October 24th, 2002, 12:21 PM
Distributed computing explained...
As the technology we use today increases in speed and usability, there are those that are happy with the fastest computer. But then there are those that either want or need to go faster than the fastest. The solution lies in and around the Internet, the solution has been rightfully termed "Distributed Computing."
Put simply, distributed computing is the splitting of a task among multiple computers. Hence using the power of many computers that people already have, thus sparing the cost of buying a supercomputer of equal proportions, which in some cases is impossible.
Distributed Computing isn't all that hard of a concept to grasp. The first thing that would be done when setting up a distributed computing operation is the selection of the problem that you to tackle. This allows you to decide what the basic architecture of the network is going to be. To help describe the process better, I will step through the process of setting up an imaginary distributed computing project.
I am bored one day, and surfing the web, and I run into RSA's homepage, were they had just announced a new competition in which they have put up an offer to give ten thousand dollars to the person that successfully cracks any one of their encryption standards under the RC5 brand. You know that RC5-56 has been cracked, and RC5-64 is being cracked using brute force distributed computing. So you decide to take on RSA "most secure" encryption standard, tackling the challenge to take on RC5-128, officially called RC5-32/12/16.
Tackling this problem with brute force is probably the easiest method, but this cipher is 2^64 times stronger than RC5-64, so we are going to need quite the backing for this. Seeing how money is only a slight problem, we go out and buy a web server, a key server, and a so-called stats server. We also buy a copy Visual Studios 6.0, (for coding the clients).
I introduced some possibly foreign terms in the last paragraph, so let me explain. A key server is possibly the most important part of a distributed computing project. Its only purpose is to keep the clients supplied with blocks of keys. The stats server being a part of the network that isn't critical to the project, but to get the amount of people we will need to crack this code, we want to make sure their happy, and geeks love stats.
The client is a program that runs on a workstation, in our case, it will be testing a key against the encrypted message that was supplied to us by RSA, for this project, the encrypted message is:
d9 3b 27 72 11 8a 65 cb ef 5b 06 74 63 76 22 16 84 f9 ec 21 56 3b 1c 1c 02 e1 70 10 50 d1 71 00 06 aa bf c1 38 e1 f1 f8 2d 63 57 bb 24 a9 7d 5d
All the client needs to know is the first line or so, this is for speed issues, as it takes less time to test against one line, than it does to test against three, if the client thinks it has a possible code, it will put a flag on that key and send it back to the key server, which sees the flag, and tests the key against the rest of message, if it works, we send it off to RSA, and they make it official and send us our check.
To reduce the load on the key server we send the keys over the internet in blocks, with somewhere between 2^34 and 2^64 keys, the bigger blocks containing 1,844,674,407,000,000,000 keys, roughly .000000003 percent of the total 33,402,823,669,000,000,000,000,000,000,000,000,000 keys, just to give you an idea of how big of a project this is. We send the keys over the Internet for the simple reason that we need lots of people and the Internet provides the perfect medium for communicating with all these people.
So, we have all the hardware we need to take on this project, and now we have to code the client. We follow all basic procedures in making any program, including beta testing. After the client is ready to be released, we start advertising, put up a web page, post to message boards, we install the client on our computers, and start implementing the stats database. By the we get the stats working, we should have a little bit of a following, and about a millionth of percent of the key space exhausted, which we are handing out from our key server in a bottom up method, simply because it sounds cool, it really makes no difference.
We are now on our way to cracking RC5-128, so now we sit back, wait and hope everything continues to run smoothly.
My involvement and interest in distributed computing sprouted from a lonesome link on a hardware page I often visit. The caption read, "Click here, sign up, and have the chance to win $1,000." With those few words, I was hooked, I signed up for Distributed.net and all associated projects on January 3rd, 1999, and have been experimenting with the concept since then.
To date, I have discovered many different projects that have implemented distributed computing to aid them in achieving their goal. The bulk of them use the Internet, and offer rewards to the person that achieves the goal.
But there is one that is our there to save money, to make that old hardware sitting in equipment closets somewhere do something. They, the computer science department at Carnegie Mellon University, have set up a network of first and second-generation computers running a flavor of Unix. They setup a client that will split up the normal tasks of a web server between this network of slower machines, hence making a relatively fast web server. Oh, and of course you have to give all these computers a name, but since they all work together, the fine folks at Carnegie Mellon, gave all the computers the same name, Andrew.
Another current project that is using distributed computing is SETI, a.k.a Search for Extra-Terrestrial Life, which take data recorded at the radio satellite in Porto Rico and sends it out to the clients, which decode the data and send it back to the server, where the server analyses it, to see if there are any extra-terrestrial radio signals in it.
Distributed Computing can also be used to do upper end math equations, as GIMPS has done. GIMPS is the Great Internet Mersenne Prime Search, which is looking for the next biggest Mersenne Prime. I will avoid boring you with the technical definition of a Mersenne Prime, since that's not what my paper is about. But anyways, Gimps sends out blocks of numbers, and has the Prime 95 client check to see if each number is prime.
And perhaps the most successful distributed computing project is that which is headed by Distributed Computing Technologies Incorporated. This is the company that is running a brute force attempt to break RC5-64, along with other encryption standards. Since their formation in late 1997, they have successfully cracked 4 ciphers, DES-I, DES-II, DES-III, and RC5-56. They are currently making simultaneous attempts to crack both the RC5-64 and the CSC ciphers, having well over 200,000 participants between the two competitions. I believe the key to DCTI's success is in their ability to communicate with the public, as an example, you can go into an IRC channel and talk to any one of the many people that run the No-Profit Organization. Their success can also be accredited to the ease of use their clients provide, their statistics, and the community that been formed around them.
Distributed computing is an idea that has done lots of work, and has the possibility to do a lot more. I believe that as technology advances, and the speed of computers on our desktops increases, we will see that distributed computing will become more useful, rather than fading away like an old tool that doesn't need to be used anymore. I believe this because distributed computing is infinitely scalable; the sky is the limit when it comes to how many computers you can link together. Distributed computing… just another fine example of what these things can do.
c/o Smoke Screen