-
May 7th, 2005, 11:45 PM
#1
Distributed Computing Rainbow Tables
Hi,
I know there are quite a few distributed computing projects out there, but has anyone ever thought of one to compute rainbow tables? I'm wondering if there is any reason it wouldn't work. The finished tables could be available for the public via a torrent file, so you wouldn't need to worry about the cost of the bandwitdth of that many people downloading rainbow tables. I think this would be a useful project, but can anyone find any flaws in it?
Thanks.
It is better to die on your feet than to live on your knees.
-
May 8th, 2005, 02:42 AM
#2
There are tables on the net to use already and 99.999% of people have no space for such a monster and really who wants to dl multiple terabytes? (or at very least many, many gigs)
Do unto others as you would have them do unto you.
The international ban against torturing prisoners of war does not necessarily apply to suspects detained in America\'s war on terror, Attorney General John Ashcroft told a Senate oversight committee
-- true colors revealed, a brown shirt and jackboots
-
May 8th, 2005, 04:29 AM
#3
I've yet to find good rainbow tables that are free, and you'd have to be using a ridiculous amount of characters to have a terabyte-sized rainbow table. I'd definately download a 10-20gig rainbow table if available.
It is better to die on your feet than to live on your knees.
-
May 8th, 2005, 09:11 AM
#4
Jareds411
AFAIK most rainbow tables are produced using distributed computing techniques. I guess government agencies may use supercomputers?
I have seen 14 character (all keyboard characters) upper & lower case tables on sale for $800. They take up 60 Gb They were produced using distributed computing, but I am afraid that I cannot remember the exact resources that were used..............it took quite a few months.
As for "free"..................what would be the justification for providing them "free"?
-
May 8th, 2005, 11:24 AM
#5
Hi
Nihil is right - rainbow tables can be produced in distributed mode.
But Jareds411, I salute you for the idea, which you had independently
It is possible to run the generation in embarrassingly parallel mode[1],
which shows "optimal" scaling - there is no communication from one node
to the other involved. Example for 7-byte password:
1. CPU: starting with a, and check the corresponding keyspace.
2. CPU: starting with b
...
70. CPU: starting with !
Feasibility?
Let us take an alphabet with 70 symbols (alphanumerics plus some specials).
- Each CPU has to look at a keyspace 70^6 = 117 billion elements.
- Assume a MD5 calculation of about 3.5 million hashes per second (I just run that benchmark
on a standard CPU (3Ghz, 32bit) using pure MD5 (fixed passwords, no comparison,
no writing to disc), so it is an upper limit)
-> 10 hours per CPU needed. (lower limit. reallife: multiply with 4 )
-> 700 CPU hours in total needed.
-> non-compressed file: 24 byte * 117 billion = 2.6 TB
-> compressed: ~ 250 GB
I claim, a Beowulf-Cluster provides as good a performance
as a supercomputer MD5 seems awkward to optimise for
vector-computation, the memory involved in the computation
is small (ie will fit into L2 or L3). This is just a guess of course,
and the whole considerations does not involve additional I/O time)
Cheers.
[1] http://en.wikipedia.org/wiki/Embarra...rallel_problem
If the only tool you have is a hammer, you tend to see every problem as a nail.
(Abraham Maslow, Psychologist, 1908-70)
-
May 8th, 2005, 12:44 PM
#6
I've seen a couple of rainbow tables provided free, this one esspecially
http://rainbowtables.shmoo.com/
i2c
-
May 8th, 2005, 04:19 PM
#7
I don't think that the "free" ones are much more than proof of concept exercises. AFAIK they will only work with 7 character passes.
The amount of computing power and effort required to set up full blown tables makes it highly unlikely that they will be provided free.
i2c
You will notice that the tables in your link are missing the "€" symbol.
-
May 8th, 2005, 04:35 PM
#8
The justification for providing the tables for free would be that people provided their processor time for free, so why not everyone share in the rewards too?
It is better to die on your feet than to live on your knees.
-
May 8th, 2005, 05:01 PM
#9
Well,
I think that it is just too large a project. Also, the tables get very large once you get beyond the 14 character limit of Windows 9x systems. Just think about ASCII characters as well
Anyway, what use would they have, other than to law enforcement agencies or professional recovery companies?
I really cannot see anyone bothering in a home computing environment.
-
May 9th, 2005, 08:58 AM
#10
Hi
I did a few benchmarks in order to answer Jareds411 question -
I think my point of argumentation was not clear from my previous
post. Following is a "Proof of concept", as nihil pointed it out ,
which supports his conclusion.
-> Distributed computation (eg. a la seti@home) does not
make sense in a (standard) home-user environment
Considering a 1Mbit/s sustained internet connection, it takes you as
much time to download a rainbow-table as to generate one youself!
Object: rainbow table, 70-symbols alphabet, 7-chars password
Production time (MD5, 70^7 non-optimised, standard CPU) : takes 1000h to produce
Download time (high compression (theory): 256 GByte table): takes 580h to download
It is your choice: Shall I use my bandwidth or my CPU?
Note: Apart from all this, authentication mechanisms should not
depend on the "uncrackablity" of some hashes - which should not
be "available" anyway. But, well, common-user real-world just is
like that.
Cheers.
P.s. I am happy if you can point out the errors in my chain of
argumentation.
------------------------
For the interested ones:
Benchmark results (70^5 passwords, I/O added!) - because I can
(gcc=gnu cc, icc=intel compiler, mcc=microsoft compiler)
Code:
XEON 3Ghz (gcc -O3) : 15m53.080s
XEON 3Ghz (icc -O3 -axW) : 12m19.424s
OPTERON 1.8 Ghz (gcc -O3): 17m53.259s
ITANIUM 1.6 Ghz (icc -O3): 12m30.017s
ITANIUM 1.6 Ghz (gcc -O3): 14m19.163s
AMD 3Ghz (mcc) : 13m28.013s
As you can see, very different architectures result in the same performance,
if one uses non-optimised, unspecific code for the machine. From experience,
speed gains of factor 2 to 100 (!) can be achieved by simply the
- choice of an optimal data structure
- choice of an optimal order of the computation
without even making use of further advantages of SSEx-instructions! I also
started a run on an old-generation cray, but it performed too bad, so I stopped
in order not to waste precious computation time.
If the only tool you have is a hammer, you tend to see every problem as a nail.
(Abraham Maslow, Psychologist, 1908-70)
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
|