Ok.... I am new to the encryption stuff and i've heard about this encryption called PGP, i've heard it is one of the best encryption things out there. How does it work? What is the math behind it?
Printable View
Ok.... I am new to the encryption stuff and i've heard about this encryption called PGP, i've heard it is one of the best encryption things out there. How does it work? What is the math behind it?
It stands for Pretty Good Privacy.
Here is more info than you'll ever want about it.
http://www.pgpi.org/doc/faq/pgpi/en/
http://www.lostvault.com/vault/docs/PGPx.htm
and
http://www.labnet.com/irg/pgp1.htm
Are a good place to start - they explains how PGP came to be and what it is. It also referances a few good books that explains it in more detail.
RRP
This is a pretty good basic description on the maths behind PGP at:
http://www.momentus.com.br/PGP/doc/howpgp.html#maths
As you can see it is all to do with prime numbers, and what happens when you multiply two of them together. This was first suggested as an encryption technique in the 1970's, but at that time there were few computers powerful enough to run this.
Today, any PC can easily run PGP.
Ok, I think I have the math down:
Encrypt: Plain text^public key % n = encrypted text
Decrypt: encrypted text ^ private key % n = plain text
but I don't get the making of the keys thing, what is the J in J(N)?
and in this: J(N) = (P-1)(Q-1) do they actually mean subtract one from the prime number you chose?
also after reading that I still don't understand how we pick the keys, could you actually make a set of keys? showing me the numbers and all?
also, forgot to ask, is there a max on the prime number you should pick before the equation just gets to big to work with? like is a prime number of 1299827 or up there ok?
The J(N) refers to Eulers quotient , aka the Jacobi symbol (hence the 'J').
You are getting into high level maths at this point, but yes, that does mean subtract 1 from each of the numbers p & q.
What happens is that PGP chooses the prime numbers p & q for you. When you see reference to a 1024 bit RSA algorithm it means that p & q are 1024 bits long, which is one hell of a big number - about 300+ (decimal) digits long. So 1299827 doesn't even come close !!
These are chosen by picking pseudo random 300+ odd digit numbers (p & q) and seeing if they are likely to be a prime. It uses the four Fermat tests to see if they are likely to be a prime. The odds of these tests getting it wrong are about 1 in 10^50 (that's a lot of zeros, and far, far less than your chance of winning the lottery!).
Any prime number will pass the four tests, but of course, most of the time the number chosen will not be prime, and fail one of the tests. So PGP has to try again, and again ... etc. until it finds one that passes. So you now have p, and the same process is repeated for q.
The reason for this not being guaranteed a prime number is that there are some numbers known as Carmichael numbers which can slip through (about 1 in 10^50).
There is no limit to how big these numbers are - it just depends on the power of your PC.
So to summarise it relies on the fact that you can make public the result of n, which equals p*q.
To break it you need to able to find out what p or q are, which for practical purposes is regarded as impossible - if the FBI want you they will try and seize your PC instead :D
The items in bold are just some things you may wish to read up on, as the maths does get very heavy at this point.
I know, as I struggled with some of it when I was at university.
EDIT: Oops: should have said you make 'e' public, not 'n' :o
Wow, ok now I don't feel stupid... i'm in 10th grade, :D . Ok, well since rsa encryption is close to impossible to crack, i'll just use the same prime numbers for each encryption, but I still don't understand the equation to get E & D.
Also, I hate to be even more of a pain... but how do you get J out of J(N) so you can have just N when you want to encrypt it? Since I don't know what to mod it by for the prime numbers.
Also Also, :), for a 1024 bit thing wouldn't that mean 128 bytes/digits? or is this 1024 thing referring to something else(when you said 300+ digits is that right?)
There is no specific equation to get D, but once you have chosen it then you can work out E.
Let's take a 'simple' worked example, where we have chosen P=47 & Q=59.
So N = 47*59 = 2773, and therefore J(N) = 46 * 58 = 2668.
Next we need to choose D which must be relatively prime to J(N) - seems weird I know, as these were not quite the numbers we started off with !!
So, in this example 157 will do.
We then use our PC to calculate a value for E using a variation of Euclids algorithm - you'll need to read up on this, as it is not straightforward, but the result is 17 in this case.
It can turn out that we cannot find a satisfactory value for E, in which case we try another value for D, and start over.
So now we have the numbers we require: E (17), which we make public, together with N (2773) & D(157) which we keep a closely guarded secret.
To give a simple example encrypting the number 920.
920^17 = 948 (mod 2773).
To decrypt, we use
948^157 = 920 (mod 2773)
Of course in real life, the numbers for N/E/D are far bigger, and the 1024 bits that RSA uses is a binary number (not a character string) i.e. about 10^308 - in other words a 300+ decimal number as you or I would write it down.
Hope this helps :)
actually that does help, thanks for the help, this was really confusing. Probably doesn't help i'm only in tenth grade, :p . But now I understand it more, thanks.
To elaborate a bit more.
The important thing here is that the values we have made public are E(17) & N(2773).
Crucially what we have not made public is J(N) i.e. 2668.
This is crucial as J(N) & D are used as seeds in Euclids algorithm to generate E.
In other words we provide it with D(157) & J(N) (2668), and most of the time it will find a satisfactory value for E - in this case 17 being a valid result (there may well be others as well!).
Incidentally, this branch of maths is called 'number theory', and some of the results you can get from it, like the PGP agorithm seem nonsensical unless you have gone through the maths yourself, and followed the proof. Showing my age here, but I first saw this published in a technical paper in 1975 when I was at university, folowed the proof with difficulty, and at the time thought 'that is really cool!'
The only mathematical link between E, N, & D is provided by J(N).
And obviously to calculate this we need to know what P & Q are in the first place!
This has stood the test of time, as in the last 30 years no one has found a mathematical way to reverse this process, and brute force doesn’t work either, as the number of primes (P) you have to guess is astronomical. Even on the fastest supercomputer on the planet you are typically looking at years of computing time for RSA 1024 bit - of course you might get lucky, but that is incredibly unlikely.
Actually, J(N) was clearly labled as Euler's Totient, aka the PHI function. phi(k) for all naturals k returns the number of naturals less than k yet relatively prime to it. Since the phi function is mulitplicate to a degree (phi(a) * phi(b) = phi(ab) iff (a,b)=1), and since we have trivially that (p,q)=1, since phi(p)=p-1, phi(q)=q-1, phi(pq)=(q-1)(p-1).
J(n) isn't jacobi's symbol, since jacobi's symbol is just a composition of legendre symbol, it needs two imputs i.e. j(n,m).
And another important point is that D is the inverse of E in cyclic group of order j(n). This basically means than in the cyclic group of order j(n), we have D*E=1.
This means that if T is your plaintext message (T^D)^E = T^DE = 1*T = T (middle step by Fermat's little theorem).
hmm... ok... I don't know if you saw earlier but I am in 10th grade and I don't get most of what you just said... could you explain it a little simpler?(sorry)
Sure.
Here is the process in just a few simple (though not completely satsifiying in explaination) steps.
1. You find two LARGE primes, p and q. We multiply these to get n (so n=p*q)
2. From this n, we calculate phi(n). (search "euler's totient"; "phi" on google)
3. We find two numbers, E and D, such that E * D divided by phi(n) leaves a remainder of 1.
4. Say T is our message in number form. to decrypt it, we take T to the Eth power. So the encrypted message is T^E.
5. To decrypt our now encrypted message, T^E, we take the whole thing to the Dth power. So we get T^E^D = T^E*D.
Now here's the tricky part. It's impossible to explain this w/o lots of number theory terminology, but trust me when I say that because of the way we made E and D, when we divivde T^E*D by n, we get a remainder of T. Thus, we have our original plaintext message back.
I know this explaination is a bit unsatsfying, but no one can give a significatly better explaination without going to the roots of a lot of the concepts in number theory. If you really want to start to understand number theory, I suggest you pick up "Introduction to the Theroy of Numbers" by Hardy and Wright. Btw, i'm in 11th grade.
Ok.... do you think that you could give me an example? Here are two prime numbers (obviously not anywhere large enough for a real 1024 bit encryption): 1299827 & 1144301. Can you please go through the steps for creating N and Creating E & D? I already have the equation for encrypting and decrypting (encrypting C=T^E % N, decrypting T=C^D % N) but can you please go through the getting of N,E, & D with those two numbers? (1299827 & 1144301). Please include all steps.
1. We get n = 1299827*1144301 = 1487393335927 .
2. The euler totient of this number phi(1487393335927) = 1487390891800. To find out how this number is calculated, search "euler's totient" or "phi euler" in google).
3. We pick an encrytion key. In this case, we will pick 2003 as our encryption number. This number MUST be relatively prime to 1487390891800, which 2003 is. Then we find its inverse by using Euclid's Algorithm (search on google). This number happens to be 1440608252667. This number will be the decrytion key.
So E=2003. We selected THIS number by finding a number relatvively prime to 1487390891800.
And D= 1440608252667, which is calculated with Euclid's algorithm from 2003 and 1487390891800.
4. Say we have a plaintest message and we want to encrypt it. Let the message be 31337.
To encrypt it, we take 31337 ^ 2003 % 1487393335927 = 1331504076685.
1331504076685 is the ENCRYPTED message.
5. To decrypt it, we take 1331504076685^D % N = 1331504076685^1440608252667 % 1487393335927= 31337. And we're done.
Just to add to this, in the case of PGP where n=p*q, and p & q are prime, then Eulers totient is always (p-1)*(q-1) i.e. (1299827-1)*(1144301-1) = 1487390891800.
It doesn't matter from a mathematical point of view whether we choose d & calculate e, or vice versa (i.e. choose e & calculate d). We use Euclids algorithm in either case.
What does matter is that we cannot reverse the process that Euclids algorithm uses i.e. given e & n, what we need to help us calculate d is j(N) i.e. (p-1)*(q-1), which is not remotely obvious if n is large enough.
Which is why the fundamental maths behind PGP relies on keeping p & q secret.
Of course, when you get down to the implementation of PGP on a PC, there are all sorts of practical problems. For example, normally p & q are stored as encrypted numbers on your PC (using a different type of encryption). Also, you cannot be absolutely sure that p & q are prime in the first case, in which case it is trivial to break.
Which is why it is called Pretty Good Privacy - in other words it is not foolproof, even from a mathematical point of view, but if used correctly, then it is about 99.9999 ... etc. % reliable.