March 25th, 2003 08:27 AM
Extremely Simple Crypto Tutorial
Crypto," to use the all-purpose abbreviation for cryptography, cryptoanalysis, and cryptology, is cool. Just plain cool. My biggest regret in life is that I never took a math class past Algebra II, so I really don't know jack about the mathematical foundations of intense crypto systems. But boy, do I respect those who do.
If you're a person who finds crypto textbooks really boring yet wants to understand this whole crypto bit in the broad sense, go read Neal Stephenson's Cryptonomicon. Sure, it's more than 900 pages of quasi-fiction, but it manages to tell a fascinating story while giving an incredible amount of insight into modern cryptography.
In this tutorial, you'll learn something or another about the common, Web-based uses for the following basic encryption techniques
Asymmetric key-based algorithms. This method uses one key to encrypt data and a different key to decrypt the same data. You have likely heard of this technique; it is sometimes called public key/private key encryption, or something to that effect.
Symmetric key-based algorithms, or block-and-stream ciphers. Using these cipher types, your data is separated into chunks, and those chunks are encrypted and decrypted based on a specific key. Stream ciphers are used more predominantly than block ciphers, as the chunks are encrypted on a bit-by-bit basis This process is much smaller and faster than encrypting larger (block) chunks of data.
Hashing, or creating a digital summary of a string or file. This is the most common way to store passwords on a system, as the passwords aren't really what's stored, just a hash that can't be decrypted.
The basic idea of key-based cryptography is that you take a chunk of data (plain text) and scramble it up (ciphertext) so that the original information is hidden beneath a level of encryption. In theory, only the person (or machine) doing the scrambling and the recipient of the ciphertext knows how to decrypt (unscramble) it, because it will have been encrypted using an agreed-upon set of keys or a specific cipher and passphrase (key).
This key-based method of cryptography is common, wherein the key in question is only known to the persons or machines doing the encrypting and decrypting. Think of it like a car key. The owner of the car has the key, obviously. When the owner walks away from the car, she locks it and keeps the key safely secured. No one can get into or use the car without some sort of "brute force."
The responsibility of protecting the key rests solely with the owner of the car. If the owner puts a set of keys in one of those magnetized key holders underneath the car, that's a very loose method of security. If the owner keeps the key with her at all times, even showering with it on a chain around her neck, that's a pretty good level of key security.
But say the owner's friend needs to borrow the car, so the owner passes along an extra set of keys for the friend to use. Both people can now drive the car, but the security of the key itself is compromised because someone else has it. If the friend makes copies of the key (for other people to use when the owner is out of town, say), the level of security becomes even more diluted. Eventually, the original lock-and-key security will be lost entirely, and in order to recover it, the owner will have to have new locks put on the car and new keys made.
Keys used in encrypted communications have the same problems as conventional keys: They can be lost, stolen, even bought and sold. And some can be discovered by crackers through a method called "social engineering."
Crackers don't necessarily use a serious amount of CPU cycles to crack a cipher. Most of the time, they just ask for the password from an unsuspecting technician. Or maybe they call up your receptionist "just to chat" and glean a tidbit or two of crucial information. You'd be surprised at how often this occurs.
Sometimes crackers play on the notion that most people choose passwords that are easy to crack, like any word found in a dictionary. Words like "hopscotch," "meteor," or "porcupine" may seem like nice, hard-to-guess and easy-to-remember non sequiturs, but they're all bad passwords because most password-cracking software cycles through a dictionary. If your password is anywhere in that dictionary, then say bye-bye to your sensitive data. Better passwords are alphanumeric and nonsensical, such as "1Am*Sh$b" or "BA8Hw2Lq."
There are methods of cryptography that don't rely on keys at all, but even those aren't foolproof. If the decryption program is essentially the key itself, then the machine becomes one big, concrete representation of the key, which can be stolen. For example, take the infamous Enigma machine. This machine was used by the Germans in World War II to encrypt and decrypt secret messages. Although it looked like a typewriter on steroids, the Enigma machine was not built to type plain text. Based on a complex series of settings, wheels, and rotors, the typed text was skewed ever so slightly, so as to produce the encrypted data. In this instance, the machine was the key; it proved to be a very valuable piece of equipment, especially in the hands of the Allies.
Taking all of this into consideration — social engineering, careless people holding keys, encryption embedded into machines themselves — you may wonder if your sensitive data is really safe. If you keep your systems locked down, keep your private keys private, don't use an Enigma machine, and don't give your root password to your receptionist, your data is probably pretty safe. The techniques outlined in this tutorial will assist you as you attempt to reach a comfortable level of security, but be advised that these few Web-based tricks only scratch the surface of data encryption and security.
In public-key cryptography, a user has a pair of keys: public and private. As their names suggest, the private key is kept private, while the public key is distributed to other users. The owner of the private key never, ever shares the private key with anyone. A second, public key is distributed to other users. The public and private keys of a particular user are related via complex mathematical structures in such a way that inexorably links one key with the other. This relationship is crucial to making public/private key-based encryption work, as you will soon see.
The public key is used as the basis for encrypting a message, while the private key is necessary for the recipient to decrypt the encrypted message. Only the bearer of the private key can decrypt the message. Even the person who did the encrypting cannot decrypt the message he just encrypted, because he does not hold the private key.
OK ... let's try this again:
Suppose that Joe User has a public key and a private key. Jane User also has a public key and a private key. Joe and Jane want to send encrypted messages to each other, so they exchange public keys. Now Joe has his own private key and Jane's public key. Jane has her own private key, and Joe's public key.
Keys are kept on key rings: One ring is for private keys and another is for public keys. They are not unlike real key rings that hold your car, house, and other keys together. On Joe's public key ring, he has Jane's public key. On Jane's public key ring, she has Joe's public key. Both Joe and Jane also have private key rings, that hold only their own private keys. Their private key rings should only ever hold their own private keys.
When Joe wants to send an encrypted message to Jane, he uses his encryption software to scramble the message based on Jane's public key. Jane receives the message, then uses her encryption software and her private key to decrypt it. Only Jane will be able to decrypt a message that has been encrypted by someone using her public key.
In the early 1990s, Phil Zimmerman developed PGP, or Pretty Good Privacy, which quickly became a very popular piece of software for email and file encryption using public and private keys. Due to the United States' export regulations and the import regulations of other countries regarding encryption algorithms, however, the OpenPGP standard was developed, and the GnuPG software was built around it. Unlike PGP software, GnuPG does not use patented or restricted encryption algorithms, and thus, it has become a popular alternative to PGP.
Although US export laws were recently modified, both PGP and GnuPG will likely continue to co-exist in the developer community. In the next section, you'll learn to use either PGP or GnuPG with PHP to encrypt and send messages, so now is a good time to decide which you'd like to use. Here are some basic differences:
To use PGP commercially, you must pay a fee, while GnuPG is free for all types of uses.
GnuPG is primarily Unix-based, although a Windows version does exist. PGP has versions for Unix, Windows, and even the Mac.
Both PGP and GnuPG have some restrictions or warnings regarding export and distribution, although this problem hits PGP users harder than GnuPG.
Both PGP and GnuPG are easy to install and subsequently use, but PGP has an extensive built-in GUI.
Take a look at both Web sites (www.pgp.com and www.gnupg.org) and decide for yourself. After determining which encryption software you want to use, follow the steps outlined in either of the following sections to learn how to set up PGP or GNUPG on your Web server and on your personal system, so you can use PHP to invoke the encryption and send your Web-based order forms and whatnot to yourself as encrypted messages.