An Introduction to Cryptography, and Common Electronic Cryptosystems – Part I
OK boys and girls, here it is… my first tutorial in about two years. I tried to approach it from a high-level, not delving too much into the specifics. However, some of you know that I tend to go off on a tangent, so…
If you slept through calculus, or ever sat there thinking to yourself that “I’ll never use this crap.”, you might want to skip my explanation of the RSA Cryptosystem…lol.
I’m by no means an expert in this stuff, so if any of my explinations seem fuzzy, I apologize in advance. If anyone finds a mistake or computational error, PLEASE post a correction. Yes nihil, this means you :p
Topics Covered in this Tutorial include:
A brief history of Cryptography
Symmetrical Key Encryption and Cryptosystems
Asymmetrical Key Encryption and Cryptosystems
I plan to follow this tutorial up with another one that goes more in depth, and possible another relating to PKI, Digital Certificates, Stenography, Escrowed Encryption, Key Management, and Encryption Technologies.
A Brief History of Cryptography
The Ancient Egyptians are the first known civilization to utilize a form of secret writing. This form of writing should be familiar to most of us: Hieroglyphics. The Egyptians began using these hieroglyphs around 3000 B.C. to conceal sacred texts in order to prevent them from being universally accessible.
The Ancient Spartans developed a method to encrypt military messages around 300 B.C. This method is known as Scytale. To use it, one simply wound a thin strip of material around a cylindrical rod of a specific length and diameter, with the material spiraling across the length of the rod. The message that was to be encrypted was then written lengthwise across the material. Additional characters could be written across the rod in other locations, to add complexity. Once unwound from the rod, the material appeared to contain nothing more than a hodgepodge of random characters. Only by winding the material around a rod identical in length and diameter could the message be decrypted.
The Romans developed one of the first known substitution ciphers around 50 B.C. Julius Caesar used this cipher to send messages Cicero during his military campaigns. This code, known as C3, worked by substituting every letter of the message with the third letter after it in the alphabet. So, A would become D, B would be E, and so on.
Around 815 A.D., The House of Wisdom was established in what is known today as Baghdad, Iraq. The studies of Mathematics and Linguistics that took place there pioneered the science of modern cryptography.
The first known encryption device was developed in 1460 by Leon Battista Alberti. This machine used two cipher disks with the alphabet scribed around the edges. One disk was placed inside the other, and by aligning a letter on the outer disk with a different one on the inner disk, a substitution cipher was created. The Confederate army used a device based upon this design to encrypt and decrypt battlefield messages during the American Civil War in the mid-1800s.
One of the first true mechanical encryption devices was the Hagelin machine, which was developed in Sweeden in the early 1920s. By improving and refining the design of the Hagelin machine, much more advanced encryption machines were developed that used rotors, making them capable of polyalphabetic substitution. Polyalphabetic substitution involves taking a base alphabet such as the modern Latin alphabet, and using substitution to encrypt it into another alphabet, such as Cyrillic. This can occur several times using multiple alphabets, and these alphabets can even consist of numeric characters or symbols. The German Enigma machine is used this method of polyalphabetic substitution to encrypt messages.
Cryptography generally involves two basic steps, Encryption and Decryption. Encryption is the process that converts a readable message, or plaintext, into an unreadable form, called ciphertext. Decryption is the process of converting ciphertext back into plain text so that the message can be disseminated.
Most electronic encryption methods use an encryption key to convert plaintext into ciphertext. This key is made up of a string of bits – random binary numbers. This string of bits makes up the code that is used to encrypt the message.
Cryptography provides four key security functions for data:
1. Confidentiality: In order to easily access the encrypted data, you must have the key. This helps to maintain the confidentiality of data.
2. Authenticity: Data can be easily authenticated by using the decryption key. If the use of the decryption key does not yield a plaintext message, the message is likely to not have come from the specified sender.
3. Integrity: It is difficult to alter an encrypted message, so integrity is maintained.
4. Nonrepudation: When you receive an encrypted message, receipts are provided so that the sender cannot deny sending the message, and the receiver cannot deny receipt.
The Importance of Encryption.
Strong encryption can be used for a myriad of purposes, from masking criminal activity to protecting military information. It can even be used as a weapon of war. In fact, the U.S. Government classifies encryption technologies as munitions and places strict export controls over them. The National Security Agency considers encryption technology crucial to national security, and the defense of the nation.
In the U.S., there are two main governmental departments that regulate the export of encryption technologies:
• The Bureau of Export Administration, part of the U.S. Department of Commerce.
• The Office of Defense Trade Controls, part of the U.S. Department of State.
In years past, the export of cryptographic technologies was strongly controlled by these organizations, which imposed difficult restrictions upon their export. Among other controls, the export of cryptographic technologies required a security background check, as well as a lengthy end-user agreement. Exports that were approved were all subject to a 30-day delay, and the most powerful encryption technologies were banned completely from export.
With the explosive growth of the Internet and e-mail communications, the demand for encryption technologies became great. Due to the tight export controls the U.S. placed on these technologies, many non-U.S. companies developed encryption technologies and began to dominate this market. Some of these encryption technologies were much more powerful than those allowed by U.S. export controls. This caused U.S. companies to lose millions of dollars each year in potential customers. This caused the Federal Government to ease export controls on cryptographic technologies in July of 2000.
Under new export regulations, U.S. companies are now allowed to export branded encryption technologies to any country within the EU, as well as eight other countries. They are also allowed to customize these technologies to meet the needs of financial and medical institutions.
Symmetric Key Encryption and Cryptosystems
Symmetric Key Encryption is a form of encryption that uses the same key to both encrypt and decrypt data. This is also known as private key or secret key encryption – the secrecy of the data depends upon the confidentiality of the key. Symmetric key encryption can be used on large amounts of data rather quickly and efficiently. But in order to decrypt the data, the key must be shared with the data recipient, leaving the key open to possible compromise. The most secure method of sharing the key is with a face-to-face exchange of the key. This makes it pretty impractical for most uses.
Symmetric encryption employs two main methods to encrypt data. One is called a block cipher, and the other is known as a stream cipher. In order to use a block cipher, a certain amount of data is required. This is usually a minimum of 64 bits. For data larger than this, the data is broken up into 64-bit blocks. For smaller amounts of data, padding is added to make the data a 64-bit block. The key is applied to create ciphertext that can then be decrypted using the same key used to encrypt the data.
There are several modes of operation that a block cipher can use. A few of these are:
• Cipher Block Chaining (CBC) – CBC encrypts plaintext in 64-bit blocks. If a pattern can be deduced from the plaintext, the resulting ciphertext will likely present a similar pattern. This can lead to compromise. To avoid this, CBC first alters each plaintext block to prevent recognition of patterns. The resulting jargon is then transformed by using an XOR function on the plaintext with a string of random initialization vectors called a seed. Once the plaintext has been seeded, the encryption key is applied using an XOR function to encrypt the data.
Once the first block of ciphertext has been completed, it is XORed with the next block of plaintext. Then the resulting data is XORed with the encryption key. This process repeats until the entire message is encrypted.
• Electronic Code Book (ECB) – ECB encrypts data in 64-bit blocks. This method is usually used to encrypt small amounts of data, such as encryption keys or seeds. It is fast, because it simply encrypts each plaintext block separately. Sucessive blocks of plaintext are not XORed with previous ciphertext blocks, resulting in each plaintext block having an identical ciphertext block. This process allows parallelization, and thus, speed is improved. This also causes a weakness, because any discernable patterns that exist in the plaintext block will also exist in the ciphertext block.
• Cipher Feedback (CFB) – CFB can be used in either block or stream ciphers. It encrypts plaintext and then XORs the resulting ciphertext back into the plaintext block, which results in the “current” ciphertext block. Any discernable patterns in the plaintext are masked by the output of the XOR, but identical plaintext and ciphertext blocks will still result in identical XOR output.
• Output Feedback (OFB) – Like CBC, OFB makes use of a seed during the encryption process. The encryption of each block is derived from the encryption of the previous block of plaintext as well. However, each IV or seed is generated independently, which prevents most errors that occur in CBC encryption.
Data Encryption Standard (DES), Blowfish, Twofish, MARS, Skipjack, RC5, and Rijndael are examples of symmetric encryption implementations that use block ciphers. Each has its own characteristics, and they are all used in a variety of applications. For example, Blowfish is the encryption method used in Secure Shell (SSH) technologies. It uses a variable key strength of up to 448-bit, and block lengths can vary also. Twofish, which is basically a lighter and faster blowfish, is commonly used in smartcards. Twofish uses 128-bit blocks with 28, 192, or 256-bit keys. MARS, like Twofish, is also used in smartcards, and uses 128-bit blocks – but uses key lengths in excess of 400-bit. Skipjack is a classified algorithm used by the NSA. It is used in clipper chips and the Fortezza token. Skipjack uses a 64-bit block, an 80-bit encryption key, and performs 32 processing rounds. RC5 uses variable block and key lengths, as well as variable processing rounds. The minimum to provide decent security using RC5 is 12 processing rounds using a 128-bit key length.
DES was created in 1972 by IBM, using the Data Encryption Algorithm. It was adopted by the U.S. Government as its standard encryption method for commercial and unclassified communications in 1977. DES begins the encryption process by using a 64-bit key. The NSA restricted the use of DES to a 56-bit key length, so DES discards 8-bits of the key and then uses the remaining key to encrypt data in 64-bit blocks. DES can operate in CBC, ECB, CFB, and OFB modes, giving it flexibility. It initially will use any one of these modes to encrypt each block, and then DES will repeat this process 16 times to produce a final ciphertext of each block. This process is fast when compared to other symmetric encryption methods. In order to mount a successful brute-force attack on 56-bit DES, you would have to try up to 256 quadrillion keys. It is estimated that an attacker with the proper resources and determination, could crack DES within 24 hours. In 1998, the supercomputer DES Cracker, assisted by 100,000 distributed PCs on the Internet, cracked DES in 22 hours. The U.S. Government has not used DES since 1998.
DES was superseded by Triple DES (3DES) in November of 1998. 3DES is exactly what it is named – it performs three iterations of DES encryption on each block. It can do this in a number of ways, but the most common method is the Minus Encrypt-Decrypt-Encrypt (-EDE) method. Each iteration of 3DES using –EDE will encrypt a block using a 56-bit key. After encryption, use a different 56-bit key to decrypt the block. On the last pass, a 56-bit key is used to encrypt the data again. This is equivalent to using a 168-bit encryption key. Another method that can be used is Minus Encrypt-Encrypt-Encrypt (-EEE). This is three successive encryptions using a different 56-bit key. There are several keying methods that 3DES uses. All three keys can be independent of each other, or the first and third keys can be identical, with the second key being unique. All three keys can also be identical, which provides the least security, but is also the fastest to encrypt with. 3DES is still approved for use by U.S. Governmental systems, but has been replaced by the Advanced Encryption Standard (AES).
AES was approved for use by the U.S, Government for the encryption of sensitive, but unclassified data in 2000. AES uses the Rijndael block cipher. Rijndael is a very resilient algorithm that has shown resistance to all known cryptographic attacks thus far. It is used in a number of implementations, including the encryption of communications and HDTV transmissions. Rijndael key and block length can be 128, 192 or 256-bits. If both the key-length and block length are 128-bit, Rijndael will perform 9 processing rounds. If the block or key is 192-bit, it performs 11 processing rounds. If either is 256-bit, Rijndael performs 13 processing rounds. These round counts do not include an extra round of encipherment performed at the end.
Each processing round involves four steps:
1. Bite Sub – Each byte of the plaintext block is replaced by its reciprocal from a substitution table that is generated before the round begins.
2. Shift Row – Each byte from the Byte Sub step is arranged in a table of rows and columns. Data in each row is incrementally shifted a certain number of columns.
3. Mix Column – Data in each column from the shift row step is multiplied by the algorithm’s matrix.
4. Add Round Key – The key for the processing round is XORed with the data.
The algorithm itself is designed to exponentially increase the time it would take to mount a brute force attack on the cipher to several billion years.
In contrast to block ciphers, stream ciphers work by applying the key to each individual bit or byte of data. RC4 is probably the best known symmetric encryption algorithm that uses a stream cipher. RC4 keys can be anywhere from 1 to 2048 bits in length. These are used to initiate a 256-byte state table. This table is used by RC4 to generate a psudo-random byte stream used to encrypt the plaintext. This stream is transformed with the plaintext data using an XOR (Exclusive OR Boolean Operator). The function will result in “True” only when a plaintext bit and its corresponding psudo-random stream bit are opposite in value. In addition, every element in the state table is swapped at least once. This is what produces the ciphertext.
Another stream cipher method is known as Password Based Encryption (PBE). PBE works by using data derived from a password to create an encryption key. A typical password does not contain enough data to be used as a key in its own right, so PBE uses two methods to generate the extra data needed – adding a salt, and using iteration counts.
A salt is simply a random value added to the password before transforming it into a key. The iteration count is the number of times the salt and password are transformed to make the key. In addition, each iteration count uses a completely different salt, adding additional complexity to the key. The A5 encryption algorithm used to encrypt GSM cellular communications is an example of a stream cipher that uses PBE. However, A5 uses a weak key of only 5 bytes, making the algorithm relatively easy to crack.
Asymmetric Key Encryption and Cryptosystems
Asymmetric Key Encryption uses two independent keys. One key is used to encrypt, the other to decrypt. This method is more popularly known as Public Key Encryption, as the encryption key can be made public without compromising the data – only the decryption key must remain secret. The encryption key is known as a public key, and the decryption key is a private key. Asymmetric encryption relies on a one-way function for its security. This is an operation that is straightforward in one direction, but extremely complex to reverse. To explain it another way, for a function f, if y = f(x), it should be easy to solve y if given x, but extremely difficult to derive x if given y.
The security of asymmetric encryption relies on the difficulty of deriving the private key from the public key. For example, let’s say you have a person’s name and address, but not their telephone number. You need their telephone number, so you get out the phone book and find it using their name and address. Now let’s flip the scenario. Now you only have their phone number, but you need their name and address. How difficult is it going to be to get their name and address from the phone book, using their telephone number? Not impossible, but very difficult.
Most of these one-way functions contain a process that will reverse the function easily. This is known as a trapdoor. It is extremely difficult to deduce the private key from the public key unless you know the trapdoor.
Asymmetric encryption is stronger than most symmetric encryption methods. This also makes it extremely slow – up to 1000 times slower. It is widely used for electronic communications, however, because anyone with a user’s public key can encrypt a message, but only the recipient can decrypt it. Another popular way to encrypt communication, such as e-mail, is to use a symmetric encryption method to encrypt the message, and use an asymmetric method to encrypt the symmetric key by using the recipient’s public key. The message and key are then packaged in a digital envelope and sent.
One popular asymmetrical cryptosystem is the Diffie-Hellman Key Exchange. This is the first asymmetric cryptosystem, and was developed in 1976. The Diffie-Hellman algorithm is based on the difficulty of finding separate logarithms in a finite field generated by a very large prime number. Let’s say that you and I want to exchange encryption keys so that we can communicate with encrypted messages. We agree to send each other a key using Diffie-Hellman Key Exchange. Both of our computers will generate a large random number. My computer doesn’t know your number, and you’re not going to give it up either. These numbers remain secret, and we’ll call these numbers “X”. The computers will then generate two additional numbers. The first one we will call “p”, and it is a random prime number. The next one we will call “g”, and it is a random number that has a value of less than “p”.
My computer will send you my “p” and “g” numbers, and vice versa. When I get your “p” and “g” numbers, my computer will use your “p” and “g” to compute Y = gX mod p. My computer then sends you the result, “Y”. The same thing happens on your end. Once we exchange “Y” with each other, our computers will use the other’s “Y” value to calculate a common value that we’ll call “K” by using the formula Y X mod p = K. This common value is the key that we will use to encrypt and decrypt messages.
El Gamal is a popular asymmetrical cryptosystem that is based on Diffie-Hellman Key Exchange.
Another popular asymmetrical cryptosystem is RSA. RSA is commonly used for encryption, authentication, and key exchange. RSA derives its name from the initials of the last name of each of its creators: Rivest, Shamir, and Addleman. It is commonly used with key strengths of 1024-bits, but its real strength relies on the prime factorization of very large numbers – RSA uses numbers well in excess of a googol long! (a googol is 1 x 10100, just in case you were wondering, and yes, that’s how you spell it…google is a search engine, not a number).
The RSA algorithm works by using two prime numbers, p and q. The following calculations are performed:
Find p, where p is a positive prime number < ; 156 digits in length.
Find q, where q ≠ p and q is a positive prime number < ; 156 digits in length.
Find the modulus n = pq
Find a number, e, where e < n, and relatively prime to (p – 1)(q – 1). This means that the e, and (p – 1)(q – 1) have no common factors except 1.
Find a number, d, where d ≠ e, such that (ed – 1) is divisible by (p – 1)(q – 1).
The values e and d are the public and private exponents, respectively. The public key is the pair (n, e) and the private key is the pair (n, d).
Where P is the plaintext and C is the ciphertext, the encryption process is:
C = P^e mod n
And the decryption process is:
P = C^d mod n
Another popular asymmetrical cryptosystem is the Merkle-Hellman Knapsack. This cryptosystem is based on the difficulty of finding a set of superincreasing numbers that add up to a given value. (A set of numbers is considered to be superincreasing when every member of the set is greater than the sum of all the previous members in the set.)
Other asymmetrical cryptosystems employ algorithms that are based on the mathematical fact that a logarithm for a particular point on a curve is harder to find than a conventional logarithm. This is known as Elliptic-curve cryptography. Elliptic-curve cryptography allows for smaller bit-length keys that are as secure as larger keys used by other asymmetrical methods, due to the extreme difficulty it takes to produce the key. Because smaller keys result in shorter processing times, these types of algorithms are commonly implemented in wireless technologies, and also in smartcards.
Hashes and Digital Signatures
Hash functions, also known as message digests, are one-way algorithms that are used to provide authentication, integrity, and nonrepudiation to data. Hashes are commonly used in e-mail communications, and also by some file systems and IDS/IPS systems. Hash functions produce a fixed length output, usually 128-bits, from an input of any length. This output is called a hash. A hash is unique to the input that produced it. Altering just one bit of the input data will produce a completely and noticeably different hash. There is no practical way to re-create the input from the hash, which is why it is considered a one-way algorithm.
Data integrity can be verified by comparing the hash produced from the received plaintext data with the hash produced by decrypting the sender’s digital signature with their public key. If the hashes match, the data has not been altered since the sender signed it. If the hashes are different, the data has been tampered with, or it may indicate that someone masquerading as the sender sent the data.
You can also use hashing on data that you do not intend to decrypt. This is common as an authentication method. For example, UNIX will create a hash from the data that a user inputs when prompted, and compare that hash with the hash stored in /etc/passwd to authenticate the user.
The most commonly used hashing algorithms in use are Message Digest 2 (MD2), Message Digest 4 (MD4), Message Digest 5 (MD5), and Secure Hashing Algorithm (SHA). Of these, SHA is the most secure. SHA is based on MD5 and was developed by the NSA for use as a National Institute of Standards and Technology (NIST) standard. It produces a hash of 160-bits using four processing rounds. The Message Digest series of algorithms were created by Ron Rivest (one of the creators of the RSA algorithm) and all produce a 128-bit hash. MD2 produces secure hashes, but is very slow. MD4 is simple and fast, but has been successfully attacked and defeated in the past. MD5 is basically an improved version of MD4 that uses additional processing rounds to increase security.
Digital Signatures make use of hashing algorithms to perform the same functions as a hash - authentication, integrity, and nonrepudiation. To produce a digital signature, the message is used to produce a hash. The message is then linked to the hash. The hash is then encrypted using the sender’s private key. Then information is added that includes the hashing algorithm among other things, such as a time stamp. This combination is the digital signature.
The digital signature is then encrypted using a randomly generated key. This random key is then encrypted using the recipient’s public key, producing a digital envelope. When the recipient gets the message, they decrypt the envelope using their private key, which reveals the random key and digital signature. The digital signature is then decrypted, and the sender’s public key is used to reveal the message, its hash, and the hashing algorithm. Using the hashing algorithm and the plaintext message, an integrity check is performed by generating a hash form the plaintext and comparing it to the hash received with the message. If they match, everything is good to go. The recipient also knows for certain that the sender’s identity is true. Remember that the sender used their private key to encrypt the hash, so only the sender’s public key can decrypt it.
That’s it for now. Thanks for letting me waste your time…lol. Again, if anyone has any comment or corrections, or wants to go into more detail about any of these topics, PLEASE, PLEASE, post.