# Thread: How to identified cryptographic algorithm used?

1. ## How to identified cryptographic algorithm used?

Hi there!
The following question is something I always questioned myself since I started learning crypto: how do you identify what algorithm has been used to encrypt a given cyphertext?

We know that different algorithms have different vulnerabilities. In order to crack the algorithm we should first know what algorithm we're dealing with right?

Is there a way to find out this directly without having to considered all major algorithms when cracking a given cyphertext?

2. Measures are usually taken into account to hide the algorithm. thats part of the security of cryptography.

3. Originally posted here by MK19
Measures are usually taken into account to hide the algorithm. thats part of the security of cryptography.
on the contrary mon ami, if you hide the algorithm, how i (as an user) can trust that the encription of my stuff is good enough?

now at the other (dark) side, looking at a (encripted) string, how can i discover the possible algorithms?

4. What you want to do is publish your algorithm so others can test it. Otherwise it's a mix between obscurity and security, and generally it won't get you anything good. Unless you're the NSA and have some of the best cryptographers/cryptologists in the world working for you, you need peer-review.

But supposing you've just gotten across a ciphertext and you'd like to break it...

First off you look at the types of characters involved. If it's alpha-numeric, you try out a bunch of things, like character frequency [different for different languages] or likelihood of coincidence. Also splitting into groups [blocks] or pattern analysis. If it's numbers, you could also try some of these things if you suppose it's a transposition cipher.

If it's binary data it'll be hard [I know everything is binary, but I mean something outside of characters and numbers] because you have to take into account complicated mathematical modelling that can only have an application in computers etc.

Some algorithms also leave a 'mark' of what they are. If the data is valuable, believe me that a cryptanalyst would spend time trying out possibilities of multiple ciphers/algorithms. Even a polyalphabetic substitution cipher can have you coding for days if you have no clue about what you're doing, but in general cryptanalysts know where to start looking.

Read Bruce Schneier's "Applied Cryptography" if you're really interested in the subject. I just got it in the mail yesterday [after repeatedly browsing it in bookstores and libraries] and it's an excellent read.

5. In short, the point of a great (computer encryption) algorithm is to make random data so that you can't make anything meaningful from it. So, given an arbitrary and perfectly encryted string you would not be able to know what algorithm was used to encrypt it, or to decrypt it. You would have no idea what the original message is. I haven't been able to devote the time to research and try to learn much otherwise (I don't know what information is avaliable about the topic), so the information in this post takes the short above statement to be fact. I won't be able to help you much in your original question -- I can only help if an algorithm is known.

Today, we are relying pretty much entirely on keyspace to protect our data. We are freely publishing the algorithm we've encrypted stuff with, and name it *.MD5 or *.SHA. (Note: Those are hashing-algorithms, but I've used them to show that we give things descriptive names telling everyone what was done in it) With public key crypto, we're including a public key that usually works great with a paticular algorithm (PGP DH) but might not work for something else (RSA) and vice-versa. In this environment, we know what algorithm is used, and we simply have a huge keyspace to brute-force.

What I understand as related to your quesion of getting an arbitrarly encrypted string:

As mentioned earlier, (modern computer-based) encryption aims to produce a nearly random jumble of characters that don't mean anything unless you know something about them beforehand. This randomizing is done by compressing the data to be encrypted, which removes non-random elements and leaves a nearly-random jumble of characters to be encrypted. (Compression removes non-random elements; completely random values cannot be compressed further) From here, no matter how crappy an encryption algorithm is, the output will remain generally random. So in a proper algorithm and implementation there is no character-analysis to speak of, since you can't tell what random numbers are.

If an algorithm is known, the job is much easier. (I can't really give much insight into how to find the algorithm of a well encrypted string since I've never been exposed to anything talking about it) For example, with XOR, unless a one-time-pad method is used, the key will repeat many times to completely encrypt a file. If you encrypt something 512bits long with a 128bit key, the key will have to be repeated 4 times. When it repeats, and we know where it repeats, we can compare different parts of the data accurately, even if we don't know what the data is. Once part of the key is cracked, in XOR encryption, everytime that part is repeated, the message at that part is successfully decrypted. So if we know the 32bits of the 128bit key, and the key repeats 4 times, we have 32*4 = 128bits of the file successfully decrypted. Thus, unless a one-time-pad is used (a key that does not repeat since it is as long as the original file and is random) cracking regular XOR algorithms is pretty easy. Combined with the right type of compression to randomize the data though, XOR is nearly impossible to crack...

Hopefully whatever I said above was of some use. I need a break because I think that I may have confused you with stuff you weren't specificially interested in with your original question.

-Tim_axe

6. well some programs are weaker written in C and C++ which could be broken

If u use powerfull algo like RSA then it is little difficult, and with in an week i am writing an tut on the encryption.

7. Some algorithms also leave a 'mark' of what they are.
is about that "mark" im talking about. On the book that you ve mencioned i could find nothing.

Tim had mencioned that once algorithm is known (if i understood correctly) we can proceed with decyphering.

But im still looking a way to identify to algorithm on this "enviroment":

1) i have a 1024 bytes message

2) its encrypted

3) i have no idea about its contents, (plain text or binary one, language, objectives, nothing)

4) i dont know which algorithm was used

5) i need to see its contents!

what is the "path" that a cryptoanalist follows? That is my question.

8. I'd like to point out incase I forgot to last time that I'm not a cryptanalyst although the topic interests me a lot and I try to learn what I can while applying my prior math knowledge to what I'm learning. So I could be terriably wrong, but I think I have something to stand up on

I'd think that if you had some clue what the message is, and you have enough different messages that are encrypted the same, you'd be able to somehow get the right message. At least for symetric algorithms.

I say this for the following reasons (strictly in symetric cyphers):
• If we know the messages are sensible, that is we know it will be english text or whatever else we know it to be, then the encrypted message will be that when it is decrypted, and if it isn't then we probably decrypted it wrong
• If the different messages are encrypted with the same algorithm/key, then there should be at least one algorithm/key that will successfully decrypt all of the messages, and that one should be the correct algorith/key.
• If the algorithm/key is wrong, then some of the messages will be gibberish. (see first point)

Basically, when we encrypt a file with an algorithm and a key, that same algorithm and key will decrypt any message that was encrypted the same way.

So if we encrypt "Hello, world" and "AntiOnline" with our algorithm and some key, then that angorithm's decryption function and key will give us the original messages "Hello, world" and "AntiOnline". The wrong algorithm and key might give us "83lawpa%sk8s" and "xp3#ln@nr9". It could have been what the person encrypted, but since we are under the assumption that they encrypted some form of English text, these random characters probably mean we decrypted it wrong and should try again. When both appear to be some form of the English language, we might have the right algorithm and key.

But we might also get the wrong messages "What's up yo" and "Follow now" that fit our rule of being part of the english language. The unfortunate fact is that this message meets our criteria for being the "right message" so we can't be sure which one is real.

But if we found a 3rd encrypted message, and what we used to get "What's up yo" gives us some random output, then we can eliminate what appears to be wrong and possibly even realize that the real algorithm gives us the messages "Hello, World", "AntiOnline", and "Polkadot".

In summary:

If we have enough messages, then statistically we should be able to narrow down the possible algorithms until we are down to a single set of possible messages from the algorithm we've created. But a single 1024 byte message probably isn't enough for this to work well. We need multiple different messages encrypted the same.

9. So, Tim, your argument (that appears to be correct - for me at least ) leads me that the best encription schema is:

send cripto messages with one time passwords.

since cripto analyst needs more "volume" to get the key, if i change it on every msg no one will get it!
BUT
it leads me also to "i can use a weak algorithm, since i m sending just one msg no one will get"
and i think that is wrong.
crypto analyst will "see" (i think) that is a weak algorithm with a one time password - so it can be broken by brute force. like a DES algorithm. but how it can see?

10. The One Time Pad (in XOR algorithms), where the key is the same length as what you are encrypting and it is completely random and non-repeating, is considered to be imposible to crack (because there is no way to know if you got the right message or not - all combinations are equally possible). Reusing the key of course nulls the security of it.

I'm pretty sure it is the volume of messages that is needed for a cryptoanalyst to understand how to read the message. But it *might* be possible for a cryptoanalyst to learn about the algorithm even if you switch keys. I think it would even make their job easier if you encrypt the same message with a different key. But I don't know how they would know it is the same message since when it is encrypted it will look different...

The problem with someone finding out if the algoritm is "weak" or "strong" is that encrypted data looks pretty much the same - a random jumble of characters and symbols. This is why even "snake oil" can look like strong encryption, when it really isn't. But once the cryptoanalysit knows and understands the algorithm, they can work on brute forcing keys for it. For example, DES was promoted by the government as "secure". All of their estimates showed it would take many thousands of computers many thousands of years to decrypt a simple message. In reality, the government knew it was not nearly as secure as they said, and a computer was later put together for under \$150,000 that cracked it in a week (although it was put together near the end of DES's 20 year life). Special purpose hardware was created that optimized the decryption process well beyond what could be done in a normal CPU at the time, because DES was very slow in software. So they knew what they needed to do to crack it because the algorithm was avaliable, but without the algorithm I'm not sure how they'd have done it...

Also, I should probably mention this again, the algorithm must be able to withstand scrutiny from the cryptographic community for many years before being accepted as "strong/secure" for the time being. They're not interested in seeing "try to crack this" type posts on AO, but would probably much rather see solid mathimatical basis for an algorithm, and be able to disect it and see all of the code involved. If you are trying to write something secure and haven't gone to any college-level math classes or don't provide the full algorithm, I don't think they're too interested in what you're doing...

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•