I'm making a drive mapping utility for use on computers to check their contents. I was wondering whether it would be to my advantage to use more than one hash. I was thinking of using md5, sha-1, and a hash engine that I myself designed (because the code for it is written already).
I was thinking about it and figured that it couldn't hurt to use more than one hash to verify the integrity of the data. Any thought?
I doubt it would be an advantage unless the two mechanisms provided separate features or degrees of reliability. As for which to use, since their just hashes, and you have code already, I don't see any immediate downsides to using your own hashing thing so long as your sure it's reliable. Who knows how you'd test that fully on your own.
Hope you end up with something good,
Oh yeah, just to add some stuff about more specifics of the program. It notes the file's path, hash, and structure. Here's a quick outline of an entry generated from a file:
256 bytes (file path)
016 bytes (MD5 hash)
020 bytes (SHA1 hash)
008 bytes (DPG hash)
064 bytes (characters)
x4 bytes (counts) (max 256)
total: 1388 bpe
The 64 bytes are for character entries. It is basically an array of 256 bits with each bit corresponding to a particular character. The 32 bits (that 4-byte section) are an integer that shows how many instances of a character there were in a file.
The entry length doesn't really matter as the database structure I'm using for this is a type of key vs data setup. You have one file with pointers into the other file that define the data regions of it.
UpperCell, I'd like to just use mine, but I'm not highly confident in it and I'd like to have backup hashes that will most likely be much more secure than mine. Thanks for the input though :)
Why not just use a single algorithm with a longer hash key?
If your aim is to make it harder for unauthorised modifications to intentionally generate the same hash value - that is very hard already with most algorithms.
Even with a 128bit key, it would take a fantastically long time to find a file with the right hash value. Long enough for me not to worry about malware ingeniously constructing files with the same hash value.
Couldn`t you just use Tripwire? Isn`t the free version of that still available? (rather then having to code something yourself)
Only logging filename, filelength and MD5 hash should be enough.
After thinking about what you've said, I think I'm going to expand the program plans so that it encompasses multiple drive map options. I think I'll have the following options:
-File name (mandatory)
-MD5 hash (mandatory)
-File length (optional)
-File structure, simple (optional)
-File structure, complex (optional)
-Reported last modification time (optional)
-Database RLE compression (optional)
-Database Huffman compression (optional)