aspose file tools*
The moose likes Security and the fly likes Double Hashing - hashing SHA1 with MD5 Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Soft Skills this week in the Jobs Discussion forum!
JavaRanch » Java Forums » Engineering » Security
Bookmark "Double Hashing - hashing SHA1 with MD5" Watch "Double Hashing - hashing SHA1 with MD5" New topic
Author

Double Hashing - hashing SHA1 with MD5

M Jay
Ranch Hand

Joined: Sep 21, 2004
Posts: 66
Greetings,

I am using Sun DS as my password store, which stores passwords in SHA1 hashed form. I need to use the retrieved password (SHA1 hashed) to generate an AES-128 key. This is not possible with SHA1 as it is 20 byte long whereas I need a 16 byte hash to generate the key. Is rehashing my SHA1 password with MD5(16 bytes) going to make the generated key less secure? Unfortunately Sun DS doesn't support MD5 hashing of passwords hence my problem.

Cheers


SCJP J2SE 1.4<br />SCBCD J2EE 1.3
Elchin Asgarli
Ranch Hand

Joined: Mar 08, 2010
Posts: 222

Hashing your SHA1 hash with MD5 will give you MD5-level security for your SHA1 hash, but your initial password will still have SHA1 protection, so I think it should be ok for you to use this approach.


Personal page, SCJP 6 with 91%, SCWCD 5 with 84%, OCMJD
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 19060
    
  40

Why not just use 16 out of the 20 bytes?

Henry


Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Elchin Asgarli
Ranch Hand

Joined: Mar 08, 2010
Posts: 222

Henry Wong wrote:Why not just use 16 out of the 20 bytes?

Henry


I believe that would affect the collision resistance (thus the security), since the algorithms are designed to be collision resistant for their full length of output bytes, not just parts of them. Can't provide mathematical proofs though
Pat Farrell
Rancher

Joined: Aug 11, 2007
Posts: 4659
    
    5

Elchin Asgarli wrote:
Henry Wong wrote:Why not just use 16 out of the 20 bytes?

Can't provide mathematical proofs though

The math is described in the "Birthday paradox", google or wikipedia have links.

If you only need a limited range of hash values, just pull them out of the SHA results. You can truncate the return string, or pick a substring out of the center, makes no difference. There are many cases where a long hash value is not necessary

There is zero reason to do a MD5 of the SHA. Its bad form.
Elchin Asgarli
Ranch Hand

Joined: Mar 08, 2010
Posts: 222

Pat Farrell wrote:
Elchin Asgarli wrote:
Henry Wong wrote:Why not just use 16 out of the 20 bytes?

Can't provide mathematical proofs though

The math is described in the "Birthday paradox", google or wikipedia have links.

If you only need a limited range of hash values, just pull them out of the SHA results. You can truncate the return string, or pick a substring out of the center, makes no difference. There are many cases where a long hash value is not necessary

There is zero reason to do a MD5 of the SHA. Its bad form.


I did not get it, the Birthday Paradox is in fact contrary to what you said about getting substring from the hash. Lets say checksum is a 'birthday' of the input string(or if you think other way around, birthday is log365 bit checksum of a person), so probability of getting collision is higher than it may first seem.
Pat Farrell
Rancher

Joined: Aug 11, 2007
Posts: 4659
    
    5

Elchin Asgarli wrote:I did not get it, the Birthday Paradox is in fact contrary to what you said about getting substring from the hash. Lets say checksum is a 'birthday' of the input string(or if you think other way around, birthday is log365 bit checksum of a person), so probability of getting collision is higher than it may first seem.

No, the birthday paradox explains exactly the math.

Then you have an engineering decision, what probability of collision is important to your application? You may not need the low probability of collision that the full SHA provides. If so, you can substring the results. And you don't have to guess, you can calculate the length you need for your requirements.

Doing a MD5 of a SHA1 yields only the MD5 collision strength, and MD5 is considered too weak for serious work. Doing the two step process is a waste of execution and programmer time.

Use a SHA, these days SHA256. Then truncate to suit your needs.
James Sabre
Ranch Hand

Joined: Sep 07, 2004
Posts: 781

Pat Farrell wrote:
Elchin Asgarli wrote:I did not get it, the Birthday Paradox is in fact contrary to what you said about getting substring from the hash. Lets say checksum is a 'birthday' of the input string(or if you think other way around, birthday is log365 bit checksum of a person), so probability of getting collision is higher than it may first seem.

No, the birthday paradox explains exactly the math.

Then you have an engineering decision, what probability of collision is important to your application? You may not need the low probability of collision that the full SHA provides. If so, you can substring the results. And you don't have to guess, you can calculate the length you need for your requirements.

Doing a MD5 of a SHA1 yields only the MD5 collision strength, and MD5 is considered too weak for serious work. Doing the two step process is a waste of execution and programmer time.

Use a SHA, these days SHA256. Then truncate to suit your needs.


The birthday paradox says that there is roughly an even chance of getting a collision when you have a sample size of about the square root of the number of possible outcomes. If you use N bytes from a hash (obviously N less than or equal to the hash length) whether they are a sub-string from an MD5 or a SHA1 or a SHA2 does not matter - the number of possible outcomes is 256^N so the approximately even chance comes with 256^(N/2) samples.

Even if one takes the MD5(SHA1(SHA256(RIPE16(SHA128(messages))))) or just MD5(message) one still ends up with 16 bytes so one still has an even chance with about 256^8 samples as long as no stage in the chain results in less than 16 bytes. To illustrate this restriction, consider a very limited hashing that produced only one byte of output. This means that after this stage all input messages result in only one byte so only 256 possible values. All the following stages therefore can only produce one value for each of these 256 possible outcomes. One would there for get an even chance of a collision with about 16 input messages even though the resulting output has 16 bytes.

So, doing an MD5 of the SHA1 hash is a waste of time. If one is going to use a hash to generate the key like this (I think a more formalized PKCS12 PBE approach would be better) Pat is correct when he say use a 16 byte sub-string of a sha (sha1, sha256, sha384, sha512 etc) .












Retired horse trader.
 Note: double-underline links may be advertisements automatically added by this site and are probably not endorsed by me.
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 19060
    
  40


My original response was merely a "common sense" statement. And I didn't intend for it to get into a long discussion about "collision resistence".... but since we are here, let me thrown my two cents in.

Arguing the merits of MD5 vs SHA1 vs MD5 of SHA1 seems silly to me. It is like trying to put an extra deadbolt on the security door, when the window has a simple lock. Keep in mind that, if lucky, an average password has about 7 or 8 ascii letters. At 5 relevant bits per letter, this adds up to about 40 bits. Arguing about keeping the 160 bits vs dropping the security level down to 128 bits, is kinda silly, when the password is only about 40 relevant bits long.

IMHO, the purpose of the SHA1 is more to generate a strong password for the AES, as sha1 output generally don't have long lengths of 0's or 1's. However, if I were to attack the security, I would attack it at the weakest point, and that would be a brute force attack at the password itself -- and not the key used for the AES.

Henry
Pat Farrell
Rancher

Joined: Aug 11, 2007
Posts: 4659
    
    5

Henry Wong wrote:Arguing the merits of MD5 vs SHA1 vs MD5 of SHA1 seems silly to me.


I don't think anyone is seriously arguing for hash(hash()). And using MD5 for anything modern is just silly.

Henry Wong wrote: Keep in mind that, if lucky, an average password has about 7 or 8 ascii letters. At 5 relevant bits per letter, this adds up to about 40 bits. Arguing about keeping the 160 bits vs dropping the security level down to 128 bits, is kinda silly, when the password is only about 40 relevant bits long.


I will argue that you are being overly generous in claiming that an average password has as many as 40 bits of entropy. A recent study of leaked passwords from the social networking site RockYou's (32 million passwords) showed "one fifth of users had selected passwords which were amongst the 50,000 most common on the web." This is less than 16 bits of entropy. Further, "Nearly 300,000 of the compromised accounts used "123456" as a password, while an additional 79,078 selected "12345" and 76,790 users had "123456789" as their password. Other commonly used passwords were "password", "iloveyou", and " princess"."

In the CERT list of weak passwords, there are only about 200 entries, and these very popular passwords have essentially zero entropy.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Double Hashing - hashing SHA1 with MD5