I’m updating a site which is still using md5 hashing for passwords. If I change the hash method to something newer like password_hash()
can I convert existing md5 passwords in the database to use this? I was thinking I could just make a script to do this, but I can’t “un-hash” the md5 and then re-hash and save back to the database. I suppose by nature, you are not supposed to be able to un-hash them, only compare hashed versions.
So how should I go about this? Do I need to get all registered people to create new passwords to use the newer hash method?
You could force them all to reset their passwords.
You could simply properly hash the md5-hashed passwords, and flag those accounts with some arbitrary value (legacy=1
) so that your system knows to both md5 and password_hash them to get the proper string?
I was thinking, if I add a field to the database to identify who has or has not updated their password; When they login, those who have not are asked to do so and that status is changed when they do.
Another approach is to first hash the submitted password using your new password_hash method. If it matches what is in your database then great.
If not then md5 it and check again. If md5 matches then updated the stored password with the new hashed one. You have access to the plain text password at this point so it would be invisible to the user and they can keep their existing password.
If neither match then of course the password is invalid.
After a certain period of time you could then force a reset for any users that still have md5 versions. Those will be users that are probably defunct anyways.
That’s better, the users would be converting their passwords to the new hash simply by logging on, without even knowing or having to change anything.
I agree, @ahundiak’s method is a good one. Maybe let that script run for a predetermined amount of time, until you’d like everyone to be on a secure non-md5 password, and then issue the resets for the remainder, who are probably inactive users of the site anyway if your time period was very long.
Thanks guys. That method was also a lot easier to do, less code. I did not have to go and make an extra “new password” form. It is done, I just need to test it now.
Personally, I prefer the flag approach (such as legacy=1) rather than the trial-and-error approach, especially if we’re going to strengthen old passwords by rehashing the MD5 value. I wrote a post a while ago that explains the process.
True. There will still be insecure md5 passwords lingering on, which defeats the object of the update.
I think I will go for a hybrid approach. Re-hash the existing md5 hashed entries like you say. But still use ahundiak’s idea of updating to the new system on the next login. That way I don’t have to deal with double hashed entries in every script that requires a password. I’ll just do it in the login script.
What I did was to add a specific value to the front of the MD5 hashes so the system could identify they were MD5 (equivalent to the legacy flag but without the extra field) then automatically replaced the hash with the new version after validating the password against the old hash the next time each person logged in. Completely invisible to the users and the only hashes not updated are those where the person hasn’t logged in since the code change.
I like it. No need for the extra table column… that I already created.
Frankly, I think you’re better off with the extra column. If you find yourself doing string joining on write and splitting on read – or really any form of multi-value serialization – just to avoid an extra column, then I think you’re missing the point of columns. You’re supposed to store separate values in separate columns. If you serialize multiple values to fit into one column, then you’re doing extra work just to avoid using the right tool for the job.
Well, I already have the column, so I won’t change direction at this point.
So why does password_hash store the type of hash, the salt and the hash itself all in the one field? Shouldn’t password_hash be using three separate fields to store that information instead of concatenating them together into a single value?
In what way is placing a value on the front of an MD5 hash to indicate that it is an MD5 hash different from password_hash placing a code on the front to indicate what type of hash was used?
Having a consistent way of identifying the hash used is why I put the value on the MD5 hashes - so that they at least partly match the way the other hashes are stored (although I didn’t amend the values so that they also contained the salt).
This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.