Vova is a passionate serial entrepreneur and has been a professional full-stack developer since the age of 12. After his previous company Senexx was acquired by Gartner Inc., Vova left his position as CTO and founded RaitingWidget.com. His newest company is disrupting the internet ratings and reviews industry for publishers and eCommerce by providing a ridiculously simple, yet highly user friendly star ratings solution for web developers.
PHP provides the popular md5() hash function out of the box, which returns 32 a hex character string. It’s a great way to generate a fingerprint for any arbitrary length string. But what if you need to generate an integer fingerprint out of a URL?
We faced that challenge in RatingWidget when we had to bind our rating widgets to a unique Int64 IDs based on the website’s page it’s being loaded from. Theoretically we could just store the URLs and query the URL column, but URLs can be very long and creating an index for text column with unknown length is very inefficient.
So if you are working on any kind of dynamic widget development that should load different data based on the URL it’s loaded from, this post will save you tonnes of time.
To simplify the problem, let’s divide it into two sub-challenges:
- URL Canonization
- String to unique Int64 conversion