SitePoint Sponsor 

User Tag List
Results 1 to 9 of 9
Thread: Image Comparison

Jun 12, 2010, 02:21 #1
 Join Date
 Jun 2010
 Posts
 4
 Mentioned
 0 Post(s)
 Tagged
 0 Thread(s)
Image Comparison
Hello all. Im working on a PHP project for myself, and I wanted give users the ability to upload an image and use a algorithm to find which other image it most closely resembles. I figure I need some sort of image analysis library and was wondering if anyone could shed some light on how one might do something like this. Thanks for your time!

Jun 12, 2010, 03:37 #2
 Join Date
 Jan 2006
 Location
 Gold Coast, Australia
 Posts
 123
 Mentioned
 0 Post(s)
 Tagged
 0 Thread(s)
Do you have to use PHP? Maybe look into Python, I have done something similar using the SciPy library, was very straightforward. Not sure if a PHP library like that exists.

Jun 12, 2010, 09:37 #3
I don't know whether this is possible or not at least with GD library.
May be ImageMagick has some magic.PHPycho  Magento Freelancer
Free Modules: jQuery LightBoxes  Frontend Links Manager & more...
Commercial Modules: Custom Login Redirect Pro  Store Restrction Pro & more...
Follow me on Twitter @ magepsycho

Jun 12, 2010, 11:01 #4
 Join Date
 Apr 2008
 Location
 NorthEast, UK.
 Posts
 6,111
 Mentioned
 3 Post(s)
 Tagged
 0 Thread(s)
Sure, it's possible. You iterate through each pixel and record the color at that position, thus creating an image fingerprint(map).
The fun however, comes when you need to take into account tone/contrast differences, scale, crops etc...
imagecolorat@AnthonySterling: I'm a PHP developer, a consultant for oopnorth.com and the organiser of @phpne, a PHP User Group covering the NorthEast of England.

Jun 13, 2010, 21:13 #5
 Join Date
 Jun 2010
 Posts
 4
 Mentioned
 0 Post(s)
 Tagged
 0 Thread(s)
Thanks for all the replies. I'm definitely ok with using something besides php.
To explain further what I'm trying to accomplish; if a user uploaded a image of a say a red car, I was wondering if there was any algorithm out there that could tell them that that photo is closer to another unique image of that same car, as opposed to a red apple.
I know this is very complex functionality, that's why I was looking to see if there is anything I could licensee to do this.
Thanks for all your help an input!

Jun 13, 2010, 21:57 #6
 Join Date
 Jan 2006
 Location
 Gold Coast, Australia
 Posts
 123
 Mentioned
 0 Post(s)
 Tagged
 0 Thread(s)
If you are open to trying out a new language, I would suggest using Python for this. As there is a versatile Python Imaging Library
Below is basic code to compare to images
Code Python:h1 = Image.open("image1").histogram() h2 = Image.open("image2").histogram() rms = math.sqrt(reduce(operator.add, map(lambda a,b: (ab)**2, h1, h2))/len(h1))
However what this method does, is simply compare the histograms of both images. So inconsistencies are bound to arise.
A bit more about using histograms for image comparisons
Another less robust but potentially faster solution is to build feature histograms for each image, and choose the image with the histogram closest to the input image's histogram. I implemented this as an undergrad, and we used 3 color histograms (red, green, and blue), and two texture histograms, direction and scale. I'll give the details below, but I should note that this only worked well for matching images VERY similar to the database images. Rescaled, rotated, or discolored images can fail with this method, but small changes like cropping won't break the algorithm
Computing the color histograms is straightforward  just pick the range for your histogram buckets, and for each range, tally the number of pixels with a color in that range. For example, consider the "green" histogram, and suppose we choose 4 buckets for our histogram: 063, 64127, 128191, and 192255. Then for each pixel, we look at the green value, and add a tally to the appropriate bucket. When we're done tallying, we divide each bucket total by the number of pixels in the entire image to get a normalized histogram for the green channel.
For the texture direction histogram, we started by performing edge detection on the image. Each edge point has a normal vector pointing in the direction perpendicular to the edge. We quantized the normal vector's angle into one of 6 buckets between 0 and PI (since edges have 180degree symmetry, we converted angles between PI and 0 to be between 0 and PI). After tallying up the number of edge points in each direction, we have an unnormalized histogram representing texture direction, which we normalized by dividing each bucket by the total number of edge points in the image.
To compute the texture scale histogram, for each edge point, we measured the distance to the nextclosest edge point with the same direction. For example, if edge point A has a direction of 45 degrees, the algorithm walks in that direction until it finds another edge point with a direction of 45 degrees (or within a reasonable deviation). After computing this distance for each edge point, we dump those values into a histogram and normalize it by dividing by the total number of edge points.
Now you have 5 histograms for each image. To compare two images, you take the absolute value of the difference between each histogram bucket, and then sum these values. For example, to compare images A and B, we would compute
A.green_histogram.bucket_1  B.green_histogram.bucket_1
for each bucket in the green histogram, and repeat for the other histograms, and then sum up all the results. The smaller the result, the better the match. Repeat for all images in the database, and the match with the smallest result wins. You'd probably want to have a threshold, above which the algorithm concludes that no match was found.
 Source : StackOverflow :: Image comparison  fast algorithm
Method 2:
Using the PIL library (stated above) and also the NumPy library
The following code will give you a number as the output. So, the closer the number is to 0, the greater the resemblance the 2 images share.
Code Python:def main(): img1 = Image.open(sys.argv[1]) img2 = Image.open(sys.argv[2]) if img1.size != img2.size or img1.getbands() != img2.getbands(): return 1 s = 0 for band_index, band in enumerate(img1.getbands()): m1 = numpy.array([p[band_index] for p in img1.getdata()]).reshape(*img1.size) m2 = numpy.array([p[band_index] for p in img2.getdata()]).reshape(*img2.size) s += numpy.sum(numpy.abs(m1m2)) print s if __name__ == "__main__": sys.exit(main())
Method 3:
Read more about the Hough Transform
The Hough transform is a feature extraction technique used in image analysis, computer vision, and digital image processing.[1] The purpose of the technique is to find imperfect instances of objects within a certain class of shapes by a voting procedure. This voting procedure is carried out in a parameter space, from which object candidates are obtained as local maxima in a socalled accumulator space that is explicitly constructed by the algorithm for computing the Hough transform.
Better than picking 100 random points is picking 100 important points. Certain parts of an image have more information than others (particularly at edges and corners), and these are the ones you'll want to use for smart image matching. Google "keypoint extraction" and "keypoint matching" and you'll find quite a few academic papers on the subject. These days, SIFT keypoints are arguably the most popular, since they can match images under different scales, rotations, and lighting. Some SIFT implementations can be found here.
One downside to keypoint matching is the running time of a naive implementation: O(n^2m) (Might need to familiarize yourself with Big O Notation if you have not done so), where n is the number of keypoints in each image, and m is the number of images in the database. Some clever algorithms might find the closest match faster, like quadtrees or binary space partitioning

Jun 13, 2010, 22:27 #7
 Join Date
 Jun 2010
 Posts
 4
 Mentioned
 0 Post(s)
 Tagged
 0 Thread(s)
Thanks wackyjoe. I have no problem picking up python. It sounds like a fun weekend project .
About the code, the function getbands() is examining an then comparing the amount of red, green, and blue in each image right?

Jun 14, 2010, 00:07 #8
 Join Date
 Jan 2006
 Location
 Gold Coast, Australia
 Posts
 123
 Mentioned
 0 Post(s)
 Tagged
 0 Thread(s)
getbands() returns a sequence of strings, one per band, representing the mode (see below) of the image. For example, if image img1 has mode "RGB", img1.getbands() will return ('R', 'G', 'B').
So the mode in this instance would be :
"RGB", which has 3 bands. With the description: True redgreenblue color, three bytes per pixel.

Jun 14, 2010, 16:02 #9
 Join Date
 Jun 2010
 Posts
 4
 Mentioned
 0 Post(s)
 Tagged
 0 Thread(s)
OK. Thanks again. I going to give this a try this weekend.
Bookmarks