Comparing two images

This is going way out on a limb. I know that this functionality is not built into PHP itself, but I was wondering if maybe someone has some code that might be able to do it.

I am building a website that, for the most part, runs on data that is imported nightly from an external datasource. My client is a realtor and she has subscribed to a local MLS database which provides property listings and images for all of the properties currently on the market in her area.

I need to give her the ability to add new listings and update (overwrite) the content that comes through. So if she adds her property to the MLS database using the MLS interface that MLS has provided for her, she can either wait until that property gets imported to the website in the nightly cron or she can add it again to her own website.

Okay, so here’s my issue: If she add her images to both systems (MLS and the website), then I will probably have her MLS-uploaded images coming though during the nightly cron. Then we’ll have a duplicate of each image (not good).

So I want to be able to compare the images and not import an images if it is the same image. I can’t do a binary comparison because MLS might have resized it or resaved it and changed the data. So what I need is a way to compare the colors at different points of the two images (even if they are different sizes) to see if they are the same image.

It’s like similar_text(), but comparing images rather than text. Maybe ImageMagick can do it? Any suggestions?

Imagemagick can probably do what you want; search the users section for compare - there are about 14 pages of results !
http://redux.imagemagick.org/discourse-server/index.php
Quite a lot of the examples use Bash scripts which you may be able to adapt.

One of Freds bash scripts may do what you want - I do not know a lot about how bash scripts work but I have run one of freds scripts before in php using:

Download the script and upload it to your server.
Change the permissions to 755 on the script
Write your php code like this using the full path to the script
<?php
// Run the script
exec(“/FULL PATH TO AUTOTRIM/autotrim.sh zelda3_border2w.png fred.png 2>&1”, $array);
//Display any errors
echo “<br>”.print_r($array).“<br>”;
echo “</pre>”;
?>

Wow, sounds interesting…

However, maybe you’re approaching this the wrong way around. Surely these images come with text describing the property, would it not be easier to compare these?

I can’t really see a way of efficiently comparing images, maybe convert all the images to the same size and then to mono.

Iterate through each image pixel by pixel, and add a 1 for black and 0 for white to a string. Store this string then compare the strings?

Maybe give it a variance?

Sorry, I’m just thinking out loud. :slight_smile:

I think I found what I want to do.

First, I’ll do a comparison on the size/height ratio. If they are a completely different shape, I’ll consider the image new and import it. If they are more or less the same shape, I’ll use ImageMagick to compare them.

First, I’ll resize one of the images to a temporary file so that they can be compared with ImageMagick (they need to be the same size, apparently):

compare -metric RMSE /tmp/resized-image1.png image2.jpg /tmp/unneeded-method-output.gif

That returns

10194.3 (0.155555)

After playing with it for a while, I’ve discovered that the second number is less than 0.10 if the images are most likely the same.

Thanks for you help!

Check the filesize as well - this way if a new 100 x 100 image is uploaded (in place of an old 100 x 100) you’ll still know a new one has been uploaded.

Thanks for your suggestion. I would be worried about completely different images that have a similar file size, I think.

You can also use this: http://libpuzzle.pureftpd.org/project/libpuzzle

Oh that looks cool! thanks.

I tried libpuzzle and but I doubt my client’s shared hosting will let me install it. And while installing it on my dev server, it brought errors when I ran “make check”. So grudgingly I bagged that idea. I decided to do what I mentioned in my second post.

Requirements:

  • You must have ImageMagick installed on the server
  • This was only tested on Ubuntu 8.04.2, PHP 5.2.4, ImageMagick 6.3.7 02/19/08 Q16

Here’s the main class:

<?php

require_once 'process.php';

define('IMAGE_COMPARE_SHAPE_RATIO_THRESHOLD', 0.01);
define('IMAGE_COMPARE_SIMILARITY_THRESHOLD', 0.1);
define('IMAGE_COMPARE_COMPARE_PATH', 'compare');
define('IMAGE_COMPARE_CONVERT_PATH', 'convert');

class ImageCompare {

	var $error;
	var $debug;

	function ImageCompare() {
		$this->_reset();
	}

	function compare($image1, $image2) {
		$this->_reset();

		// Valid image files?
		if (!$this->valid_image($image1) || !$this->valid_image($image2)) {
			return $this->error('Invalid images -- compare failed');
		}

		// Get GIS for each
		$image1_gis = getimagesize($image1);
		$image2_gis = getimagesize($image2);

		// Both the same kind of file?
		if ($image1_gis['mime'] !== $image2_gis['mime']) {
			$this->_debug('Not the same kind of file');
			return false;
		}

		// Same shape?
		$image1_ratio = $image1_gis[0]/$image1_gis[1];
		$image2_ratio = $image2_gis[0]/$image2_gis[1];
		$ratio_difference = abs($image1_ratio - $image2_ratio);
		if ($ratio_difference >= IMAGE_COMPARE_SHAPE_RATIO_THRESHOLD) {
			$this->_debug('Not the same shape. Ratios: '.$image1_ratio.' '.$image2_ratio.'; Difference: '.$ratio_difference);
			return false;
		}

		// Same content?
		$process = new Process;
		$process->execute(IMAGE_COMPARE_CONVERT_PATH.' '.$image1.' -resize '.$image2_gis[0].'x'.$image2_gis[1].' miff:- | '.IMAGE_COMPARE_COMPARE_PATH.' -metric rmse - '.$image2.' null:');
		$preg_match = preg_match('/^[0-9]+\\\\.?[0-9]* \\\\(([0-9]+\\\\.?[0-9]*)\\\\)/', $process->stderr, $matches);
		if ($preg_match !== 1) {
			$this->_debug('Invalid output: "'.$process->stderr.'"');
			return false;
		}
		if (floatval($matches[1]) > IMAGE_COMPARE_SIMILARITY_THRESHOLD) {
			$this->_debug('Not above the threshold: "'.$matches[1].'"');
			return false;
		}
		return true;
	}

	function valid_image($image) {
		if (!is_file($image)) {
			return $this->error('Image file "'.$image.'" doesn\\'t exist');
		}
		if (!is_readable($image)) {
			return $this->error('Image file "'.$image.'" not readable');
		}
		if (getimagesize($image) === false) {
			return $this->error('Image file "'.$image.'" not valid');
		}
		return true;
	}

	function _reset() {
		$this->errors = array();
		$this->debug = array();
	}

	function _error($error) {
		$this->errors[] = $error;
		return false;
	}

	function _debug($message) {
		$this->debug[] = $message;
	}

}

Here’s a supporting class I revised for calling bash:

<?php

class Process {

	var $stdout;
	var $stderr;
	var $exit_code;

	function Process() {
		$this->_reset();
	}

	function _reset() {
		$this->stdout = '';
		$this->stderr = '';
		$this->exit_code = '';
	}

	function execute($command) {

		// Reset myself
		$this->_reset();

		// Execute command
		$resource = proc_open($command, array(array('pipe', 'r'), array('pipe', 'w'), array('pipe', 'w')), $pipes, null, $_ENV);
		if (is_resource($resource)) {

			// Collect STDOUT
			while (!feof($pipes[1])) {
				$this->stdout .= fgets($pipes[1]);
			}
			fclose($pipes[1]);

			// Collect STDERR
			while (!feof($pipes[2])) {
				$this->stderr .= fgets($pipes[2]);
			}
			fclose($pipes[2]);

			// Collect exit code
			$this->exit_code = proc_close($resource);
		}
	}

}

Here’s the usage:

require_once 'image_compare/image_compare.php';
$image_compare = new ImageCompare;
$same = $image_compare->compare('/home/user/image1.jpg', '/home/user/image1-resized-twice-as-a-test.jpg');
var_dump($same); // bool(true)