Programming - - By Alex Walker

The Future of Imagery? Content Aware Image Resizing

Every now and again something comes along that just makes you go ‘wow’. I think this is one of those moments.

Last week Shai Avidan and Ariel Shamir demonstrated their new ‘Content Aware Image Resizing‘ research for the first time, as seen in this YouTube clip. The demonstration does a far better job at explaining it than I can, but the executive summary goes something like this:

Currently we have two methods of presenting photographic imagery within a liquid (resizable) layout. Most of the time we crop our image to a size that we like and then lock it to display at those exact dimensions, allowing the text to flow and wrap around it when resizing occurs.

It’s also possible (though not common) to set the image size as a percentage of the page width, allowing it to scale with the page. Of course, this inevitably results in generated artifacts, distortions and noise at all non-standard dimensions.

Content Aware Image Resizing in action‘Content Aware Image Resizing’ (CAIR) takes a completely different tack. If we scale our image width down by 1 pixel, rather than removing a random vertical column of pixels, the CAIR process determines an often winding ‘path of least information’ from top to bottom which it then removes from display. Visually important areas that are dense with detail — like people, faces and text — are left virtually untouched while ‘low data density areas’ like clear skies, grass and concrete are carefully trimmed away.

The concept works similarly when scaling images up too. Critical picture data is protected while new image is generated in the low data areas. The process also allows users to manually ‘tag’ areas of the image as protected, so they won’t be touched by the processing algorithms.

Amazing and impressive stuff.

It does raise some interesting questions though.

a). Is it possible we could be seeing this technology make it through to our desktops sooner rather than later?

Although it’s likely the team behind the idea want to see some return on their time and money investment, selling it to ‘Joe Sixpack’ consumers seems like a hard ask to me. However, if they were able to produce a set of free extensions or plug-ins for Firefox, IE7, Safari and Opera, they would quickly create a viable userbase of the technology. Companies like Adobe would then want to license their technology to allow their Photoshop users to mark up with images with ‘resize-protected’ areas. If all goes to plan, down the track browsers may render the CAIR images natively.

I’m not sure what the limitations are, but it seems to me it may be possible to encode this ‘resize-protected’ data within the existing PNG32 format, allowing users with the CAIR plug-in to get intelligent resizing, while others see a garden-variety PNG. Fireworks certainly seems to encode a lot of data into its PNGs, and that data is ignored by all other apps.

Alternatively, Kevin has theorized that it might be more productive for them to write their image processor into an SWF and allow developers to license and deploy it on a site-by-site basis. I think ideally I’d prefer to have a few big, important customers like Adobe and Microsoft rather than thousands of smaller customers.

It will be interesting to see what they have planned.

b). Would you be happy to have photo journalism from a news site like CNN or BBC ‘edited’ by a technology like CAIR?

Of course, when any human editor crops a photo, he or she is making an editorial change to content of the image — leaving some bits in, cropping others out — and we all accept that as part of the process of journalism.

But with CAIR, we effectively have a ‘machine’ making an editorial decision on the story being told. It’s is adding or removing data that it judges to of less value, but often for us it is the spaces between that tell the story. For instance:

  • How close were those war protesters to the riot police when the violence broke out?
  • Was the President a little too close and familiar with that pretty intern in the crowd?
  • Was that famous Brazilian striker really offside when he goaled in the World Cup final?

Your answer to those questions may well vary depending on the resolution of your monitor.

Interesting times.

Shai Avidan has since been employed by Adobe in their Newton, Massachusetts Office