The Future of Imagery? Content Aware Image Resizing

Tweet

Every now and again something comes along that just makes you go ‘wow’. I think this is one of those moments.

Last week Shai Avidan and Ariel Shamir demonstrated their new ‘Content Aware Image Resizing‘ research for the first time, as seen in this YouTube clip. The demonstration does a far better job at explaining it than I can, but the executive summary goes something like this:

Currently we have two methods of presenting photographic imagery within a liquid (resizable) layout. Most of the time we crop our image to a size that we like and then lock it to display at those exact dimensions, allowing the text to flow and wrap around it when resizing occurs.

It’s also possible (though not common) to set the image size as a percentage of the page width, allowing it to scale with the page. Of course, this inevitably results in generated artifacts, distortions and noise at all non-standard dimensions.

Content Aware Image Resizing in action‘Content Aware Image Resizing’ (CAIR) takes a completely different tack. If we scale our image width down by 1 pixel, rather than removing a random vertical column of pixels, the CAIR process determines an often winding ‘path of least information’ from top to bottom which it then removes from display. Visually important areas that are dense with detail — like people, faces and text — are left virtually untouched while ‘low data density areas’ like clear skies, grass and concrete are carefully trimmed away.

The concept works similarly when scaling images up too. Critical picture data is protected while new image is generated in the low data areas. The process also allows users to manually ‘tag’ areas of the image as protected, so they won’t be touched by the processing algorithms.

Amazing and impressive stuff.

It does raise some interesting questions though.

a). Is it possible we could be seeing this technology make it through to our desktops sooner rather than later?

Although it’s likely the team behind the idea want to see some return on their time and money investment, selling it to ‘Joe Sixpack’ consumers seems like a hard ask to me. However, if they were able to produce a set of free extensions or plug-ins for Firefox, IE7, Safari and Opera, they would quickly create a viable userbase of the technology. Companies like Adobe would then want to license their technology to allow their Photoshop users to mark up with images with ‘resize-protected’ areas. If all goes to plan, down the track browsers may render the CAIR images natively.

I’m not sure what the limitations are, but it seems to me it may be possible to encode this ‘resize-protected’ data within the existing PNG32 format, allowing users with the CAIR plug-in to get intelligent resizing, while others see a garden-variety PNG. Fireworks certainly seems to encode a lot of data into its PNGs, and that data is ignored by all other apps.

Alternatively, Kevin has theorized that it might be more productive for them to write their image processor into an SWF and allow developers to license and deploy it on a site-by-site basis. I think ideally I’d prefer to have a few big, important customers like Adobe and Microsoft rather than thousands of smaller customers.

It will be interesting to see what they have planned.

b). Would you be happy to have photo journalism from a news site like CNN or BBC ‘edited’ by a technology like CAIR?

Of course, when any human editor crops a photo, he or she is making an editorial change to content of the image — leaving some bits in, cropping others out — and we all accept that as part of the process of journalism.

But with CAIR, we effectively have a ‘machine’ making an editorial decision on the story being told. It’s is adding or removing data that it judges to of less value, but often for us it is the spaces between that tell the story. For instance:

  • How close were those war protesters to the riot police when the violence broke out?
  • Was the President a little too close and familiar with that pretty intern in the crowd?
  • Was that famous Brazilian striker really offside when he goaled in the World Cup final?

Your answer to those questions may well vary depending on the resolution of your monitor.

Interesting times.

Shai Avidan has since been employed by Adobe in their Newton, Massachusetts Office

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • http://www.solidmedia.net.au solidmedia

    I literally said wow at the end of that video.

    Nice – hope it gets implemented sooner rather then later.

  • http://www.tyssendesign.com.au Tyssen

    Would you be happy to have photo journalism from a news site like CNN or BBC ‘edited’ by a technology like CAIR?

    In cases where news sites did present scalable layouts where they knew that certain images might be changed by this sort of process, I’d think they’d provide a link to a page where you could view the image without any manipulation as well.

  • http://dtracorp.com dtra

    i guess it makes sense for logos, and the like but for photos, i don’t think it works, cropping is one thing, but if this starts cutting middle sections from photos, then you’re losing a lot that’s what thumbnails are for, i believe

    looks like a very cool application though
    maybe good for mad fold outs :D

  • http://www.sitepoint.com AlexW

    Looks like Shai Avidan has been snapped up by Adobe. It would be interesting to know exactly what they want him to work on.

    I guess ‘if an image resizes in the forest, but nobody sees it, did it really resize?‘ — or in plainer language, if a user can only see an image resize inside Photoshop, how valuable is the concept?

    Will we see Adobe will be venturing into the browser plugin market?

  • ScottX

    wow. i mean WOW.

    hmm, this is going to bring in a whole new age of forged images… like in the surfer bit… want to get the ex girlf out of old pictures? no problem… :p

  • MikesBarto2002

    But how does this fit in with copyright laws? I would think if, for example, you are resizing in this way, things would no longer take on their original shape. Also, if you are cropping people out of photos, that is removing the art behind a photographer’s photo. Is this even legal?

  • http://www.sitepoint.com AlexW

    hmm, this is going to bring in a whole new age of forged images… like in the surfer bit… want to get the ex girlf out of old pictures? no problem… :p

    I guess we’ve had control of the content of our own images for a long time. To some extent, this takes away some of that control and gives it to individual browsers.

    But how does this fit in with copyright laws? I would think if, for example, you are resizing in this way, things would no longer take on their original shape. Also, if you are cropping people out of photos, that is removing the art behind a photographer’s photo. Is this even legal?

    Great question. This is cutting-edge image technology right now, so there’s little likelihood the law will even recognise it’s an issue in the next five years. I think if I was Reuters Images, licencing news images to other news sources, I’d want some kind of tagging in my images that told browsers ‘No CAIR resizing’.

  • Frank Quist

    Of course, when any human editor crops a photo, he or she is making an editorial change to content of the image — leaving some bits in, cropping others out — and we all accept that as part of the process of journalism.
    This is a different kind of editorial decision, though. Yes, bits are left out, but none of the bits that are left in are actually manipulated. While the viewer can be fooled by leaving out important bits, every bit that is left in can be trusted. Unlike others, this trend gave me little excitement. It just seems an odd, quirky technique that would be useful in only things like logos, ads, copywriting, etc (basically anything non-photo where no photo imagery is manipulated). When applied to photos (where most of the wow factor probably comes from) it just leaves me with the sour taste in my mouth that we will have even less trust in what we see (the threat here not coming from the technique per sé but in the amount of people that seem to find it nonobjectionable, unlike other manipulation techniques).

    Should be a niche technique useful in copywriting companies, but make it stay outside of journalism, please.

  • Serge

    I have created the experimental realization of “seam carving” method, you may download it here: http://www.intuimage.com

  • RC

    It would be great to put images into portable devices that scale so we can see the whole picture in context, eliminating the need for scrolling.
    Great tool!

  • Irmgard

    Hi,
    If you are looking for a software to try out seam carving, take a look at http://www.thegedanken.com/retarget

    The program that you can download there (for Windows and Linux, and free) is already highly optimized concerning speed, and apart from enlarging or decreasing image size you can also use masks to protect or delete certain parts of your image.

    Have fun,
    Irmgard

  • http://www.sitepoint.com AlexW

    Hi,
    If you are looking for a software to try out seam carving, take a look at http://www.thegedanken.com/retarget

    The program that you can download there (for Windows and Linux, and free) is already highly optimized concerning speed, and apart from enlarging or decreasing image size you can also use masks to protect or delete certain parts of your image.

    Have fun,
    Irmgard

    Wow, that is impressive, Irmgard!

    I’ve seen a few implementations since this first appeared, but that’s easily the best so far. Great work.

  • Will

    Check out rsizr.com for a Flash-based implementation of seam carving that lets you resize your own images, both in height and width simultaneously, in real time. (You can rescale and crop images too!)

    http://rsizr.com/about/gallery/ for example images