Sitemap to index Full size images and not Thumbnails

sitemap
xml

#1

Dear Sitepoint Community,

I have a page full of thumbnails, when clicked it opens the Full size image.
I need to make a sitemap to help Google index the Original images and not the Thumbnails. I know the way to make a sitemap for images on page, but how can I achieve it the other way?
This is how my code looks like:

<div class="thumb">
  <a href="images/wallpaper.jpg">
     <img src="images/wallpaper-small.jpg" alt="wallpaper-description">
  </a>
</div>

Moreover, I need the alt attribute of the Thumbnail to lead to the Original image when searched in Google Images.

Any suggestions or possible solutions? Thank you.


#2

Have you read Google's guidelines for image sitemaps?

https://support.google.com/webmasters/answer/178636?hl=en


#3

@TechnoBear Yes I did, Its pretty straight forward to make a sitemap for images visible on the page, "Thumbnails" in my case. But how to add the Full size image in the site map? and exclude the "Thumbnails" from indexing?


#4

How is the page populated with thumbnails and links to the full-size images?
Is it a static page made manually, or is it generated dynamically from a database and back-end scripting?


#5

Are you saying that Google is currently indexing your thuimbnails but not your large images?

I have a thumbnail gallery on one site, with alt text set on the thumbnails. If I search Google images for that alt text, it's the large image which is returned in the results. I don't have a sitemap of any sort; this is just default Google behaviour.


#6

You could try prefixing "images/wallpaper-small.jpg" with the domain name.

Google Chrome Site Links Test:

site:"YOUR-DOM.com" wallpaper-small.jpg


#7

@SamA74 It is a static page, HTML & CSS only. Thumbnails placed in rows and columns like a grid.


#8

@TechnoBear Yes that's correct, only Thumbnails are indexed. And they are pretty small to attract attention (160 x 160 px)


#9

@John_Betong I don't quite understand what you mean. Do you mean making a link as in "www.mysite.com/images/wallpaper-small.jpg" instead of "images/wallpaper-small.jpg"?


#10

I'm puzzled as to why Google seems to be indexing my full-size images and ignoring the thumbnails (in image search results), but is apparently doing the opposite with your site. Are you sure Google can access the large images?

As far as adding the large images to your site map goes, I'd have thought you simply add them to the page the thumbnail appears on. Presumably the large version is opening there, and not taking you to a different page.


#11

Yes a fully qualified domain has made a difference when used on my sites.

Did you check in Google Chrome for the link I supplied?

Maybe the images are appearing on the http version and not on www of your site or vice-versa. A fully qualified domain name removes all doubt.


#12

It might be worth taking some time to look at Apache mod_expires to see if there are any less than preferred HTTP headers being sent.

http://httpd.apache.org/docs/current/mod/mod_expires.html


#13

@TechnoBear I am not sure if Google can access the large images! As for how the images open, kindly refer to the code I have provided.
When Thumbnail is clicked a new tab opens and the URL bar looks like this:

www.mysite.com/images/wallpaper.jpg

Opens directly from the images folder inside my public_html folder, no light box, no java, just a plain image.


#14

@John_Betong Yes i did, and if i do understand right i should type this:

site:"full link to the page that contains the thumbnails" large-photo-name.jpg
which did not return any results for any file name

site:"full link to the page that contains the thumbnails" thumbnail-photo-name.jpg
returns 1 result to the page and many thumbnails in image search

What should I understand for the above results?
As for the www & non-www there is no problem as I have redirected everything to www version.


#15

@Mittineague I can't quite relate the article to my problem. Can you please elaborate more on that? thanks


#16

Simply that if mod_expires is sending Expires and Cache-Control HTTP headers like "don't save this, it's already obsolete" a search engine may consider the image not worth indexing


#17

@TechnoBear @SamA74 @John_Betong @Mittineague
I have made this illustration to show how the page functions, it might help understand the problem better.


Thank you all for your time.


#18

Why are you not prefixing both the thumbnail and image with the domain name?

I think because the domain name is missing then the browser guesses the thumbnail and image locations and is able to render the images.

When the site is crawled the page contents are used and there is no complete URL. The thumbnails and images have incomplete URLs resulting in not being cached.


#19

If it can't access your images, it can't index them, so that would be the first thing to check. Have you excluded your images folder in your robots.txt, for example?

I'm wondering if that is part of the problem; Google isn't indexing them because it has no associated page to crawl. (I really don't know - I'm guessing here.) My images open on the same page the thumbnail is on, using Highslide viewer.


#20

Because I have learnt to use Relative URL's since the start and never found a benefit or a reason to use Absolute URL's

I don't really understand how the browser would guess the location. I use a Relative URL as in "images/wallpaper.jpg" it should create no problem for the browser to fetch the image from its location.
I would like to note that when you click a Relative link the browser automatically shows the Full Link "Absolute" in browser tab.

I found some disadvantages in using Absolute URL's "prefixing the domain name" some are below:

  • I will have to manually change links of thousands of images if decided to use HTTPS.
  • Same goes if I ever want to change the domain name.
  • having thousands of images and a few extra characters in each links will increase page size.

I will try other methods first before experimenting with Absolute URL's
Thanks for your input