On October 26, 2009, I’ve set up a 301 permanent redirect from all www.lescavesdulac.de pages to www.lescavesdulac.com, but Google is still listing all .de pages in its index which I think is not good, because it might be interpreted as duplicate content.
Any ideas why Google is not removing those .de pages?
I would imagine it’s because all those pages still exist so they’re still indexed. Simple as that. If you actually visit any of them from the SERP, you get taken to the .com version though so the redirect is working fine.
If you want those pages not to be indexed you either need to remove them or use Google Webmaster tools to have them removed from the index. 301s are usually used when content has been deleted.
I’m not sure if I understand correctly. The .de pages do not physically exist. The .de domain is just an alias of the .com domain that redirects all requests via 301 to .com on the Apache level.
If I removed that domain alias, all .de pages would return a 404 error.
Sometimes it takes a long time for these changes to take place … as long as all the links still work and Google is serving up relevant pages for search queries, I wouldn’t worry too much.
Another thing you can do is to put <link rel="canonical" href="...">
in the <head> of each page, with the URL that you want to be the definitive URL in the href.
I don’t think duplicate content is likely to be a problem. It is common for websites to have multiple international versions, eg domain.com, domain.co.uk, domain.de etc, and Google is smart enough to figure out that it is looking at different versions of the same website rather than different websites that are duplicating (ie copying) content from a single source.
Stevie is probably right because if you do a ‘site:www.lescavesdulac.de’ search those pages are still showing in the SERP. That’s what made me think they still existed.
A ‘cache:www.lescavesdulac.de/idex.php’ search shows that Google still has a page indexed for that to.