I have a client who has a number of technical PDF documents that he’d like to put onto his website. It is built using WordPress. He doesn’t want to re-purpose the document text and copy it over to a post.
I’ve inserted the documents in using a Google Doc Embedder plugin. That’s fine and they show up well, however the content isn’t linked directly to the PDF’s contents internally (the WP database) so they are impossible to search. My client wants these PDF/DOCS to be searchable.
Enter the quandary - how to create a searchable attachment that is associated with a post?
After a little searching and digging I discovered that a Google Custom Search would index an attachment, but I couldn’t for the life of me get it to work. It always seemed to skip the posts no matter how many times I checked the directory was correct. I searched for other alternatives and eventually found Yolink. A customisable search engine with a WordPress plugin. Installed that and discovered that the same directory structure wouldn’t index either. A word to their support showed that behaviour was due to the nature of the WordPress structure and that I should use their javascript version instead. I tried mucking around with sitemaps but they produced the same results, thankfully yolink allows feedback from the crawl/indexing process and I could now see that “Page skipped due to lack of content” was the reason why.
So how do I get a searchable attachment to register as the contents of a WordPress post? I can see that it USED to be possible with Google Docs. But that functionality was removed sometime in 2010. I can get the PDFs indexed but they don’t associate in the results with the post, they are stand alone results so the end user would go directly to a PDF opening in their browser rather than to my clients website to see them viewable there.
I’ve tried a number of approaches but I can’t wrap my head around whether or not this is even possible. I’ve seen some higher complexity answers such as installing a server based search but that is beyond my capability. Has anyone got any brilliant ideas I may be overlooking?
Thanks.