I have over 50,000 articles in my database (Joomla). It’s really slowing things down. I need to archive 49,000 articles and and leave the last 1,000 on my current server.
I cannot delete those 49K articles, because they are accessible via search engines and it would ruin my SEO if they were to be deleted. Those articles are indeed accessed several times a day, but not as much as the website itself (with the last 1,000 articles), since we have daily news. Like 50 articles daily. So in a month, those articles become less and less viewed. And people tend to view the current articles. (kind of like CNN.com) you read the daily news and don’t really access older articles.
Any idea how to split that database and put it on a different server and still have the URLS forwarded?
Rather than expand to more databases, might it not make more sense to convert the older pages into static files?
I don’t imagine the content would be changing at this point, and other than design changes not being applied I don’t see where it should be a big problem to visitors.
By converting them to static files, would that mean that the 49K articles won’t be called from the CMS to the database? They would be in a folder and every file would have a .php extension?
When SitePoint changed from vBulletin to Discourse, rather than importing everything, it was decided to “archive” the “older” inactive threads
The archived pages still had the vBulletin design, and a lot of the links would no longer work, but the pages were essentially static HTML pages.
I was hoping to provide an example, but it seems all of the links to old content now go to a generic “we’ve moved” page with links that redirect to the Discourse forum
That approach is going to be problematic if/when the site is redesigned or features are added to the global layout which are expected to persist to those articles. A simple example would be adding a menu link. If the client by chance changes their mind than you would have to crawl all the articles manually and add the customization on each static page.
Clients always change their mind. What is an ok sacrifice one second is not an ok sacrifice the next. So if you were to go the static route I would still leave all those articles in tact in the database to be regenerated when content is added to the global layout.
I don’t know a whole lot about Joomla but does Joomla have some type of internal caching you can turn on? If all those articles are the same for every user full page caching is most definitely an option.
Regardless wether they are HTML or PHP pages, I am afraid it would hurt my SEO.
If turning them static won’t hurt SEO, then it’s simple. I will create the old fashioned SSI tags inside each article. So whenever I need to update header and footer links, all I have to do is change two pages and will apply to all.
But the question would be, by switching static, would it hurt my SEO?
Also, how do I switch 49,000 articles static?
Like I said I’ve never used joomla but I have used Drupal and magento. In both platforms there are ways to turn on different types of caching including full page. I would think something similar exists in joomla or via an add on. Search google for caching pages in joomla. That is where I would start.
Caching is not the issue. I have caching on, tried plugins… URL forwarded would be required.
I will look up how to switch articles to static pages. I have an instance with ordpress also. I converted all 50k+ articles to Wordpress too. To see which is easier.
If the static content is identical to the dynamic content then the chances are the page will load a lot faster and benefit the SEO.
I have long forgotten how Joomla displays final content, with a bit of luck the page contents can be easily saved.
Failing that, after the page has rendered, use Php file_get_contents(…) and save the result to your cached folder if and only if the cached file does not exist
Edit
Actually if the web-page does exist in the cached folder then the .htaccess should render the page and not call, Php, MySQL or Joomla. This is what makes my pages fast.
There are a multitude of caching levels. I’m referring to caching the whole page which is effectively the same thing as generating static pages. The static page would be served up until the page becomes expired via an edit or update of its content. Furthermore, there are various services such as varnish which can be placed in front of your application that effectively do the same thing with a boatload of options for handling dynamic content.
If the pages are dynamically cached then the page has to be visited once to generate the static page. The .htaccess file checks it the URL is in the static cache folder and if it is then the static [age is generated.
Deleting the cache folder contents will start generating new static pages with the new banners, etc
[quote]Thanks John.
Now, how is that done? What do I use to convert to static? HTTrack? What about the URL redirects?[/quote]
Did you follow the [color=red]Link[/color] in my previous post #5?
Can you supply a link to one of your pages and I will try it on my server?
Meanwhile search for “htaccess file exists redirection” and learn how to modify your .htaccess file.
Try creating a cache folder and save a static file with the URL name. Then call the URL from your browser and see if the static page is displayed.
Also search for how to save a Joomla web-page to a static folder.