Google Algorithms Explained, Part 1: Don’t Be SpamBy Sam Gooch
This article is part of an SEO series from WooRank. Thank you for supporting the partners who make SitePoint possible.
More from this author
In the SEO world, Google algorithm updates are big news. They often have a big impact for sites that can be felt for months, or in some cases, years later. But for those who don’t spend a lot of time on SEO, or are new to the industry, keeping track of every algorithm can get pretty confusing — they have funny names and it’s not always clear what each one does. Lucky for you, in this article we’ll go over the major Google algorithms, what they do and how you can avoid incurring the wrath of a panda, penguin or hummingbird.
But first, what do we mean when we say “algorithm”? To boil down the Wikipedia definition, an algorithm is “a computer’s way of figuring out which steps to take to complete a task.” In Google’s instance, its algorithms decide what steps it takes to find pages relevant to keywords used in a search, and in what order it should display those pages.
PageRank is part of the original core of Google’s search algorithm, and is considered a big factor that differentiated Google from its early competitors such as Lycos and AltaVista. Developed in the ‘90s by Larry Page and Sergey Brin, this algorithm works to determine the importance of a page or domain by counting and evaluating the links pointing at it, and then giving it a relative score between 0 and 10.
It operates based on the idea that links operate as endorsements of a page, domain or content, so the more links a page has, the better it is. According to Google:
PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important it is. The underlying assumption is that more important websites are likely to receive more links from other websites.
So if you spend a lot of time working on link building, or a lot of time dealing with emails from people looking for links, you have PageRank to thank.
Note that Google used to provide PageRank as part of Google Toolbar, but stopped updating it years ago and finally removed it entirely in 2016. However, that doesn’t mean PageRank stopped being important. Links are still one of the top ranking factors.
What Is Panda?
In the old days, websites could consistently publish lots of short articles (just a few hundred words or less) with links pointing back to their site (often using exact match anchor text) across numerous article websites that would accept submissions, many without even looking at the quality of the content. The trick was to slightly change longer pieces and republish them (a process known as article spinning) or even blatantly plagiarize content in an attempt to appear at the top of search results. These websites were called “content farms” and could rank very well. The problem, from Google’s perspective, was that these websites didn’t help users find what they were looking for. They damaged Google’s user experience. Introduced in February 2011, Google’s Panda update was essentially a filter applied to search results to weed out those sites with low quality content. How does Google define low quality?
- Thin: Research has shown that Google likes long content: the average page ranked in the top ten search results has almost 2,000 words. Of course that number isn’t some sort of requirement, but what it means is that the Panda algorithm is looking for content that has enough depth to provide its users with the most information or best experience.
- Duplicate: There’s not really a “duplicate penalty” from Panda, but publishing the same content as other sites, or publishing the same thing on multiple pages across your own site, will still significantly impact your ability to rank. If your content is similar enough, you run the risk of being left out of search results altogether.
- Over-Optimized: A big part of the problem with pre-Panda content was that many of those content farms published articles that were obviously written to help the linked pages to rank for certain keywords. That meant high keyword and synonym density with low user friendliness and usefulness.
The results were noticeable, to say the least. Here’s traffic for one site that got hit by Panda:
Avoiding Panda Problems
So how do you stay on Google’s good side when it comes to content? Well, the obvious answer is to publish quality content. A good content strategy is to focus on publishing evergreen content, which by definition is high quality, in-depth and adds value for visitors. Not everything you publish will gain enough traction to truly become evergreen, but following the evergreen content guide will help you create in-depth, unique and naturally optimized articles that generally rank well.
However, just being a good writer is not always enough. Depending on what type of site you have, you could have a problem with duplicate content and not even know it. This is particularly common with content management systems, syndicated content, e-commerce shopping cart systems, international sites, search/filter features and pagination. You’ll have to use some technical know-how to deal with these types of duplicate content issues.
What Is Penguin?
Since links have always been such an important part of ranking in Google SERPs, SEOs have spent a lot of time and effort developing ways to build links. Even though PageRank took linking domain quality into account when scoring a page, building a huge number of lower quality links could very quickly add up and surpass sites that had links from fewer, better sites. This resulted in some link building techniques that were slightly less than white hat:
- Article marketing: As mentioned above, this technique consisted of writing an article with links pointing back to your site, and then submitting it to article websites that existed for the sole purpose of hosting content for backlinks. As we said before, the Panda update targeted those types of sites, which severely lowered the effectiveness of article marketing.
- Widgets: Tools or other embedded code can be very useful for users, but they can also be abused to manipulate link juice by adding a link to the code. Google recently reemphasized its position that using widgets for link building constitutes unnatural link building. You’re allowed to give yourself credit for creating a widget by linking to your site. Just use the rel=”nofollow” attribute and avoid using keyword anchor text.
- Link wheels: A link wheel is a group of Web 2.0 properties, usually blogs, resulting in the alternative name “blog network”, that connect one another through a series of links. Each blog then links back to the main site.
Google has become quite adept at finding these link schemes. While you might get a small bump early — and it will be really small because these blogs won’t have much link juice to pass — you might wind up with a link penalty, so definitely not worth the risk.
- Exact match anchor text: This isn’t technically a link scheme, but if your link profile is full of keyword-rich anchor text, it’s a clue you’re using link spam to manipulate your ranking.
Since these links were built artificially, they resulted in less useful websites appearing high up the SERPs, to the detriment of Google’s users. As a result, Google created the Penguin algorithm to find unnatural link building and punish those who use it. Sites are affected in two ways: they lose search ranking or Google applies a manual penalty. In the first case, you need to clean up your link profile and disavow any link spam you might have.
In the second case, you have to do some legwork to convince the linking domains to remove the links to your site, and then disavow those that don’t.
Since Penguin was released back in 2012, it had gone untouched. However, in late September Google added Penguin to its core algorithm. That means it now runs in real time, so you should see any changes to your link profile take effect in a matter of weeks. It’s still unclear, however, what exactly the implications are for link building and analysis.
Avoiding Penguin Problems
If you’ve just launched your website or blog, avoiding a Penguin penalty is as easy as avoiding unnatural link building techniques like those mentioned. However, if your site has been around for a while, or you’ve hired outside SEO help, you’ve probably got some bad links somewhere in your profile. In this case, you need to do a link audit to find any low quality links that could be hurting your SEO.
If you’ve got an account with WooRank, you can monitor a sample of your backlinks using an Advanced Review, giving you some early indications as to whether you have some low quality links lurking in your backlink profile. With these audits you can see the source domain, anchor text, link target and overall link quality.
If you’re unfortunate enough not to have a WooRank membership or an Advanced Review, follow our guide to conducting a link audit. When evaluating links for quality look at the following criteria:
- Anchor text: This is a strong indicator of link quality, both for an individual link and when aggregated for the entire link profile. Too much text that is an exact match to keywords is unnatural, so it’s important to have plenty of branded anchor text and generic text like “click here.” When looking at individual links, check to make sure the anchor text is relevant to both the linking and target page content.
- Page content: This is pretty subjective, but still very important. Is the linking page well designed? Does it have a good user experience, or at least try to? Is it covered with ads? Objectively speaking, check the page content for spelling, grammar, usage and keyword stuffing. As usual longer content is better than thin content. If you want an idea of the sort of pages Google doesn’t like, you can see actual examples of web spam that’s been removed from search results here.
- IP address: While not a deciding factor when evaluating individual links, it is useful in judging the quality of your overall profile. IP addresses in certain countries known for hosting spam, Russia and China in particular, should get an extra look if your company doesn’t operate there. Too many links from different domains that share the same IP address can also look suspicious.
Once you’ve got your list of low quality and unnatural links, try to get them removed. First, reach out to the owner of the site linking to you, asking for the links to be removed or made “nofollow”. Save your correspondence as proof for your reconsideration request if you’ve received a manual link penalty. If your attempts at removal fail, it’s time to fall back on Google’s disavow tool. Add your links to a plain text file, including notes on when you asked the link to be removed. It should look like this:
# contacted owner for spamlinks.com on 10/01/2016 # requesting link removal but received no response http://www.spamlinks.com/paid-links.html
You can disavow links with Bing as well under Configure My Site in its Webmaster Tools. Unlike with Google, Bing does not allow you to upload a file; you have to enter each URL individually.
It may seem like Google is out to get you with their various algorithms and penalties but they’re really not. At the end of the day they’re just trying to do what’s best for their users by providing the most relevant and high quality websites in their search results. By devaluing and penalizing sites that attempt to get to the top of the rankings by manipulating the search engine, they’re doing just that. So if you actually focus on creating a quality website with useful content and a good user experience, you shouldn’t have to worry about Pandas or Penguins too much in the end.
As you’ve probably noticed, we only covered two of several algorithm updates. They also happen to be two of the most famous algorithms, as well as the ones dedicated to fighting web spam. In part two, we’ll dig into the other major algorithms and how they’ve changed the approach to search optimization and marketing.