Trying to figure out the logic Google uses to index a URL into SERP

So almost all pages of my blog prahladyeri.github.io are indexed by Google except a few like this one which isn’t so popular and nor do I remember sharing it widely.

But even if I didn’t share, Google should know about it since it’s the omniscient one, right? And if it knows, why didn’t it index this particular page? Can you find something odd about this page which can cause it to not get indexed unlike the other pages of this same website?

Can you add a Gihub-hosted site to Google Search Console? I am not sure if it would be possible to verify the ownership of the domain. I think there is an option to add an HTML file to the domain in order to verify ownership.

If you were to get it into Google Search Console, then you could get crawl errors and also manually submit URLs.

1 Like

I was able to verify my github pages subdomain by adding their specified html file to the repository. Even primary domains can be verified that way, it’s just that you get fewer features compared to full domain verification by way of adding TXT records, etc.

If the site is added to GSC, is it giving any errors? Does it show up as indexed?

1 Like

Yes it does.

Google search says that 101 pages are indexed from my own blog prahladyeri.github.io.

And if we talk about all the blogs or websites in that subdomain, the number comes to a staggering 33 million!

You need to go to Google Search Console and look for errors on why pages are not being indexed.

In my case, most of the URLs fall in “Crawled but not indexed” category which isn’t much of a help. When I click the “Learn More” button above this result, it leads me to this help page which also isn’t much of a help in determining exactly what is wrong with this content.

Use the URL inspection tool on those URLs to see if you get more details on why it’s not crawled. You can also request re-indexing of the pages.

Make sure that you do not have these blocked in any way with a robots.txt file or meta tags. If you have a canonical URL set, make sure it’s correct.

Good luck.

Did you check Robots.txt and the meta tags? I think there might be something wrong with these elements.

If a specific page on your blog isn’t indexed by Google, consider the following potential issues:

Ensure the page doesn’t have a noindex meta tag.
Check if the robots.txt file blocks the page.
Verify if there are sufficient internal links pointing to the page.
Use Google Search Console to check for crawling or indexing errors.
Ensure the content is unique and valuable, not duplicated.

Review these aspects to identify why the page might not be indexed.

I mean, if i stick the title of one of OP’s “Crawled - currently not indexed” articles into google, i find the page. So most likely the answer is “Wait.”

1 Like

Google might not index your page for several reasons. Ensure the page has internal links, check your robots.txt and meta tags to make sure they’re not blocking it, and verify the content is unique and valuable. Also, include the page in your XML sitemap and consider promoting it to attract more traffic and backlinks.

Thin content may be one reason. Is your content unique and valuable to users? Another reason could be that Google isn’t rendering your page correctly. When you test the live URL, is Google rendering it correctly as a user sees it?