To Index, Or Not To Index...
It's an important question, but first things first. What’s an index … ?
An index is another name for the database of web pages used by a search engine.
Imagine it like the table of contents in a book with each of your webpages being a chapter. If it is not indexed, it is not shown in the table of contents.
Index is a critical term to understand in SEO. It impacts the value of your keywords and how your site stacks up against your competitors. If your website is not in the search engine’s index it will not be found by users. This could cost you customers. This is where Google (and friends) collect all the information about all the web pages in their universes. If your site or webpage is not in an index, according to the search engine it doesn’t exist.
Have we triggered an existential crisis yet?!
We’re here to help you navigate the world wide web! As a starting point for Indexes, we have put together a quick guide for:
What search engines include in their index;
What to avoid sharing with search engines via index;
How to improve your site through indexing.
What search engines include in their index
As mentioned above, the index is a database of web pages on the internet. But that description is a bit misleading. The Index includes, not only your web pages, but also:
everything within the HTML code of the URL.
If a web page is indexed, Google will be able to crawl and index that page. Once you no-index a page, Google will still be able to find it. However, it is automatically being categorize as “not important for search results”. This is why your no-indexed pages are not found nor generating any traffic by themselves. This allows search engines to focus on what's important in your site.
Also good to note that by default, every WordPress post and page is indexed. You can check what from your site has been indexed on Google Search Console.
What to avoid sharing with search engines via index
Though it may be hard to hear, not all of the pages on your website are important. If a search engine robot finds an irrelevant page, this could actually hurt your SEO ranking.
It’s a bit of a catch-22: if you want search traffic, search engines need to be able to find you and your content. But, if they find you and they don’t like what you’ve posted, you're penalized. They claim that you won’t be, they just won’t show your website in the search results as they deem it to be irrelevant and unsuitable for the searcher. This of course means that you are PENALIZED.
So, as tempting as it is to index as many pages as you can on your website in order to hopefully hit good rankings, it is not always the best approach.
All of your SEO efforts should be working to direct traffic to relevant pages. We recommend avoiding an index of unimportant or irrelevant pages from coming into results.
Don’t go overboard with indexing since it may confuse the search engine, or interfere with how it understands your site. Here are some pages that can you should mark as no-index to avoid this “Index Bloat”
Thank you pages - these pages are usually thin content pages, with upsell and social share options, but no value, content-wise. No reason for these pages to show up in the search result pages!
Login or Admin pages - seems obvious, but as mentioned above, on Wordpress all pages are automatically indexed. Be sure to add a noindex tag unless this is something you want your community to be able to access to interact with your product.
Old blog posts, case studies or media releases - these are interesting to you, but get little to no traffic. Pages that you want to keep on your site but don’t necessarily want or need to be found can be filed away. Just be sure to double check that they aren’t being indexed.
Internal search results - these are pretty much the last pages Google would want to send its visitors. If you want to ruin a search experience, you link to other search pages, instead of an actual result. But the links on a search result page are still very valuable, you definitely want Google to follow them.
There may be more work than just removing obviously irrelevant pages. Do you know how many pages your website has? Perhaps you think there’s only 5-10, but run it through any SEO Analysis Tool. You’re likely to be surprised by the results!
Even if the overall number of pages on your site remains stable, there’s a chance you’re carrying unnecessary pages from the past. These pages could deplete your relevancy scores as Google makes changes to its algorithm.
How to improve your site through indexing
There are two easy methods of making sure the content that you ant indexed is being included in search engine’s index: URL Inspection and Sitemaps
Google makes it easy to index relevant content. They’ve even provided us with a number of tools to do so, including the URL Inspection tool. This is part of Google Search Console (formerly known as Google Webmastertool). This tool allows you to submit a URL, see if it’s been crawled by Google, and if not, tell Google to crawl it. This puts it in their priority crawl queue. Put simply, that means Google has a list of URLs to crawl and yours will be put into the priority. Once you do that, it will get crawled faster and indexed faster.
Another technique to make sure Index is working for you is using sitemaps. If you're not using sitemaps, start. It is the quickest way to get your URLs indexed. When you have URLs in your sitemap, you let Google know that they're actually there. There's a number of different techniques that can optimize this process further.
The most basic way is to put it in your robots.txt manually. You can even list multiple sitemaps. Another way is to do using the Search Console Sitemap Report, another report in Google’s Search Console. You can go in there and submit sitemaps super easily. You can also validate or even remove sitemaps.