Quick Links
Google Search Console (GSC) is a free SEO tool that you should definitely use if you aren’t already. It’s no secret that Google is the largest search engine out there, so you’ll want to know how Google views your website.
For example, if one of your pages isn’t indexed, it won’t show up in the Google search results.
In other words, your page is invisible to Google – and you’re getting zero SEO value from it.
That’s why Google Search Console is such an invaluable tool. One of its key features will let you know if Google ran into any errors when attempting to crawl and index your site.
As a result, you’ll be able to instantly identify any pages that aren’t indexed so you can fix the crawl issue (or another type of error).
Beyond that, GSC provides detailed reports and analytics on how your website is performing organically. That’s powerful information to have – as it lets you measure how effective your SEO tactics are, according to Google.
That’s why I put together this extensive guide for identifying and resolving Google Search Console errors. Read on to learn how to use the index coverage report to resolve problems and improve your overall SEO strategy.
Understanding Google’s Indexing Process
In order for your website to show up on search engine results pages (SERPs), it must go through the following three processes:
Discovering
Before Google can crawl and index your site, it has to discover it first. The most common way to discover a website is to process its XML sitemap. Google can also follow on and off-site links to discover websites, as well as other methods.
Besides indexing errors, issues can arise during the discovery process as well. The good news is you can upload your XML sitemap to GSC to ensure it discovers your site.
Crawling
Once Google discovers the site, it’s in the queue for crawling. During a crawl, Googlebot will gather the metadata, title tags, alt tags, and more for the indexing process. Once the crawl is complete, it will request indexing.
Indexing
This is the last phase of the process, and it’s where Googlebot attempts to make sense of the information from the crawling phase. In other words, the indexer will determine how relevant the content is for a search query.
As you can see, your website MUST make it past the indexing phase for it to show up on search engines. Errors can occur during each phase, so it’s critical to familiarize yourself with GSC’s error reports.
Too technical? Check out HOTH Technical SEO.
The Basics of Google Search Console
Now that you know why GSC is such a powerful SEO tool – it’s time to learn how to use it. Before you can dive into its numerous features, you’ll need to verify ownership of your website within GSC. Otherwise, the Console won’t know which domains you own – so it won’t have any analytics to show you.
Also, you’ll want to claim ownership of all your domains and subdomains. That includes all the different variations of your domain, such as:
- http://yourwebsite.com
- https://yourwebsite.com
- http://www.yourwebsite.com
- https://www.yourwebsite.com
- Any other subdomains that don’t contain ‘www’ (i.e., blog.yourwebsite.com)
Google will treat each of these variations as a separate website – which is why it’s imperative to claim ownership of them all. If you forget even one of them, you’ll miss out on crucial reports and data for it.
To verify ownership, you’ll need to head to your website’s verification settings page. If you use Yoast SEO on WordPress, you can easily verify your website by using an HTML tag. To do so, head to the Verify Ownership page on GSC. That’s where you’ll find the HTML tag, so make sure to copy it.
Inside WordPress, open the Yoast SEO plugin and paste the code into the ‘Google Verification Code’ text box in the ‘Webmaster Tools’ tab.
The performance tab
Now that you’ve verified ownership of your domains, you can now use the Performance Tab to view your analytics. In it, you’ll be able to clearly see which keywords you’re ranking for on Google. There’s also a myriad of other helpful information and metrics, such as:
Number of clicks
This metric lets you know how many people actually clicked on your website through Google’s SERPs. If your number of clicks is low – it can be a sign that your title tags and meta descriptions need some work. If users aren’t clicking on them, they aren’t enticing enough to warrant a click.
Total impressions
What’s an impression? It’s whenever your website pops up in the results for a given keyword. An example would be if you sell guitars and your website shows up for the keyword ‘guitar sales.’ That would be considered an ‘impression,’ and this metric measures how many you receive.
Average CTR (click-through rate)
Your click-through rate refers to the number of times that you showed up in a search and a user clicked through to your web page. In general, higher rankings translate to higher click-through rates. If you want to strengthen your CTR, try rewriting your title tag and meta description to make them more appealing (you can try including a call-to-action, for example).
Average position
This refers to the average ranking position you had for a particular keyword or page. Yet, this tends to be the most unreliable metric of the bunch – as it can vary heavily from user to user. At the same time, it’s a useful metric to gauge if the other three metrics are accurately reflecting your position.
The performance tab will be your go-to when measuring the success of your SEO efforts. Measuring, analyzing, and tweaking your results through GSC can bolster your digital marketing strategy – so it’s a tool well worth your while.
Index coverage tab and reports
Remember, your pages won’t bring you any SEO value if they aren’t indexed by Google. To make sure that Google has indexed each submitted URL, you can head over to the Index Coverage Tab.
This page will let you know how many of your pages are in Google’s index since the latest update, how many are not, and if any errors occurred during indexing.
Note: You’ll want to view the Index Coverage Report for not only the primary version of your website but all its versions as well. (i.e., the domain variations we listed before, such as http://www.yourwebsite.com)
To keep it simple, we’ll focus on troubleshooting the primary version of your website for now.
To run the report, pull up the Index Coverage Tab first. You’ll see a dashboard containing your index coverage, performance search results, and more. Scroll to the Index Coverage graph, and click on Open Report in the top-right hand corner.
From here, you’ll be able to discover common Google Search Console errors, indexing issues, coverage issues, and other types of problems. It’s also where you’ll be able to do maintenance to resolve and prevent these errors. The report breaks issues down into four different categories:
- Errors. This is where GSC will notify you of any major problems that took place during the discovery, crawling, and indexing process.
- Valid. These pages were indexed with no problems.
- Valid with warnings. The pages were indexed, but there are a few issues that you should look at.
- Excluded. Pages that were not indexed because they were set to noindex. Examples of pages that you want to leave out of the indexing process include admin pages, thank you pages, author archives, and more. In other words, you should noindex pages that you don’t want to drive traffic to.
You’ll want to pay the most attention to the Errors category, as it contains problems with important pages that you wanted to get indexed but weren’t. There are many types of errors that can occur, including:
- Server errors (5xx)
- Redirect errors
- Blocked by robots.txt file
- Marked ‘noindex’
- Soft 404 errors
- 404 Not Found
Let’s dive into each issue to discover why they occur, how to fix them, and how to prevent them from happening again in the future.
Server Errors (5xx)
There are a few different types of server errors – but the most common is that your server took too long to respond. When a Googlebot crawls a website, it only waits for a set period of time for the server to load. If it’s taking too long, the Googlebot will give up, and the request will time out. Since the bot can’t crawl the site, it won’t get indexed, either.
People often confuse server errors with DNS (Domain Name System) errors.
A DNS error means the bot can’t even look up your URL in the first place, making it a discovery error, not a crawling one.
Server errors take place during the crawling phase. Google can discover your URL, but the server fails to load in time for the bot to crawl it. Whenever a problem with the server occurs, it will show up as 5xx on the GSC.
What’s that mean?
5xx refers to any HTTP code that begins with 5. GSC uses 5xx to represent 500, 502, and 503 errors, among others, but they’re the most common. Here’s what each error code means:
- 500: Internal Server Error. For whatever reason, technical issues are causing the server to delay processing the request. It could be a coding error in the CMS, improper PHP code, or a thousand other reasons.
- 502: Bad Gateway. You’ll get a 502 whenever the request is delayed due to an upstream service not responding. This upstream service could be running on the same machine or another machine. Regardless, something is causing it to malfunction and not respond within time. If you get a 502, it may be due to a problem with your WordPress CMS.
- 503: Service Unavailable. If your server is too busy or is down for maintenance, you’ll get a 503 error. It means that the server is temporarily unavailable, but will be back later. If your server is contending with a heavy amount of traffic, it may trigger a 503 error if a bot is trying to crawl your website.
Those are by no means the only error codes starting with 5 (there are PLENTY more), but these tend to be the most prevalent.
How urgent is a server error?
If a server error pops up on GSC, you should strive to fix it as soon as possible. Server errors are incredibly urgent as they’re fundamental errors that will harm your site and negatively affect your search engine optimization.
The first step is to ensure that Google can discover your website. As such, you should make sure that the Googlebot can connect to the DNS. Once you rule that out, you’ll know that you’re dealing with a server error occurring during the crawling phase.
In addition to fixing server errors, you should put preventative measures in place to stop them from happening again.
That’s because if server errors pop up on GSC, it’s a good sign that they’ve occurred before. If your website is running fine when they show up, it may have caused trouble in the past. That’s why you’ll want to make sure that they don’t happen again.
How to fix server errors
Google has an official diagnostic to run to see if the search engine can crawl your website or not. It’s called Fetch as Google – a webmaster tool that you can use on the GSC. If it shoots back the content of your homepage (or another specific page) untouched, you know that Google can crawl your website without any issues.
You’ll also want to diagnose the specific type of server error that you’re experiencing. For example, is it an internal server error or a bad gateway? Knowing this is imperative if you want to resolve the problem – so pay attention to the 5xx code you get from GSC. Here are a few methods to try to fix server errors:
- Refresh the page. The problem may be temporary, so a simple refresh may be all you need to do to fix it.
- Clear the browser cache. This is another simple fix to try before you dive deeper into debugging your site. If it doesn’t work, you know something else is the culprit.
- Check your CMS. There may be corrupted files in your WordPress (or other CMS) database. Try reinstalling plugins and themes, as well as reinstalling WordPress. You should also check your Javascript and CSS, as faulty lines of code may be causing your issues.
- Check your PHP memory limit. Lastly, you may have exhausted your PHP memory limit – which can cause server issues.
These are by no means the only fixes, but they should help you get started with diagnosing the issue. For additional help, you can consult the GSC help page to fix server errors.
Redirect Errors
Sometimes Google runs into redirect errors with URLs. If you’ve redirected your URL more than a few times, this type of error can occur. The common causes of redirect errors include:
- A redirect loop. If you’ve used redirects for a while, you could have inadvertently created a redirect loop. That’s where redirects lead to other redirects and never point to a live URL.
- The redirect chain was too long. Even if the redirects don’t loop, sometimes there are too many redirects in a row, and the Googlebot gives up.
- A bad or empty URL in the chain. All it takes is one bad apple to ruin a redirect chain. If the bot comes across a bad URL or an empty one, that’s the end of that. The bot will give up and display a redirect error.
- The redirect URL exceeded the max character length. A URL can’t be too long, or it will exceed the maximum length. If that happens, the bot won’t crawl the website.
Since Google has so much content to crawl, it doesn’t mess around with lengthy redirects. The good news is that you can solve these issues by using a single redirect that goes to the final URL. You can also use a URL inspection tool to uncover and fix redirect errors.
How to Fix Redirect Errors
To fix a redirect error, you’ll need to identify the original redirect and the final URL. There are various SEO tools out there that can help with this, such as SEO Minion. It will provide the entire redirect path for you to observe. That way, you can identify which area needs tweaking.
Do your best to cut out the middle steps – as they’re likely what’s causing the redirect error. Instead, keep it simple with one redirect and one URL at the end. Once you’ve done that, run GSC’s Index Coverage Report again to see if the problem is gone.
Blocked by Robots.txt File
If this error pops up, it means Google could not retrieve your robots.txt file.
What’s that?
A robots.txt file enables you to have page content that you do not want search engines to index. As stated before, there are numerous reasons why you wouldn’t want certain pages to show up on search engines. These are primarily admin pages that contain no value to readers and aren’t part of your SEO strategy.
Note: You only need a robots.txt file if you have web pages that you don’t want to index. If you don’t have a problem with Google crawling and indexing every page on your site, you don’t need one. All that will happen is that Googlebot will index your entire website. That’s not a big deal for smaller sites, but larger sites often have pages that they want to keep secret with a noindex’ tag.
If you have a robots.txt file, but Google cannot load it, this error will occur. It’s an urgent issue because your website won’t be crawled or indexed until you fix it.
How to fix blocked by robots.txt file
A reliable way to fix this issue is to use a robots.txt tester. It will let you know if there’s an issue with the file or not. Beyond using this tool, you’ll want to do a manual inspection to make sure that the file is properly configured.
Go through the file and make sure that it’s not crawling any pages that you don’t want it to.
Beyond that, you’ll want to look for one line of code in particular: ‘Disallow: /’. If you see it, eliminate it immediately. It should not be in your robots.txt file, as it will prohibit your website from showing up on Google.
If you’re still experiencing issues and don’t know why it’s best to delete your robots.txt file for the time being. That’s because it’s better to go without a robots.txt file than have one that’s misconfigured. If you don’t have one, Google will crawl your website like normal. If you have one that’s not set up right – the crawling won’t take place until you resolve the issue.
Marked ‘Noindex’
This is a common problem, but it’s also a relatively simple one to fix. What happens is you tell Google to crawl a page – but you don’t know or remember that the page has a noindex tag. In other words, you’re giving Google some seriously mixed signals.
You may have noindexed pages by mistake as well. For example, an X-robots tag HTTP header response can noindex page. As such, these are more difficult to spot.
To fix a URL marked as noindex, you need to remove the noindex directive or HTTP response. To discover which pages are marked as noindex, you can check your robots.txt file and HTTP responses. Once you remove the tag, the issue will resolve. So if you see a marked noindex error show up in your notifications, don’t panic. All you’ll need to do is remove the noindex tag, and you’ll be good to go.
Soft 404 Errors
You’ve likely run across a 404 Not Found page during your time browsing the internet – but what’s a Soft 404?
A Soft 404 occurs whenever a page displays 200 (which means found) but should display 404 (Not Found). In other words, the page says that it’s there, but it really isn’t. It’s a discrepancy that you won’t see on the content side of the page. All you’ll see (and users will see) is what appears to be a standard 404 Not Found page.
Yet, on the crawler-visible side – it’s seeing a 200, not a 404 or 410 (which means gone). That’s what causes a ‘soft’ 404 error, and they can be very puzzling if you don’t know what causes them. It means the header HTTP does not return the 404 or 410 code in response to a non-existent page.
If a page doesn’t exist, you should always return it with a 404 or 410. Fixing this issue is as simple as correcting the HTTP header code.
Besides 200, a Soft 404 can occur due to a 301 redirect that points to non-related pages. An example would be a 301 redirect that sends you back to the homepage. So if you’re going to use 301 redirects, make sure that they’re for related content. If you use a ton of 301 redirects that point to the homepage, they’ll likely show up as Soft 404s.
The urgency of a Soft 404 depends on the pages it appears on. Pages that aren’t crucial to your operations that have Soft 404s aren’t very urgent. Yet, if Soft 404s show up on essential pages to your business, you’ll want to solve them ASAP.
404 Not Found
You’ll see a 404 Not Found error whenever Google attempts to crawl a web page on your site that doesn’t exist. It finds these errors whenever other pages or sites link to a non-existent page.
In Google’s Guidelines, they state that 404 pages on your site will not affect your ability to rank in the SERPs.
Yet, that doesn’t mean that you should ignore them by any means. While a few 404s on non-essential pages won’t mean much, 404s on crucial pages will hurt your SEO. Also, sometimes 404 errors only show up on mobile devices such as smartphones.
You’ll have a few options when dealing with 404 errors. If the page is of no relevance to your business, it’s ok to let it 404. If you’d rather redirect users to a relevant page, you can use a 301 redirect.
You can also check your CMS to make sure that the page is indeed published and not only in draft mode. If it’s listed amongst your valid pages, then you’ll know something else is at foot. Also, make sure that the URL variation is correct for the 404.
Concluding Thoughts: Common Google Search Console Errors
That’s a breakdown of the most prevalent crawl errors that may show up on GSC. Remember, to rank on search engines – you’ll need a Googlebot to be able to successfully discover, crawl, and index your site.
Numerous errors can occur at each of these phases, which is why GSC is such a useful tool. Remember to check the response code for each URL to make sure they add up. Also, you should test live URLs to make sure there are no redirect errors.
As long as you keep a keen eye on the Index Coverage Tab and your valid URLs, you should be able to keep your domains running smoothly. That way, Google won’t have trouble crawling and indexing them, and your SEO efforts will get the chance to bear fruit.
Do you want your domains to run error-free so you can generate the most traffic and revenue? If so, please don’t wait to schedule a call with our expert consultants. Our team can help you revolutionize the way you approach SEO.
If you don’t have time to run your own strategy, we’ll take care of everything for you with our HOTH X fully-managed SEO services.
Very useful information provided