Top

How to Smash Nasty Crawl Errors

crawl errors

How to Smash Nasty Crawl Errors

Once you have optimized and adjusted the way Google indexes your site, you need to check the Crawl section of Google Search Console to correct any problems that were found by the Googlebot. This is a vital step because if the Googlebot has issues with or cannot find a lot of pages on your site, it will think that your site is down or may pose a hazard to visitors. This can severely affect your page ranking if Google doubts the relevance of your website.

Crawl Errors

Crawl Errors are a list of URLs on a website that the Googlebot attempted to index but could not. If these errors appear, there is no reason to be alarmed because many can easily be corrected. They might be 404 Errors, meaning the page doesn’t exist or the name was changed, or they could be a 500 Internal Server Error, which indicates something has gone wrong on the website’s server.

404 errors are common and not difficult to fix. If you are getting 404 errors, you may have deleted a post or changed the name of a URL, linked to a post that no longer exists on your site, or linked to an external page that is no longer there. If it is an internal link that no longer exists, find a post that is meaningful to the subject of your post internally. In the case of a missing external link, you should find another external link that is similar to your subject to replace it. For other 404 errors, sometimes you only need to fix a broken link.

A 500 error is also nothing to panic about, but it is a bit more challenging than a 404 Error. For 500 Internal Server Errors, it is possible that the server had a glitch that has since been corrected, but the Googlebot needs to be updated. The error can show up for a variety of reasons. The video below contains more information on such crawl errors:

Crawl Stats

The Crawl Stats tool chronicles the Googlebot activity on your website for the last 90 days. It takes into account all of the files your website it has downloaded, including CSS, JavaScript, PDF’s, and image files. It records in the Crawl Stats report how many pages, kilobytes, and time it took to download per day. You will see spikes when you have added a lot of new information or have information Google deems to be very useful. At Netsville, for example, we recently had a large spike for this article.

Fetch as Google

Fetch as Google is a tool which checks whether Google can access a particular page on your site. It will return the output code and HTML for the page. Fetch and Render accesses the page, renders how it will be displayed, and checks whether it can access page resources like images or scripts.

google-485611_1920

Robots.txt

A robots.txt file directs Googlebot and other search engine robots (web crawlers) on how to crawl and index pages on a website. The robots.txt tester tool shows you the last time Google read the robots.txt file and its code. It can be set to block all web crawlers from all content or from a specific folder on a site.

Moz.com has a very good cheat sheet on the number of ways it can be set up. You may want to block them from scanning non-public parts of your site or indexing duplicate content. You can test pages to see if they are blocked and you can instruct it to allow or disallow certain pages. It also enables you to verify that you are not blocking pages that should not be blocked.

At the bottom of the page in this section, you can additionally observe how many syntax warnings and logic errors appear on your site. Syntax warnings are characters or strings of code that are incorrectly written and fail to execute a command. Logic errors errors are bugs in a program that cause it to operate incorrectly. Both need to be fixed so the server does not fail to output the page.

abstract-1278061_1920Sitemaps

Having a sitemap is beneficial because if you are continuously updating your content, it will help you maintain a higher search ranking. Using a sitemap alerts Google that your content has been updated. Ideally, sitemaps should be updated every time you add new content to your site so that it will be indexed and visible on Google’s search as soon as possible. In other words, if you have a very large website, you will want to update the sitemap frequently.

URL Parameters

URL Parameters are added pieces of text that appear at the end of a permalink such as a country country code if your website sells products globally. They tell Google that you have multiple web pages which contain the same content aimed at different countries, defining that there are differences between the pages (hence the unique parameters). The URL parameters tool will tell you if Googlebot is experiencing any problems and the URL Parameters need to be configured.

It is not recommended that you do anything with this section unless you are very familiar with how parameters work. If you have reason to believe that problems exist, contacting an IT technician would be the best option to avoid breaking anything on your site.

security-265130_1920Security Issues

If your site has been attacked, you will receive a notice in the security issues sections. In the event you receive a warning, it is possible that your site has been hacked and users will see a warning about your site in the search engine results page (SERP). This is another problem where the best solution would be to get an IT technician involved.

Other Resources

Google is great with providing extra help when you need it. On the Other Resources page you will find information outside of Google Search Console to help you optimize your site even further plus a link to Webmaster Academy to advance your knowledge of all of the tools we discussed in this series of articles. Here is the list of all the resources Google provides for help and testing:

Resources 

Based in Rochester, New York, Netsville is an Internet Property Management company specializing in managing the Digital Marketing, Technical, and Business Solutions for our customers since 1994. For more information, please click here.
No Comments

Post a Comment