Web crawler visit frequency and depth are calculated by a set of adaptive algorithms that the search engine developers prefer not to disclose. When you run an advertisement campaign to promote your goods or services over the Internet, or even just update your website content, it is important to understand in what way and how fast these changes will be seen by the search engines.
Every search engine uses their own proprietary algorithms to determine the optimal crawl frequency for a website. When you analyse web crawler behaviour it is important to realise what the norm is and what is an abnormal deviation. From the statistical point of view, it is relevant to analyse web crawler behaviour as a number of visits over a period of time. This approach allows discovering sudden short-term decline in the number of web crawler visits to your website as well as long-term decreasing popularity trend. Well-timed detection of these situations helps you to take timely action and always stay in the top list of the search results. Therefore, RST Cloud provides the SEO report “Expected number of visits” which can help you to detect anomalies and discover a general trend in the number of web crawler visits to your website over a period of time. This analysis can be done for all crawlers or any crawler you choose: Google, Bing, Yahoo, Yandex and others.
Apart from the general web crawler visit statistics analysis, it helpful to analyse every web crawler behaviour individually. The SEO report “Real number of visits” provides you with that information.
After investigating the number of visits of each web crawler you can see which web crawlers rarely visit your website and take action to promote your website within the corresponding search engine.
Every web crawler has its’ unique content parsing techniques. More often than not different web crawlers will see the same web page differently. The SEO report “Return codes by crawlers” allows you to track these situations.
After you have found the discrepancies in your website presentation to different web crawlers, you can do a further analysis to detect the problematic pages and optimise them for the specific web crawlers. For example, 404 error code might indicate that a page was deleted, so, a crawler would remove it from the index. Sometimes, your internal links may contain some spelling errors in URL, therefore, you can correct these errors after the investigation.
It is important to understand how the web crawler visits frequency changes over time. The more often the web crawler visits your website, the faster the new information makes it to the search engine and becomes available in user search results. At RST Cloud we provide the SEO report “Revisit time statistics” for this purpose.
Increasing period between the web crawler visits to your website may indicate the necessity of more frequent content updates or link publications at the external resources. Additionally, it may indicate that crawler frequency of the website should be changed by editing sitemap.xml or robots.txt or configured in the crawler’s console if it is available.
<User-agent: * Crawl-delay: 5
<url> <loc>http://www.example.com/</loc> <lastmod>2016-01-01</lastmod> <changefreq>weekly</changefreq> <priority>0.5</priority> </url>
For successful website promotion and for your user convenience try to avoid dead-end pages that have no cross-references to the other pages of your website. To deal with these issues, it is important to see at which pages the indexing process starts and at which pages it ends. The SEO report “Crawlers activity profile” is provided to help you to detect such pages.
By profiling a web crawler activity, it is possible to detect these problems. If that is the case, you will often see that a web crawler starts and stops indexing at the same page. This may indicate that the page is problematic for that web crawler.
Depending on SEO strategy, a website should be adapted to different crawlers. Therefore, sometimes it is essential to look at the content of the most frequently visited pages and use the same content shaping methods at the other pages of your website. The SEO report “Most indexing pages” shows top 10 pages for a selected web crawler.
Also, if you picked up the Googlebot and then Baiduspider, you would understand that they may index your site differently, so, it would help you to optimise it for your needs.
Apart from the aforementioned functionality, RST Cloud provides you with the SEO analysis tools for classical tasks such as:
RST Cloud uses statistical mechanisms to detect patterns in the web crawler behaviour and allows for better understanding of the logic behind the web crawler behaviour. Knowledge of common web crawler behaviour tendencies, as well as individual web crawler activity, helps you to effectively promote your goods and services over the Internet.
Recent PostsUnpathed Critical Vulnerability in Magento