How do robots “see” the world? How to upgrade to a new version of Search Console No Sitemap files used by the robot
Promoting your website should include optimizing your pages to attract the attention of search engine spiders. Before you start creating a search engine friendly website, you need to know how bots view your site.
Search engines are not actually spiders, but small programs that are sent to analyze your site after they learn the URL of your page. Search engines can also reach your site through links to your website left on other Internet resources.
As soon as the robot reaches your website, it will immediately begin indexing pages by reading the contents of the BODY tag. It also fully reads all HTML tags and links to other sites.
Search engines then copy the site's content into a main database for indexing. This process in total can take up to three months.
Search Engine Optimization not such an easy matter. You must create a site that is spider friendly. Bots don't pay attention to flash web design, they just want information. If you looked at the website through the eyes of a search robot, it would look pretty stupid.
It’s even more interesting to look through the eyes of a spider at your competitors’ websites. Competitors not only in your field, but simply popular resources that may not need any search engine optimization. In general, it’s very interesting to see how different sites look through the eyes of robots.
Text only
Search robots see your site to a greater extent, as text browsers do. They love text and ignore information contained in pictures. Spiders can read about the picture if you remember to add an ALT tag with a description. Web designers who create complex websites with beautiful pictures and very little text are deeply disappointed.
In fact, search engines simply love any text. They can only read HTML code. If you have a lot of forms or javascript or anything else on your page that might block a search engine from reading the HTML code, the spider will simply ignore it.
What search robots want to see
When a search engine crawls your page, it looks for a number of important things. Having archived your site, the search robot will begin to rank it in accordance with its algorithm.
Search spiders protect and often change their algorithms so that spammers cannot adapt to them. It is very difficult to design a website that will rank high in all search engines, but you can get some advantage by including the following elements in all your web pages:
- Keywords
- META tags
- Titles
- Links
- The selected text
Read like a search engine
After you have developed a website, all you have to do is develop it and promote it in search engines. But looking at a site only in a browser is not the best or successful technique. It is not very easy to evaluate your work impartially.
It is much better to look at your creation through the eyes of a search simulator. In this case, you will get much more information about the pages and how the spider sees them.
We have created a search engine simulator that is not bad, in our humble opinion. You will be able to see the web page as a search spider sees it. It will also show the number of keywords you entered, local and outbound links, and so on.
Upgrade guide for legacy users
We are developing a new version of Search Console, which will eventually replace the old service. In this guide, we will cover the main differences between the old and new versions.
General changes
In the new version of Search Console we have implemented the following improvements:
- Search traffic data can be viewed for 16 months instead of the previous three.
- Search Console now provides detailed information about specific pages. This information includes canonical URLs, indexing status, degree of mobile optimization, etc.
- The new version includes tools that allow you to monitor the crawling of your web pages, fix related errors and submit requests for re-indexing.
- The updated service offers both completely new tools and reports, as well as improved old ones. All of them are described below.
- The service can be used on mobile devices.
Comparison of tools and reports
We're constantly working to improve various Search Console tools and reports, and you can already use many of them in the updated version of this service. Below, the new versions of reports and tools are compared with the old ones. The list will be updated.
Old version of the report | Analogue in the new version of Search Console | Comparison | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Search query analysis | The new report provides data for 16 months, and it has become more convenient to work with. | |||||||||||||
Helpful hints | Rich Results Status Reports | New reports provide detailed information to help troubleshoot errors and make it easy to request rescans. | ||||||||||||
Links to your site Internal links |
Links | We have merged two old reports into one new one and improved the accuracy of link counting. | ||||||||||||
Indexing status | Indexing report | The new report contains all the data from the old one, as well as detailed information about its status in the Google index. | ||||||||||||
Sitemap report | Sitemap report | The data in the report remains the same, but we have improved its design. The old report supports testing the Sitemap without submitting it, but the new one does not. | ||||||||||||
Accelerated Mobile Pages (AMP) | AMP Page Status Report | The new report adds new types of errors for which you can view information, and also allows you to send a rescan request. | ||||||||||||
Manual measures | Manual measures | The new version of the report provides a history of manual actions taken, including information about review requests submitted and review results. | ||||||||||||
Google crawler for websites | URL Checker Tool | In the URL Inspection Tool, you can view information about the version of the URL included in the index and the version available online, and submit a crawl request. Added information about canonical URLs, noindex and nocrawl blocks, and the presence of URLs in the Google index. | ||||||||||||
Ease of viewing on mobile devices | Ease of viewing on mobile devices | The data in the report remains the same, but working with it has become more convenient. We've also added the ability to request a page be rescanned after mobile viewing issues have been fixed. | ||||||||||||
Scan Error Report | Indexing report And URL checking tool |
Site-level crawl errors are shown in the new indexing report. To find errors at the individual page level, use the new URL inspection tool. New reports help you prioritize issues and group pages with similar issues to identify common causes. The old report showed all errors for the last three months, including irrelevant, temporary and insignificant ones. A new report highlights issues important to Google that have been uncovered over the past month. You will only see issues that could cause the page to be removed from the index or prevent it from being indexed. Issues are shown based on priority. For example, 404 errors are only marked as errors if you requested the page to be indexed through a sitemap or other method. With these changes, you'll be able to focus more on the issues that affect your site's position in Google's index, rather than having to deal with a list of every error Googlebot has ever found on your site. In the new indexing report, the following errors have been converted or are no longer shown: URL Errors - For Desktop Users
URL Errors - For Smartphone UsersCurrently, errors occurring on smartphones are not shown, but we hope to include them in the report in the future. Site errorsIn the new version of Search Console, site errors are not shown. |
||||||||||||
Security Issue Report | New security issue report | The new Security Issues Report retains much of the functionality of the old report and adds a history of site issues. | ||||||||||||
Structured Data | Rich Results Checker Tool And rich results status reports | To process individual URLs, use the Rich Results Inspector or URL Inspection Tool. You can find site-wide information in the rich results status reports for your site. Not all rich results data types are available yet, but the number of reports is growing. | ||||||||||||
HTML optimization | – | There is no similar report in the new version. To create informative page titles and descriptions, follow our guidelines. | ||||||||||||
Blocked resources | URL Checker Tool | There is no way to view blocked resources across the entire site, but using the URL Inspection tool you can see blocked resources for each individual page. | ||||||||||||
Android Applications | – | Starting March 2019, Search Console will no longer support Android apps. | ||||||||||||
Resource Kits | – | Starting March 2019, Search Console will no longer support resource sets. |
Do not provide the same information twice. Data and queries contained in one version of Search Console are automatically duplicated in the other. For example, if you submitted a review request or sitemap in the old Search Console, you do not need to submit it again in the new one.
New ways to perform common tasks
The new version of Search Console performs some legacy operations differently. The main changes are listed below.
Features not currently supported
The following features are not yet implemented in the new version of Search Console. To use them, return to the previous interface.
- Scanning statistics (the number of pages scanned per day, their loading time, the number of kilobytes downloaded per day).
- Checking the robots.txt file.
- Managing URL parameters in Google Search.
- Marker tool.
- Reading and managing messages.
- "Change Address" tool.
- Specifying the primary domain.
- Linking a Search Console property to a Google Analytics property.
- Disavowing links.
- Removing obsolete data from the index.
Was this information useful?
How can this article be improved?
We've released a new book, Social Media Content Marketing: How to Get Inside Your Followers' Heads and Make Them Fall in Love with Your Brand.
Robot crawlers are a kind of stand-alone browser programs. They go to the site, scan the contents of the pages, make a text copy and send it to the search database. Its indexing in a search engine depends on what crawlers see on your site. There are also more specialized spider programs.
- “Mirrorers” - recognize duplicate resources.
- “Woodpeckers” determine the accessibility of the site.
- " " - robots for reading frequently updated resources. As well as programs for scanning pictures, icons, determining the frequency of visits and other characteristics.
What does the robot see on the site?
- Resource text.
- Internal and external links.
- HTML code of the page.
- Server response.
- The robots. txt is the main document for working with the spider. In it you can set some parameters to attract the robot’s attention, while others, on the contrary, cannot be viewed. Also, when you visit the site again, the crawler uses exactly this file.
In what form does the robot see the site page?
There are several ways to look at a resource through the eyes of a program. If you are a website owner, then Google came up with Search Console for you.
- Add a resource to the service. Read how this can be done.
- After that, select the “View as” tool Googlebot ».
- Click “Receive and display”. After scanning, the result will be like this.
This method displays the most complete and accurate picture of how the robot sees the site. If you are not the owner of the resource, then there are other options for you.
The simplest is through a saved copy in a search engine.
Let's assume that the resource has not yet been indexed and you cannot find it in a search engine. In this case, to find out how the robot sees the site, you need to perform the following algorithm.
- Install Mozila Firefox.
- Add a plugin to this browser.
- A bar will appear below the URL field in which we:
in “Cookies” select “Disable Cookies”;
in “Disable” click on “Disable JavaScript” and “Disable ALL JavaScript”. - Be sure to reload the page.
- All in the same tool:
in “CSS” click on “Disable styles” and “Disable all styles”;
and in “Images” check the “Display ALT attributes” and “Disable ALL images” checkboxes. Ready!
Why do you need to check how the robot sees the site?
When a search engine sees one information on your site, and the user sees another, it means that the resource appears in the wrong search results. Accordingly, the user will hastily leave it without finding the information he is interested in. If a large number of visitors do this, then your site will drop to the very bottom of the search results.
You need to check at least 15-20 pages of the site and try to cover all types of pages.
It happens that some cunning people deliberately pull off such scams. Well, for example, instead of a website about soft toys, they are promoting some casino “Kukan”. Over time, the search engine will (in any case) detect this and send such a resource under filters.