Indexing a site in search engines. How is the site indexing in Yandex and Google?

What is site indexing? How is it going? You can find answers to these and other questions in the article. Web indexing (indexing in search engines) is the process of attaching information about a site to a database by a search engine robot, which is subsequently used to search for information on web projects that have undergone such a procedure.

Data on web resources most often consists of keywords, articles, links, documents. Audio, images, and so on can also be indexed. It is known that the algorithm for identifying keywords is dependent on the search device.

There is some limitation on the types of indexed information (flash files, javascript).

Sharing management

Indexing a site is a complex process. To manage it (for example, to prohibit the inclusion of a particular page), you need to use the robots.txt file and such regulations as Allow, Disallow, Crawl-delay, User-agent and others.

site indexing

Also, for indexing, use the <noindex> tags and the <nofollow> props, which hide the contents of the resource from Google and Yandex robots (Yahoo uses the <nofollow> tag).

In the Goglle search engine, new sites are indexed from a couple of days to one week, and in Yandex - from one week to four.

Do you want your site to appear in search engine results? Then it must be processed by Rambler, Yandex, Google, Yahoo and so on. You must inform the search engines (spiders, systems) about the existence of your website, and then they will scan it in whole or in part.

Many sites are not indexed for years. The information that is on them, no one sees, except their owners.

Processing methods

Site indexing can be done in several ways:

  1. The first option is manual addition. You need to enter your site data through the special forms offered by search engines.
  2. In the second case, the search engine robot itself finds your web site using links and indexes it. He can search your site for links from other resources that lead to your project. This method is most effective. If the search engine found the site in this way, it considers it significant.

The timing

Site indexing is not too fast. Dates are different, from 1-2 weeks. Links from authoritative resources (with excellent PR and Tits) significantly speed up the placement of the site in the search engine database. Today, Google is considered the slowest, although until 2012 it could do the job in a week. Unfortunately, everything is changing very quickly. It is known that Mail.ru has been working with websites in this area for about six months.

Yandex site indexing

Not every specialist is capable of indexing a site in search engines. The timing of adding new pages to the database of pages already processed by search engines is affected by the frequency of adjusting its content. If fresh information constantly appears on a resource, the system considers it often updated and useful for people. In this case, her work is accelerated.

You can follow the progress of website indexing on special sections for webmasters or on search engines.

Changes

So, we have already figured out how the site is indexed. It should be noted that search engine databases are often updated. Therefore, the number of pages added to them in your project can vary (both decrease and increase) for the following reasons:

  • search engine sanctions to the website;
  • the presence of errors on the site;
  • change search engine algorithms;
  • disgusting hosting (out of reach of the server on which the project is located) and so on.

Yandex answers to common questions

Yandex is a search engine that many users use. She ranks fifth among the search systems of the world in the number of processed research requests. If you added a site to it, it may take too long to add to the database.

Adding a URL does not guarantee indexing. This is just one of the methods by which the system robot is informed that a new resource has appeared. If the site does not have links from other websites or there are not many, adding it will help to detect it faster.

google site indexing

If indexation has not occurred, you need to check if there were any failures on the server at the time of creating an application from the Yandex robot. If the server reports an error, the robot will complete its work and try to execute it in the order of a comprehensive bypass. Yandex employees cannot increase the speed of adding pages to the search engine database.

Indexing a site in Yandex is quite a difficult task. You do not know how to add a resource to the search engine? If there are links to it from other web sites, then you don’t need to add a website specifically - the robot will automatically find and index it. If you do not have such links, you can use the "Add URL" form to tell the search engine about the existence of the website.

Keep in mind that adding a URL does not guarantee indexing of your creation (or its speed).

Many people wonder how long it takes to index a site in Yandex. Employees of this company do not give guarantees and do not predict the timing. As a rule, since the robot found out about the site, its search pages appear in two days, sometimes in a couple of weeks.

Processing process

Yandex search engine

Yandex is a search engine that requires accuracy and attention. Site indexing consists of three parts:

  1. The crawler crawls the resource pages.
  2. The content (content) of the site is recorded in the database (index) of the search system.
  3. After 2-4 weeks, after updating the database, you can see the results. Your site will appear (or not appear) in the search results.

Indexing Check

How to check site indexing? There are three ways to accomplish this:

  1. Enter the name of your company in the search bar (for example, Yandex) and check each link on the first and second page. If you find there the URL of your brainchild, then the robot has completed its task.
  2. You can enter the URL of your site in the search bar. You can see how many Internet sheets are shown, i.e. indexed.
  3. Register on the webmaster pages in Mail.ru, Google, Yandex. After you go through the verification of the site, you can see the results of indexing and other search engine services created to improve the operation of your resource.

Why does Yandex fail?

The site indexing in Google is carried out as follows: the robot enters into the database all pages of the site, of poor quality and quality, without choosing. But only useful documents participate in the ranking. And Yandex excludes all web junk right away. It can index any page, but the search engine will eliminate all the garbage over time.

search engine indexing

Both systems have an incremental index. Both one and the other low-quality pages affect the rating of the website as a whole. A simple philosophy works here. Favorite resources of a particular user will occupy higher positions in its issuance. But the same individual will hardly find a site that he did not like last time.

That is why it is first necessary to cover copies of web documents from indexing, to inspect the presence of blank pages and not to allow substandard content to be issued.

Acceleration of Yandex

How can I speed up site indexing at Yandex? The following steps must be completed:

  • Install the Yandex browser on your computer and wander through the pages of the site with it.
  • Confirm the rights to manage the resource in Yandex.Webmaster.
  • Post a link to the article on Twitter. It is known that since 2012 Yandex has been cooperating with this company.
  • For the site, add a search from Yandex. In the Indexing section, you can specify your own URLs.
  • Enter the Yandex.Metrica code without checking the box "Sending pages for indexing is prohibited."
  • Create a Sitemap that exists only for the robot and is not visible to the audience. Verification will begin with him. The Sitemap address is entered in robots.txt or in the appropriate form in "Webmaster" - "Setting Indexing" - "Sitemaps".

Intermediate actions

speed up site indexing

What needs to be done before the Yandex webpage is indexed? The domestic search engine should consider the site as the primary source. That is why, even before the publication of an article, it is necessary to add its content to the “Specific Texts” form. Otherwise, plagiarists will copy the record to their resource and be the first in the database. As a result, they will be recognized as authors.

Google database

For Google, the same recommendations that we described above will do, only the services will be different:

  • Google+ (instead of Twitter);
  • Google Chrome
  • Google devices for programmers - “Scan” - “Look like Googlebot” - option “Scan” - option “Add to index”;
  • search inside the resource from Google;
  • Google Analytics (instead of Yandex.Metrics).

Prohibition

What is a ban on indexing a site? You can overlay it both on the whole page, and on its separate part (link or piece of text). In fact, there is both a global ban on indexing and a local one. How is this implemented?

Consider the ban on adding a website to the search engine database in Robots.txt. Using the robots.txt file, you can exclude indexing of a single page or an entire section of a resource like this:

  1. User-agent: *
  2. Disallow: /kolobok.html
  3. Disallow: / foto /

The first paragraph says that the instructions are defined for all PS, the second indicates the prohibition of indexing the kolobok.html file, and the third does not allow the addition of the entire filling of the foto folder to the database. If you need to exclude several pages or folders, specify all of them in Robots.

how is site indexing

In order to prevent the indexing of a separate web page, you can use the robots meta tag. It differs from robots.txt in that it gives instructions immediately to all PS. This meta tag follows the general principles of the html format. It should be placed in the page header between the <head> <head> tags. An entry for a ban, for example, can be written like this: <meta name = ”robots” content = ”noindex, nofollow”>.

Ajax

And how does Yandex index Ajax sites? Today, Ajax technology is used by many website developers. Of course, she has great opportunities. With it, you can create high-speed and productive interactive web pages.

However, the search engine crawls the web worksheet differently than the user and browser. For example, a person looks at a comfortable interface with movably loaded Internet sheets. For a search robot, the contents of the same page can be empty or presented as the rest of the static HTML content, for the generation of which scripts do not go into business.

You can use C # URLs to create Ajax sites, but its search engine robot does not use. Usually the part of the URL after # is separated. This must be taken into account. Therefore, instead of a URL of the form http://site.ru/#example, he makes a request to the main page of the resource, located at http://site.ru. This means that the content of the web sheet may not get into the database. As a result, it will not appear in the search results.

To improve the indexing of Ajax sites, Yandex supported changes to the search robot and the rules for processing URLs of such websites. Today, webmasters can tell Yandex search engine the need for indexing by creating an appropriate scheme in the resource structure. To do this, you must:

  1. Replace the symbol # in the URL of the pages with # !. Now the robot will understand that it will be able to apply for the HTML version of the content of this Internet sheet.
  2. The HTML version of the content of such a page should be placed on the URL, where #! replaced by? _escaped_fragment_ =.

Source: https://habr.com/ru/post/C1001/


All Articles