WordPress knowledge sharing

Explain with examples how to identify true and false Baidu spiders?

Old Wei shared Baidu Official Teaches You How to Identify Baidu Spiders Correctly Some fans feel that they don't understand enough, and hope to explain it in practice. Today, Lao Wei will use an example that actually happened.

Baidu Spider to Grab

You can see the following figure is a record of Baidu spider crawling displayed in a website log.

In fact, at first glance, it can be seen that this is a fake spider, because The real Baidu spider only uses its own server, not a third-party host

Although we know it is fake, we still check it in the usual way in order to explain it to our fans.

 Baidu Spider to Grab

View spider UA

It can be seen from the above figure that the spider visiting UA is Mozilla/5.0+(compatible;+Baiduspider/2.0++ http://www.baidu.com/search/spider.html ), except for a little more+, the others seem to be no big problem.

UA can be forged, so it cannot be used as complete evidence.

DNS anti query IP

We press win+R on the Windows keyboard, enter cmd, and enter the command line window

nslookup 39.104.66.126

The following results are obtained:

 DNS anti query IP

The ping result does not contain the name of the Baidu host, which means that it is a fake spider.

Spider Grab URL

We list the URLs crawled by the spider and find that they are non-existent URL addresses, starting from 1 and in the order of+1, which is obviously crawled automatically by the crawler tool according to the preset rules.

Real spiders crawl along hyperlinks in pages, even 404 pages won't crawl all the time.

 Spider Grab URL

Search this IP

Search this IP on Baidu, and the real Baidu spider will generally have a record published by netizens, but this IP has no record.

The pagoda panel is regularly updated with spiders

The pagoda panel regularly updates the spider IP of each major search engine for users.

Pagoda panel payment firewall A spider pool is provided. By manually clicking "Synchronize", spiders in major search engines are regularly summarized and updated, real spider IP is released, and fake spiders are screened to avoid impact on website SEO.

Long term running website, Lao Wei suggested Purchase pagoda paid firewall Use to ensure the safe and stable operation of the server and website.

There are three ways to purchase:

  • Purchase pagoda paid firewall separately and pay monthly
  • Pagoda Professional Edition Free use of paid firewall
  • Baota Enterprise Edition free use of paid firewall
Lao Wei suggested that the pagoda panel is suitable for novices and Xiaobai to build a website, and supports Linux and Windows systems, Click to go directly to the pagoda website Register an account and get a free professional edition gift package of ¥ 3188 (only 1188 yuan for permanent authorization of the professional edition). New users can enjoy a professional edition experience of 0.99 yuan for 7 days.
Recommended selection of pagoda panel Alibaba Cloud Tencent Cloud The server
For more information, see Special topic of pagoda panel
Click to view NetEase Cloud Classroom "Introduction to Pagoda Linux Panel for Novice"

summary

Judging from the above steps, this is a fake Baidu Spider IP, which is actually a crawler tool that uses Alibaba Cloud hosts to impersonate Baidu Spiders to grab our website content. You can black out the IP address and report it when necessary Alibaba Cloud official website The website log record is attached. Generally, there will be an official reply within 72 hours. If the official confirmation is correct, the host will be blocked and the account owner will be punished.

Like( zero )
Article name: "How to identify true and false Baidu spiders with examples?"
Article link: https://www.vpsss.net/28671.html
Copyright notice: The resources of this website are only for personal learning and exchange, and are not allowed to be reproduced and used for commercial purposes, otherwise, legal issues will be borne by yourself.
The copyright of the pictures belongs to their respective creators, and the picture watermark is for the purpose of preventing unscrupulous people from stealing the fruits of labor.