You can see the following figure is a record of Baidu spider crawling displayed in a website log.
In fact, at first glance, it can be seen that this is a fake spider, becauseThe real Baidu spider only uses its own server, not a third-party host。
Although we know it is fake, we still check it in the usual way in order to explain it to our fans.
View spider UA
It can be seen from the above figure that the spider visiting UA is Mozilla/5.0+(compatible;+Baiduspider/2.0++http://www.baidu.com/search/spider.html), except for a little more+, the others seem to be no big problem.
UA can be forged, so it cannot be used as complete evidence.
DNS anti query IP
We press win+R on the Windows keyboard, enter cmd, and enter the command line window
nslookup 39.104.66.126
The following results are obtained:
The ping result does not contain the name of the Baidu host, which means that it is a fake spider.
Spider Grab URL
We list the URLs crawled by the spider and find that they are non-existent URL addresses, starting from 1 and in the order of+1, which is obviously crawled automatically by the crawler tool according to the preset rules.
Real spiders crawl along hyperlinks in pages, even 404 pages won't crawl all the time.
Search this IP
Search this IP on Baidu, and the real Baidu spider will generally have a record published by netizens, but this IP has no record.
The pagoda panel is regularly updated with spiders
The pagoda panel regularly updates the spider IP of each major search engine for users.
Pagoda panel payment firewallA spider pool is provided. By manually clicking "Synchronize", spiders in major search engines are regularly summarized and updated, real spider IP is released, and fake spiders are screened to avoid impact on website SEO.
Long term running website, Lao Wei suggestedPurchase pagoda paid firewallUse to ensure the safe and stable operation of the server and website.
There are three ways to purchase:
Purchase pagoda paid firewall separately and pay monthly
Pagoda Professional Edition Free use of paid firewall
Baota Enterprise Edition free use of paid firewall
Judging from the above steps, this is a fake Baidu Spider IP, which is actually a crawler tool that uses Alibaba Cloud hosts to impersonate Baidu Spiders to grab our website content. You can black out the IP address and report it when necessaryAlibaba Cloud official websiteThe website log record is attached. Generally, there will be an official reply within 72 hours. If the official confirmation is correct, the host will be blocked and the account owner will be punished.
Article name: "How to identify true and false Baidu spiders with examples?" Article link:https://www.vpsss.net/28671.html Copyright notice: The resources of this website are only for personal learning and exchange, and are not allowed to be reproduced and used for commercial purposes, otherwise, legal issues will be borne by yourself. The copyright of the pictures belongs to their respective creators, and the picture watermark is for the purpose of preventing unscrupulous people from stealing the fruits of labor.