home page Site building tutorial SEO optimization text

PHP detects whether it is a search engine spider or an ordinary user

The spider identifier (such as Baiduspider Google bot) can tell whether it is a search spider or an ordinary user who has visited our website. Here are some ways PHP can judge whether a visit is a search engine spider or an ordinary user. 1. From Discuz x3.2<? phpfunction checkrobot($useragent=''){ static $kw_spiders =...

The spider identifier (such as Baiduspider Google bot) can tell whether a search spider or an ordinary user has visited our website. The following is PHP's judgment of whether a visit is Search Engines Spiders are also several methods for ordinary users.

1. From Discuz x3.2

 <? phpfunction checkrobot($useragent=''){     static $kw_spiders = array('bot', 'crawl', 'spider' ,'slurp', 'sohu-search', 'lycos', 'robozilla');     static $kw_browsers = array('msie', 'netscape', 'opera', 'konqueror', 'mozilla');     $useragent = strtolower( em pty($useragent) ?  $_SERVER['HTTP_USER_AGENT'] : $useragent);     if(strpos($useragent, 'http://') === false && dstrpos($useragent, $kw_browsers)) return false;     if(dstrpos($useragent, $kw_spiders)) return true;     return false;} function dstrpos($string, $arr, $returnvalue = false) {     if(empty($string)) return false;     foreach((array)$arr as $v) {         if(strpos($string, $v) !==  false) {             $return = $returnvalue ?  $v : true;             return $return;         }     }     return false;} if(checkrobot()){ Echo 'Robot crawler';     }else{ Echo 'Normal user';     } ?>

In practical application, it can be judged that non search engines (spiders):

 <? php if(! checkrobot()){ //do something } ?>

The second method is to use PHP to implement spider access log statistics

 <? php  $useragent = addslashes(strtolower($_SERVER['HTTP_USER_AGENT']));  if (strpos($useragent, 'googlebot')!==  false){      $bot = 'Google';  }elseif (strpos($useragent,'mediapartners-google') !==  false){      $bot = 'Google Adsense';  }elseif (strpos($useragent,'baiduspider') !==  false){      $bot = 'Baidu';  }elseif (strpos($useragent,'sogou spider') !==  false){      $bot = 'Sogou';  }elseif (strpos($useragent,'sogou web') !==  false){      $bot = 'Sogou web';  }elseif (strpos($useragent,'sosospider') !==  false){      $bot = 'SOSO';  }elseif (strpos($useragent,'360spider') !==  false){      $bot = '360Spider';  }elseif (strpos($useragent,'yahoo') !==  false){      $bot = 'Yahoo';  }elseif (strpos($useragent,'msn') !==  false){      $bot = 'MSN';  }elseif (strpos($useragent,'msnbot') !==  false){      $bot = 'msnbot';  }elseif (strpos($useragent,'sohu') !==  false){      $bot = 'Sohu';  }elseif (strpos($useragent,'yodaoBot') !==  false){      $bot = 'Yodao';  }elseif (strpos($useragent,'twiceler') !==  false){      $bot = 'Twiceler';  }elseif (strpos($useragent,'ia_archiver') !==  false){      $bot = 'Alexa_';  }elseif (strpos($useragent,'iaarchiver') !==  false){      $bot = 'Alexa';  }elseif (strpos($useragent,'slurp') !==  false){ $bot='Yahoo';  }elseif (strpos($useragent,'bot') !==  false){ $bot='Other spiders';  }  if(isset($bot)){    $fp = @fopen('bot.txt','a');    fwrite($fp, date('Y-m-d H:i:s'). "\t".$_ SERVER["REMOTE_ADDR"]. "\t".$ bot."\t".'http://'.$_SERVER['SERVER_NAME'].$_SERVER["REQUEST_URI"]."\r\n");    fclose($fp);  }  ?>

The third method:

The spiders of search engines have their own unique signs, and HTTP_USER_AGENT is used to determine whether they are spiders.

 function isCrawler() {      echo $agent= strtolower($_SERVER['HTTP_USER_AGENT']);       if (! empty($agent)) {          $spiderSite= array(              "TencentTraveler",              "Baiduspider+",              "BaiduGame",              "Googlebot",              "msnbot",              "Sosospider+",              "Sogou web spider",              "ia_archiver",              "Yahoo! Slurp",              "YoudaoBot",              "Yahoo Slurp",              "MSNBot",              "Java (Often spam bot)",              "BaiDuSpider",              "Voila",              "Yandex bot",              "BSpider",              "twiceler",              "Sogou Spider",              "Speedy Spider",              "Google AdSense",              "Heritrix",              "Python-urllib",              "Alexa (IA Archiver)",              "Ask",              "Exabot",              "Custo",              "OutfoxBot/YodaoBot",              "yacy",              "SurveyBot",              "legs",              "lwp-trivial",              "Nutch",              "StackRambler",              "The web archive (IA Archiver)",              "Perl tool",              "MJ12bot",              "Netcraft",              "MSIECrawler",              "WGet tools",              "larbin",              "Fish search",  //Other spiders,         );           foreach($spiderSite as $val) {              $str = strtolower($val);               if (strpos($agent, $str) !==  false) {                  return true;               }          }      }else{          return false;       }     }     if (isCrawler()){  Echo "Hello spider spirit!";      }else{  Echo "Hello ordinary users!";      }    ?>

Common spider logo, if there are errors or not collected, you can leave a message.

Baidu Spider
Baidu picture: Baiduspider image
Baidu WAP: Baiduspider mobile
Baidu Video: Baiduspider video
Baidu News: Baidu pider news
Google spider: Google bot
360 Spider
SOSO spider
Yahoo Spider: Yahoo
YoudaoBot, YodaoBot
Sogou News Spider, Sogou web spider、Sogou inst spider、Sogou blog、Sogou Orion spider
Bing Spider: bingbot
MSN spider: msnbot, msnbot-media
YisuSpider
Alexa spider: ia_archiver
EasouSpider
Instant Spider: JikeSpider
EtaoSpider

Reward
poster

Statement: Some of the resources on this site are original works on the site, and some are publicly shared and collated based on the Internet. The copyright belongs to the original author.
If it infringes your rights, please contact our website, and we will deal with it as soon as possible. Thank you. Please indicate the source of the transfer

Link to this article: https://www.umtheme.com/seo/219.html

Related recommendations

 How to make website traffic more valuable

How to make website traffic more valuable

There are often some SEO personnel who are keen on counting the number of keyword rankings. They will look at the webmaster tool to check the fluctuation of keywords every day. There is no doubt that the total number of keywords that can be ranked
SEO optimization 2019.03.29 zero four thousand three hundred and fifteen
Post comments

Thank you for your support