Website construction

Simply fix the discovery of robots file vulnerabilities prompted by 360 security detection

I haven't seen 360's webmaster platform for a long time, so I found that the security score was 99 instead of 100 in 360's search. I went in to have a look and found the following wonderful flower:

Nanny? Found robots.txt file? Is this a loophole? Continue to read and explain:

Vulnerability type:

Information leakage

Station building procedure:

other

Server type:

currency

Programming language:

other

Description:

The robots.txt file was found on the target WEB site.

1.robots.txt is the first file to be viewed when a search engine visits a website.

-Stow
2. The robots.txt file will tell the spider program which files can be viewed and which files are not allowed to be viewed on the server. Take a simple example: when a search spider visits a site, it will first check whether robots.txt exists in the root directory of the site. If it does, the search robot will determine the scope of access according to the contents of the file; If the file does not exist, all search spiders will be able to access all pages on the website that are not protected by passwords. At the same time, robots.txt is publicly accessible by anyone. Malicious attackers can obtain sensitive directory or file path information by analyzing the contents of robots.txt.

harm:

The robots.txt file may reveal sensitive information in the system, such as the background address or the address that is not willing to be disclosed to the public. Malicious attackers may use this information to carry out further attacks.

Solution:

1. Ensure that the robots.txt does not contain sensitive information. It is recommended that you use permission control for directories or files that you do not want to publish to the public, so that anonymous users cannot access these information

2. Move sensitive files and directories to another isolated subdirectory to exclude this directory from Web Robot search. A good solution is to move files to non specific directory names such as "folder": New directory structure:/folder/passwords. txt/folder/sensitive_folder/

New robots.txt: User-agent: * Disallow: /folder/

3. If you cannot change the directory structure and must exclude the specific directory from the Web Robot, please only use the local name in the robots.txt file. Although this is not the best solution, at least it can increase the difficulty of guessing the full directory name. For example, if you want to exclude "admin" and "manager", use the following name (assuming there is no file or directory starting from the same character in the Web root directory): robots. txt:

User-agent: *

Disallow: /ad

Disallow: /ma

Original address: http://webscan.360.cn/vul/view/vulid/139

I probably understand that robots will disclose the background or other sensitive addresses of the website. When I met an address that I didn't want people to know through robots, I will also use the third solution above and only write local strings.

However, these are completely hidden practices. People with a clear eye can easily identify whether a blog is WordPress or other website building programs. It is impossible to hide any sensitive directory. Of course, it is useless to hide it.

However, I am not happy to see that it is not 100 points, so I will hide my ears and solve the problem!

My idea is very simple. For non spider crawling robots.txt, it will return to 403. That is, robots.txt is only open to spiders. The implementation is very simple. Add the following code to the Nginx configuration:

 #If robots.txt is requested and the spider is matched, 403 is returned location = /robots.txt {  if ($http_user_agent !~* "spider|bot|Python-urllib|pycurl") { return 403; } }

After adding, reload the following Nginx, and then go to the browser to access the robots address. You should see that access to 403 is prohibited.

Then I went to 360 and scanned it. The result was not unexpected:

Oh, the problem is simply solved, just for this sentence, "All Shenma hackers are floating clouds, and website security is 360, which is really powerful!", hehehe... It's OK to block anonymous scanning on the network, but the rest is really just floating clouds.

53 responses

Long Xiaotian 2016-7-22 · 14:02

Almost never paid attention to these~~
- I love sports bike network 2016-7-28 · 10:21
  
  : razz: Like you, I don't care much about this!
Koolight 2016-7-22 · 15:17

360 is also a wonderful flower. It seems that there are too many websites with 100 points, and the quota needs to be reduced.
Teach you free application website 2016-7-22 · 17:08

I wonder if my free website supports the modification of robots
Chenghang Xiansen 2016-7-22 · 19:26

I have noticed this problem before and may not know how to solve it. Brother Zhang, what about the virtual host?
Seon 2016-7-22 · 21:28

This 6, hahaha: grin: I've changed my avatar for a long time. Why hasn't your blog updated the cache
A long song saying misty rain 2016-7-23 · 8:28

Hello, Jager~According to your method, I went to nginx. conf to add the code you provided, and it really returned 403. But when I went to Baidu webmaster platform to update robots. txt, I was prompted that your robots file has set redirect and jump, which can't be viewed temporarily. What should I do?
- Jager 2016-7-23 · 21:04
  
  1. Check whether robots can be captured with Baidu capture diagnosis
  2. View the nginx log to view the captured information when clicking to update robots
A long song saying misty rain 2016-7-23 · 8:31

Hello, Jager~According to your method, I went to nginx. conf to add your code. However, when I went to Baidu webmaster platform to update robots. txt, I was reminded that your robots file has set redirect jump, and can't be viewed temporarily. What should I do?
Xingtai website construction 2016-7-23 · 12:45

Thanks for sharing. It's time to test your avatar.
The other thing is that once you enter your email address, you can get your avatar, which is very good.
An error occurred while reviewing the element. It's https
- Jager 2016-7-23 · 21:02
  
  No error message is seen.. Please paste the address next time
  - Xingtai website construction 2016-7-23 · 21:30
    
    Font from origin ' http://res.zgboke.com ' has been blocked from loading by Cross-Origin Resource Sharing policy: The 'Access-Control-Allow-Origin' header contains multiple values ' https://zhang.ge , https://zhang.ge ', but only one is allowed. Origin ' https://zhang.ge ' is therefore not allowed access.
    
    This is the page, and the picture will not be posted
    - Jager 2016-7-24 · 8:01
      
      I know. The old head rules are cached by individual nodes and will be updated slowly.
Handsome little Qiqi 2016-7-24 · 23:06

Well, it's really a way to cheat yourself, but no one will care
benen005 2016-7-25 · 11:50

It's really good
Easy task 2016-7-25 · 14:37

With so many old drivers commenting below, I will also take one
Pioneer Blog 2016-7-28 · 8:50

Thanks for sharing!: roll: :roll: :roll: :roll: :roll:
Tianya SEO Self study Network 2016-7-28 · 10:45

My tip is that I just want to see 100 points for obsessive-compulsive disorder like you!
Personal blog 2016-7-28 · 16:19

Thanks for sharing with Brother Zhang
WeChat voting management system 2016-7-29 · 17:44

Without 100% strength, writing 100% is also self deception (⊙ o ⊙)?
- Jager 2016-7-30 · 8:57
  
  make conversation
hey soul sister 2016-7-30 · 17:44

The Great God is the Great God Ha ha
Yitaojin Stock Blog 2016-7-30 · 22:29

Thanks for sharing, I'm too lazy to test
leaf 2016-7-31 · 12:25

360 is funny....
leaf 2016-7-31 · 12:26

It is estimated that the webmaster who deceived Xiaobai, such as Mr. Zhang, will see through. 360 is a wonderful flower.
Songsong soft text release 2016-8-3 · 13:18

Thank you for your sharing. It's good to receive goods today. Continue diving
Sumei Haoxiang Blog 2016-8-4 · 11:28

360's test is useless
- Jager 2016-8-4 · 20:47
  
  Nobody said it was useful.
Pseudo geek 2016-8-6 · 16:40

It's true after checking..
WeChat public account registration and voting 2016-8-8 · 15:33

Thanks for sharing, many professional posts. Will come to see it often oops:
Songsong soft text release 2016-8-15 · 15:11

I really don't know this
Yang Zeye 2016-8-18 · 20:20

How awesome! I also learn
song 2016-9-8 · 21:44

Hello, Jager, can you tell me how to modify the server if it is win2008 R2 IIS7.5?
youth passes as a fleeting wave 2016-9-21 · 13:25

How to disable Alibaba virtual host
- WingsBlog 2016-10-8 · 15:32
  
  Also, how to disable the virtual host
Chrysanthemum 2016-10-12 · 13:03

Thanks for sharing
OtwoCn 2016-11-24 · 10:48

I add it to the configuration according to your method, but nginx can't be started in any way. All positions have been tested, and it is found that only the first segment can not be started, swelling or breaking~~~My nginx version is 1.11.5
- Jager 2016-11-24 · 13:15
  There must be an error message during startup. How can I know if it is not posted? My articles are all available for testing.
  In addition, it simplifies the following:
  
  location = /robots.txt { if ($http_user_agent !~* "spider|bot|Python-urllib|pycurl") { return 403; } }
  - OtwoCn 2016-11-25 · 9:48
    
    For deployment, it's Xiaobai~~~I just see that there is no nginx process in the task manager process~~~~
    I have tried to optimize the code, but I can't start it. Prawn reminds me where to read the wrong information:!:
    - Jager 2016-11-25 · 13:47
      
      Forget about the Windows server, when I didn't say
      - OtwoCn 2016-11-25 · 14:56
        
        oh my god.... I just found the wrong information, and the landlord hit me like this....
        2016/11/24 09:13:40 [emerg] 4496#5036: unknown directive "if($http_user_agent" in C:/nginx-1.11.5/conf/nginx.conf:73
        
        Jager 2016-11-26 · 11:09
        
        if($http_user_agent
        There should be a space before the if and parentheses. I clearly have a space in my article.
        
        OtwoCn 2016-12-5 · 9:47
        
        After adding a space, it can be started, but 360 detection is still 99 points: sad:
Yang Xiaojie 2016-12-5 · 21:14

Hello, blogger. Kangle doesn't have nginx. How can I fix it?
- Jager 2016-12-8 · 17:13
  
  Kangle is not familiar with this, but the principle is the same
someone 2016-12-20 · 18:35

Brother, robots are for spiders. You might as well delete the robots file if you configure them like this
- Jager 2016-12-21 · 9:36
  
  Please don't mislead the readers. Well, the way of the article is to only grab robots. txt for spiders, and ordinary users won't show it. If you don't read the article carefully, you will be blind, and you haven't left an email. Do you really come to communicate?
side dish 2016-12-21 · 9:17

How to write code for htaccess of the apache server
Boke112 navigation 2017-3-10 · 14:39

The method is very easy to use. Today, my site's 360 score is too low, and it has been settled. Finally, it has 100 points. Thank you for sharing.
1080 Benefits 2017-4-15 · 23:43

I was built by lnmp. When I changed location~/robots. txt, I returned 403
=Is there anything different from~,: roll:
- Jager 2017-4-17 · 21:30
  
  It's OK. Make up a good lesson and then reform yourself.
  ~Indicates matching, supports regular, decimal point indicates all characters, location~"^/robots . Txt"{
  =Means exactly equal to.
The voice is still murmuring 2017-7-18 · 20:42

I would like to ask the blogger how Apache can make users unable to see robots.txt, and let search engines crawl this file. Thank you.
Sleepless at night 2017-8-16 · 16:08

Why the input has no effect, but it can still be accessed

Simply fix the discovery of robots file vulnerabilities prompted by 360 security detection

APISIX O&M Optimization Configuration File Automatic Generation Scheme

Share an alternative application case of Nginx reverse proxy

Share an alternative application case of Nginx forward proxy

Solve the problem of returning 200 status codes on 404 pages of the website