Website construction

Simply fix the discovery of robots file vulnerabilities prompted by 360 security detection

Jager · July 22 · 2016 · 11469 read

I haven't seen 360's webmaster platform for a long time, so I found that the security score was 99 instead of 100 in 360's search. I went in to have a look and found the following wonderful flower:

 Simply fix the discovery of robots file vulnerabilities prompted by 360 security detection

Nanny? Found robots.txt file? Is this a loophole? Continue to read and explain:

Vulnerability type:
Information leakage
Station building procedure:
other
Server type:
currency
Programming language:
other
Description:
The robots.txt file was found on the target WEB site.

1.robots.txt is the first file to be viewed when a search engine visits a website.

-Stow
2. The robots.txt file will tell the spider program which files can be viewed and which files are not allowed to be viewed on the server. Take a simple example: when a search spider visits a site, it will first check whether robots.txt exists in the root directory of the site. If it does, the search robot will determine the scope of access according to the contents of the file; If the file does not exist, all search spiders will be able to access all pages on the website that are not protected by passwords. At the same time, robots.txt is publicly accessible by anyone. Malicious attackers can obtain sensitive directory or file path information by analyzing the contents of robots.txt.

harm:
The robots.txt file may reveal sensitive information in the system, such as the background address or the address that is not willing to be disclosed to the public. Malicious attackers may use this information to carry out further attacks.
Solution:
1. Ensure that the robots.txt does not contain sensitive information. It is recommended that you use permission control for directories or files that you do not want to publish to the public, so that anonymous users cannot access these information

2. Move sensitive files and directories to another isolated subdirectory to exclude this directory from Web Robot search. A good solution is to move files to non specific directory names such as "folder": New directory structure:/folder/passwords. txt/folder/sensitive_folder/

New robots.txt: User-agent: * Disallow: /folder/

3. If you cannot change the directory structure and must exclude the specific directory from the Web Robot, please only use the local name in the robots.txt file. Although this is not the best solution, at least it can increase the difficulty of guessing the full directory name. For example, if you want to exclude "admin" and "manager", use the following name (assuming there is no file or directory starting from the same character in the Web root directory): robots. txt:

User-agent: *

Disallow: /ad

Disallow: /ma

Original address: http://webscan.360.cn/vul/view/vulid/139

I probably understand that robots will disclose the background or other sensitive addresses of the website. When I met an address that I didn't want people to know through robots, I will also use the third solution above and only write local strings.

However, these are completely hidden practices. People with a clear eye can easily identify whether a blog is WordPress or other website building programs. It is impossible to hide any sensitive directory. Of course, it is useless to hide it.

However, I am not happy to see that it is not 100 points, so I will hide my ears and solve the problem!

My idea is very simple. For non spider crawling robots.txt, it will return to 403. That is, robots.txt is only open to spiders. The implementation is very simple. Add the following code to the Nginx configuration:

 #If robots.txt is requested and the spider is matched, 403 is returned location = /robots.txt {  if ($http_user_agent !~* "spider|bot|Python-urllib|pycurl") { return 403; } }

After adding, reload the following Nginx, and then go to the browser to access the robots address. You should see that access to 403 is prohibited.

Then I went to 360 and scanned it. The result was not unexpected:

 Simply fix the discovery of robots file vulnerabilities prompted by 360 security detection

Oh, the problem is simply solved, just for this sentence, "All Shenma hackers are floating clouds, and website security is 360, which is really powerful!", hehehe... It's OK to block anonymous scanning on the network, but the rest is really just floating clouds.

53 responses
  1. Long Xiaotian 2016-7-22 · 14:02

    Almost never paid attention to these~~

  2. Koolight 2016-7-22 · 15:17

    360 is also a wonderful flower. It seems that there are too many websites with 100 points, and the quota needs to be reduced.

  3. I wonder if my free website supports the modification of robots

  4. Chenghang Xiansen 2016-7-22 · 19:26

    I have noticed this problem before and may not know how to solve it. Brother Zhang, what about the virtual host?

  5. Seon 2016-7-22 · 21:28

    This 6, hahaha: grin: I've changed my avatar for a long time. Why hasn't your blog updated the cache

  6. Hello, Jager~According to your method, I went to nginx. conf to add the code you provided, and it really returned 403. But when I went to Baidu webmaster platform to update robots. txt, I was prompted that your robots file has set redirect and jump, which can't be viewed temporarily. What should I do?

    •  avatar
      Jager 2016-7-23 · 21:04

      1. Check whether robots can be captured with Baidu capture diagnosis
      2. View the nginx log to view the captured information when clicking to update robots

  7. Hello, Jager~According to your method, I went to nginx. conf to add your code. However, when I went to Baidu webmaster platform to update robots. txt, I was reminded that your robots file has set redirect jump, and can't be viewed temporarily. What should I do?

  8. Xingtai website construction 2016-7-23 · 12:45

    Thanks for sharing. It's time to test your avatar.
    The other thing is that once you enter your email address, you can get your avatar, which is very good.
    An error occurred while reviewing the element. It's https

    •  avatar
      Jager 2016-7-23 · 21:02

      No error message is seen.. Please paste the address next time

      • Xingtai website construction 2016-7-23 · 21:30

        Font from origin ' http://res.zgboke.com ' has been blocked from loading by Cross-Origin Resource Sharing policy: The 'Access-Control-Allow-Origin' header contains multiple values ' https://zhang.ge , https://zhang.ge ', but only one is allowed. Origin ' https://zhang.ge ' is therefore not allowed access.

        This is the page, and the picture will not be posted

        •  avatar
          Jager 2016-7-24 · 8:01

          I know. The old head rules are cached by individual nodes and will be updated slowly.

  9. Handsome little Qiqi 2016-7-24 · 23:06

    Well, it's really a way to cheat yourself, but no one will care

  10. benen005 2016-7-25 · 11:50

    It's really good

  11. Easy task 2016-7-25 · 14:37

    With so many old drivers commenting below, I will also take one

  12. Pioneer Blog 2016-7-28 · 8:50

    Thanks for sharing!: roll: :roll: :roll: :roll: :roll:

  13. Tianya SEO Self study Network 2016-7-28 · 10:45

    My tip is that I just want to see 100 points for obsessive-compulsive disorder like you!

  14. Personal blog 2016-7-28 · 16:19

    Thanks for sharing with Brother Zhang

  15. Without 100% strength, writing 100% is also self deception (⊙ o ⊙)?

    •  avatar
      Jager 2016-7-30 · 8:57

      make conversation

  16. hey soul sister 2016-7-30 · 17:44

    The Great God is the Great God Ha ha

  17. Yitaojin Stock Blog 2016-7-30 · 22:29

    Thanks for sharing, I'm too lazy to test

  18. leaf 2016-7-31 · 12:25

    360 is funny....

  19. leaf 2016-7-31 · 12:26

    It is estimated that the webmaster who deceived Xiaobai, such as Mr. Zhang, will see through. 360 is a wonderful flower.

  20. Songsong soft text release 2016-8-3 · 13:18

    Thank you for your sharing. It's good to receive goods today. Continue diving

  21. Sumei Haoxiang Blog 2016-8-4 · 11:28

    360's test is useless

    •  avatar
      Jager 2016-8-4 · 20:47

      Nobody said it was useful.

  22. Pseudo geek 2016-8-6 · 16:40

    It's true after checking..

  23. Thanks for sharing, many professional posts. Will come to see it often oops:

  24. Songsong soft text release 2016-8-15 · 15:11

    I really don't know this

  25. Yang Zeye 2016-8-18 · 20:20

    How awesome! I also learn

  26. song 2016-9-8 · 21:44

    Hello, Jager, can you tell me how to modify the server if it is win2008 R2 IIS7.5?

  27. How to disable Alibaba virtual host

    • WingsBlog 2016-10-8 · 15:32

      Also, how to disable the virtual host

  28. Chrysanthemum 2016-10-12 · 13:03

    Thanks for sharing

  29. OtwoCn 2016-11-24 · 10:48

    I add it to the configuration according to your method, but nginx can't be started in any way. All positions have been tested, and it is found that only the first segment can not be started, swelling or breaking~~~My nginx version is 1.11.5

    •  avatar
      Jager 2016-11-24 · 13:15

      There must be an error message during startup. How can I know if it is not posted? My articles are all available for testing.
      In addition, it simplifies the following:

       location = /robots.txt {  if ($http_user_agent !~* "spider|bot|Python-urllib|pycurl") { return 403; } }
      • OtwoCn 2016-11-25 · 9:48

        For deployment, it's Xiaobai~~~I just see that there is no nginx process in the task manager process~~~~
        I have tried to optimize the code, but I can't start it. Prawn reminds me where to read the wrong information:!:

        •  avatar
          Jager 2016-11-25 · 13:47

          Forget about the Windows server, when I didn't say

          • OtwoCn 2016-11-25 · 14:56

            oh my god.... I just found the wrong information, and the landlord hit me like this....
            2016/11/24 09:13:40 [emerg] 4496#5036: unknown directive "if($http_user_agent" in C:/nginx-1.11.5/conf/nginx.conf:73

            •  avatar
              Jager 2016-11-26 · 11:09

              if($http_user_agent
              There should be a space before the if and parentheses. I clearly have a space in my article.

              • OtwoCn 2016-12-5 · 9:47

                After adding a space, it can be started, but 360 detection is still 99 points: sad:

  30. Yang Xiaojie 2016-12-5 · 21:14

    Hello, blogger. Kangle doesn't have nginx. How can I fix it?

    •  avatar
      Jager 2016-12-8 · 17:13

      Kangle is not familiar with this, but the principle is the same

  31. someone 2016-12-20 · 18:35

    Brother, robots are for spiders. You might as well delete the robots file if you configure them like this

    •  avatar
      Jager 2016-12-21 · 9:36

      Please don't mislead the readers. Well, the way of the article is to only grab robots. txt for spiders, and ordinary users won't show it. If you don't read the article carefully, you will be blind, and you haven't left an email. Do you really come to communicate?

  32. side dish 2016-12-21 · 9:17

    How to write code for htaccess of the apache server

  33. Boke112 navigation 2017-3-10 · 14:39

    The method is very easy to use. Today, my site's 360 score is too low, and it has been settled. Finally, it has 100 points. Thank you for sharing.

  34. 1080 Benefits 2017-4-15 · 23:43

    I was built by lnmp. When I changed location~/robots. txt, I returned 403
    =Is there anything different from~,: roll:

    •  avatar
      Jager 2017-4-17 · 21:30

      It's OK. Make up a good lesson and then reform yourself.
      ~Indicates matching, supports regular, decimal point indicates all characters, location~"^/robots . Txt"{
      =Means exactly equal to.

  35. The voice is still murmuring 2017-7-18 · 20:42

    I would like to ask the blogger how Apache can make users unable to see robots.txt, and let search engines crawl this file. Thank you.

  36. Sleepless at night 2017-8-16 · 16:08

    Why the input has no effect, but it can still be accessed