WordPress knowledge sharing

What is website log _ How to analyze website log _ How to shield blackout when discovering malicious IP

What is website log

Website logs are also called web logs , is to record the visit of website users, similar to video recording. Every day we visit not only visitors, but also search engine spiders. The website log can effectively record all visitors' visits, their sources, when they visit, what pages they visit, what browsers they use, what operating systems, etc. The website log can record these data. For search engine spider crawling, it will record whether it is Baidu spider, Google spider or 360 spider, when it came to crawl which pages, what is the return value of the crawl, etc. These data are recorded in the website log file. If we can read and analyze the website logs well, we can observe the website from another angle, and then solve some problems existing in the website.

 What is website log
What is website log

Purpose of analyzing website logs

Generally speaking, it has the following purposes:

  • The new website has not been included in the search engine for a period of time after it was published. At this time, it is necessary to download the website log to analyze whether the search engine has crawled the content, and whether we have blocked spider crawling for our own reasons;
  • It turns out that the website ranking is pretty good, and then it is found that there is something abnormal. We need to download the website log to see whether it is normal for the search engine to crawl the website during this period of time;
  • Find that the website has been attacked or invaded, download the website log to analyze the details of the attack IP, attack time, attack mode, attack characteristics, etc;

How to get website logs

Where can I download website logs?
  • Virtual hosts generally search for directories such as/wwwlogs/, and almost all of them have the word "logs";
  • The server and host will see website logs in directories such as/www/wwwlogs/. For example, the pagoda panel is in this directory. In the pagoda panel>Security, you can see the website logs on the top right;
  • The virtual host and server should download the website logs to the local. We usually use ftp software, and the pagoda panel can also be downloaded directly from the path mentioned above;
  • If the size of the website log is too large, hundreds of megabytes or even more than 1G, you can use Log cutting function of pagoda panel Divide into small size files and download them;

How to analyze website logs

To divide the website logs, we also need a log analysis software. I found several softwares that are not very easy to use. Finally, I found a small software of logviewer pro, which works well.

You can use this software to open the website log file directly without limiting the file size. What you see is a line by line log record, as shown in the following figure. Is it because his scalp is numb and he doesn't know how to start? Old Wei will give specific examples to analyze below, and you will find it not so difficult to understand after reading.

 View website log
View website log

Take a line from the above figure to analyze as follows:

14.18.183.126 – – [06/Sep/2020:16:41:42 +0800] “GET /13264.html HTTP/1.1” 200 10177 “-” “Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0; Trident/5.0)”

IP address. The access time is accurate to seconds.+0800 is the time zone of the visitor, get is the fetching method,/13264.html is the access page address, http is the access protocol, and 200 is the http status code. 10177 is the size of the visited page. Mozilla is followed by visitor browser information, operating system information, etc.

We don't need to analyze all the above information, just take the part we need for comprehensive analysis.

Another example is the line in the above figure

  • 203.208.60.98 – – [06/Sep/2020:16:42:09 +0800] “GET /21283.html HTTP/1.1” 200 9337 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; + http://www.google.com/bot.html )”
  • At 16:42:09 on September 6, 2020, the Google spider with an IP address of 203.208.60.98 crawled the/21283.html page, which was 9 KB in size

The same Baidu spider, 360 spider and Toutiao spider will leave similar log traces. Each spider leaves a different mark, but has its own brand name.

How to distinguish true and false search engine spiders

Of course, there are many fake search engine spiders, so we should learn to analyze the true and false spiders.

In the Windows system, press the keyboard window+R, and enter the cmd command in the pop-up window. Enter in the command line as shown in the following figure

nslookup 203.208.60.98

There is a space between the command and the IP address.

Then we get the server name shown in the figure below, which contains the word "google bot". Combined with the IP segment searched on the network, it is the result of the Google spider. It is comprehensively judged that it is a real Google spider.

Lao Wei reminds:

  • Many malicious IP addresses will sell dog meat with sheep's head. In fact, they are intended to attack or collect. We should pay attention to distinguish them;
  • Malicious attacks will use proxy IP, so sometimes the IP you see may not be the real IP of the visitor;
 How to distinguish true and false search engine spiders
How to distinguish true and false search engine spiders

Website log http status code

Is it important to analyze http status codes in website logs? Yes, just now we can see the http status code in the first screenshot of the example, which is 200 or 304, meaning the results of the search engine spiders or users visiting the website.

  • 200 means successful capture;
  • 304 The requested web page has not been modified since the last request. When the server returns this response, it will not return the web page content;
  • 404 means that the link does not exist when fetching, so 404 is returned to the visitor;

There are many Http status codes. 200300400 or 500 can be subdivided into many common status code numbers. We only need to know roughly what these common numbers mean: 200 represents successful fetching, 404 represents wrong links, and 500 represents server errors. It's OK to know the meaning of these numbers in general. Don't remember all of them. If you want to know more about the meaning of http status codes, you can ask Du Niang to search.

If 404 error codes always appear in your website logs, you should check what happens to these pages and why you always prompt 404, which will help us improve the website problems.

How to block blackout when malicious IP is found

If you encounter a malicious IP, you can add it to the IP blacklist of the server firewall, such as using a security dog and other software. You can also Masking malicious IP in the pagoda panel firewall

Pagoda firewall includes system firewall (network layer) and paid firewall (software layer). The network layer is larger than the software layer. It can be understood that the network layer is the first one to contact the access traffic in the outer layer. If you limit the access traffic here, these IPs cannot access the server. The software layer is a certain web application in the server, and the payment firewall only limits the access to this web application.

When blacklisting an IP, it is inevitable to encounter "accidental injury", just delete it from the IP blacklist.

Lao Wei tips:

If it is a CC attack, blacking is useless, which will not affect the next attack with this "proxy IP". Therefore, it should be added to the server firewall, such as security dog, pagoda panel firewall, etc. Use firewall to shield CC attacks.

Extended reading: How to use the pagoda panel website firewall

Lao Wei's summary

For most novices and Xiaobai, Analyze site logs It is a hard work to "look" and tired brain. When the website has no problems, you can also check the website log, where you can find many things that "look" can not be found. When you find any abnormal situation on the website, you should analyze the problem from the website log, and deal with it in a timely manner to ensure the normal operation of the website.

Like( one )
Article name: What is website log _ How to analyze website log _ How to shield malicious IP
Article link: https://www.vpsss.net/23304.html
Copyright notice: The resources of this website are only for personal learning and exchange, and are not allowed to be reproduced and used for commercial purposes, otherwise, legal issues will be borne by yourself.
The copyright of the pictures belongs to their respective creators, and the picture watermark is for the purpose of preventing unscrupulous people from stealing the fruits of labor.