Play Nginx logs manually

Nginx logs are an undiscovered treasure for most people. Summarize the experience of doing a log analysis system before, and share with you the pure manual analysis method of Nginx logs.

Nginx logs are configured in two places: access_log and log_format.

Default format:

 access_log /data/logs/nginx-access.log; log_format old '$remote_addr [$time_local] $status $request_time $body_bytes_sent ' '"$request" "$http_referer" "$http_user_agent"';

I believe that most people who have used Nginx are familiar with the default Nginx log format configuration and log content. But the default configuration and format are readable, but difficult to calculate.

Nginx log disk flushing related policies can be configured:

For example, set the buffer, and only when the buffer reaches 32k can the disk be flushed; If the buffer is less than 5s, the configuration of forced disk brushing is as follows:

 access_log /data/logs/nginx-access.log buffer=32k flush=5s;

This determines whether to see logs in real time and the impact of logs on disk IO.

There are many variables that can be recorded in Nginx logs that do not appear in the default configuration:

For example:

Request data size: $request_length

Return data size: $bytes_sent

Request time: $request_time

Serial number of connection used: $connection

Current connection requests: $connection_requests

The default format of Nginx is not computable. You need to find a way to convert it to a computable format. For example, use the control character ^ A (ctrl+v ctrl+a under Mac) to split each field.

The format of log_format can be as follows:

 log_format new '$remote_addr^A$http_x_forwarded_for^A$host^A$time_local^A$status^A' '$request_time^A$request_length^A$bytes_sent^A$http_referer^A$request^A$http_user_agent';

After that, we analyze it through common Linux command line tools:

Find the most frequently visited URL and times:

cat access.log | awk -F ‘^A’ ‘{print $10}’ | sort | uniq -c

Find the wrong access to the current log file 500:

cat access.log | awk -F ‘^A’ ‘{if($5 == 500) print $0}’

Number of errors to find the current log file 500:

cat access.log | awk -F ‘^A’ ‘{if($5 == 500) print $0}’ | wc -l

Find the number of 500 wrong accesses in a minute:

cat access.log | awk -F ‘^A’ ‘{if($5 == 500) print $0}’ | grep ’09:00’ | wc-l

Find slow requests that take more than 1s:

tail -f access.log | awk -F ‘^A’ ‘{if($6>1) print $0}’

If you only want to view some bits:

tail -f access.log | awk -F ‘^A’ ‘{if($6>1) print $3″|”$4}’

Find 502 URLs with the most errors:

cat access.log | awk -F ‘^A’ ‘{if($5==502) print $11}’ | sort | uniq -c

Find 200 blank pages

cat access.log | awk -F ‘^A’ ‘{if($5==200 && $8 < 100) print $3″|”$4″|”$11″|”$6}’

View real-time log data flow

tail -f access.log | cat -e

perhaps

tail -f access.log | tr ‘^A’ ‘|’

summary

According to this idea, many other analyses can be done, such as the most UA accesses; The IP with the highest access frequency; Request time consuming analysis; Request to return packet size analysis; wait.

This is the prototype of a large Web log analysis system. This format is also very convenient for subsequent large-scale batching and streaming calculations.

link: http://blog.eood.cn/nginx_logs

I remember that someone in the zone asked about nginx log analysis a long time ago, and I just saw this article today. I have practiced it, and I think it's good. Let's share it:)

[Original address]

Original article reprint please specify: reprint from Seven Travelers Blog

Fixed link of this article: https://www.qxzxp.com/4955.html

Manually playing Nginx logs: there is currently 1 message

  1. Thank you very much

    2016-01-10 23:41 [Reply]

Comment

6 + 8 =

Shortcut key: Ctrl+Enter