Load balancing is an important feature in server development. In addition to being a regular Web server, Nginx can also be used as a reverse proxy front-end on a large scale, because Nginx's asynchronous framework can handle large concurrent requests. After holding these concurrent requests, they can be distributed to the backend servers (also known as the service pool, hereinafter referred to as the backend) for complex computing, processing and response, The advantages of this mode are considerable: hidden service hosts are more secure, public network IP addresses are saved, and background servers can be easily expanded when traffic increases.

Load balancing can be divided into hardware load balancing and software load balancing. The former is generally a combination of dedicated software and hardware equipment. The equipment will provide a complete and mature solution, which is usually more expensive. Nginx accounts for the majority of software complex equilibrium, and this paper also does corresponding learning research based on its manual.

1、 Basic Introduction

Load balancing involves the following basic knowledge.
　　

(1) Load balancing algorithm

　　a. Round Robin: It is the simplest way to send requests to all backend rotation training, and it is also the default allocation method;

　　b. Least Connections (least_conn): tracks the current number of active connections and backends. The smallest number of connections indicates that the backend has the lightest load and assigns requests to it. This method takes into account the weight information allocated to each upstream in the configuration;

　　c. Least Time (least_time): the request will be allocated to the backend with the fastest response and the least number of active connections;

　　d. IP Hash (ip_hash): calculate the hash value of the request source IP address, IPv4 will consider the first three octets, IPv6 will consider all address bits, and then allocate the hash value to the backend through some mapping;

　　e. Generic Hash (hash): calculates the hash value in the way of user-defined resources (such as URLs) to complete the allocation. Its optional consistent keyword supports the consistency hash feature;

(2) Session consistency
　　

When users (browsers) interact with the server, they usually save some information locally. The whole process is called a session and is identified by a unique session ID. The concept of session is not only used in shopping carts. Because the HTTP protocol is stateless, any situation that requires a logical context must use the session mechanism. In addition, the HTTP client will cache some additional data locally, which can reduce requests and improve performance. If the load balancer may distribute the requests of this session to different backend servers, it must be inappropriate. These data must be shared through multiple backends, and the efficiency will be very low. The simplest case is to ensure session consistency - every request of the same session will be allocated to the same backend.
　　

(3) Dynamic configuration of background server
　　

The problematic backends should be detected in time and eliminated from the distribution group, and the number of backends can be flexibly added when the business grows. In addition to the currently popular Elastic Compute cloud computing service, service providers should also automatically add and reduce backend hosts according to the current load.
　　

(4) DNS based load balancing
　　

Usually, a domain name of a modern network service provider is associated with multiple hosts. When DNS queries are performed, by default, the DNS server will return the IP address list in the form of round robin in different order, so it naturally assigns customer requests to different hosts. However, this method has inherent defects: DNS does not check the accessibility of the host and IP address, so the IP assigned to the client is not guaranteed to be available (Google 404); DNS resolution results will be continuously cached in the client and multiple intermediate DNS servers, so the allocation of backends will not be so ideal.

II Load balancing in Nginx

The load balancing configuration in Nginx is described in great detail in the manual, so the journal will not be used here. For commonly used HTTP load balancing, we first define an upstream as the backend group, and then forward it through proxy_pass/fastcgi_pass, among which fastcgi_pass is almost the standard configuration of Nginx+PHP.

2.1 Session consistency

The session consistency in Nginx is enabled by sticky. The session consistency does not conflict with the previous load balancing algorithm, but after the first allocation, all requests of the session are allocated to the same backend. Currently, three modes of session consistency are supported:
　　

(1). Cookie Insertion
　　

After the first response of the backend, a session cookie will be added to its header, that is, the load balancer will implant the cookie to the client, and then the client's subsequent requests will carry this cookie value, Nginx can judge which backend to forward to based on this cookie.

sticky cookie srv_id expires=1h domain=.example.com path=/;

The srv_id above represents the name of the cookie, while the parameters expire Domain and path are optional.
　　

(2). Sticky Routes
　　

After the first response of the backend, a route message will be generated, Route information is usually extracted from cookie/URI information.

sticky route $route_cookie $route_uri;

So Nginx will search in order The routecookie and route_uri parameters and select the first non empty parameter as the route. If all parameters are empty, the above default load balancing algorithm is used to determine which backend the request is distributed to.
　　

(3). Learn
　　

More complex and intelligent, Nginx will automatically monitor the session information in the request and response, and usually the request and response that need echo consistency will contain the session information. Compared with the first method, it does not need to add cookies, but dynamically learns the existing session.
　　

This method needs to use the zone structure. In Nginx, zones share memory and can share data in multiple worker processes. (But why doesn't the shared memory area be used for other session consistency?)

sticky learn

create=$upstream_cookie_examplecookie

lookup=$cookie_examplecookie

zone=client_sessions:1m

timeout=1h;

2.2 Session Draining

It is mainly necessary to turn off some backends for maintenance or upgrade. These key services are all focused on the successful processing: that is, new requests will not be sent to the backend, and subsequent requests for the session previously assigned to the backend will continue to be sent to him until the session is finally completed.

To make a backend enter the draining state, you can either directly modify the configuration file, and then reload the configuration by sending a signal to the master process in the previous way, or use Nginx's on the fly configuration mode.

$ curl http://localhost/upstream_conf?upstream=backend

$ curl http://localhost/upstream_conf?upstream=backend&id=1&drain=1

In the above way, first list the ID numbers of each backend, and then drain the ID backend. After all sessions of the backend are completed through online observation, the backend can be offline.

2.3 Backend health monitoring

Backend error involves two parameters, max_fails=1 fail_timeout=10s; It means that as long as Nginx fails to send a request to the backend or does not receive a response, the backend is considered unavailable for the next 10 seconds.

By periodically sending special requests to the backend and expecting to receive special responses, it can be used to confirm that the backend is healthy and available. This configuration can be made through health_check.

match server_ok {

status 200-399;

header Content-Type = text/html;

body !~ "maintenance mode";

}

server {

location / {

proxy_pass http://backend ;

health_check interval=10 fails=3 passes=2 match=server_ok;

}

}

The health_check above is required, and the following parameters are optional. In particular, the following match parameter can customize the conditions of server health, including return status code, header information, return body, etc. These conditions are&&and relationship. By default, Nginx will send a "/" request to the backend group every interval. If it times out or returns a response code other than 2xx/3xx, it is considered that the corresponding backend is unhealthy, and Nginx will stop sending requests to it until the backend is checked again the next time.

When using the health_check function, it is generally necessary to open a zone in the backend group. When sharing the backend group configuration, all the backend statuses can be shared by all worker processes. Otherwise, each worker process saves its own status check count and results independently. There will be a big difference between the two cases.

2.4 Setting HTTP Load Balancing through DNS

Hosts in Nginx's backend group can be configured as domain names. If the resolve parameter is added after the domain name, Nginx will periodically resolve the domain name. When the result of domain name resolution changes, it will automatically take effect without restarting.

http {

resolver 10.0.0.1 valid=300s ipv6=off;

resolver_timeout 10s;

server {

location / {

proxy_pass http://backend ;

}

}

upstream backend {

zone backend 32k;

least_conn;

...

server backend1.example.com resolve;

server backend2.example.com resolve;

}

}

If the domain name resolution result contains multiple IP addresses, these IP addresses will be saved to the configuration file, and these IP addresses will participate in automatic load balancing.

2.5 Load Balancing of TCP/UDP Traffic

Generally, HTTP and HTTPS load balancing is called seven layer load balancing, while TCP and UDP load balancing is called four layer load balancing. Because the seven layer load balancing is usually HTTP and HTTPS protocols, this kind of load balancing is a special case of the four layer load balancing. The equalizer can use the HTTP/HTTPS protocol header (User Agent Language, etc.), response code, and even response content to make additional rules to meet the requirements of backend forwarding for specific conditions and purposes.

In addition to the HTTP load balancing that Nginx specializes in, Nginx also supports load balancing of TCP and UDP traffic, which is applicable to various application scenarios of LDAP/MySQL/RTMP and DNS/syslog/RADIUS. Load balancing in such cases is configured using stream, Nginx needs to support the – with stream option when compiling. Check the manual. The configuration principle and parameters are similar to HTTP load balancing.

Because TCP UDP load balancing is aimed at general programs, so the match conditions (status Header, body) can not be used. TCP and UDP programs can use send Expect is used for dynamic health detection.

match http {

send "GET / HTTP/1.0rnHost: localhostrnrn";

expect ~* "200 OK";

}

2.6 Other characteristics

Slow_start=30s: This parameter prevents the newly added/restored host from being overwhelmed by the sudden increase of requests. This parameter allows the weight of the host to increase slowly from 0 to the set value, allowing the load to increase slowly.

Max_conns=30: You can set the maximum number of connections in the backend. When the number exceeds this number, it will be placed in the queue. At the same time, the queue size and timeout parameters can also be set. When the number of requests in the queue is greater than the set value, or exceeds timeout, but the backend cannot process the request, the client will receive an error return. Usually, this is also an important parameter, because Nginx is usually used to resist concurrency when it acts as a reverse proxy. If too many concurrent requests are given to backend, it is likely to take up too many resources in the backend (such as thread and process non event driven), which will ultimately affect the processing capacity of backend.

Original link: Interpretation of Software Load Balancing Implementation Based on Nginx , Please indicate the source for reprinting!