Kafka's Performance Strategy
in Note with 0 comment
Kafka's Performance Strategy
in Note with 0 comment

background

In my work, I have been using Kafka and maintaining several Kafka clusters. There is no doubt that Kafka is the most popular message queue middleware at present, and what makes it popular and popular. Next, I will explain why Kafka is so popular from the perspective of performance.

What is Kafka

First, Kafka is a distributed streaming data processing platform. Its important functions include:

In the context of big data processing requirements, Kafka will inevitably optimize the performance of the above functions. The key points/bottlenecks of performance optimization lie in:

transmission efficiency

The performance improvement of Kafka mainly uses the IO optimization technology of the operating system to break away from the memory limitation of the JVM.

Why start with the operating system? People use the operating system every day, but the role of the operating system is generally ignored. Let's recall that one of the major roles of the operating system is to eliminate hardware differences and provide a unified standard API for user programs. As a result, most people use IO to stay in calling the system read/write Back end engineers will learn more about NIO's epoll/kqueue Let's take a look at Kafka's following two optimization strategies.

mmap

In fact, modern operating systems have made complex optimization of disk IO. Under Linux, there is a common abbreviation, vfs, that is, virtual file system, which maps memory and external storage (disk) to improve read and write speed, such as but not limited to:

 2018-12-01T14:54:14.png

The optimization of the above memory/disk mapping depends on the prediction strategy of the operating system. Generally speaking, it is usually sequential access to disks, which is significantly more efficient.

In addition to automatically completing the above processes, the operating system also provides APIs mmap Actively map files to users page cache , the system will page cache The memory shared to the user program, the user program does not have to advance alloc memory , read the page cache directly to access data. Therefore, when frequently accessing a large file write mmap It is a better choice. It reduces the number of context switches between user mode and kernel mode in the normal write process (repeatedly copying the cache).

Zero-Copy

Above, we know the optimization strategy of disk cache. How to optimize the socket, another frequently used IO object

Introduced in Linux 2.1+ sendfile System call, via sendfile , we can page cache 's data directly copied to socket cache Only file descriptors and data offset And size Pass parameters to sendfile DMA The engine (Direct Memory Access) copies the data in the kernel buffer to the protocol engine, without going through the user mode context or requiring the user program to prepare the cache, so the user cache is zero. This is Zero Copy technology.

 #include<sys/sendfile.h> ssize_t sendfile(int out_fd,  int in_fd, off_t *offset, size_t count);

 2018-12-01T14:54:44.png

Zero Copy technology is like a magic weapon for Java programs, making the size and speed of cache free from the limitations of JVM.

In combination with Kafka's usage scenario, multiple subscribers pull messages, which are pulled by multiple different consumers. Using Zero Copy technology, call mmap Read disk data to a copy page cache , call again sendfile Take a copy page cache Copy to different socket cache The entire replication process is completed in the kernel state of the system, which makes the best use of the performance of the operating system. The bottleneck is almost limited to the hardware conditions (disk read/write, network speed).

Message quality

We can see how Kafka guarantees the quality of messages from three dimensions, namely, the real-time, reliability and throughput of messages.

Space for time

In order to solve the problem of excessive network requests, Kafka producers will combine multiple messages and submit them again, reducing the frequency of network IO, and sacrificing a little delay for higher throughput.

In practice, we can configure the parameters of this process for the producer client. For example, set the maximum message size to 64 bytes and the delay time to 100 milliseconds. The effect is that if the message size reaches 64 bytes in 100 milliseconds, it will be sent immediately. If the message size does not reach 64 bytes in 100 milliseconds, but it reaches 100 milliseconds, the message in the cache will also be sent.

Distributed design

Previously, we analyzed the strategies of the stand-alone environment (operating system, communication IO). Then, in terms of horizontal expansion, what performance optimization strategies does Kafka have?

What should be considered to build a distributed messaging system?

How to use the advantages of distributed and multi node to enhance the throughput, disaster tolerance and flexible capacity expansion of the message system

Let's abstract our thinking. News is flowing water, a water pipe under a single machine, and a tap water network under multiple nodes. In order to make the flow of information more stable, we must ensure that every link of the flow of information is guaranteed

List each link of the information flow first

How does Kafka deal with these links?

In terms of horizontal expansion, Kafka stores messages collected from producers under one topic on multiple partitions. The number of partitions is greater than or equal to the number of Kafka nodes. Each partition can allocate at most one consumer.

Intuitively, the partition and consumer will undergo the following rebalancing

Backup Policy

In addition to horizontally expanded partitions, multiple backups (Replicas) should be made to the total partition. One leader and multiple followers should be set for a partition. One leader should handle the transactions of the partition. The follower should be in ISR state (In Sync Replicas). Once the leader fails, a new leader should be generated in the latest follower

summary

Responses