Performance comparison and analysis of Disruptor and LinkedBlockingQueue

original
2016/11/23 23:35
Reading amount 1W

Introduction to Disruptor and LinkedBlockingQueue
Disruptor is a message component implemented by Java for inter thread communication. Its core is a Lock free Ringbuffer; LinkedBlockingQueue is a blocking queue provided in the java. util. concurrent package; Because there are many similarities between the two, a performance comparison is made here.

Pressure test
1. Pressure test class for LinkedBlockingQueue

 public class LinkedBlockingQueueTest { public static int eventNum = 5000000; public static void main(String[] args) { final BlockingQueue<LogEvent> queue = new LinkedBlockingQueue<LogEvent>(); final long startTime = System.currentTimeMillis(); new Thread(new Runnable() { @Override public void run() { int i = 0; while (i < eventNum) { LogEvent logEvent = new LogEvent(i, "c" + i); try { queue.put(logEvent); } catch (InterruptedException e) { e.printStackTrace(); } i++; } } }).start(); new Thread(new Runnable() { @Override public void run() { int k = 0; while (k < eventNum) { try { queue.take(); } catch (InterruptedException e) { e.printStackTrace(); } k++; } long endTime = System.currentTimeMillis(); System.out .println("costTime = " + (endTime - startTime) + "ms"); } }).start(); } }

LinkedBlockingQueueTest implements a simple producer consumer model, where one thread is responsible for inserting and another thread is responsible for reading.

 public class LogEvent implements Serializable { private static final long serialVersionUID = 1L; private long logId; private String content; public LogEvent(){ } public LogEvent(long logId, String content){ this.logId = logId; this.content = content; } public long getLogId() { return logId; } public void setLogId(long logId) { this.logId = logId; } public String getContent() { return content; } public void setContent(String content) { this.content = content; } }

The LogEvent entity class and the pressure test class of the Disruptor are also used

2. The following is the pressure test class for Disruptor. It is necessary to import the jar package of Disruptor

 <dependency> <groupId>com.lmax</groupId> <artifactId>disruptor</artifactId> <version>3.3.6</version> </dependency>

Pressure measurement class of Disruptor

 public class DisruptorTest { public static void main(String[] args) { LogEventFactory factory = new LogEventFactory(); int ringBufferSize = 65536; final Disruptor<LogEvent> disruptor = new Disruptor<LogEvent>(factory, ringBufferSize, DaemonThreadFactory.INSTANCE, ProducerType.SINGLE, new BusySpinWaitStrategy()); LogEventConsumer consumer = new LogEventConsumer(); disruptor.handleEventsWith(consumer); disruptor.start(); new Thread(new Runnable() { @Override public void run() { RingBuffer<LogEvent> ringBuffer = disruptor.getRingBuffer(); for (int i = 0;  i < LinkedBlockingQueueTest.eventNum; i++) { long seq = ringBuffer.next(); LogEvent logEvent = ringBuffer.get(seq); logEvent.setLogId(i); logEvent.setContent("c" + i); ringBuffer.publish(seq); } } }).start(); } }

In order to ensure the accuracy of the test data, the Disruptor uses the ProducerType SINGLE mode, and only one LogEventConsumer is used

 public class LogEventConsumer implements EventHandler<LogEvent> { private long startTime; private int i; public LogEventConsumer() { this.startTime = System.currentTimeMillis(); } public void onEvent(LogEvent logEvent,  long seq, boolean bool) throws Exception { i++; if (i == LinkedBlockingQueueTest.eventNum) { long endTime = System.currentTimeMillis(); System.out.println(" costTime = " + (endTime - startTime) + "ms"); } } }

LogEventConsumer is responsible for recording the start time and end time, as well as the number of messages received, to facilitate time statistics

Statistics of pressure test results
Test environment:
Operating system: win7 32-bit
CPU: Intel Core i3-2350M 2.3GHz 4-core
Memory: 3G
JDK:1.6

Run the above two instances respectively, and take the average value for several times. The results are as follows:

The results show that the Disruptor is 1.65 times of the LinkedBlockingQueue, the test environment is my laptop, and the configuration is a little low. All the gaps are not particularly obvious; Similarly, the result displayed on the company's desktop (win7 64 bit – Intel Core i5 4 core – 4g memory – jdk1.7) is about 3-4 times; The official data is about 5 times: https://github.com/LMAX-Exchange/disruptor/wiki/Performance-Results

Cause analysis of performance gap
1. Gap between lock and cas
Locks are used in LinkedBlockingQueue, as shown below:

 /** Lock held by take, poll, etc */ private final ReentrantLock takeLock = new ReentrantLock(); /** Lock held by put, offer, etc */ private final ReentrantLock putLock = new ReentrantLock();

The Disruptor provides lock free support for cas and BusySpinWaitStrategy policy

2. Avoid pseudo sharing
The cache system stores data in cache lines. The cache line is an integer power of 2, which is consecutive bytes, generally 32-256 bytes. The most common cache line size is 64 bytes. When multiple threads modify independent variables, if these variables share the same cache line, they will unintentionally affect each other's performance, which is called pseudo sharing.

Look at an example:

 public class FalseSharing implements Runnable { public final static int NUM_THREADS = 4; public final static long ITERATIONS = 50000000; private final int arrayIndex; private static VolatileLong[] longs = new VolatileLong[NUM_THREADS]; static { for (int i = 0;  i < longs.length; i++) { longs[i] = new VolatileLong(); } } public FalseSharing(final int arrayIndex) { this.arrayIndex = arrayIndex; } public static void main(final String[] args) throws Exception { final long start = System.currentTimeMillis(); runTest(); System.out.println("costTime = " + (System.currentTimeMillis() - start) + "ms"); } private static void runTest() throws InterruptedException { Thread[] threads = new Thread[NUM_THREADS]; for (int i = 0;  i < threads.length; i++) { threads[i] = new Thread(new FalseSharing(i)); } for (Thread t : threads) { t.start(); } for (Thread t : threads) { t.join(); } } @Override public void run() { long i = ITERATIONS + 1; while (0 != -- i) { longs[arrayIndex].value = i; } } public final static class VolatileLong { public volatile long value = 0L; public long p1, p2, p3, p4, p5, p6; } }

Comment out the public long p1, p2, p3, p4, p5, and p6 in VolatileLong respectively. Compared with not commenting, it is found that the performance of not commenting is 4 times that of commenting. The reason is that the cache line size is 64 bytes, and not commenting means that a VolatileLong object just occupies a cache line; If a cache line is commented out, it will be occupied by multiple variables, which will unintentionally affect each other's performance.

Looking at the source code of Disruptor, we can find many places to avoid pseudo sharing, such as:

 abstract class SingleProducerSequencerPad extends AbstractSequencer { protected long p1, p2, p3, p4, p5, p6, p7; public SingleProducerSequencerPad(int bufferSize,  WaitStrategy waitStrategy) { super(bufferSize, waitStrategy); } }

3. Use of Ringbuffer
The Disruptor chooses to use Ringbuffer to construct the lock free queue. For details about Ringbuffer, please refer to the wiki: https://zh.wikipedia.org/wiki/%E7%92%B0%E5%BD%A2%E7%B7%A9%E8%A1%9D%E5%8D%80
Arrays are pre allocated, which avoids the running overhead caused by Java GC. The producer updates the attributes in the Ringbuffer element when producing messages or events, rather than replacing the elements in Ringbuffer.

There must be other reasons to sort out these three items first

summary
The high performance of Disruptor has long been used in some third-party libraries, such as log4j2, to make log4j2 a qualitative leap in performance. Previously, we compared the performance of three mainstream logs: https://my.oschina.net/OutOfMemory/blog/789267

Expand to read the full text
Loading
Click to join the discussion 🔥 (5) Post and join the discussion 🔥
five comment
six Collection
one fabulous
 Back to top
Top