TCP boundaryless and fragmentation and reassembly (so-called "packet sticking")

Fragmentation and reorganization when sending or receiving TCP byte streams is a commonplace topic. Some people are used to calling it "packet sticking". In fact, TCP is a stream mode protocol. It is byte stream oriented. There is no such thing as packet. The so-called "packet sticking" is a bad thing for application layer parsing

What TCP is&What it is doing

Streaming protocol

College computer networks have learned that Transmission Control Protocol (TCP) is a connection oriented, reliable flow mode protocol. So what is streaming protocol? Why is it reliable?

TCP provides the upper layer, that is, the application layer, with a byte stream that ensures delivery and is in correct order. Therefore, the upper layer sees it as a reliable stream.

During transmission, the upper layer drops the bytes into the TCP buffer. When certain conditions are met, TCP will take out the byte stream for segmentation, and add its own headers to form one TCP message after another, which will be sent to the lower network layer for processing.

Header

The header of TCP message is the key part of "reliable transmission". Its structure is as follows:

The serial number represents the offset of the first byte in the byte stream contained in this message in all byte streams in this transmission direction (the initial serial number does not necessarily start from 0 and is random), while the confirmation serial number represents the word throttling offset index sent by the host where this message is located.

Note that the serial number has nothing to do with the confirmation number. The serial number is the identification of the byte stream sent by us, while the confirmation number is the identification of the byte stream sent by the other party that we have received. To improve efficiency, our confirmation is placed in the same data message.

Receive message and splice byte stream

After the message is received, it is temporarily stored in the buffer. TCP will take out the byte stream and splice it into the correct order according to the sequence number in the message header, and send the ACK message (not necessarily sent in real time), that is, arrival acknowledgement.

If no ACK message is returned, it indicates that the probability is lost and the necessary data message will be retransmitted. Of course, retransmission is a very complex strategy. Sliding windows, retransmission timers and the like are deep pits and cannot be generalized.

ACK delay sending

ACK sending is not real-time. For efficiency, TCP will try to send fewer pure ACK messages.

At the same time, TCP can use one ACK to confirm all previous messages, that is, it is not necessary to receive one ACK at a time.

In order to improve efficiency, ACK will wait for a period of time, so that once we send data, we can carry the ACK. At the same time, if there are new messages, we can also save the next ACK. If the waiting times out, send a pure ACK message.

Nagle algorithm

Nagle algorithm stipulates that this link Only one small group without ACK can exist at the same time

For example, when a small packet is sent but no ACK is received for the time being, the unsent small packet can only wait in the buffer. If the ACK is delayed, more and more small packets will be sent in the buffer. Finally, when the MSS value is reached, the unsent small packet will be sent as a large packet.

What is the so-called "subcontracting sticking"

At the beginning, TCP does not say "packet". The so-called "packet sticking" is a bad thing for application layer parsing.

The byte stream maintained by TCP ensures that the byte stream order is consistent, but the byte stream data is Borderless Of.

The data sent from the application layer, whether it is sent once or twice, will be put into the buffer and assembled into a byte stream for sliding window processing, Just focus on byte stream order

Long byte stream involves MTU and will be split; To ensure transmission efficiency or ACK delay, short data may be combined into large packets by Nagle algorithm and sent again.

Or it involves retransmission and congestion control, which leads to the problem of message arrival timing and is still waiting for the message to be supplemented.

The lower layer MTU limit+Nagle algorithm+ACK delay sending+retransmission+congestion control will lead to the wonderful phenomenon mentioned in this article.

The final effect: the application layer sends 100 bytes at a time, which are thrown into the send buffer and sent according to the policy; The other end adjusts the receiving method once and only gets 10 bytes, or 120 bytes.

The application layer of the other party fetches the data once, and finds that the data is incorrect. The bytes obtained are either too many or too few, or they are delayed in receiving (small packets are delayed in sending). Some people call this phenomenon "TCP packet sticking", and TCP means that I don't want to back up.

How to solve

1. Dealing with boundlessness

Although the byte stream maintained by TCP is borderless, we can design an application layer protocol to make the byte flow borderless.

In the previous project, I made a simple application layer protocol. The protocol structure is as follows:

byte content meaning
zero / Identifier of a packet
1-2 unsigned short Total length of the package (including head and itself)
3-n(n<65538) UTF8 JSON data

When receiving a byte stream, first store it temporarily, then find the flag '/' to determine the packet, then parse the header information, obtain the length, finally read the data of the corresponding length, and convert the byte into a UTF-8 string.

The application scenario at that time was a numerical monitoring, which required high real-time performance, but did not care about data continuity. Therefore, when the temporary buffer reached a certain size, the last identifier was directly taken, and the previous bytes were thrown away to avoid the problem of too slow update or snowballing.

2. Process short byte packet delay transmission

In order to improve the network transmission efficiency, when the content of the send buffer is less than the number of MSS bytes, Nagle and other TCP control policies may delay the sending of messages, and delayed ACK will aggravate this delay.

Therefore, if the real-time requirement is extremely high, you can disable the Nagle policy. Take C # Socket as an example:

 socket.SetSocketOption(SocketOptionLevel. Tcp, SocketOptionName.NoDelay, true);

Where SocketOptionName NoDelay disables the Nagle algorithm.

Zimiao haunting blog (azimiao. com) All rights reserved. Please note the link when reprinting: https://www.azimiao.com/6417.html
Welcome to the Zimiao haunting blog exchange group: three hundred and thirteen million seven hundred and thirty-two thousand

Comment

*

*