Tencent Cloud PCDN: From P2P to Internet of Everything Framework

Looking back at the development history of XP2P, starting from saving bandwidth costs for the live broadcast business that has emerged in recent years, XP2P has built a sound foundation for direct connectivity at the bottom with STUN, port prediction, birthday attacks, UPnP, and has an efficient and reliable transmission protocol XNTP on the basis of direct UDP connectivity, The widely used HTTP protocol has been implemented on it. So far, the service framework of the Internet of Everything has been basically built, which is characterized by low network load, high transmission performance and stability. This article comes from the wonderful sharing of Zhang Peng, senior engineer of Tencent Cloud, on LiveVideoStackCon2019 Beijing Station.

By Zhang Peng

Organize/LiveVideoStack

Hello, I'm Zhang Peng from Tencent Cloud. Since 2014, I have been deeply engaged in P2P technology and overcome P2P technical problems. In the past few years, Tencent Cloud XP2P technology has been launched in several product lines and has withstood the test of live broadcast of large traffic parade, live broadcast of events and so on. Today I will focus on Tencent Cloud P2P technology in network penetration, network transmission, network topology construction and other aspects.

1. What is XP2P?

P2P architecture reflects the core technology of Internet architecture. In short, it is "you have something, I have something, and everyone has something to share". Therefore, this concept has been described in RFC 1 for a long time. It is the architecture that the earliest Internet builders dreamed of most. So why is it still a centralized service model? It is mainly because some historical and technical reasons have not been solved. However, we can still see the products of P2P in the rapid development of Internet technology in the past 30 years, such as netsper, PPlive, bt. In addition, 2006-08 is the golden age of domestic P2P academic circles, and domestic universities have very high output, such as GridMedia of Tsinghua University, CoolStreaming of Hong Kong University of Science and Technology, AnySee of Huazhong University of Science and Technology, etc, It represents the highest level of streaming P2P at that time. After 2008, due to the heavy load brought by P2P traffic to the network, Tencent Cloud XP2P entered the cloud service market again after the suppression of operators and fell into a period of low tide until the rise of live broadcast in 2014.

2. XP2P product functions

2.1 Penetration

2.1.1 P2P NAT penetration

When it comes to P2P, we must talk about its starting point - how to establish an interconnected relationship. We know that the IPv4 address has been depleted day by day. Now the outgoing IP address of the "cat" router in the home bandwidth is no longer the public IP address, and NAT (network address translation) can solve this problem. Of course, NAT is also a double-edged sword. For example, when we send a request, we use an intranet address and a port to identify the request. When the request data comes to the Internet, it is mapped to a public address and port by NAT. But then there will be a problem. Although we know the other party's address, we cannot connect. This is because when we send data, the other party's NAT does not know me and is blocked. Therefore, we need to first get through the process of establishing a connection between peers. The following figure describes the process of connecting two Peers through the intermediate forwarding server.

2.1.2 NAT type

The latest STUN protocol has seven NAT types: Open Internet, UDP Blocked, Symmetric Firewall, Full Cone, Restricted Cone, Port Restricted Cone, and Symmetric. For their introduction, let's take a look at the PPT. Without too much introduction, just to say that the symmetric type is the most difficult. When connecting different external peers, the symmetric type will have random different mapping addresses, so it also has the characteristics of port restriction type, which is difficult to penetrate.

2.1.3 STUN protocol

The STUN protocol is to probe the seven types of NAT protocols mentioned above. Its packet format is shown on the left side of the figure below, and its flow chart is shown on the right side. There are several points in this figure that need special attention: First, Test II, which requests the original STUN server address, but changes an IP to return the packet to me to see if it can be received. The second point is Test III, which requests the original STUN server address, but changes a port to respond to me to see if it can be received. These two points need special attention.

At present, most P2P applications are based on the STUN protocol to penetrate and establish a direct connection channel, but the STUN protocol does not completely solve the problem. It is unable to establish symmetric and port restricted and symmetric connections. The following figure is a case of symmetric connection port restriction type: both symmetric and port restriction type will connect to a hole server in the Internet, and they know each other's addresses. When the symmetric type sends packets to the port restriction type, the target address changes, so a new mapping address is changed, creating a new "hole"; However, the information transferred to the port restricted type through the hole server is still the old "hole" created with the hole server before the symmetric type, which causes the port restricted type to return packets at this time, but the returned information is the old "hole", which is different from the new "hole" of the symmetric type plan and the restricted type communication, so the communication is rejected and the hole drilling fails. To make matters worse, with the improvement of the security level, there are more and more symmetrical and port restricted models, but STUN is unable to do anything about it. It's a pity.

2.1.4 Breaking symmetric NAT penetration: port prediction

How to break the penetration of symmetric NAT has become one of the keys of P2P technology. A previous paper proposed a way to solve the penetration of symmetric NAT through port prediction: first, the client detects the change rule of its own port to the external network, which is equivalent to port prediction; Sever also explores its own port change rules; Finally, through the penetration server, we can know each other's information and predict the next port of the other party, and send the packet to them, so as to achieve successful hole drilling. This was feasible in that year, because 95% of the devices in that year were ports increasing; However, most of the changes of NAT ports are random, which is also a pity. Do we have nothing to do now? Most of the random port symmetric connection port restricted and symmetric cases are still pending!

2.1.5 Breakthrough of symmetric NAT with random ports: birthday attack

We know that the port restricted address does not change, while the random port symmetric mapping address changes randomly with the target address. How can they create a direct connection channel? Inspired by the way of port prediction, it is actually possible to get through at a small cost. The scheme adopted here is birthday attack, that is, the principle of random collision to achieve penetration. The birthday attack comes from the birthday paradox, which refers to that there are 23 people sitting in a conference room, 365 days a year, and any one of them may have a birthday on any day. However, the probability that at least one pair of them will have the same birthday is 50%. The straighter white point is that by using the number of attempts (23 people) that is far less than the sample set (365 possible birthdays), we can get the results of a pair of same sample collisions (the same birthday) with a high probability. So, if the sample set is 65535 ports, how many ports should be drilled randomly by the two parties to make the probability of a successful collision between the two parties reach 80%?

It specifically does this:

1. Random NAT sends packets to port restricted internet addresses by creating a new socket. These packets will be randomly mapped into different "holes" when leaving the NAT

2. Port restricted type, randomly send out packets to different port addresses of the other party. Then check whether a target port is sent to the "hole" of the random NAT mapped Internet port in step 1. If it is matched, the connection can be successfully established

By sending 400 packets to each other, the penetration success rate can reach more than 80%. At the cost of 400 packets, each packet only contains the IP header, UDP header, connection ID, and an Ethernet frame header. The maximum size of each packet is 50 bytes. The total size of 400 packets is 20 KB. It is completely acceptable to exchange for the successful connection of high-quality nodes.

The technique of birthday attack greatly increases the possibility that the two parties can successfully establish a connection. But that's not enough. Birthday attacks can only deal with symmetric and port limited penetrations, but the connection between symmetric and symmetric random ports still cannot be established!

2.1.6 UPnP: Change Symmetric NAT to Full Cone

UPnP (Universal Plug and Play Service) is a protocol designed to enable home devices to seamlessly establish connections with other Internet devices and simplify the network process. In fact, it is more like an Internet of Things protocol, which covers everything. We found that it contains two services, WANIPConnection and WANPPPConnection, which can be actively added to the full cone port mapping of the external network from the internal network. Therefore, if the symmetric type is difficult to penetrate, we can use this protocol to convert it to the full cone type, so as to further improve the penetration success rate.

So far, we have almost realized the optimal peer-to-peer penetration scheme, which is better than WebRTC and libp2p, and will achieve a good connection establishment success rate, which will be the core foundation of our P2P.

2.2 Transmission

2.2.1 Disadvantages of TCP protocol

When it comes to transmission protocols, we have to say TCP, which has been the foundation of the Internet for two or three decades, but it also has some weaknesses: first, it is slow to start. TCP has a slow start - every time it tries to reach a threshold at a multiplying speed, it rises from a very low initial speed to an ideal speed, and it requires many rounds of RTT in the middle, so it is called "slow" Start. In the initial stage, its growth rate is not even as fast as a fixed linear growth rate with a large slope. The second point is poor congestion control. TCP will add a packet (sending window) every round. Once it exceeds the maximum limit of the network, it will directly reduce by half, so that its average sending rate can only reach 75% of the network. The third point is that TCP has poor resistance to jitter and packet loss, You may have heard the saying "TCP packet loss rate exceeds 20%, which is basically obsolete "Let me give you an example: suppose our packet loss rate is 30%, that is, the probability of a packet being lost three times in a row is 2.7%, which means that when the cumulative number of additional packets is 40, repeated ACKs will occur three times in a row, causing the speed to drop to half, and the speed of the cyclic TCP will never rise again. Finally, there is retransmission ambiguity. When a packet is lost, I receive the packet in the next round. However, I cannot confirm whether I received the original packet or resend the packet.

2.2.2 Pacing of XNTP

For the TCP problem, our protocol has made many optimizations. The first is Pacing transmission. I understand Pacing transmission is uniform transmission. In the early TCP transmission mode, it may be handed to the network to transmit all the packets to be sent in the RTT at a moment, and then the RTT will become larger and larger, so it will have queues, and the RTT will become larger. This will send more, and eventually the router will be unable to queue up, leading to packet loss, The network speed jitters. A better transmission method is to send packets evenly. One RTT is 40 milliseconds, and I send 40 packets. Here, one packet is sent every millisecond, and then another packet is sent one millisecond later. This is very uniform for the router, so that the network can be used without frequent packet loss.

Pacing Face itself is nothing strange. It is only the basis of our transmission protocol, and many subsequent optimizations actually rely on it.

2.2.3 XNTP anti jitter

Our anti jitter strategy is inspired by QUIC. Each packet of QUIC has two serial numbers: the packet serial number and the content serial number. We believe that it is the key to solving the anti jitter problem of network transmission, achieving the effect that the sliding window does not stop when congestion occurs. Assume that the sliding window is 11 for normal sending. After receiving the 11th packet, the receiver sends an ACK indicating that the second packet is lost. TCP will slide the sliding window to [2,13], and then retransmit packets 2 and 11. The efficiency is low; When QUIC encounters a packet loss, it will retransmit it with a new packet number, while the content number of the packet remains the same. The effect is that compared with TCP, it is similar to sliding the window directly to [12, 23]. The lost packet will be transmitted with a new number of 12, 13, and the sliding window will not affect it. This makes it possible to use the remaining 80% of the network to transmit data even if the packet loss rate is 20%.

2.2.4 "Quick Start" of XNTP

Let's first learn about Google's BBR, which has two key factors: the minimum RTT detected and the maximum bandwidth, to determine the optimal sending method: this sending amount=minimum RTT * maximum bandwidth. Can this idea also be used for reference in "quick start"? We mentioned earlier that Pacing has solved the problem of TCP sending queue cache delay, but it still has its own disadvantages in terms of initial speed. Is burst useless? The significance of burst is to make the sender's transmission speed reach the instant infinity, and our solution is based on this! We think that the RTT of the first packet is the minimum RTT, and the burst mode is used for the first few packets. For example, the first four packets of the first 10 packets sent in the first round are burst mode, which also means that the transmission rate is unlimited. Therefore, the receiver can get a maximum bandwidth that we think is approximate by counting the receiving speed of the four packets, so when the ACK is obtained after 1-RTT, With the minimum RTT and maximum bandwidth available, you can get an "optimal sending method" by using the BBR method, end the slow start, and enter congestion control.

For network transmission innovations like Quick Start, we have many other systematic understandings of transmission protocols. After data verification in many ways, we will also try to open them to friends in academia and industry. Due to time, we won't introduce them too much for the time being.

2.3 Application

The reliable transmission provided by XNTP is equivalent to the "TCP". We also follow the trend to build HTTP semantics on XNTP. Here, we not only realize the basic semantics of HTTP, but also realize the multiplexing of HTTP 2 - multiplexing multiple streams on the same link. In addition, we also implement HTTP Client and HTTP Server. That is, based on the high connection success rate of P2P and the good transmission performance of XNTP, HTTP services can be built on the HTTP protocol. We know that the unity of Unix's simple philosophy is now "everything is a document", and our framework can also be simply understood as "everything is a service". We will continue to follow this simple idea in the future, and XP2P/PCDN naturally becomes the first application service of this framework.

2.4 How to use P2P technology?

2.4.1 Necessary condition - slice

For P2P technology implementation, first of all, all nodes must have data consistency across the network, that is, a piece of your data must be the same as my piece of data, otherwise it will be meaningless.

For on-demand, the file length and playback duration are both fixed, so the algorithm for data consistency is very simple and easy to implement. However, for live broadcast, the live broadcast content is fleeting, and the time point for each audience to watch the live broadcast is also different, so it is not possible to split the live broadcast according to Offset like on-demand.

2.4.2 P2P slice for live broadcast

For live streaming slicing, the traditional methods include formulating according to the dts, or using the central server to slice the live streaming into small files to agree on data consistency, such as HLS and DASH. However, the way we currently use is to slice directly on the original live stream. FLV and FMP4 naturally have obvious boundary information in the format - FLV's tag or FMP4's box. Therefore, we only need to agree on the start node and boundary information with other peers to break through the limitation of P2P implementation on the original live stream.

2.4.3 P2P adaptive bit rate

Smooth viewing experience is very important in live broadcast. HLS and DASH are born for adaptive bit rate. They will automatically switch to a lower bit rate when the network conditions are poor. We know that HLS and DASH slice the live stream, but FLV and FMP4 do not need it. Just give the new decoding parameters to the client (player) in an HTTP link You can achieve adaptive bit rate. What are the advantages over HLS and DASH? As we all know, the biggest improvement of HTTP 2 over HTTP 1.1 is that multiple HTTP requests are transmitted using a single TCP connection, because the congestion control and sequential arrival of a single TCP connection are better than the congestion competition of multiple TCP connections. However, HLS and DASH have turned the live stream request into a file request of multiple connections, which is actually a step backwards. While we directly implement adaptive bit rate on a stream live broadcast, we think it is a more advanced technology than HLS and DASH.

2.4.4 P2P network topology

The following are three common network topologies: mesh model, tree model, and average diffluence model. Each has its own advantages and disadvantages. But if you want to ask us which model Tencent Cloud uses, the answer is that none of our models are useful! At present, Tencent Cloud uses a hybrid model of mesh model and tree model. While combining the advantages of both models, it relatively reduces the number of layers of the tree structure, so as to balance the disadvantages brought by the tree model. This is also the reason why Tencent Cloud P2P and sharing rate are good.

3. XP2P Application Scenario

When it comes to P2P application scenarios, the first thing I think about is the acceleration of traffic distribution, such as the applications of live broadcast, on-demand and file download scenarios. In fact, today's P2P development has not caused high network load, but more friendly to improve network capacity. We believe that P2P actually implements a multicast protocol to some extent, which can optimize bandwidth transmission and reduce network load: for example, when two people watch the same video in the same cell, only one person needs to pull data from a remote server, and the other person can obtain data through internal network sharing, which reduces the flow of data packets on the network, This reduces the network load. It is worth mentioning that we have achieved a high success rate of P2P penetration, and achieved reliable network transmission and HTTP. I think that combining edge computing and fog computing is very likely to break the limits of the current centralized deployment services, and it is possible to deploy simple servers in a truly distributed network.

4. Future and prospect of XP2P

Finally, we look forward to the future of XP2P. Tencent Cloud X-P2P implements multicast protocol in a sense, which optimizes network quality and reduces network load; The arrival of 456 (4K, 5G, IPv6) will enable X-P2P to further exert its capabilities and get wider applications; The P2P technology used at the bottom of the blockchain is similar to that of Tencent Cloud X-P2P, but libp2p has not seen how to access the core technology of penetration except for a bunch of unnecessary concepts; Edge computing will also rely on robust, secure and efficient P2P technology bottom layer; XNTP transmission protocol will continue to be optimized, and will even be comparable to quic in the future; It is worth mentioning that fog computing is not only an application like flow sharing, but also needs a wind to really redefine it, just like AWS redefined cloud computing that year. The service framework of the Internet of Everything will continue to develop following the simple idea of "everything is connected". It may be the key to redefine fog computing, which is more exciting!