Detailed explanation of IM push guarantee and network optimization (I): how to realize backstage survival without affecting user experience

original
2018/06/28 17:28
Number of readings 445

For mobile apps, IM functions are becoming more and more important, which can create connections between people. In social products, communication between users can produce better user stickiness.

In the complex Android ecosystem, a variety of factors can cause the message push to fail to reach the client in time. In addition, the unstable mobile network also adds obstacles to the speed and reliability of data transmission.

This article explains in detail the exploration and experience of NetEase Yunxin IM SDK in dealing with weak network environment, mobile terminal hardware restrictions and the complex ecological status of Android. How to achieve the background live without affecting the user experience, the improved long connection plus push combination scheme, and the optimization practice of large data transmission in weak network environment.
Read with thought:

1. What is IM

two  IM SDK How to realize backstage survival without affecting user experience
3. How to make a long connection plus push combination scheme
4. How to optimize big data transmission in weak network environment

Definition of IM

IM consists of two words: Instant and Messaging.
The immediacy requires that new messages can be received immediately. If the program is in the background, push notifications must be received immediately.
The communication requires stability and reliability, no system downtime, no program crash, security, no interception and monitoring when delivering messages, no message loss, no disorder in order, no repetition, and if audio and video chat is included, the delay is required to be low, smooth, and not stuck.

To truly create a set of stable and reliable commercial grade IM system , the challenge is very big.

The first problem is message push. IOS has APNS for push, which is quite stable. Android itself can also use GCM, but there are "walls" in China that directly block out all Google services such as GCM. In order to achieve instant and stable message push, NetEase has been studying since the age of Ease. As time goes by, difficulties and methods are constantly changing.

For IM, when the APP goes back to the background, it must be able to receive new message reminders. What can I do without GCM? At the beginning, the only thing we could do was run in the background. This is almost the only way to receive push, and even now, it is also the most important way. Android is designed to support true background running, and the background running feature is one of the reasons why Android can be so successful now. On the other hand, Android has long been unable to get rid of its bad reputation of being stuck and power consuming, and background running can not drag it off. Therefore, the system will not allow itself to run in the background.

Four major obstacles faced by APP running in the background

The first obstacle is Android's Low Memory Killer mechanism. The memory of the mobile phone is limited. When there are more and more processes running in the background, the remaining memory will decrease accordingly. When a new APP wants to start, if the memory is not enough, the LMK mechanism will start, select one of the running processes to clean up, free up space, and then the new APP can run.

LMK has two scales to judge. One is the process priority. The lower the priority, the more likely it is to be cleaned. The other is the memory occupation. The more memory occupied, the more weight will be cleaned.

Because of the LMK mechanism, although APP is allowed to run in the background, it also faces the risk of being cleaned up at any time. Therefore, NetEase needs to restart in time after being cleaned up.

The second obstacle is alarm. There are two types of alarm clock: circular alarm clock and one-time alarm clock. After the alarm clock is triggered, the corresponding components are started.

The third obstacle is that the Receiver, which is registered statically in the Manifest file, can start components when these events occur by listening to various system events, such as startup, network changes, mount/unmounts, etc., because this way will cause the system to jam when these events occur. In 7.0, Android has added restrictions.

The fourth obstacle is the JobScheduler, which was added in 5.0 and allows the APP to do some actions when certain events occur, such as charging, switching to wifi, etc.

Although no matter how you do it, APP will eventually die, but by comparing with LMK's evaluation criteria, you can reduce the probability of APP being cleaned up. The first is to reduce the memory consumption of processes. If the single process mode is adopted, the memory will inevitably remain high and low because the process contains UI, Webview, various image caches and other contents. IM software generally adopts a dual process or even multi process strategy to separate the push process. In the push process, only network connection and push business are processed, and no other business logic is involved, let alone any UI.

The following is the architecture of NetEase Yunxin Android SDK, designed according to the hierarchical structure mode. The cyan layer at the bottom is the push layer, which runs as an independent process. He is only responsible for the work related to long network connections, such as security encryption, heartbeat, authentication, packet unpacking, etc. All business logic is handed over to the service module of the UI process. Take a look at the process memory usage of Yunxin Demo. The upper one is the main process. Looking at the data of PSS in the fourth column, the memory occupation is about 50M. The lower one is the push process, and the memory occupation is only about 10M. When in the background, the probability of the push process being cleaned is much lower than that of the main UI process.

(Netease Yunxin SDK architecture)

The second way to reduce the probability of being cleaned is to increase the process priority. Let's take a look at this example first. This is a screenshot of the green daemon. At the top is "Don't sleep automatically for now", because the two APPs listed here are both in working status, and the corresponding process priority is "visual process". However, these two APPs do not provide small desktop departments running, nor do they indicate the resident notification bar of foreground services. In fact, they are just running in the background. Usually, when a process retreats to the background, its process priority type becomes a lower background process instead of such a "visual process". How do they improve the priority and reduce the probability of being cleaned up?

(Screenshot of green guard)

Android has a loophole in the design of foreground services. It can create an invisible foreground service through the cooperation of two services. Here are two started services: A and B. First, call startForeground in A to provide a NOTIFY_ID, and then A will become the foreground service. At the same time, there will be a resident notification bar reminder with the ID of NOTIFY_ID. Then NetEase will also call startForeground in B to provide the same NOTIFY_ID, and B will become the foreground service. Because the two notification IDs are the same, no new notification bar reminder will be created this time. Then call stopForeground in A, and the foreground property of A will be canceled. At the same time, the resident notification bar reminder will also be removed. However, service B will not be affected in any way. It is the foreground service. If you stop A again, the process will only have foreground service B, and the process will become a foreground process, but the user will not have any perception.

Normally, after the above three steps, the process can run stably in the background.

However, in some cases, the push process never gets up. After tracing, it is found that besides the system can kill processes running in the background, users can also kill processes. There are two ways for users to kill processes. One is to cross off the app in the recent task list. This way is the same as the system killing processes. The other way is to use force stop, which is more thorough than system cleaning. Not only will the running processes of the app be cleaned up, but the services to be restarted, registered alarm clocks, event monitoring components, etc. in the current restart list of the app will be removed. Unless the user actively clicks or the system restarts, the app can no longer climb up on its own.

On some domestic ROMs like MIUI, users remove apps from the recent task list, and the effect is force stop. Normally, if the user takes the initiative, the app itself should not be restarted. However, sometimes this is not the user's intention. In addition, for IM software, message push must be guaranteed, otherwise people who are uncertain about the correctness of the software will feel that the software is not good, and even message push is not good.

A good way to keep APP Android process alive

The first is to add exec to the fork twice. After two forks, the process of the first fork exits, and the process of the second fork will be adopted by the init process. The user will then force stop, because the complex process of this process is init, not Zygote, so it will not be cleaned. Since this process is still from the Android process fork, with the Android runtime environment and resources of the complex process, the memory will be relatively large. Here, you can use the exec command to open a pure Linux executable file and start a daemon process. Its memory occupation is only about 100K+, and users will be completely indifferent. Using this background process, you can periodically pull up the push process. This method is only effective in systems below 5.0. In systems 4.4 and above, the SELinux feature is forced on, and exec has no permission to execute. At the same time, after 5.0, when the ActivityManager performs force stop and removal tasks, as long as the processes have the same uid, they will all be cleaned up, and the processes without virtual machine environment will no longer be missed.

The last means to keep the backstage alive is a big killer. Because all the methods listed above are not so safe, we have come up with such a way to protect each other's lives. When an app is up, he scans the list of installed apps to see if there are any siblings, such as the same long app, or apps integrated with the same SDK. If there are, pull up all the apps. This is also the well-known "Family Bucket" scheme. Although this method can really bring a high background survival rate, especially those big factories and widely used SDKs, it is also very harmful to users. If it is necessary to push in the background and does not cause too much harm to the user experience, this method can still be used, but if it is only for advertising, it will cause harm to users. In turn, It may also cause users to directly uninstall the APP.

Now all kinds of mobile phone management software will limit this family bucket wake-up method, especially on the rooted machine, you can completely cut off these wake-up paths. At the same time, many ROMs will also have their own management software to limit background running and background wake-up, so as to exchange longer endurance for devices. In the current domestic Android ecological environment, it is increasingly difficult to run in the background all the time, no matter what method is used, so we need to think of another way to ensure message push. On the other hand, as a developer, it is also an obligation to provide users with better experience software, rather than endless waste of user resources in the background.

 

Expand to read the full text
Loading
Click to lead the topic 📣 Post and join the discussion 🔥
Reward
zero comment
zero Collection
zero fabulous
 Back to top
Top