Information Center

The data war behind the network black production and competition for user information

  

 Browser Cloud Security

In the past week, Facebook was on the verge of death due to the indirect data leakage of more than 50 million users. The investigation of the Federal Trade Commission of the United States has begun. If it is true, Facebook will face a fine of up to $2 trillion, and it is deeply in a crisis of trust. After the incident, the company's share price fell all the way, and its market value evaporated by 50 billion yuan in two days.

Earlier, late at the night of March 7, the Coin Security Exchange, the world's second largest virtual currency market, was hacked, and a large number of virtual currencies were converted into Bitcoin. Cryptocurrencies, including Coin Security and Fire Currency, plummeted across the board, with some mainstream currencies falling by more than 5%. Subsequently, the Currency Security Exchange announced that "this is a large-scale phishing to obtain user accounts and try to steal coins."

The latest hot topic is that big companies use big data to "kill". For example, when using Didi Taxi, the prices of the same departure and destination are different, and even the prices generated by different mobile phones are different. Although Didi CTO Zhang Bo denies the existence of "kill familiar", it is because users feel the power of big data at close range. Everything depends on the attitude and decision of the enterprise.

In just one month, several global malignant events have erupted due to data problems. Although the place and field of occurrence are different, all of them involve commercial interests. The victim is the user's data security and information privacy. Surprisingly, by the middle of 2017, the number of people engaged in online black industry in China had exceeded 1.5 million, and the market size had reached 100 billion yuan.

It is undeniable that in the era of the Internet of Everything, the strategic importance of data is growing day by day, and the commercial value generated by big data has also been recognized, but only a small part of the data can truly realize commercial value. Those criminals who are under the banner of "protecting user privacy" are deliberately and blindly seizing data. As the protagonist being contested, users often appear powerless and have no room for resistance.

To some extent, this is related to the lack of supervision. On June 1 last year, two laws and regulations on network security came into force. At least 50 articles of illegal acquisition and sale of citizens' personal information can be deemed as "serious circumstances", which meets the criteria for sentencing. Within three months, the Haidian police in Beijing cracked more than 30 cases related to this. Previously, even the transactions with hundreds of millions of pieces of data, due to the lack of judicial interpretation, the cases could not go to the proceedings and often ended up in nothing.

Those who can stand at the top of data power are likely to be super companies that can really use data well. Because almost all interviewees said that the protection and use of data in China is still disorderly, there is no bottom line for illegal production, and Internet enterprises rely on self-discipline.

WeChat, which is in charge of 1 billion users, was questioned about "watching users chat every day". Zhang Xiaolong once denied it in 2018 WeChat Open Class. The official also clearly responded that WeChat does not keep any user's chat records, and the chat content is only stored in the user's mobile phone, computer and other terminal devices. In addition, WeChat will not use any chat content of users for big data analysis.

Alibaba is one of the enterprises that most value data in China. In the past five years, most of Ma Yun's public speeches mentioned the opportunities and responsibilities of enterprises in the DT era. In 2012, when Alibaba set up its first CDO (Chief Data Officer), Jack Ma wrote in an internal email that "Alibaba will become a real data company".

The party holding data urgently needs to exercise the power of data, which seems to be able to stand at the commanding height of future strategy. As AI, new retail and other industries step into the air one by one, data is beginning to be used on a large scale, and friction between enterprises and users, enterprises and enterprises is obviously intensifying.

Data black production

Information leakage is invading normal life in a pervasive manner. A user authorizes an application to use a mobile phone microphone, interact with friends on a social platform, or even inadvertently log on to a website, all of which have the possibility of real-time information acquisition.

"Excessive and stupid." Ma Gang, the co-founder of Firevelvet Security, was a little resentful. In his opinion, data can also be effective or invalid. Most enterprises use data inefficiently. "It seems that I went to the user's home to search and took away a lot of information, but I didn't find anything useful. I hurt the user and didn't get any benefits myself."

Firevelvet is a service provider focusing on the security of PC software. In their monitoring, almost all desktop software infringes. "It's crazy. Even 50% of the broadband of some software is used to upload user information. They can not only monitor the data stored in the computer, but also record the user's online account."

The data that Chuangyu gets is that the number of attacks on the PC side is about 30 billion every day, while the number of normal visits is about 20 billion, far less than the number of attacks by hackers. Among them, information leakage in education, medical care, finance, fitness and other fields is the most serious.

The data problem on the mobile end is obviously more serious. The function clicked or the application downloaded unintentionally has the risk that the mobile phone will be ROOT. "It can bypass any authority, whether the user agrees or not, it can record all the user's operations and do anything it wants to do." Fang Ning, vice president of Bangbang Security, told China Entrepreneur.

Different from Huorong, Bangbang Security is a security service provider for mobile and Internet of Things. At present, it provides security services for more than 800000 mobile apps. Their observation is that, in addition to financial companies and large-scale Internet companies with their own security teams, 70% of APPs were initially launched naked.

At least 30% of the mobile Internet traffic flows to the black industry. Taking the bike sharing industry as an example, the company initially obtained users by means of subsidies, for example, a subsidy of 1 yuan for one cycle ride. The black industry will simulate mobile phone numbers and user behavior, and will eventually cheat out a subsidy of 1 yuan without cycling. If the annual promotion fund is 1 billion yuan, 300 million of which will flow to the black industry.

Compared with the low-level barbarism of black products, mobile Internet is full of cunning in stealing user information.

The reason why Facebook has recently fallen into crisis is that a company called Cambridge Analytics, UK, reached Facebook users through a personality analysis test app. In this test, users were asked to "authorize the app to access their own and friends' Facebook data information". Although only 270000 users agreed, after the snowball effect, This app eventually gets more than 50 million Facebook users' information.

What really caused panic was that Cambridge Analytics sold the information of 50 million users to a third party. Facebook believes that the above company obtained user information with user permission, but sold it to a third party without user permission, which is the main reason for this information disclosure, although Facebook has been aware of the existence of the vulnerability before.

"Whether it is allowed by users" is an important standard to judge whether the enterprise uses user information legally or not. When installing a new APP, it is usually required to access the address book, geographical location and other information, but few enterprises will give a clear explanation of the purpose, time and method of access, which is clearly stipulated in the Network Security Law.

In the Spring Festival of 2018, the headline splashed 1 billion yuan today to launch the "Year of Wealth in China" activity. Users can receive cash red packets by collecting zodiac cards, red packets rain, taking small videos of New Year greetings, etc. It was originally an activity of making money for users, but the withdrawal agreement included a large amount of collection of personal privacy "including but not limited to identity information, personal information, and account information". More importantly, signing this agreement means that the user agrees to provide all personal information to the third party on the front page today, and requires the user to agree that after the account is cancelled, "the company can still save the relevant information before cancellation".

Just one month before the incident, three companies, Toutiao, Ant Financial and Baidu, were interviewed by the Ministry of Industry and Information Technology today because they collected personal information privately. The Ministry of Industry and Information Technology believed that the above companies had insufficient information about the rules and purposes of user information collection.

"Excessive collection of user information is very common in Internet companies." Zhao Guodong, secretary-general of Zhongguancun Big Data Industry Alliance, told reporters that enterprises used the privilege of obtaining information to collect information excessively by hitchhiking.

In the face of "unicorn" and "Big Mac", Dong Libo of Haidian Police Support Brigade can take very limited countermeasures. "They will not clearly go beyond the legal limits, but only walk in gray areas, and key data have their own servers, so it is difficult to investigate and collect evidence." In 2017, Dong Libo and his team cracked hundreds of cases, I spent most of the year on business trips.

Due to the lack of awareness of privacy protection, users are likely to sign an agreement to disclose personal information unconsciously.

At the beginning of January, Alipay released its annual statement, and the line "I agree with the Sesame Service Agreement" at the bottom was not only small in font, but also checked by default. The agreement states that Alipay can directly provide users with relevant information to third parties, and can analyze and push it to cooperative organizations, as well as have the right not to support users to revoke the information query authorization of third parties. After being discovered by the user, Alipay apologized and modified the options agreed by the default user. "In any case, Alipay should not default to the user's permission, but it is unclear whether it is illegal or not. It is still a gray area." Ma Gang analyzed.

Such sidelines are very common in the Internet industry. Dong Libo found that the latest version of the Taobao platform service agreement defines the scope of "Taobao platform" and "Alibaba platform" in detail, "which has never been so detailed before." On his desk, there are a large number of books related to legal provisions. Agreements are usually full of word games, and Dong Libo needs to find loopholes in them.

Although the law has clearly stipulated that the user information legally obtained cannot be provided to others without the consent of the collector, in the Taobao agreement, it still said that "the user information will be shared with related companies", and the purpose, method and scope of use were not indicated. Dong Libo explained that in the new Interpretation on Several Issues Concerning the Application of Law in Handling Criminal Cases Involving Infringement of Citizens' Personal Information (hereinafter referred to as the Interpretation), data cannot be inherited. For example, data obtained by the parent company cannot be directly provided to subsidiaries.

Data black hole

Seeking ways to obtain user data is only one aspect, and data competition among enterprises has also surfaced.

Database collision means that hackers try to log on to other websites in batches by collecting user and password information that has been leaked from the Internet, and obtain a series of users who can log on. When users use the same login account and password on different platforms, the success rate of database collision is particularly high. The recent dispute between 360 and Station B involves the problem of database collision.

Fast video is a short video product launched by Qihoo 360 last November. In February this year, a large number of B station users can directly log in to Fast Video with the same user name and password, but they have not registered in Fast Video before. Another problem that fast video is criticized is that a lot of content overlaps with station B. As of February 22, Fast Video found nearly 5000 non genuine accounts from Station B, and more than 16000 relevant video content.

Although Fast Video denies hitting the database and dragging the database B station data, the outside world believes that hitting the database is an important means to quickly obtain users and information. An industry security person analyzed that "doing so is creating the illusion of false prosperity, moving the shadow, but no one."

From the competition of registered accounts to the competition of "account+data", Lv Guihua, president of Qiniuyun, felt very obvious. Daily life is a more important assessment dimension than the number of registered accounts, and what supports daily life is the data and relationships users leave on the platform. "Enterprises now know how to control users, leave users and data, and the relationships generated in the process, and users will naturally return to the platform."

In the past three years, Lv Guihua has felt that enterprises attach more importance to data. As enterprise level cloud service providers, a large number of companies store data on the servers of Qiniu Cloud. "In the past, enterprises would regularly delete some data on the servers because of saving money, but now, even if they can't use it in the short term, enterprises will retain data."

On June 1 last year, Shunfeng and Cainiao fought hand to hand, and the focus of contention was data. Cainiao claims to upgrade the information security of the whole network logistics data to protect consumer privacy and phone information security, but Shunfeng refuses to cooperate. Shunfeng's reason is that Cainiao requires to provide irrelevant customer privacy data, which belongs to the user and cannot be provided without the user's permission.

One day later, the dispute between the two sides rapidly expanded into two camps, one is the rookie system represented by "four connections and one arrival", the other is the enterprises such as JD, Meituan and NetEase that quickly rushed to rescue Shunfeng. The details of the final reconciliation between the two sides are unknown, but it is a matter of life and death. Neither side wants to step back.

At the end of August last year, the Shanghai Intellectual Property Court made a judgment on Baidu's suspected use of public comment information by improper means. Baidu lost the lawsuit and compensated public comment 3.23 million yuan. Lv Guihua believes that this is a typical enterprise friction caused by data competition.

The reason is that when users use Baidu Maps and Baidu Knows to search for a merchant, the page will display the user's evaluation information of the merchant, most of which comes from public comments. For example, among 1055 merchants involved in the catering industry, 86286 reviews were from Dianping.com, and more than 75% of the reviews used by 784 merchants were from Dianping.com.

In the end, the court ruled on the ground that "Baidu substantially replaced the plaintiff's website by using the information of Dianping.com, which is illegitimate". In this friction, Baidu apparently used data information that should belong to public comments and users, and did not inform both sides.

The data diversion transaction under the table is also an open secret in the industry.

Since 2016, Alipay, as a credit bureau, has connected Zhima Branch with many online loan platforms to provide risk control business for the latter. Previously, a person in charge of online loan platform business said in an interview that Alipay would provide him with the user risk assessment results. In exchange, users would complete the lending behavior on the online loan platform and "need to reply the user related data to Ant Financial Services for more than 20 days", so that Alipay could improve its credit blacklist.

Similar behaviors have been clearly stipulated in the Regulations on the Administration of Credit Reporting Industry. As an online loan platform, "if you provide personal bad information to credit reporting institutions, you should inform the information subject in advance." In the second half of last year, after accumulating a large amount of data, Alipay began to tighten its cooperation pockets.

In the Interpretation, which came into force on June 1 last year, it was mentioned that "providing legally collected citizen information to others without the consent of the collector" is an illegal act of selling and providing personal information.

Compared with the data competition among enterprises, Zhao Guodong believes that the more serious problem is the data separatism. BATJ has its own data, but they are not interconnected. After knowing the importance of data, enterprises have built fences. However, in the subsequent data transactions, due to the unequal volume, it is easy to have data hegemony.

In a sense, the emergence of Netlink is to balance the relationship between third-party payment platforms and traditional banks. Before the advent of Internet connection, third-party payment bypassed clearing institutions through direct connection of accounts opened in multiple banks. "Banks cannot obtain the data of transactions between third-party payment platforms. In the long past, they will become a data black hole, with a large amount of data and complete isolation from the outside." Zhao Guodong analyzed.

Zhao Guodong believes that the way to disintegrate data hegemony is to confirm the right of data, which is also the ownership. At present, the industry has reached a consensus that the basic information of users, such as personal information, shopping information, geographical location, etc., should belong to users, but the information and data generated in the business process should belong to enterprises. Taking Gaode Map as an example, the ownership of personal tracking information belongs to individuals, but the data such as congestion duration judged by Gaode according to road conditions belong to enterprises.

data mining

Although data mining is still the tip of the iceberg, it can be seen that Internet giants represented by BAT are gradually moving towards a positive cycle.

Zhao Guoliang, a senior technical expert in JD's big data platform and product research and development department, believes that the key to data application is whether there is scenario support. "The richer the scenarios are, the greater the space for data to play. On the contrary, data is useless garbage. For companies with BAT size, there are many business scenarios, so we don't worry about data being useless."

So far, JD has accumulated data in such links as commodity purchase and sales, user purchase, warehousing and distribution, and logistics after-sales, with a total of 400PB.

New retail is a scenario where online data is used offline. 7FRESH is a fresh food supermarket under JD, and JD can push 7FRESH products to users according to their precise portraits. This process is not to directly give users the previous transaction information, but an analysis result.

Unmanned supermarkets also need to make comprehensive use of data in different scenarios. Alibaba opened the first unmanned supermarket "TaoCoffee" last year. Users log in to Taobao ID and enter the supermarket. During the shopping process, the camera will collect user behavior tracks to ensure that the display of subsequent products can better meet user needs. During the settlement process, the camera will automatically complete the settlement and change the inventory records, which requires access to data in different dimensions.

Their honey is my arsenic. The same data can play different roles in different scenarios. The user's shopping information is worthless, but enterprises can use it as a basis for various judgments. A commodity has a particularly large sales volume in a certain region. With this information, you can stock up more goods in storage in advance and shorten the logistics time. But it also involves the circulation of data.

Zhao Guoliang believes that what really blocks the flow of data among enterprises is technology. "Until the problem of desensitization and anonymous data is solved, the flow of data among enterprises will be blocked."

Unlike the black industry, enterprises compete for data mostly because they want to occupy the data track faster.

Last year, Huawei and WeChat had a dispute over user data. The context of the matter is clear: Huawei hopes to read user WeChat data and automatically load relevant information. For example, when talking about movies, it recommends applications related to this. However, when grabbing WeChat data, the latter refused to protect user information, while Huawei said it had obtained user permission.

There is no doubt that the data of WeChat belongs to the user, and whoever obtains and uses the data must obtain user authorization. Huawei wants to use WeChat data to try more interactive experiences. However, for WeChat, users' chat data is its core asset, which cannot be easily given away.

In Zhao Guodong's view, the data competition among enterprises will only become more and more intense. "Small companies may have no bargaining space with large companies, but large companies are looking for new growth points. Data is regarded as a gold mine, and everyone wants to mine it."

Zhao Guoliang believes that the data power generated by the current giant is not as big as expected. "It is hard to say how it will affect the social order and economic system, but it can help entrepreneurs to judge the industry trend in a more advanced way."

The role of the government in data sharing has not been fully played. Sun Pishu, chairman of Inspur Group, has put forward the proposal on "government open data sharing" at the two sessions for several consecutive years. In his opinion, compared with Internet enterprises, the government has a larger volume and higher quality of data.

In Alibaba, Tencent and other Internet companies, there is a huge ID mapping table that identifies users according to different dimensions, such as name, WeChat ID, Taobao ID, JD ID, Mobike ID, etc. The information of users in different scenarios is different, but this ID mapping table is to map users in different scenarios one by one. With the increase of information density, the user's portrait will gradually become clear, without secrets, and eventually become transparent.