A massive store of data containing information on about one billion Chinese residents could be one of the biggest data breeches of all time.

Portions of the leaked data appeared last week on a known cybercrime forum from someone selling the cache for 10 bitcoins, or about $200,000, and is said to have been taken from a Shanghai police database.

Some of the data has been verified as authentic, which suggests at least some of the data is real. It is not clear how the data came to be in the hands of an underground seller.

There are restrictions on speech and expression in mainland China and internet access is restricted.

Questions are being raised about Beijing's ability to keep that data secure, as well as the large scale of China's surveillance state.

We've learned a lot so far.

How did the data leak?

The seller claimed on the forum that he downloaded the data from a cloud storage server hosted by the Chinese e- commerce giant. The company said it was looking into the claims when contacted.

Experts say that the database may have been exposed by human error before it was discovered, but they don't know how it happened. This would seem to rule out the claim that the database's credentials were accidentally published as part of a technical post on a Chinese developer site in 2020 and later used to snatch the billion records from the police database, since no passwords were needed to access it.

The database was exposed through a web-based software used to visualize and search huge databases, according to a Ukrainian security researcher. Anyone could have accessed the data if they knew the web address of the database.

The internet is frequently scanned by security researchers for databases that have been accidentally exposed. Threat actors also run the same scans, often with the goal of copying data from an exposed database, deletion it and offering the data's return for a ransom payment, an increasingly common tactic used by criminal dumpster-divers. On this occasion, a malicious actor found, raided and deleted the exposed database, and left a ransom note for the return of their money.

The threat actor decided to get money elsewhere after the note did not work. Another malicious actor found the data and decided to sell it.

The seller and why the data was dumped online are unknown. It is not uncommon to see large quantities of personal data for sale on the dark web, but rarely for such a large amount.

What does the data look like?

A larger sample of the data uploaded by the seller contained three files, 500 megabytes in total, each containing 250000 individual records.

The data is formatted in a way that makes it easy to read. The format of the database suggests that it was meticulously maintained and downloaded, rather than being created by simply gathering information from multiple data sources. Some data may have been derived from outside sources.

The sheer size of the data and the level of detail make it hard to fake.

The police records were translated from Chinese to English.

china-police-case-1
china-food-order
china-vpn-user
View 6 Photos

The files contain detailed police reports dating back to 1995 and include names, addresses, phone numbers, identity numbers, sex and the reason why the police were called out. The names of the people who made the police reports, as well as their race and ethnicity, are included in the records seen by TechCrunch. The Chinese government has imprisoned more than a million of its own citizens, mostly from Muslim minority ethnic groups, which the Biden administration has labeled a "genocide."

Reports of credit card fraud, internet scam and gambling, which is illegal in China, are included in the records. Police reports show that they are cracking down on the use of virtual private networks, or virtual private networks, which are used to access sites that are blocked in China and are not allowed in China. A record shows that a Shanghai resident was accused of using a virtual private network to post critical remarks about the government on social media. What happened to the individual is unknown.

The data also contained full web addresses to photos stored on the same server, none of which were accessible at the time of writing, but the associated data often indicates what was uploaded, such as a person's residency documentation or their passport. The way in which these web addresses are formatted is consistent with howAlibaba stores files.

Many of the records we looked at contained information on children based on their dates of birth.

Without confirmation from the Chinese government, it's difficult to know if the seller's claims are legit and the data was obtained from the police dept. The Wall Street Journal, The New York Times and CNN have verified some of the data by calling the people who found it in the database.

What is the impact?

If the allegations are true, this could be a major problem for Beijing, and raises questions about the government's security measures.

China is increasing protection for personal data. China passed the first comprehensive privacy and data protection legislation in the world in September of last year. The law restricts how businesses can collect personal data and is expected to have a sweeping effect on the ad businesses of the country's biggest tech giants.

China's messaging apps are blocking messages and mentions of the data leak anddatabase breach, which is reported to have taken place in Beijing. The Chinese government has yet to comment on the incident.

It is not the first time that a large set of Chinese residents' data has been exposed to the internet. The contents of a facial recognition database were spilled by a smart city installation in China.

China passes data protection law

You can get in touch with this reporter by email or by phone.