Making big tech companies share data could do more good than breaking them up


Don’t break up the big tech companies, says Viktor Mayer-Schönberger, a professor of internet governance at the Oxford Internet Institute. Make them share.

To him, the recent push for investigation is a good idea, but splitting up a company like Google will make tools like search worse without making it easier for startups to build good alternatives. Even requiring companies to stop favoring their own services over those of rivals won’t keep the platforms themselves from getting better and dominating the market.

Mayer-Schönberger, who is the coauthor of Reinventing Capitalism in the Age of Big Data, suggests adopting a “progressive data-sharing mandate” that forces companies above a certain size to share some of their data-anonymized for privacy-with smaller competitors. I caught up with Mayer-Schönberger this week to discuss this bold proposal, and what it can and cannot fix.

The following has been edited for length and clarity.

In every market there tends to be a push toward concentration. But for time eternal there was a counterforce, and that was human innovation. So a small startup could come up with a better idea, and that kept markets competitive.

Innovation is moving at least partially away from human ingenuity, toward data-driven machine learning. Those with access to the most data are going to be the most innovative, and because of feedback loops, they are becoming bigger and bigger, undermining competitiveness and innovation. So if we force those that have very large amounts of data to share parts of that data with others, we can reintroduce competitiveness and spread innovation.

You can break up a big company, but that does not address the root cause of concentration unless you change the underlying dynamic of data-driven innovation.

Sign up for The Download – your daily dose of what’s up in emerging technology

Breaking up a big data company reduces the value generated from data. It is useful to collect and repeatedly use a lot of data. Take Google Street View cars: they capture images, road geometry, Wi-Fi signals. That doesn’t simply help Google Street View; it improves geolocation on Android and helps Waymo’s autonomous driving. If you split Google into smaller silos, you are reducing the ability to use that data. That not only constrains Google’s ability to grow and be innovative; it also lets nobody else be innovative either, because nobody else has that data.

That’s not necessarily the case, for two reasons. First, the value of any additional data point diminishes as you have more and more-so smaller players benefit more than bigger ones. Secondly, data sharing means smaller companies can get data from Google, Microsoft, and various other players. So they get very diverse data, and that provides a more comprehensive picture and may actually be better off than Google, which has a very homogenous data source.

How much data should be shared. We suggest numbers up to 3 to 5 percent, but those are back-of-the-envelope calculations. You want to provide a sufficient amount, but those numbers aren’t etched in stone.

Another is: How do we know what data these companies have? You could think of an online directory of data holders where they have to publish the kinds of holdings they have, like “search data,” or “autonomous driving data,” and potential competitors could consult that directory of data sources. It can be done relatively easily, but needs to be organized.

Privacy isn’t a single issue but a bundle of problems; one among them is the concentration of information power. And while data sharing doesn’t increase individual control, limiting the concentration of information power might protect us from a Big Brother-like situation.

But make no mistake: the data-sharing mandate isn’t solving privacy issues. That’s not what it can do. But it can help provide alternatives. We’ve had search companies providing internet search that is more privacy conducive, but their results sucked because they didn’t have enough training data. If we help competitors to Google or Facebook offer really good services because they have training data, I think that would help consumers and help privacy.