The amount of data being generated and stored in the cloud has exploded.
Every aspect of the enterprise is being instrumented for data, so new operations are built based on that data, pushing every company into becoming a data company.
The emergence of the cloud database is one of the most profound shifts driving this. Services such as Amazon S3 have solved computing on large volumes of data and have made it easy to store data from every available source.
The enterprise wants to be able to deliver improved customer experiences and new market capabilities by storing everything they can.
Over the last 10 years, database companies have raised over $8 billion, but only half of that, $4.1 billion, has been raised in the last 24 months, according to CB Insights.
It's not surprising given the high valuations of the companies. The market doubled in the last four years and is expected to double again over the next four years. There is a huge opportunity to go after.
Here is a list of database financings in 2021.
Spending in the enterprise is being driven by database growth. Venrock is the image credit.
Thanks to the cloud, distributed applications, global scale, real-time data and deep learning, new database architectures have emerged to solve for new performance requirements.
There are different systems for fast reads and fast writes. There are systems that are specifically designed to power ad-hoc analytic systems, as well as systems that are used for data used for cache, search, and more.
It may come as a surprise, but there are still billions of dollars in Oracle instances still powering critical apps today, and they likely aren’t going anywhere.
High availability, horizontal scale, distributed consistency, failover protection, partition tolerance and being serverless and fully managed are some of the performance needs of each system.
The average enterprise stores data across seven or more different databases. For example, you may have Clickhouse for ad-hoc analytic, Timescale for time-series data, Elastic for their search data, S3 for logs, Postgres for transactions, Redis for caching or application data, and Dgraph for complex workloads.
If you have built a modern data stack from scratch, then that's all.
The level of performance and guarantees from these services and platforms is vastly different than it was a decade ago. The database layer is becoming more and more fragmented at the same time.
When new clusters or systems come online, the overhead of managing active-active clustering across so many different systems can be a problem. Each of these has different requirements.
We have new databases every month that aim to solve the next challenge of enterprise scale.
Will the future of the database be defined as it is today?