Large-scale machine learning models are at the center of many technologies. They can generate images and text convincing enough to pass for a human's work. It took a lot of time and power to develop the models. If DALL-E 2 were trained on Amazon Web Services, it would cost around 130,000 dollars for two weeks of training.
Smaller companies struggle to keep up, which is why many turn to "ai-as-a-service" vendors that handle the challenging work of creating models. Assemblyai is a speech-to-text and text analysis vendor.
The Series B round was led by Insight Partners and included participation from Y Combinator and Stripe. According to Dylan Fox, the founder and CEO of AssemblyAI, the company has raised $64 million so far and is investing to grow its research and engineering teams.
After 2 years working on machine learning for collaboration products, Fox founded Assemblyai. He started YouGive1 to reward customers with product offers in exchange for donations.
After seeing how limited and low-accuracy the available options were, Fox started AssemblyAI. The company wants to research and deploy cutting-edge artificial intelligence models for natural language processing and speech recognition, and expose those models to developers in very simple software development kits and APIs that are free and easy to use.
Assemblyai has services in over 80 languages for automatic transcription, topic detection, and content moderation, as well as auto chapters, which break down audio and video files into chapters with summaries for each. At a relatively low cost, developers can use the platform to perform tasks such as identify the speakers in this conversation or check the podcasts for banned content.
The image is from Assemblyai.
The models are being trained on hundreds of graphics cards. Larger models are more sophisticated than smaller ones. He said that they continue to improve the accuracy of all of their models and launch new ones. It is possible to learn from a random sample of a customer's data in order to improve over time.
There are other players in the bustlingai-as-a-service sector. Sayso created an application that could change the accent of English in near-real time. Not for nothing, Amazon, Microsoft, and others have a number of products that target applications like text analysis, image recognition, text-to-speech, and more.
The rise of remote work is one of the reasons that AssemblyAI continues to grow at a rapid rate. He notes that audio and video is being incorporated into many products. P roduct teams are looking for ways to build high value features on top of audio and video data.
These features look like trust and safety teams at social media companies, as well as automated content moderation at advertising platforms and telephony companies building smarter contact center platforms. AssemblyAI is quickly becoming the go-to platform for these product teams to be able to ship these artificial intelligence infused features on top of audio and video data within their products.
According to Fox, Assemblyai now has hundreds of paying customers. The user base has gone up 3x while revenue has gone up 3x.
Fox said that they were processing millions of calls a day. Over the next six months, we will triple our research team and invest millions of dollars into hardware to train larger and more complex artificial intelligence models.
AssemblyAI will be well positioned for the coming year, according to Fox. At a time when layoffs are becoming a regular occurrence and financing is hard to come by, Assemblyai will nearly double the size of its team by the end of the year.
We closed our Series A funding just a few months ago, but weren't actively raising money. Fox said that they had been in touch with Rebecca from Insight and that she would help them further. As the market opens, we need to be able to both establish ourselves as the dominant provider in this space, as well as support the growing expectations of customers.