Databricks, a San Francisco-based software maker, announced its acquisition of MosaicML, a three-year-old startup that focuses on taking AI beyond the lab. The deal is worth $1.3 billion and is indicative of the fervor for assets in the generative artificial intelligence market. Additionally, it demonstrates the changing nature of the modern cloud database market.
MosaicML, staffed with semiconductor veterans, has built a program called Composer that makes it easy and affordable to take any standard version of AI programs such as OpenAI’s GPT and dramatically speed up the development of that program. The company this year introduced cloud-based commercial services where businesses can for a fee both train a neural network and perform inference, the rendering of predictions in response to user queries.
The more profound element of MosaicML’s approach implies that whole areas of working with data, such as the traditional relational database, could be completely reinvented. “Neural network models can actually be thought of almost as a database of sorts, especially when we’re talking about generative models,” said Naveen Rao, co-founder and CEO of MosaicML. Rao explained that a database is a set of endpoints that are typically very structured, so typically rows and columns of some sort of data, and then, based upon that data, there is a schema on which you organize it. Unlike a traditional relational database, such as Oracle, or a document database, such as MongdoDB, where the schema is preordained, with a large language model, “the schema is discovered from [the data], it produces a latent representation based upon the data, it’s flexible.”
The MosaicML work is part of a broad movement to make so-called generative AI programs like ChatGPT more relevant for practical business purposes. For example, Snorkel, a three-year-old AI startup based in San Francisco, offers tools that let companies write functions which automatically create labeled training data for so-called foundation models — the largest neural nets that exist, such as OpenAI’s GPT-4.
MosaicML, which raised $64 million prior to the deal, appealed to businesses with language models that would be not so much the generalists of the ChatGPT form but more focused on domain-specific business use cases, what Rao called “building experts.” The prevailing trend in artificial intelligence, including generative AI, has been to build programs that are more and more general, capable of handling tasks in all sorts of domains, from playing video games to engaging in chat to writing poems, captioning pictures, writing code, and even controlling a robotic arm stacking blocks.
The use of AI in the wild, by individuals and institutions, is likely to be dominated by approaches far more focused because they can be far more efficient. “I can build a smaller model for a particular domain that greatly outperforms a larger model,” Rao told ZDNET.
MosaicML had made a name for itself with performance achievements by demonstrating its prowess in the MLPerf benchmark tests that show how fast a neural network can be trained. Among the secrets to speeding up AI is the observation that smaller neural networks, built with greater focus, can be more efficient. That idea was explored extensively in a 2019 paper by MIT scientists Jonathan Frankle and Michael Carbin that won a best paper award that year at the International Conference on Learning Representations. The paper introduced the “lottery ticket hypothesis,” the notion that every big neural net contains “sub-networks” that can be just as accurate as the total network, but with less compute effort.
The acquisition by Databricks brings MosaicML into a vibrant non-relational database market that has for several years been shifting the paradigm of a data store beyond row and column. That includes the data lake of Hadoop, techniques to operate on it, and the map and reduce paradigm of Apache Spark, of which Databricks is the leading proponent. The market also includes streaming data technologies, where the store of data can in some sense be in the flow of data itself, known as “data in motion,” such as the Apache Kafka software promoted by Confluent.
MosaicML’s work is part of a broader movement to make generative AI programs more relevant for practical business purposes. The company’s training technologies are being applied to “building experts,” using large language models more efficiently to handle corporate data. MosaicML has built a program called Composer that makes it easy and affordable to take any standard version of AI programs such as OpenAI’s GPT and dramatically speed up the development of that program, the beginning phase known as the training of a neural network.
MosaicML’s approach implies that whole areas of working with data, such as the traditional relational database, could be completely reinvented. “Neural network models can actually be thought of almost as a database of sorts, especially when we’re talking about generative models,” said Naveen Rao, co-founder and CEO of MosaicML. Unlike a traditional relational database, such as Oracle, or a document database, such as MongdoDB, where the schema is preordained, with a large language model, “the schema is discovered from [the data], it produces a latent representation based upon the data, it’s flexible.”
The acquisition of MosaicML by Databricks is a sign of the changing nature of the modern cloud database market. It also highlights the fervor for assets in the white-hot generative artificial intelligence market. MosaicML’s work is part of a broad movement to make generative AI programs more relevant for practical business purposes. The company’s training technologies are being applied to “building experts,” using large language models more efficiently to handle corporate data.
The MosaicML work is part of a broad movement to make generative AI programs more relevant for practical business purposes. For example, Snorkel, a three-year-old AI startup based in San Francisco, offers tools that let companies write functions which automatically create labeled training data for so-called foundation models — the largest neural nets that exist, such as OpenAI’s GPT-4. Another startup, OctoML, last week unveiled a service to smooth the work of serving up inference.
MosaicML, which raised $64 million prior to the deal, appealed to businesses with language models that would be more focused on domain-specific business use cases, what Rao called “building experts.” The use of AI in the wild, by individuals and institutions, is likely to be dominated by approaches far more focused because they can be far more efficient. “I can build a smaller model for a particular domain that greatly outperforms a larger model,” Rao told ZDNET.