Why LLMs Might Not Be a Good Business
Large language models (LLMs) may not be a sustainable business model due to several factors. While data, math, compute, and interface are crucial for LLMs, they are not unique to any single company and can be replicated by competitors. Furthermore, the (...)
TLDR
Large language models (LLMs) may not be a sustainable business model due to several factors. While data, math, compute, and interface are crucial for LLMs, they are not unique to any single company and can be replicated by competitors. Furthermore, the potential for open-sourcing models by competitors like Meta or DeepSeek could further undermine the profitability of LLM businesses.
The Investment Must Flow
Tens of billions of dollars have been invested in the artificial intelligence gold rush. And, at least until very recently[1], we assumed that up to hundreds of billions of dollars of additional investment[2] might be required to train the next generation of large language models (LLMs).
The Core Question: Is there value in LLMs?
Cal Patersen recently published a fantastic essay, in which he dissects certain key aspects of the business of LLMs. In this writing I will attempt to extend his framework and ultimately provide an even more compelling view that producing frontier LLMs might not be a good business.
I will argue that LLMs are only a good business if they can be monetized, achieve sufficient profit margins, and defend their market position.
What are the pillars of an LLM business?
Although we could identify many core areas of an LLM business, I will focus on the following four that are in my view the most critical and are most likely to be somewhat within the control of the business to execute:
- Data: LLMs need a lot of data to train. The models have been viewed historically as only being as good as the data they are trained on.
- Math: Developing the right mathematical, statistical, or other models to train an LLM makes your data useful.
- Compute: Having access to significant computational resources is essential to train an LLM, especially those with billions of parameters and petabytes of data.
- Interface: The distribution required to monetize an LLM is expensive and difficult to acquire. This is a significant barrier to entry for new entrants.
Let's treat each one of these in turn to see if it satisfies the criteria of being a good business.
Data
The ability to accumulate massive amounts of data is limited only by the kind of data that you seek. Data may be freely and publicly available. This is most likely to be data from the open web that is purposely posted for consumption by anyone who can resolve the web address. Many companies maintain vast amounts of this data and the only real barrier to accumulating it is the cost of storage and the cost of processing it, which are both relatively low. Data may be publicly available to see but restricted from downloading. Large portions of the internet, especially those parts with some of the most useful data, are managed by companies who restrict access to the ability to mass download the data from their sites. Reddit is a good example of a site that has taken steps to prevent mass downloading of its data, especially by parties who would us it to train various models. Data may be restricted from public view and access. This data lives behind paywalls or login screens, such as much of the data on Facebook, LinkedIn, or other social media sites. This data is the most difficult to acquire and is often the most valuable. I will also include any data that relates to protected personal information or other data that is for regulatory or other reasons restricted from inclusion in a training data set.
So while certain LLM companies may have a competitive advantage in the data they have access to, it does not appear to be sustainable, as the data is not unique to the company and can be acquired by others.
Furthermore, we are learning, in almost realtime, the role distillation plays in training smaller models from the larger ones. The distillation process provides the new model with the performance of the original model without the need for the original data. Thus, reducing the value and the need to acquire your own original data.
Math
Is math discovered or is it invented? There are a lot of opinions on this question, which may or may not have an answer. But the one thing we can answer is that math is published. Every significant organization doing work in the area of artificial intelligence or large language models publishes significant portions of their work. Google publishes their work. Meta publishes their work. Even DeepSeek publishes a significant amount of their work. DeepSeeks CEO said (via Stratechery)
"In the face of disruptive technologies, moats created by closed source are temporary. Even OpenAI’s closed source approach can’t prevent others from catching up. So we anchor our value in our team — our colleagues grow through this process, accumulate know-how, and form an organization and culture capable of innovation. That’s our moat.
Open source, publishing papers, in fact, do not cost us anything. For technical talent, having others follow your innovation gives a great sense of accomplishment. In fact, open source is more of a cultural behavior than a commercial one, and contributing to it earns us respect. There is also a cultural attraction for a company to do this."Liang Wenfeng - DeepSeek CEO
I have yet to mention the thousand of academics researchers who carry even stronger incentives to publish their work, especially the most impactful results and processes that they can uncover.
Building out mathematical or statical models does not appear to be a a sustainable competitive advantage. If you believe your model is the differentiator then you will likely be disappointed.
Compute
To build an advance model you must have sufficient computational resources to train it. Access to these computing resources is gated by two key limitations, who supplies the chips and access to sufficient capital to buy the chips.
Everyone knows Nvidia is the market leader in supplying the chips for training large language models. This presents significant risk to any company trying to build a model. Everyone is playing Nvidia's game. If Nvidia decides to raise prices, or if they decide to limit the number of chips they sell to any one company, then that company is at a significant disadvantage. This is a significant risk to any company that is trying to build a model.
For this and other reasons, companies who are serious about artificial intelligence are investing in their own chip design and manufacturing. Google, Microsoft, Amazon, and Apple all design or produce their own custom silicon for artificial intelligence workloads. OpenAI has even explored their own designs.
DeepSeek is making headlines because they achieved such outstanding results despite the fact that an export ban prevented them from using the most capable chips. Their engineers responded by customizing the chips they could get in ways to overcome their limitations. And of course, DeepSeek turned the world upside down by training their model for such a relatively low cost that it calls into question the very idea that having massive amounts of compute are even required.
If compute is a commodity, then it is not a sustainable competitive advantage.
Interface
Building a high performing model requires a way for people to interact with it. This of course was the major innovation of OpenAI. By creating a chat interface, it gave people a way to interact with the model in a way that was not possible before. However, most, if not all, subsequent entrants to the market have copied this interface. There is no sustainable advantage when competitors can simply mimic your designs.
This has played out in real time over the last few days as DeepSeek has gained notoriety. Their app shot to the top of the App Store in what seemed like an instant.
Consumers to not appear to be loyal to any model or organization creating LLMs. There does not appear to be any amount of stickiness to any one LLM app or company.
The interface is not a sustainable competitive advantage.
Another Thought on Open Source Models
If it is part of your value proposition keep it secret. If it is part of your competitors' value proposition then open source it.[3]
The business of LLMs will struggle if major competitors, like Meta or DeepSeek, build sufficiently capable models and then open source them. This cuts the floor right out from underneath the business model of any company that is trying to build a model and then monetize it.
One Thought on Apple and Meta's Positions
Artificial intelligence gains usefulness the more a model can interact with your own data. Apple and Meta both maintain access to a large trove of personal data that only they can access. Apple's data reside on the devices of its users, where it is difficult for any other player to unify and analyze it without core integration into the operating system. Meta's data is locked behind the login screen of its apps and services. They both appear to be more well-positioned to leverage the power of large language models and to feed them into activities that drive value for their firms than any other major company.
Conclusion
The business of LLMs is not a good business. The data, math, compute, and interface are not sustainable competitive advantages. Turning a business that builds frontier models into a profitable enterprise will likely hinge on some other aspect of the business and not its ability to produce the model itself.
USA Today: "DeepSeek says it costs less than $6 million to train its DeepSeek-V3 model. OpenAI, in comparison, spent more than $100 million to train the latest version of ChatGPT, according to Wired." ↩︎
Of course, there is a significant amount of debate on whether this announcement constitutes an accurate forecast or is more of a marketing headline. I'll leave it up to the reader to decide. ↩︎
I could not find a source for the original quote or its closest variation. If you know it, please let me know. ↩︎