The primary question the AI and tech community faces at this point is not technological but commercial — does good technology translate into good business? (Representational/ Created by Dall-E)
The AI revolution, heralded by the launch of ChatGPT in November 2022, which suddenly opened a bewildering world of possibilities, seems to now have plateaued. While the technology remains uniquely world-altering, the next couple of years will likely see the space develop in a more incremental manner rather than the exponential advances that have become the norm. There are five main reasons for this.
Running out of food for AI
First, we are quickly running out of data required to train newer models. The voracious appetite of existing models has already vacuumed up most of the publicly available text on the open internet and it is estimated that the entire lot of high-quality textual data online will be used up by 2028, creating a so-called “data wall”, that might be difficult to breach. While there are still abundant sources of video, audio and images that can be tapped into, these are far more difficult to use for training than text and are subject to greater intellectual property protections.
Second, the quality of the public data that remains available is a matter of serious concern with much of the high-quality data already accounted for, leaving only subpar sources which necessitates spending more time on “cleaning” available data to make it fit for consumption. Good quality data is an absolute necessity for training quality LLMs. Finding new, untapped sources of such data will be a difficult task, forcing developers and companies to look at improving the quality of existing data sets and drawing more utility from them.
A sourcing issue
Third, any further expansion of data for LLMs would need to come from two sources. Either proprietary data protected by intellectual property, or from “synthetic data” generated by AI systems themselves. Access to proprietary data can legally only be acquired on a case-by-case basis, subject to specific agreements with the data owners, which reduces the speed of said access. While there has already been some movement on this front with institutions like The New York Times signing data access agreements with OpenAI for their content, how much this model can be replicated globally is up for debate. Synthetic data, on the other hand, could provide a more ready alternative to publicly available online sources, by training AI models on data produced by other AI systems, thereby bypassing the “data wall” problem.
The growing interest in the use of synthetic data and its theoretical possibilities has led firms like Nvidia and IBM to launch their own synthetic data generators. There is, however, a great deal of scepticism about the efficacy of such data compared to human-generated sources. A recent study in Nature magazine suggests that models trained on synthetic data are more likely to “hallucinate” or produce nonsensical outputs because it amplifies mistakes made in previous generations of the trainings, leading to AI models “collapsing” on themselves. AI offerings trained on synthetic data are therefore likely to have less commercial uptake when compared to models that are trained on human-generated data.
Fourth, an honest assessment would show that despite the hype over the last year and a half, AI adoption has been relatively slow and narrow, creating doubts about the short-term commercial viability of prevalent models. The most recent quarterly financial report from Alphabet for example shows that the cost of training and deploying their respective models has far outweighed any immediate commercial returns, leading to a hammering of its share price after 18 months of giddy ascent. Odds are, other AI-focused Big Tech firms like Meta and Microsoft are likely to show similar results. In essence, the majority of the world still does not know what to do with existing AI models, forget about trying any new ones.
Still a niche
Beyond a small niche, AI as a product has not really taken off as expected. Big Tech’s focus now is on creating more use-cases for existing AI, either by integrating it at the backend within existing products the way Apple is, or cannibalising existing rivals’ businesses like OpenAI is attempting to do with SearchGPT.
Fifth, increased regulatory scrutiny globally coupled with rising geopolitical tensions have circumscribed the area within which AI models can operate and develop. Unlike the internet, which enjoyed a relatively long period of unregulated growth across the world, AI has from the outset been squarely in the line of sight of governments. A wide range of policy issues from copyright infringement and potential job losses to environmental and national security concerns that are attributed to AI will ensure that any further leapfrogs in capabilities are subject to the beneficence of individual states, reducing both the speed and scale of advancements.
The task, for now
The primary question the AI and tech community faces at this point is not technological but commercial — does good technology translate into good business? After all, developing a technology and commercialising it are two very different problems. Will AI change the world? Undoubtedly, but this will take time and will be subject to the very real problems of capital, regulations, and market forces. There is also significant work that remains to be done on the physical aspects of the AI ecosystem, such as designing more efficient and environmentally positive data centres, creating more robust chip supply lines, and building new power plants, amongst a host of other issues.
Therefore, until significant AI-attributable revenues start coming in, AI and tech companies will be forced to focus on the incremental tinkering needed to make existing models better, sustainable, and importantly, more commercially viable. This phase of consolidation is not only good but is also required to set the necessary groundwork for the next generation of frontier models to be more easily accepted globally.
The writer is Managing Partner, Evam Law & Policy