Saturday, February 1, 2025
Home Opinion After DeepSeek: For India, time for the AI leap

After DeepSeek: For India, time for the AI leap

by
0 comment

After DeepSeek: For India, time for the AI leapThe launch of DeepSeek’s model has set off a global AI race. Where does India stand? (Illustration by C R Sasikumar)

indianexpressindianexpress

B Ravindran

Krishnan Narayanan

Jan 31, 2025 14:32 IST First published on: Jan 31, 2025 at 07:09 IST

DeepSeek, a Chinese startup, recently released its AI model (R1) designed for advanced reasoning tasks. It has raised a virtual storm worldwide. The AI models, which have been open-sourced, are supposed to have been built using just 2,000 Nvidia H800 GPUs, matching the performance of leading systems like OpenAI’s ChatGPT 4.0, but at a fraction of the cost (just $ 6 million for its final training). These numbers (of AI infrastructure and costs of model development) are an order of magnitude better than leading frontier AI models.

Some have hailed DeepSeek’s emergence as “AI’s Sputnik moment”, while others have expressed scepticism about the origins and actual costs of its rapid advancement. The stock markets are in a tizzy. Startups/researchers worldwide have begun testing, even locally installing, and trying to replicate the results of DeepSeek’s models. The dust is settling. One thing is clear: This moment can catalyse new AI developments in the world. But what does it mean for India?

Story continues below this ad

Chinese engineers looking to develop foundation models/LLMs faced significant challenges in acquiring large quantities and the latest versions of Nvidia’s GPUs. Given these constraints, they cleverly combined several known AI engineering techniques, while making some unique contributions as well, to radically improve efficiencies and lower costs of AI-model training and inferencing.

For instance, DeepSeek claims that it uniquely leveraged “reinforcement learning” techniques to create an AI model with advanced reasoning behaviours like self-verification and complex chains of thought, autonomously. It uses a “mixture of experts” technique to assign different parts of a training task to specialised units or “experts” within the model, ensuring that only the most relevant sections are used at any given time. To make the system even more efficient, DeepSeek uses other optimisation techniques to quickly find and process information without using much memory, and also predict two words at a time instead of one. All these AI engineering methods make the system faster and more resource-efficient while still handling complex problems. The lower cost encourages more startups to use DeepSeek in their real-world applications.

Several questions arise with respect to DeepSeek’s implications for India. Why didn’t we create this here? Is there an opportunity to create newer models in the future? Will our developers use models like this and benefit from them?

Story continues below this ad

Let us start with the implications for developing AI applications first. The most significant aspects of DeepSeek models are their cost-effectiveness and open access. These models achieve performance that matches existing models, like GPT-4, but at a fraction of the cost. The API access is roughly one-tenth to one-twentieth the price of global AI models. This price reduction is a game changer for the Indian AI industry. It means that high-quality language models become much more accessible and affordable for a wide range of applications and users.

DeepSeek is open source, which is very important, as it allows users to download the models and run them on their own hardware if they have the capacity. We are already seeing others create local installations of DeepSeek models — even without GPUs. This means Indian startups don’t need to rely on servers located in China and can create their own version of the DeepSeek service, much like Perplexity has already done.

Second is the issue of AI research. India has a strong AI talent pool, but it’s mostly focused on building applications on top of existing AI systems. While India can use existing LLMs very well for this purpose, we need to focus on fundamental research in order to create our own cutting-edge AI foundation models. There is a strong need for increased AI research funding and a shift in our approach to AI development. To start with, we expect that multiple efforts will be undertaken in India (in universities and companies) where existing models of DeepSeek/Meta’s Llama will be installed locally, and fine-tuned with India-specific or domain-specific data. Remember, DeepSeek did not happen overnight — it involved the efforts of hundreds of researchers/engineers in under two years.

The lower costs of training and inference mean that researchers can perform many more experiments. Andrej Karpathy, one of the engineers involved with DeepSeek, has suggested establishing a global “RL-gym” to create a wide range of RL environments to understand how LLMs think and make decisions. This may spur research towards developing AGI. At the same time, let us not forget that there are several other areas of AI to research — predictive AI and physical AI, for example.

There are only a few efforts in India to create our own LLMs. We must use the DeepSeek moment to catalyse multiple and competing mission-mode projects to develop our own foundation models. Besides the government, private sector companies and philanthropists can also fund some of these AI grand challenges. The IndiaAI Mission’s GPU cluster will come in handy for these projects.

most read

Multi-disciplinary teams should be put in place. The projects require expertise in AI frameworks like PyTorch, advanced attention mechanisms, efficient model training techniques and reinforcement learning. Engineers need skills in optimising AI performance using low-precision computing and specialised processing methods. Teams should also have hardware expertise in GPU acceleration, distributed computing and high-speed networking.

India has the talent. It has the resolve. The time for collective AI action is now.

Ravindran is Professor and Head of the Wadhwani School of Data Science and AI, IIT Madras. Narayanan is co-founder and president of itihaasa Research and Digital

You may also like

Leave a Comment

About Us

Welcome to Janashakti.News, your trusted source for breaking news, insightful analysis, and captivating stories from around the globe. Whether you’re seeking updates on politics, technology, sports, entertainment, or beyond, we deliver timely and reliable coverage to keep you informed and engaged.

@2024 – All Right Reserved – Janashakti.news