India must lead in AI for the South Asian Region
Vinit Utpal
At the AI Action Summit in Paris, Indian Prime Minister Narendra Modi urged global leaders to collaborate on setting AI standards that benefit everyone particularly in the Global South. He emphasized the need for collective global efforts to establish governance and standards that reflect shared values, mitigate risks, and foster trust. On this occasion, Modi also announced that India is developing AI applications for the public good and is building its own large language model (LLM) to address the country’s diversity across various sectors.
In January, Ashwini Vaishnaw, Union Minister of Electronics and Information Technology, announced also at the Utkarsh Odisha Conclave that India would build native generative AI models within six to eight months. He revealed that India had secured more than 18,600 high-end GPUs to build the AI infrastructure and scale its computing capacity. Among the 18,600 GPUs, there are 12,896 Nvidia H100 GPUs, 1,480 Nvidia H200 GPUs, and 742 AMD MI325 and MI325X GPUs as comparison to DeepSeek AI was trained on 2,000 GPUs, ChatGPT was trained on 25,000 GPUs.
GPUs or Graphics Processing Units are specialized hardware designed to accelerate the rendering of images and videos, but their parallel processing capabilities make them ideal for a range of complex computational tasks, including machine learning, scientific simulations, and crypto currency mining. Leading manufacturers of GPUs include NVIDIA, AMD and Intel, with NVIDIA being especially dominant in AI and machine learning through its CUDA platform and hardware like the NVIDIA A100 and Tesla series. India has already implemented the DeepSeek-R1 AI model and experts are currently evaluating the technical report to better understand its capabilities.
Meanwhile, Large Language Models (LLMs) have become foundational for AI systems. These models enable machines to learn human-like intelligence, facilitating more diverse and sophisticated interactions between machines and humans. LLMs have immense potential in various governance sectors, including health, education, transport, defence and security. However, creating LLMs requires massive funding, along with strategic resources such as semiconductors, which complicate the process.
The race for LLMs began with the launch of OpenAI’s models in the USA, Google’s Bard, Meta’s AI and Microsoft’s Copilot, initiatives supported by large tech giants in Silicon Valley with substantial funding. Yet, a notable disruption in the LLM market has come from the Chinese start-up DeepSeek. It has created an affordable open-source LLM that has challenged the dominance of Western tech giants and sparked a global reassessment of AI research policies. This shift is particularly relevant for smaller countries in South Asia, particularly India’s neighboring countries like Nepal, Bhutan, Bangladesh, Pakistan, Myanmar, and Sri Lanka.
DeepSeek’s success demonstrates that the AI race is not limited to countries with vast technological and financial resources. However, DeepSeek’s AI models are not entirely free in terms of data analysis. Due to China’s strict internet regulations and content controls, these models are designed to avoid discussing politically sensitive topics, such as democracy, human rights or global conflicts. As a result, China’s AI models often reflect the country’s geopolitical interests, shaping user behavior, especially in countries that lack their own AI infrastructure.
India, with its 22 officially recognized languages and numerous dialects, faces both opportunities and challenges in AI development. Country’s linguistic diversity offers a chance to create more inclusive AI systems, but also poses challenges in ensuring that AI models can effectively support a wide range of languages. Several years ago, the Indian Institute of Science (IISc), Bangalore worked on RESPIN Project. It was said that this is worked for speech recognition in agriculture and finance for the poor is an initiative predominantly to create resources and make them available as a digital public good in the open source domain. This was speech recognition project in nine different Indian languages such as Hindi, Bengali, Marathi, Telugu, Bhojpuri, Kannada, Magadhi, Chhattisgarhi, and Maithili. At that time, there are several inputs required from such experts including assessment of sentence quality in a language, pronunciation of the people from whom recordings are done etc.
To create indigenous LLMs that cater to India’s linguistic pluralism, it is crucial to ensure that AI systems are culturally relevant and accessible to a broader population. By providing services in multiple languages, such models could improve public services, enhance educational resources, and facilitate more effective communication. Furthermore, developing homegrown LLMs would help preserve linguistic heritage and promote digital inclusivity. Geopolitically, investing in domestic LLMs would reduce dependence on foreign technologies, thereby bolstering national security and technological sovereignty.
Many global AI models struggle to support smaller languages. While Chinese models prioritize Mandarin and regional dialects, and Western models mainly focus on English and other widely spoken global languages, an AI model developed in India would likely include languages like Nepali, Maithili, Bhojpuri, Awadhi, and Newar, among others. Such a model would increase the contextual relevance and accuracy of voice recognition, natural language processing (NLP), machine translation, and AI-driven content creation.
At this juncture, South Asian countries need to collaborate with Indian leadership to develop autonomous LLMs that respect their unique languages and governance systems, ensuring a fair and open AI ecosystem. Relying on censored AI models is very risky for objective knowledge, foreign digital dependencies and regional technological sovereignty.
(The writer is an Assistant Professor and Course Coordinator of Digital Media at IIMC, Jammu)