Sarvam AI: Powering India's Generative AI Revolution with Homegrown LLMs

Sarvam AI Logo

India is rapidly emerging as a significant player in the global artificial intelligence landscape, and at the forefront of this revolution is Sarvam AI. Founded with a vision to build full-stack generative AI solutions tailored for India, Sarvam AI is not just another tech startup; it represents India's ambition to create its own sovereign AI capabilities, deeply rooted in the country's linguistic and cultural diversity.

The Genesis of Sarvam AI

Sarvam AI Founders: Vivek Raghavan and Pratyush Kumar
Sarvam AI Founders: Vivek Raghavan and Pratyush Kumar

Sarvam AI was co-founded in August 2023 by two distinguished individuals: Dr. Vivek Raghavan and Dr. Pratyush Kumar [3, 4]. Vivek Raghavan brings extensive experience in building India's digital public infrastructure, a crucial background for a company aiming to create AI for population-scale applications [3]. Dr. Pratyush Kumar, on the other hand, emphasizes the critical need for India to develop its own sovereign Large Language Models (LLMs) [2]. Their combined expertise and vision laid the foundation for Sarvam AI to embark on a mission to develop AI solutions that are not only technologically advanced but also culturally and linguistically relevant to India.

A Focus on Indic Languages and Local Context

One of Sarvam AI's most distinctive features is its profound focus on Indic languages and the unique linguistic diversity of India. Unlike global AI models that are often designed for generic use and primarily trained on English data, Sarvam AI is building models from the ground up to support a multitude of Indian languages alongside English [9, 10]. This approach is crucial for enabling AI to penetrate deeper into Indian society and serve its vast, multilingual population.

Sarvam AI's commitment to Indic languages is evident in its various models and platforms. The following table summarizes the key language models released by the company:

Model Name Parameters Primary Focus Key Features
OpenHathi 7B Hindi, English, Hinglish First model based on Llama 2; handles code-mixing [7, 8].
Sarvam-1 2B / 3B 10 Indian Languages High-performance small model for mobile/edge use [9, 10].
Sarvam-30B 30B Indic + Reasoning Mixture-of-Experts (MoE) for cost-efficiency [13, 15].
Sarvam-105B 105B Enterprise / Flagship SOTA reasoning and multilingual capabilities [12, 14].
Sarvam-M Variable Open Source Hybrid Cross-lingual functionality for search and retrieval [6].

Model Breakdown

OpenHathi Model Visual
  • OpenHathi: This model represents a significant step towards developing LLMs that understand and generate content in a code-mixed environment, common in India [7, 8].
  • Sarvam-1: Launched in October 2024, Sarvam-1 was built to support 10 major Indian languages alongside English. The company highlights that careful curation of training data is key to its performance across these diverse languages [9].
  • Sarvam-M: An open-source hybrid Indic LLM, Sarvam-M functions cross-lingually, allowing it to generate English Wikipedia queries regardless of the input language [6].

Breakthrough Models: Sarvam-30B and Sarvam-105B

Sarvam AI has recently made significant strides with the introduction of its 30B and 105B parameter models, marking a pivotal moment in India's sovereign AI journey [11, 12].

  • Sarvam-30B: Designed as a lightweight, cost-efficient model, Sarvam-30B supports a context length of up to 32,000 tokens. It is optimized for efficiency and performance across various applications [13].
  • Sarvam-105B: This flagship large language model is engineered for enterprise-grade applications, delivering state-of-the-art performance across Indian languages. It utilizes a mixture-of-experts (MoE) architecture for higher efficiency [14, 15]. Notably, Sarvam AI claims that its 105B model outperforms the 600-billion-parameter Deepseek R1 model and Google's Gemini 2.5 in certain benchmarks, particularly in reasoning, coding, and multilingual capabilities [12, 15]. This demonstrates Sarvam AI's ability to compete with global giants on performance, even with a smaller parameter count.

Voice AI and Developer Platform

Beyond LLMs, Sarvam AI is also making significant contributions to Voice AI, recognizing India's booming voice AI adoption [10]. Their platforms include:

  • Text-to-Speech (TTS) API: Powered by Bulbul v3, this API supports 11 Indian languages, offering natural-sounding voices across Hindi, Bengali, Tamil, Telugu, Gujarati, Kannada, Malayalam, Marathi, Punjabi, Odia, and English [16].
  • Speech-to-Text (STT) API: Sarvam's STT API supports 22 Indian languages, including Hindi, Bengali, Tamil, Telugu, Gujarati, Kannada, Malayalam, Marathi, Punjabi, Odia, and English. It is built on a 2B-parameter architecture and is capable of handling code-mixing, a common phenomenon in Indian speech [17, 18].

Sarvam AI also provides a robust developer platform with comprehensive API documentation and SDKs (e.g., Python SDK) to enable developers to integrate their AI capabilities into various applications [19, 20]. This focus on an accessible developer ecosystem is vital for fostering innovation and widespread adoption of their AI solutions.

Funding and Future Outlook

Sarvam AI has attracted significant investment, underscoring the confidence in its vision and capabilities. The company secured a landmark $41 million in Series A funding in December 2023, led by Lightspeed Venture Partners, with participation from Peak XV Partners and Khosla Ventures [11, 21]. Overall, Sarvam AI has raised more than $40 million in funding since its inception in 2023 [11].

Sarvam AI's journey is a testament to India's growing prowess in the AI domain. By focusing on indigenous language support, developing high-performance models, and building a comprehensive AI ecosystem, Sarvam AI is not just creating technology; it is empowering India to build its own digital future, ensuring that the benefits of generative AI are accessible and relevant to every Indian.

Previous Post Next Post