India is rapidly emerging as a significant player in the global artificial intelligence landscape, and at the forefront of this revolution is Sarvam AI. Founded with a vision to build full-stack generative AI solutions tailored for India, Sarvam AI is not just another tech startup; it represents India's ambition to create its own sovereign AI capabilities, deeply rooted in the country's linguistic and cultural diversity.
The Genesis of Sarvam AI
Sarvam AI was co-founded in August 2023 by two distinguished individuals: Dr. Vivek Raghavan and Dr. Pratyush Kumar [3, 4]. Vivek Raghavan brings extensive experience in building India's digital public infrastructure, a crucial background for a company aiming to create AI for population-scale applications [3]. Dr. Pratyush Kumar, on the other hand, emphasizes the critical need for India to develop its own sovereign Large Language Models (LLMs) [2]. Their combined expertise and vision laid the foundation for Sarvam AI to embark on a mission to develop AI solutions that are not only technologically advanced but also culturally and linguistically relevant to India.
A Focus on Indic Languages and Local Context
One of Sarvam AI's most distinctive features is its profound focus on Indic languages and the unique linguistic diversity of India. Unlike global AI models that are often designed for generic use and primarily trained on English data, Sarvam AI is building models from the ground up to support a multitude of Indian languages alongside English [9, 10]. This approach is crucial for enabling AI to penetrate deeper into Indian society and serve its vast, multilingual population.
Sarvam AI's commitment to Indic languages is evident in its various models and platforms. The following table summarizes the key language models released by the company:
| Model Name | Parameters | Primary Focus | Key Features |
|---|---|---|---|
| OpenHathi | 7B | Hindi, English, Hinglish | First model based on Llama 2; handles code-mixing [7, 8]. |
| Sarvam-1 | 2B / 3B | 10 Indian Languages | High-performance small model for mobile/edge use [9, 10]. |
| Sarvam-30B | 30B | Indic + Reasoning | Mixture-of-Experts (MoE) for cost-efficiency [13, 15]. |
| Sarvam-105B | 105B | Enterprise / Flagship | SOTA reasoning and multilingual capabilities [12, 14]. |
| Sarvam-M | Variable | Open Source Hybrid | Cross-lingual functionality for search and retrieval [6]. |
Model Breakdown
- OpenHathi: This model represents a significant step towards developing LLMs that understand and generate content in a code-mixed environment, common in India [7, 8].
- Sarvam-1: Launched in October 2024, Sarvam-1 was built to support 10 major Indian languages alongside English. The company highlights that careful curation of training data is key to its performance across these diverse languages [9].
- Sarvam-M: An open-source hybrid Indic LLM, Sarvam-M functions cross-lingually, allowing it to generate English Wikipedia queries regardless of the input language [6].
Breakthrough Models: Sarvam-30B and Sarvam-105B
Sarvam AI has recently made significant strides with the introduction of its 30B and 105B parameter models, marking a pivotal moment in India's sovereign AI journey [11, 12].
- Sarvam-30B: Designed as a lightweight, cost-efficient model, Sarvam-30B supports a context length of up to 32,000 tokens. It is optimized for efficiency and performance across various applications [13].
- Sarvam-105B: This flagship large language model is engineered for enterprise-grade applications, delivering state-of-the-art performance across Indian languages. It utilizes a mixture-of-experts (MoE) architecture for higher efficiency [14, 15]. Notably, Sarvam AI claims that its 105B model outperforms the 600-billion-parameter Deepseek R1 model and Google's Gemini 2.5 in certain benchmarks, particularly in reasoning, coding, and multilingual capabilities [12, 15]. This demonstrates Sarvam AI's ability to compete with global giants on performance, even with a smaller parameter count.
Voice AI and Developer Platform
Beyond LLMs, Sarvam AI is also making significant contributions to Voice AI, recognizing India's booming voice AI adoption [10]. Their platforms include:
- Text-to-Speech (TTS) API: Powered by Bulbul v3, this API supports 11 Indian languages, offering natural-sounding voices across Hindi, Bengali, Tamil, Telugu, Gujarati, Kannada, Malayalam, Marathi, Punjabi, Odia, and English [16].
- Speech-to-Text (STT) API: Sarvam's STT API supports 22 Indian languages, including Hindi, Bengali, Tamil, Telugu, Gujarati, Kannada, Malayalam, Marathi, Punjabi, Odia, and English. It is built on a 2B-parameter architecture and is capable of handling code-mixing, a common phenomenon in Indian speech [17, 18].
Sarvam AI also provides a robust developer platform with comprehensive API documentation and SDKs (e.g., Python SDK) to enable developers to integrate their AI capabilities into various applications [19, 20]. This focus on an accessible developer ecosystem is vital for fostering innovation and widespread adoption of their AI solutions.
Funding and Future Outlook
Sarvam AI has attracted significant investment, underscoring the confidence in its vision and capabilities. The company secured a landmark $41 million in Series A funding in December 2023, led by Lightspeed Venture Partners, with participation from Peak XV Partners and Khosla Ventures [11, 21]. Overall, Sarvam AI has raised more than $40 million in funding since its inception in 2023 [11].
Sarvam AI's journey is a testament to India's growing prowess in the AI domain. By focusing on indigenous language support, developing high-performance models, and building a comprehensive AI ecosystem, Sarvam AI is not just creating technology; it is empowering India to build its own digital future, ensuring that the benefits of generative AI are accessible and relevant to every Indian.
Sources & References
- Vivek Raghavan - Building Full Stack GenAI - LinkedIn
- Exclusive: Dr Pratyush Kumar, Co-Founder, Sarvam AI - YouTube
- About Us - Sarvam AI
- India's Sarvam rolls out new AI model: Can it rival ChatGPT, Gemini?
- sarvam-launch - Sarvam AI
- Sarvam-M: Open Source Hybrid Indic LLM
- Analysis of Indic Language Capabilities in LLMs - arXiv
- sarvamai/OpenHathi-7B-Hi-v0.1-Base - Hugging Face
- Sarvam 1: The first Indian language LLM
- Sarvam AI Launches 3B Parameter Model for Indian Languages
- Indian AI lab Sarvam's new models are a major bet on the viability of open source AI
- Sarvam rolls out 105-bn parameter AI LLM model - The Times of India
- Sarvam And The Sovereign AI Dream - Inc42
- Sarvam Models
- Sarvam AI launches 30B and 105B models
- Voices that feel natural across India's languages
- Speech to Text API
- Sarvam AI has launched an audio-first model
- Developer Quickstart | Sarvam API Docs
- Getting Started with Sarvam AI: A Complete Python SDK Guide
- Announcing Series A - Sarvam AI