India’s Sarvam AI beats Google Gemini and ChatGPT in latest benchmarks
The big news recently is how India’s Sarvam AI beats Google Gemini and ChatGPT on specific, highly technical benchmarks that matter for the Indian context. Specifically, the company has released two tools, Sarvam Vision and Bulbul V3, which are outperforming the global giants in reading and speaking Indian languages. This isn’t just a minor victory; it is a signal that local, purpose built models can actually be more efficient and accurate than “one size fits all” global systems.
For the longest time, whenever we talked about foundational AI models, the conversation usually started in Silicon Valley and ended somewhere in Beijing. India was often seen as a massive consumer of these technologies or a hub for the talent that builds them elsewhere, but rarely the source of the core tech itself. That narrative is shifting rapidly. Bengaluru based startup Sarvam AI is proving that building “sovereign AI” isn’t just a buzzword. It is a necessity for a country with our linguistic complexity.
India’s Sarvam AI beats Google Gemini and ChatGPT in latest benchmarks
Reading a printed document might seem like a solved problem in tech, but when you introduce Indian languages, complex layouts, and handwritten notes, most global AI models start to struggle. This is where Sarvam Vision comes in. In recent testing on the olmOCR-Bench, Sarvam Vision achieved an impressive accuracy score of 84.3 percent. This is a significant milestone because it means India’s Sarvam AI beats Google Gemini and ChatGPT, with Gemini 3 Pro scoring lower and ChatGPT ranking even further behind.
The tool also showed its strength on OmniDocBench v1.5, which tests how AI handles real world documents like technical tables and mathematical formulas. It scored over 93 percent overall. If you have ever tried to get an AI to read a messy government form or a dense textbook page with equations, you know how easily things can go wrong. Sarvam has managed to solve the “messy formatting” problem that often trips up traditional systems. It turns out that focusing specifically on the quirks of Indic scripts and document styles gives you a massive home court advantage.
Why global experts are changing their minds
It is worth noting that Sarvam AI faced its fair share of skepticism early on. Many tech commentators wondered if it was worth the effort to build smaller, local models when companies like OpenAI and Google have billions of dollars and massive compute power. Even well known tech observers like Deedy Das recently admitted they were wrong about the company’s direction.
The value of what Sarvam is doing lies in filling the gaps that the big labs ignored. While a model like GPT-4 is amazing at general knowledge, it often misses the nuance of local dialects or the specific way documents are structured in India. By focusing on these “Indic use cases,” Sarvam has created something that is not only better but also more affordable. Pricing is a huge factor here. Using a global API for millions of pages of Indian language OCR can be prohibitively expensive. Sarvam offers a solution that is both more accurate for our needs and far more reasonable for the Indian pocket.

Bulbul V3: Giving AI a natural Indian voice
While reading text is one thing, speaking it naturally is another challenge entirely. Sarvam recently launched Bulbul V3, its latest text to speech model. This is India’s answer to global leaders like ElevenLabs. The goal was to create a voice that doesn’t sound like a robotic translation, but rather a natural, expressive, and production ready voice for Indian languages.
Currently, Bulbul V3 supports over 35 voices across 11 Indian languages, with plans to expand to all 22 official languages of India. Users have already started comparing it to global alternatives, noting that while ElevenLabs is excellent, its cost for Indic languages often doesn’t make sense for local developers. Bulbul provides a stable, content accurate alternative that feels “at home.” Whether it is for a customer service bot or an educational app, having a voice that sounds like it belongs here is a game changer for user experience.
The importance of local foundational models
The reason we are seeing such a buzz around the fact that India’s Sarvam AI beats Google Gemini and ChatGPT is because of what it represents for our digital sovereignty. If the future of our economy and education is going to be powered by AI, we cannot rely solely on black box models controlled by a handful of companies in the US. We need models that understand the way we speak, the way our documents look, and the specific problems we face.
Sarvam AI is showing that we don’t necessarily need to build a model that knows everything about the entire world to be world class. By mastering the Indic landscape, they have created tools that are technically superior in our backyard. The global attention they are receiving now is proof that when you solve a hard, specific problem better than anyone else, the world takes notice.
What this means for the average user
You might be wondering how this tech actually affects you. For the tech savvy, it means better APIs and cheaper tools to build local apps. For the non tech savvy, it means that the next time you use an app to translate a document or listen to an automated announcement in your local language, it will actually be correct. It means fewer errors in digitizing old records and more natural interactions with digital services.
The performance of these models suggests that the “Western bias” in AI is finally being challenged. When a startup from Bengaluru can step onto the stage and outperform the biggest names in tech on specialized benchmarks, it proves that innovation isn’t just about the size of your data center. It is about how well you understand the data you are feeding into it.








[…] Galgotias University Chinese robot dog situation got caught in a wave of sentiment that, while understandable given the broader geopolitical […]
[…] 2021, a group of seven researchers left OpenAI, one of the most well-known AI companies in the world. They had concerns about the direction the […]