NANDA: The Future of Hindi AI and Language Inclusivity
In today’s digital world, artificial intelligence (AI) has the power to bridge cultural and language gaps, and G42’s NANDA, a Hindi-centric large language model (LLM), aims to do exactly that. Developed by G42 in collaboration with Inception and Cerebras Systems, NANDA was unveiled at the India-UAE business forum in Mumbai. This blog post explores the technical features of NANDA, its role in fostering language inclusivity, and its potential impact on the global AI landscape.

What is NANDA?
NANDA is a 13-billion-parameter AI model that has been trained on approximately 2.13 trillion tokens of language data, including Hindi. Named after one of India’s highest mountain peaks, NANDA was developed with the intention of catering specifically to Hindi-speaking users. The AI model was trained using Condor Galaxy, one of the world’s most powerful supercomputers, built by G42 and Cerebras Systems. According to Dr. Andrew Jackson, acting CEO of Inception, NANDA exemplifies the commitment of G42 to AI inclusivity, particularly for underrepresented languages such as Hindi.
Technical Features of NANDA
1. Robust Training on Large Datasets
NANDA’s extensive training on 2.13 trillion tokens makes it a powerful model for understanding and generating Hindi language content. This massive dataset allows the model to recognize linguistic nuances, idiomatic expressions, and regional dialects in Hindi, giving it an edge over other LLMs.
2. Powered by Condor Galaxy Supercomputer
The model was trained using the Condor Galaxy, one of the most powerful AI supercomputers in the world. This computational infrastructure, built by G42 in partnership with Cerebras Systems, provides NANDA with the processing power required to train on such a large dataset effectively and efficiently【source】.
3. Cutting-Edge Collaboration
NANDA is the result of a collaboration between Inception, a G42 subsidiary, and Cerebras Systems, along with the Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) in the UAE. This international partnership highlights the global scope of AI research and development in bringing language-specific models to fruition.
Language Inclusivity: The Need for Hindi-Specific AI
1. Underrepresentation of Hindi in AI
While Hindi is the third most spoken language globally, it remains significantly underrepresented in the digital and AI landscape. Most existing language models, including GPT-3 and Google’s BERT, primarily cater to English and other major global languages. By focusing on Hindi, NANDA ensures that Hindi speakers have access to AI-driven solutions that reflect their linguistic and cultural needs.
2. Empowering Hindi-Speaking Communities
NANDA plays a pivotal role in empowering the nearly 600 million Hindi speakers worldwide by offering them AI tools in their native language. From chatbots and customer service agents to education and healthcare solutions, Hindi-centric AI opens the door to technological solutions tailored to local needs.
Comparison: NANDA vs. JAIS
G42 is no stranger to the development of language-specific models. In August 2023, they launched JAIS, the world’s first open-source Arabic LLM. JAIS, with parameters ranging from 590 million to 70 billion, set a new benchmark for regional language models, and G42 is seeking to replicate this success with NANDA.
While both models aim to serve underrepresented linguistic communities, NANDA specifically targets the Hindi-speaking population, providing tools and resources for a language that is central to India’s culture and heritage.
Microsoft’s Role in G42’s AI Development
Microsoft’s recent $1.5 billion investment in G42 further highlights the tech giant’s commitment to advancing AI technology in regions beyond the West. In particular, this partnership enables G42 to scale its AI initiatives, including the development of specialized LLMs like NANDA and its healthcare-focused counterpart, Med42. This collaboration ensures that G42’s AI models have the necessary infrastructure and resources to be widely deployed.【source】.
Applications of NANDA: Real-World Impact
1. E-Commerce and Customer Support
One of the most immediate applications of NANDA is in the e-commerce sector. As more Indian consumers shift to online shopping, customer support solutions powered by NANDA can cater specifically to Hindi speakers, enhancing their shopping experience by providing personalized assistance in their native language.
2. Education and E-Learning
NANDA’s ability to understand and generate Hindi content can be a game-changer for e-learning platforms. As India expands its digital infrastructure, AI models like NANDA can support educational tools that make learning accessible to millions of students across the country, particularly in rural areas where Hindi is the primary language.
3. Healthcare
Similar to G42’s Med42, which is tailored for clinicians and healthcare professionals, NANDA can play a significant role in improving healthcare access for Hindi-speaking patients. From automated appointment scheduling to providing information about medical conditions, NANDA can assist in creating Hindi-speaking AI assistants that improve patient care..
The Global Impact: NANDA’s Role in AI Inclusivity
NANDA’s introduction marks a significant step toward making AI more inclusive for non-English-speaking populations. As the world becomes more digitally interconnected, the need for AI models that understand and cater to regional languages becomes increasingly critical. NANDA is expected to set a precedent for the creation of language-specific AI models in other underrepresented languages, including Indian languages like Tamil, Telugu, and Bengali.
Challenges and Ethical Considerations
1. Bias in Language Models
Despite its potential, there are ethical challenges that come with developing language-specific AI models like NANDA. AI models are only as good as the data they are trained on, and biases inherent in the training data can lead to skewed outputs. Ensuring that NANDA is trained on diverse, representative datasets is crucial to avoid reinforcing existing linguistic or cultural biases..
2. Privacy and Data Security
With AI becoming increasingly integrated into public services and industries like healthcare, ensuring data privacy and security is more important than ever. G42’s commitment to data privacy will be key in determining the long-term success and adoption of NANDA.
The Future of Language-Based AI: A New Era of Inclusivity
NANDA sets a powerful example of how AI can be used to foster linguistic inclusivity and cultural representation in the digital age. As AI continues to evolve, more models like NANDA will likely emerge, each focusing on different languages and regions to ensure that technology is accessible to everyone.
1. Expansion to Other Indian Languages
With NANDA focusing on Hindi, it’s only a matter of time before similar models are developed for other Indian languages. India is home to over 22 officially recognized languages, and the creation of AI models that cater to these languages could revolutionize how Indians interact with technology in their day-to-day lives.
2. AI in Governance and Public Services
AI models like NANDA can have a profound impact on e-governance and public service delivery. Government websites, helplines, and public information portals powered by NANDA could make these services more accessible to Hindi-speaking citizens, particularly those in rural and underserved areas.
Unlock Your Future in AI and Data Science with BIA!
Take the next step in your career by enrolling in BIA’s Data Science and AI course. Whether you’re a beginner or looking to upgrade your skills, this program offers cutting-edge knowledge and practical experience in artificial intelligence, machine learning, and data analytics. Learn from industry experts and gain the tools you need to excel in the world of AI. Enroll now and start your journey to becoming a data science expert!
Conclusion: NANDA and the Future of AI
NANDA represents a significant leap forward in AI inclusivity for Hindi speakers. By prioritizing underrepresented languages, G42 is helping to ensure that AI technologies are more accessible, equitable, and useful for all. As the AI landscape continues to evolve, models like NANDA will play an essential role in bridging the digital divide and ensuring that language is no longer a barrier to technological innovation.