Glocal Evaluation of Models (GEM)

Glocal Evaluation of Models (GEM)

Glocal Evaluation of Models (GEM)

A global, yet local approach to evaluation of AI models for languages

A global, yet local approach to evaluation of AI models for languages

There are 8 billion people that speak more than 7,100 languages in the world. But AI is not able to converse fluently or take action with 98% of our languages—not properly covering billions of people. We aim to fix that.

The Vision for GEM

We want to build a transparent, scalable, and inclusive benchmarking framework, aiming to measure and improve the performance of AI across diverse domains and multiple modalities in every language.


Capturing this diversity of language is hard for tech companies attempting to make LLMs, speech recognition and translation models. This paucity of high-quality diverse training sets then leads to poor performance of AI models. This poor performance then implies that incredible tools like GPT4 and Gemini - which are primarily trained on Internet content of the world’s richest economies - don’t work particularly well for writers and speakers of under-resourced languages. 

This effort is an attempt to create transparency and competition among the AI community and its potential users to highlight and improve the state of AI for every language and speaker. We hope to also prod private, government and public technology makers to create ever-better AI models for global language understanding via the proven power of open, transparent competition. We take inspiration from other AI leaderboards like huggingface’s Open LLM ranking board.  As private, public and open source technology makers publish ever-improving AI models each month, we hope to enable organizations to quickly benchmark new models and evaluate with their data so they can make appropriate price, speed and performance decisions as well.

Learn More

Learn More

Email jankibaat@peopleplus.ai to sign up for our Pilot Cohort & help build this infrastructure together.

Who is this For?

Who is this For?

Who is this For?

Model Builders

People/Organisations building Global AI models want to know how good their models are. With the extensive evaluation, model builders will be able to prove their model is good and aspire to make it better.

Model Users

People/Organizations who want to use Global Language AI models want to know which models to use for their particular use-case. Organisations in and outside India working with low-resource populations to get an AI bot understand them linguistically.

Research Groups

Research groups want to uncover big gaps that still exist in the models that are unsolved. Our open indic-language specific evaluation methods would help understand better ways to do evaluations in general.

View the Rankings

PARIKSHA

PARIKSHA

(Others Coming Soon)

(Others Coming Soon)

Introducing PARIKSHA by Microsoft Research India

Introducing PARIKSHA by Microsoft Research India

Introducing PARIKSHA by Microsoft Research India

Who's behind this initiative?

Who's behind this initiative?

Who's behind this initiative?

What’s the evaluation method?

What’s the evaluation method?

What’s the evaluation method?

Evaluating AI models is a hard problem, especially for Indic and African languages. Any framework of evaluation will be imperfect to start with, so we imagine this to be an iterative framework with the following principles:

Evaluating AI models is a hard problem, especially for Indic and African languages. Any framework of evaluation will be imperfect to start with, so we imagine this to be an iterative framework with the following principles:

Evaluating AI models is a hard problem, especially for Indic and African languages. Any framework of evaluation will be imperfect to start with, so we imagine this to be an iterative framework with the following principles:

Initial set of Curated Prompts

Collaboration with experts and Microsoft Research India's PARIKSHA platform creates has allowed us to create relevant and challenging prompts, ensuring assessments go beyond accuracy to include fairness and diversity.

Transparent & Open Methodology

We detail our evaluation criteria and process, allowing for clear understanding and replicability. Ranking may not be perfect but it should be transparent. Rank them by current performance and make outcomes.

Diversity of curation by design

In order to cater to different needs of different languages, modalities & evaluation frameworks, we have kept the GEM Leaderboard to be managed by multiple curators

Designed as an Iterative Process

No evaluation is perfect. Acknowledging the imperfection of benchmarks, regular updates and feedback loops are incorporated to refine the evaluation process continuously.

Extensible Evaluation with Remix

The system is designed for customization, allowing organizations to conduct their evaluations using the latest models from major tech companies, tailored to their specific needs.

Frequently Asked Questions

What makes our leaderboard different?

How often is the leaderboard updated?

Who will run the evaluations?

What makes our leaderboard different?

How often is the leaderboard updated?

Who will run the evaluations?

What makes our leaderboard different?

How often is the leaderboard updated?

Who will run the evaluations?

Get Involved

Join the Community

Join the Community

Join the Community

Have questions or want to contribute?

Reach out to us at contact@aha-eaderboard.ai and join our mission to elevate Global AI Models.

Enroll and Share Your Big Idea

Enroll and Share Your Big Idea

MODEL BUILDERS

MODEL BUILDERS

MODEL BUILDERS

Apply here

Test your Indic LLMs against our benchmarks to gauge and improve their performance.

CURATORS

CURATORS

CURATORS

Contact Here

Contribute your knowledge as individuals or organisations to help refine evaluation & prompts, improving relevance & comprehensiveness.

EVERYONE

EVERYONE

EVERYONE

Sign up for Event

Sign up for the first Global AI Models workshop happening in the first week of May.

Join the Community

People+ai is a non-profit, housed within the EkStep Foundation. Our work is designed around the belief that technology, especially ai, will cause paradigm shifts that can help India & its people reach their potential.

Curated by EkStep Foundation

Join the Community

People+ai is a non-profit, housed within the EkStep Foundation. Our work is designed around the belief that technology, especially ai, will cause paradigm shifts that can help India & its people reach their potential.

Curated by EkStep Foundation

Join the Community