Skip to content

No Language Left Behind: How Meta AI’s System Could Safeguard Endangered Languages

GPI Curation Corner Emblem

Meta AI Saves LanguagesMeta’s No Language Left Behind Project (NLLB-200)  will make Facebook and Instagram posts available in 200 low-resource languages, including Scottish Gaelic, Welsh, Bosnian, Kamba, Lao, and other languages spoken mostly in Asia, Africa, and Europe. The NLLB-200 project will also make it possible for the translation of Wikipedia articles into the featured languages, as a result of Meta’s partnership with Wikipedia the open knowledge platform.

The project aims to foster better connections among people through the power of a shared and mutually spoken language. Meta AI’s researchers described their method to boost translations in low-resource and endangered languages.

This involves crowd-sourcing data from platforms like local language Wikipedia with less than 1 million articles and creating a multilingual dataset. The NNLB-200 AI uses a computational model based on a “Sparse Mixture-of-Experts” architecture trained on data obtained with new mining techniques tailored for low-resource languages.

This development would allow for the inclusion of lesser-spoken languages often neglected when training large language models.

Read more about how Meta AI’s NNLB-200 model could safeguard endangered languages.

Subscribe to our Newsletter