Skip to main content

Building language models, one story at a time

One-third of the world's languages are spoken in Africa, but less than 1% of African languages are represented online. This is significant because the language you speak, write or sign shapes your online experience. Language is the cornerstone of your identity, the connection to your past and the key to your future. When we can’t experience the internet in our language, it limits what we can learn, what jobs we can have, what stories we can access, and so much more.

In my home country Mali, eighty percent of the population speaks Bambara as its first or second language. It is also spoken in Burkina Faso, Ivory Coast, Liberia and Guinea — making it one of West Africa's most widely spoken languages. But, if Bambara is your primary language, it can be difficult to have an immersive internet experience. That's why I've set out to make the internet more accessible to Bambara speakers, remove the language barrier, and bring this primarily spoken language online for everyone.

To achieve this goal, a language model for Bambara needs to be built. Language models require lots of data, which typically means having hours of transcribed recordings where humans are speaking the language so that computers can learn the language through a process called Natural Language Processing. Unfortunately, Bambara lacks readily available data to train. Researchers call this being “low-resourced.” My team at Robots Mali has been trying to solve this challenge for years as part of a collaborative project called BayÉ›lÉ›mabaga. Through collaboration with the Google Research team in Accra, we're closer to accomplishing our goals of building more resources (written and bilingual texts) for Bambara.

To overcome the challenge of being “low-resourced," we teamed up with those who hold the culture's knowledge, rich history and teachings. Malian Griots are the real keepers of the Bambara collective memory, passing their knowledge only through oral storytelling. So, we gathered more than thirty griots to record them narrating generational stories. We transcribed and translated each tale to preserve the knowledge for future generations. While griots are traditionally older men, for this project, we worked to identify a diverse group of griots based on age, gender and background to build a representative group.

Using these recordings we've been able to build a model for understanding Bambara speech and facilitating easy translation to other languages, known as an Automatic Speech Recognition (ASR) model. As a result, we are making the world's information more accessible to millions of Bambara speakers and releasing our findings for the research community and everyone to benefit. Our work has allowed us to uplift traditional practices while building a new future for Bambara speakers. We’re in contact with the National Museum of Mali to donate all of the beautiful stories that the griots have narrated. The rich history and teachings from the griots will be available to the local community and public. Furthermore, the project is selected to be showcased at The Deep Learning Indaba 2022 next week, the largest machine learning conference in Africa.

Most importantly, we identified oral literature as a viable resource for languages. Many languages are underrepresented online, and this project represents a big step towards bringing more of them online. Of course, there's still a lot of work to do. But, by introducing this work to the community, researchers have new tools to keep breaking down the online language barrier.


by Allahsera Tapo, PhD via The Keyword

Comments

Popular posts from this blog

certain keys on my keyboard dont work when "cold"

Hi guys, i have a Lenovo Y520-15IKBN (80WK) and certain keys on the keyboard don't work (e,g,h,8,9,Fn...) but only when the weather is cold. for example in the winter it used to work after certain amount of time when i first boot the laptop and stops working when i stop using it for a while, but now that the weather is hot it works just fine except for the first couple of minutes or when its colder. of course i do realise that it has nothing to do with the outside weather but with the temperature of the computer itself. can someone explain to me why this is happening and how it should be fixed as i cannot take it to the tech service until july even though it's still under warranty because i need it for school. ps: an external keyboard works fine. Submitted April 29, 2018 at 03:35PM by AMmej https://ift.tt/2KiQg05

Old PC with a Foxconn n15235 motherboard needs drivers! Help!!

So my Pc corrupted and I had to fresh install windows on it, but now its missing 3 drivers and one of them is for the Ethernet controller! I've tried searching everywhere for the windows 7 drivers but all I seem to find are some dodgey programs saying they will install it for me. Problem is without the ethernet driver I can't bloody connect to the internet. I've been using a USB to try get some drivers on there, but they just end up being useless programmes . I'm also a bit of a noob at these things, I don't understand where to find the names of things in my PC, I've opened it up but I don't understand whats significant and what isnt. If someone has the drivers and can teach me how to install them I'd be very appreciative! Submitted April 29, 2018 at 02:47PM by darrilsteady https://ift.tt/2r76xMZ