Khmer ASR

Building speech recognition for Khmer language.

About

If you talk to a man in a language he understands, that goes to his head.
If you talk to him in his language, that goes to his heart.

Nelson Mandela

Despite the fact that Automatic Speech Recognition (ASR) is no longer a science fiction movie scene yet not every language has gotten the same blessing and retained equal attention from the researchers in the field to make it a reality. One among those languages is Khmer, the official language of Cambodia (where I grew up and come from).

Spoken by about 16 millions people worldwide, Khmer is being uttered throught out Cambodia from all walks of life in entertainment, business communication, k-12 and higher education lecturing, media and public hearing as Cambodian people are so accustomed to and heavily rely on its spoken langauge for working, learning, sharing and playing becoming a hindrance for ordinary Cambodian who could not use English as an alternative to harness the power of technologies and the Internet since they would not be able to effectively interact in those mainstream languages for science.

Here I set off to document the detail footsteps in building the speech recognition for Khmer language and hopefully help illustrate the way to accomplish this feat and inspire others to start building speech recognition for their own languages.