Audina

Audina is a prototype of the online multilingual forced alignment and audio annotation software. It provides an easy way to align voice recordings in any language (with the Latin alphabet) to their transcription.

It lets you review the alignment result and update the timing and text if necessary. After the review and the possible update, you can download the result as separate audio files, including a tsv file containing the audio file names and their transcription and a WebVTT file format for displaying timed text tracks (such as subtitles or captions).

We have successfully created The Librivox-Indonesia Dataset using this software. The dataset consists of MP3 audio and corresponding text files we generated from the public-domain audiobooks LibriVox. It contains around 8 hours of audio files on the following languages in Indonesia: Acehnese, Balinese, Buginese, Indonesian, Minangkabau, Javanese, and Sundanese.

We also successfully tested this software with audiobooks from other languages, such as English, German, Luxembourgish, Spanish, Afrikaans, and Tagalog. The list will still grow as we are also testing other languages. Please have a look at the samples of aligned audiobooks.

Audina has the following workflow:
  • Forced Alignment provides the possibility to submit the audio file and its transcription.
  • After the server has processed the Alignment, the web interface will redirect the user to the Annotation page, where the user can review, update, and download the result.