CIIL conference highlights 15 newly developed datasets and AI applications for Indian languages-OxBig News Network

The release of 15 newly developed datasets by LDC-IL was the key highlight of the conference on artificial intelligence (AI) at the CIIL here on Thursday.

The datasets, released by Shailendra Mohan, Director of the CIIL, and other dignitaries, mark a significant milestone in LDC-IL’s contributions to linguistic research and technology development, according to CIIL.

The datasets include – Mother Tongue Parallel Text Corpus of India (147 mother tongues), Gold Standard Rajasthani Raw Text Corpus, Gold Standard Chhattisgarhi Raw Text Corpus Vol. II, Gold Standard Kashmiri Raw Text Corpus Vol. II, Gold Standard Maithili Raw Text Corpus Vol. II, Gold Standard Telugu Raw Text Corpus Vol. II, Maithili Raw Speech Corpus Vol. II, Dogri Sentence Aligned Speech Corpus, Maithili Sentence Aligned Speech Corpus (Tirhuta Script), Manipuri Sentence Aligned Speech Corpus (Bengali Script), Manipuri Sentence Aligned Speech Corpus (Meetei Mayek), Punjabi Sentence Aligned Speech Corpus, Telugu Sentence Aligned Speech Corpus, Assamese Text-to-Speech Corpus, and Maithili Text-to-Speech Corpus.

In addition, LDC-IL launched several AI applications designed to serve Indian languages, introduced by Narayan Choudhary.

These applications, now available for public use at medha.ciil.org, include Anuvadika (Machine Translator), Lipyantara (Transliterator), Lipidha (Optical Character Recognizer), Anulekhika (Automatic Speech Recognition for Indian Languages), Anuvachika (Text-to-Speech Recognition for Indian Languages), and Dhvani Parivartka (Media Converter).

#CIIL #conference #highlights #newly #developed #datasets #applications #Indian #languages

Mother Tongue Parallel Text Corpus of India (147 mother tongues), Gold Standard Rajasthani Raw Text Corpus, Gold Standard Chhattisgarhi Raw Text Corpus Vol. II, Gold Standard Kashmiri Raw Text Corpus Vol. II, Gold Standard Maithili Raw Text Corpus Vol. II, Gold Standard Telugu Raw Text Corpus Vol. II, Maithili Raw Speech Corpus Vol. II, Dogri Sentence Aligned Speech Corpus, Maithili Sentence Aligned Speech Corpus (Tirhuta Script), Manipuri Sentence Aligned Speech Corpus (Bengali Script), Manipuri Sentence Aligned Speech Corpus (Meetei Mayek), Punjabi Sentence Aligned Speech Corpus, Telugu Sentence Aligned Speech Corpus, Assamese Text-to-Speech Corpus, and Maithili Text-to-Speech Corpus.

latest news today, news today, breaking news, latest news today, english news, internet news, top news, oxbig, oxbig news, oxbig news network, oxbig news today, news by oxbig, oxbig media, oxbig network, oxbig news media

HINDI NEWS

News Source

Related News

More News

More like this
Related

SEC to see exodus as hundreds take Trump’s buyout offers: Reuters-OxBig News Network

The U.S. Securities and Exchange Commission (SEC) headquarters in...

Gunmen open fire at Iraqi Consulate in Turkey’s Istanbul

Gunmen riding a motorbike fired eight shots at the...

Manish Sisodia appointed AAP’s Punjab incharge, Jain to be his deputy-OxBig News Network

A month after drubbing in the Delhi Assembly elections,...

‘We will rescue anyone’: Syria’s White Helmets step in after deadly attacks

When violence broke out in Syria's coastal area recently,...