INTRODUCTION
-----------

THUYG20 is an open Uyghur speech database published by Center for Speech and Language Technology (CSLT) at Tsinghua University, Signal and Information Processing Lab at Xinjiang University, and the AI cloud research center (AICRC). It involves the full set of speech and language resoruces required to establish an Uyghur speech recognition system and an Uyghur speaker recognition system.

For the speech recognition publication, check http://data.cslt.org/thuyg20/README.html.
For the speaker recognition publication, check http://data.cslt.org/thuyg20-sre/README.html.


PERFORMANCE
----------

We call for competition on this database. Two challenges are set up, and researchers are welcome to challenge the current state-of-the-art!
Check here for ASR tasks; 
Check here for SRE tasks.



LOCAL DOWNLOAD (not recommended)
---------------

The data can be download from our local server at CSLT@Tsinghua.

data_thuyg20.tar.gz  :    full data package for speech recognition
data_thuyg20_sre.tar.gz  :    full data package for speaker recognition
resource.tar.gz          :    resource including lexicon and noise singals
test_noise.tar.gz    :    standard noisy data for ASR 0db test
test_noise_sre.tar.gz    :    standard noisy data for SRE 0db test
about.html        :    about file
info.txt          :    info file
README.html       :    this file

PUBLIC DOWNLOAD (recommended)
---------------

The above links are from our own web server at Tsinghua University, which may be not stable and slow for some connections. The mirrors in the public cloud disks can be used as a backup:

  • Openslr: https//openslr.org/22/
  • Baidu: http://pan.baidu.com/s/1hqKwE00
  • Mega: https://mega.nz/#F!idRSjL4A!cnCY0R2NjU77Jr0soe9OgQ
  • THUYG20 KALDI RECIPE IS AVAILABLE: https://github.com/wangdong99/kaldi LICENSE ----------- All the resources contained in the database are free for research institutes and individuals. PAPERS ----------- Askar Rouze, Shi Yin, Zhiyong Zhang, Dong Wang, Askar Humdulla, Fang Zheng, "THUYG THUYG-20: A free Uyghur Speech Database", NCMSSC 2015, Tsinghua Xuebao. [pdf] Askar Rozi, Dong Wang, Zhiyong Zhang, "AN OPEN/FREE DATABASE AND BENCHMARK FOR UYGHUR SPEAKER RECOGNITION", O-COCOSDA2015. [pdf] PEOPLE ----------- Dong Wang, Zhiyong Zhang, Askar Rozi @CSLT, Tsinghua Univ. Askar Hamdulla @Xinjiang Univ. CONTACTOR ----------- Dong Wang Xuewei Zhang Zhiyong Zhang CSLT, Tsinghua University wangdong99@mails.tsinghua.edu.cn {zxw,zhangzy}@cslt.riit.tsinghua.edu.cn ROOM1-303, BLDG FIT Tsinghua University http://cslt.org http://cslt.riit.tsinghua.edu.cn Askar Hamdulla Xinjiang University askarhamdulla@gmail.com http://erj1.xju.edu.cn/znxx/index.htm