Welcome to KingLine Data Center!   Contact Phone: 0086-10-62660053   Email: marketing@speechocean.com

English

Speechocean HomepageHelp

Commercial Resources

What's Hot What's New View All...>>
The US English Mobile Speech Recognition Corpus was collected in the United States. It contains the voices of 2602 different speakers who were balanced distributed in age (mainly 16 – 30, 31 – 45, 46+), gender and regional accents (for the details, please see the technical document). The script contains 829,688(approx.) utterances in total. Each speaker was recorded under 2 environments: quiet environments (office/home) and noisy environments. Mobile platforms, i.e. iOS\Android\Windows were used for speech collection. Each utterance wave was stored in a separate file and uncompressed. A pronunciation lexicon is available with a phonemic transcription in SAMPA. All manually checked. All audio files were manually transcribed and annotated by native transcribers. Details are available with specification.
The Chinese Mandarin Mobile Speech Recognition Corpus was collected in China. It contains the voices of 4062 different speakers (1937 males, 2125 females) who were balanced distributed in age (mainly 16 – 30,31 – 45,46 – 60), gender and regional accents (for the details, please see the technical document). The script contains 2,125,560(approx.) utterances in total (for more details of script structure design, please check the specification), specially designed to provide materials for both training and testing of many classes of speech recognizers. Each speaker was recorded in a  quiet  office environment. Mobile platforms, i.e. iOS\Android\Windows were used for speech collection. Each utterance wave was stored in a separate file and uncompressed. A pronunciation lexicon is available with a phonemic transcription in Pinyin. All manually checked. All audio files were manually transcribed and annotated by native transcribers. Details are available with specification.
The Chinese Mandarin Mobile Speech Recognition Corpus was collected in China. It contains the voices of 1200 different speakers (602 males, 598 females) who were balanced distributed in age (mainly 18 – 35, 36 – 45, 46 – 60), gender and regional accents (for the details, please see the technical document). The script contains 359,451(approx.) utterances in total, covering 13 categories and 42 sub-categories(for more details of script structure design, please check the specification), specially designed to provide materials for both training and testing of many classes of speech recognizers. Each speaker was recorded under 2 environments: quiet environments (office/home) and noisy environments (street, restaurant, car). Mobile platforms, i.e. iOS, Android, Windows Mobile and Symbian were used for speech collection. Each utterance wave was stored in a separate file and uncompressed. A pronunciation lexicon is available with a phonemic transcription in Pinyin. All manually checked. All audio files were manually transcribed and annotated by native transcribers. Details are available with specification.
The Chinese Mandarin Mobile Speech Recognition Corpus was collected in China. It contains the voices of 5048 different speakers ( 2584 males, 2464 females) who were balanced distributed in age (mainly 16 – 35,31 – 45,>46), gender and regional accents (for the details, please see the technical document). The script contains 1,512,937(approx.) utterances in total,  covering 3 categories (for more details of script structure design, please check the specification), specially designed to provide materials for both training and testing of many classes of speech recognizers. Each speaker was recorded in a quiet office environment. Mobile platforms, i.e. iOS\Android\Windows were used for speech collection. Each utterance wave was stored in a separate file and uncompressed. A pronunciation lexicon is available with a phonemic transcription in Pinyin. All manually checked. All audio files were manually transcribed and annotated by native transcribers. Details are available with specification.

Academic Resources

What's Hot What's New View All...>>
This one channel Hokkien Speech Recognition Corpus is collected in Fujian, which is owned by Acoustic Signal and Speech Processing Lab - Xiamen University. There are 40 native speakers in total. The database contains 10134 audio files. All the speech data was transcribed and labeled.

Relevent Paper: The Hokkien Isolated Word Recognition System Based on FPGA (Or copy this link to browser: http://pan.baidu.com/s/1nuHoPuh)
Reference:Lin Li, Wenhao Xu, Jiawen Wu, Shan He, Xiaochao Li
                     Department of Electronic Engineering, Xiamen University, Xiamen, China
                     Xiamen Key Lab of Micro-Nano-Electron Devices & Integrated System, Xiamen, China
                     C Design &IT Research Center of Fujian Province, Xiamen University, Xiamen, China
                     E-mail: heshan@xmu.edu.cn, lilin@xmu.edu.cn

Credits: 500.00 or Price: 770 USD

This one channel Hokkien Speech Recognition Corpus is part of King-ASR-M-001, which is collected in Fujian and owned by Acoustic Signal and Speech Processing Lab - Xiamen University. There are 10 native speakers in total. The database contains 3500 audio files. All the speech data was transcribed and labeled.

Relevent Paper: The Hokkien Isolated Word Recognition System Based on FPGA (Or copy this link to browser: http://pan.baidu.com/s/1nuHoPuh)
Reference:Lin Li, Wenhao Xu, Jiawen Wu, Shan He, Xiaochao Li
                     Department of Electronic Engineering, Xiamen University, Xiamen, China
                     Xiamen Key Lab of Micro-Nano-Electron Devices & Integrated System, Xiamen, China
                     C Design &IT Research Center of Fujian Province, Xiamen University, Xiamen, China
                     E-mail: heshan@xmu.edu.cn, lilin@xmu.edu.cn

Credits: 0.00

This Chinese Mandarin Speech Recognition Corpus, which was collected in China, contains the voices of 285 different native speakers (144 males, 141 females) who were balanced according to age (mainly 16-28, 29-45), gender and regional accents (26 Provinces and regions were covered). The database contains about 12.2 hours of recording. A set of 6,140 digit strings were specially designed for both training and testing of speech recognizers. 199 speakers uttered 30 digit strings, 86 speakers uttered 25 digit strings. All the speech data was transcribed and labeled.

Credits: 533.00

This Chinese Mandarin Speech Recognition Corpus was collected in China and contains the voices of 20 different native speakers. Each speaker red some person names, place names, digit strings and stock names in a moving car. It includes 19,198 audio files and about 20.9 hours.

Credits: 667.00

Data Sharing

Click Here

No credits?

Don't worry...

How to gain credits?

Online Payment function is coming soon

Thanks for waiting...

Monthly Promotion

The US English Mobile Speech Recognition Corpus was collected in the USA. It contains the voices of 6 different speakers (3 males, 3 females) who were balanced distributed in age, gender and regional accents (for the details, please see the technical document). The script contains 3 pairs of conversation in total (for more details of script structure design, please check the specification), specially designed to provide materials for both training and testing of many classes of speech recognizers. Mobile platform, i.e. iOS, Android and Windows were used for speech collection. Pronunciation lexicon is available. All audio files were manually transcribed and annotated by native transcribers. Details are available with specification.

Credits: 1500.00 0

This one channel Hokkien Speech Recognition Corpus is part of King-ASR-M-001, which is collected in Fujian and owned by Acoustic Signal and Speech Processing Lab - Xiamen University. There are 10 native speakers in total. The database contains 3500 audio files. All the speech data was transcribed and labeled.

Relevent Paper: The Hokkien Isolated Word Recognition System Based on FPGA (Or copy this link to browser: http://pan.baidu.com/s/1nuHoPuh)
Reference:Lin Li, Wenhao Xu, Jiawen Wu, Shan He, Xiaochao Li
                     Department of Electronic Engineering, Xiamen University, Xiamen, China
                     Xiamen Key Lab of Micro-Nano-Electron Devices & Integrated System, Xiamen, China
                     C Design &IT Research Center of Fujian Province, Xiamen University, Xiamen, China
                     E-mail: heshan@xmu.edu.cn, lilin@xmu.edu.cn

Credits: 0.00 0