Welcome to KingLine Data Center!   Contact Phone: 0086-10-62660053   Email: marketing@speechocean.com

English

Speechocean HomepageHelp

Commercial Resources

What's Hot What's New View All...>>
This English Speech Recognition database, which was collected in USA. It contains the voices of 2538 different native speakers who were demographic balanced according to age distribution (16-30, 31-45, 46+), Gender, Dialectical Regions, each speaker recorded 320 simple sentences in a quiet office room and there were 809,074 audio files which were saved as uncompressed PCM files. All the speech data was transcribed and labeled.
This is Chinese Mandarin conversational speech database, which is collected in China over mobile phone in quiet environment. This database is owned by Speechocean. The corpus contains daily spontaneous conversational speech which was from 4062 speakers. For the first batch of speakers whose speaker ID is under 1000, there is only one speaker's voice from each recording audio file. The two speakers were in different room and the talk to each other by another telephone. While for the second batch of speakers, whose speaker ID is above 2000, there are two speakers for each recording audio file. They were in the same room during the conversation and the speech was recorded by a mobile device. The total recording time for is around 2000 hours including reasonable short pause during the conversation. All the speech data was transcribed and labeled.
The Chinese Mandarin Mobile Speech Recognition database was collected in China Mainland, and contains the voices of 1200 different native speakers (602 males, 598 females) who were balanced according to age, gender and regional accents. The script was specially designed to provide material for both training and testing of many classes of speech recognizers, and contains 300 utterances covering 14 categories and 47 sub-categories for each speaker (for the detail script structure design, please see the technical document). Each speaker was recorded under two environments, a quiet (Office/Home) and a noisy setting (Garden/roadside of restaurant/bus). A total of 300 utterances were recorded for each speaker under two environments (150 utterances and spontaneous sentences per environment). Popular mobiles were used to collect this data.
The speech data is stored as sequences of 16 kHz, 16 bit and uncompressed. Each utterance is stored in a separate file and each signal was transcribed and labeled.
This database is a Mandarin speech database collected by Speechocean over 3 different mobile Operating Systems: iOS, Android and Windows Mobile platforms. 5048 speakers were recorded in total, and each speaker recorded 1 session in quiet environments. With discarding some unqualified utterances, the whole corpus contains the recordings of 1,514,028 utterances of Chinese speech data which were from all the speakers. For the whole corpus, the pure recording time is about 2268.0 hours, including the leading and trailing silence. The total size of this database is about 243.1G. All speakers are native speakers from 14 typical dialectical cities covering seven main dialectical regions of China who were demographic balanced according to age distribution (18~35, 36~45, 46+), Gender (2,584 Males and 2,464 Females) and regional accents. The script was specially designed to provide material for both training and testing of many classes of speech recognizers. The script of each speaker contains 300 sentences which were randomly selected from a pool of sentences specially designed. Each speaker will be recorded as naturally as possible in quiet environment through Popular Mobile Phones such as of iPhones, HTC, Samsung, MOTO and etc. which cover the platforms of ios, android and windows mobile. The speech data are stored as sequences of 16 kHz, 16 bit and uncompressed PCM format. All the speech was manually transcribed and labeled. A pronunciation lexicon with a phonemic transcription in Pinyin is also included.

Academic Resources

What's Hot What's New View All...>>
This one channel Hokkien Speech Recognition Corpus is collected in Fujian, which is owned by Acoustic Signal and Speech Processing Lab - Xiamen University. There are 40 native speakers in total. The database contains 10134 audio files. All the speech data was transcribed and labeled.

Relevent Paper: The Hokkien Isolated Word Recognition System Based on FPGA (Or copy this link to browser: http://pan.baidu.com/s/1nuHoPuh)
Reference:Lin Li, Wenhao Xu, Jiawen Wu, Shan He, Xiaochao Li
                     Department of Electronic Engineering, Xiamen University, Xiamen, China
                     Xiamen Key Lab of Micro-Nano-Electron Devices & Integrated System, Xiamen, China
                     C Design &IT Research Center of Fujian Province, Xiamen University, Xiamen, China
                     E-mail: heshan@xmu.edu.cn, lilin@xmu.edu.cn

Credits: 500.00 or Price: 770 USD

This one channel Hokkien Speech Recognition Corpus is part of King-ASR-M-001, which is collected in Fujian and owned by Acoustic Signal and Speech Processing Lab - Xiamen University. There are 10 native speakers in total. The database contains 3500 audio files. All the speech data was transcribed and labeled.

Relevent Paper: The Hokkien Isolated Word Recognition System Based on FPGA (Or copy this link to browser: http://pan.baidu.com/s/1nuHoPuh)
Reference:Lin Li, Wenhao Xu, Jiawen Wu, Shan He, Xiaochao Li
                     Department of Electronic Engineering, Xiamen University, Xiamen, China
                     Xiamen Key Lab of Micro-Nano-Electron Devices & Integrated System, Xiamen, China
                     C Design &IT Research Center of Fujian Province, Xiamen University, Xiamen, China
                     E-mail: heshan@xmu.edu.cn, lilin@xmu.edu.cn

Credits: 0.00

This Chinese Mandarin Speech Recognition database, which was collected in China, contains the voices of 285 different native speakers (144 males, 141 females) who were balanced according to age (mainly 16-28, 29-45), gender and regional accents (26 Provinces and regions were covered). The database contains about 12.2 hours of recording. A set of 6,140 digit strings were specially designed for both training and testing of speech recognizers. 199 speakers uttered 30 digit strings, 86 speakers uttered 25 digit strings. All the speech data was transcribed and labeled.

Credits: 533.00

This Chinese Mandarin Speech Recognition database was collected in China and contains the voices of 20 different native speakers. Each speaker red some person names, place names, digit strings and stock names in a moving car. It includes 19,198 audio files and about 20.9 hours.

Credits: 667.00

Data Sharing

Click Here

No credits?

Don't worry...

How to gain credits?

Online Payment function is coming soon

Thanks for waiting...

Monthly Promotion

This one channel Hokkien Speech Recognition Corpus is part of King-ASR-M-001, which is collected in Fujian and owned by Acoustic Signal and Speech Processing Lab - Xiamen University. There are 10 native speakers in total. The database contains 3500 audio files. All the speech data was transcribed and labeled.

Relevent Paper: The Hokkien Isolated Word Recognition System Based on FPGA (Or copy this link to browser: http://pan.baidu.com/s/1nuHoPuh)
Reference:Lin Li, Wenhao Xu, Jiawen Wu, Shan He, Xiaochao Li
                     Department of Electronic Engineering, Xiamen University, Xiamen, China
                     Xiamen Key Lab of Micro-Nano-Electron Devices & Integrated System, Xiamen, China
                     C Design &IT Research Center of Fujian Province, Xiamen University, Xiamen, China
                     E-mail: heshan@xmu.edu.cn, lilin@xmu.edu.cn

Credits: 0.00 0