Welcome to KingLine Data Center!   Contact Phone: 0086-10-62660053   Email: marketing@speechocean.com

English

Speechocean HomepageHelp

Home > Commercial Resources > ASR-Corpus

Recording Platform

>

All

Language

>

All

Sort By:Default

15 Results

King-ASR-034

The Chinese Mandarin Speech Recognition Corpus was collected in China.<br /> <br /> The script contains 19,198(approx.) utterances in total, specially designed to provide materials for both training and testing of speech recognizers. Each utterance wave was stored in a separate file and uncompressed.<br /> <br /> This corpus contains the voices of 20 different speakers who were balanced distributed in age, gender and regional accents. Each speaker was recorded in 1 or 2 different environments taking among 7 possible environments (STOP_MOTOR_RUNNING, LOW_SPEED_ROUGH_ROAD, HIGH_SPEED_GOOD_ROAD, and etc.). <br /> <br /> 2 high quality audio channels were used for speech collection. <br /> <br /> A pronunciation lexicon is available. All audio files were manually transcribed and annotated by native transcribers. The corpus follows the general convention of SpeechDat-Car.<br /> <br /> For more details, please check the technical document or ask our sales people.<br /> <br /> Contact Information:<br /> Phone: +86-10-62660053<br /> Email: contact@speechocean.com<br />

King-ASR-120

The Chinese Mandarin Speech Recognition Corpus was collected in China.<br /> <br /> The script contains 303,948(approx.) utterances in total, specially designed to provide materials for both training and testing of speech recognizers. Each utterance wave was stored in a separate file and uncompressed.<br /> <br /> This corpus contains the voices of 160 different speakers (80 males, 80 females) who were balanced distributed in age (16 – 30, 31 – 45, >45), gender and regional accents. Each speaker was recorded in 3 different environments taking among 5 possible environments (STOP_MOTOR_RUNNING, LOW_SPEED, and etc.). <br /> <br /> 2 kinds of vehicle (FORD FOCUS and MAZDA 6) and 3 kinds of Microphone (Shure SM10A / Sennheiser ME104 / AKG Q400) were used when recording. 4 high quality audio channels were used for speech collection. <br /> <br /> A pronunciation lexicon is available with a phonemic transcription in pinyin phone set. All audio files were manually transcribed and annotated by native transcribers. The corpus follows the general convention of SpeechDat-Car.<br /> <br /> For more details, please check the technical document or ask our sales people.<br /> <br /> Contact Information:<br /> Phone: +86-10-62660053<br /> Email: contact@speechocean.com<br />

King-ASR-122

The Chinese Mandarin Speech Recognition Corpus was collected in China.<br /> <br /> The script contains 200,796(approx.) utterances in total, specially designed to provide materials for both training and testing of speech recognizers. Each utterance wave was stored in a separate file and uncompressed.<br /> <br /> This corpus contains the voices of 100 different speakers who were balanced distributed in age, gender and regional accents. Each speaker was recorded in 1 or 2 different environments taking among 7 possible environments (STOP_MOTOR_RUNNING, LOW_SPEED_ROUGH_ROAD, HIGH_SPEED_GOOD_ROAD, and etc.). <br /> <br /> 4 high quality audio channels were used for speech collection. <br /> <br /> A pronunciation lexicon is available. All audio files were manually transcribed and annotated by native transcribers. The corpus follows the general convention of SpeechDat-Car.<br /> <br /> For more details, please check the technical document or ask our sales people.<br /> <br /> Contact Information:<br /> Phone: +86-10-62660053<br /> Email: contact@speechocean.com<br />

King-ASR-125

The Japanese Speech Recognition Corpus was collected in Japan.<br /> <br /> The script contains 98,825(approx.) utterances in total, specially designed to provide materials for both training and testing of speech recognizers. Each utterance wave was stored in a separate file and uncompressed.<br /> <br /> This corpus contains the voices of 308 different speakers who were balanced distributed in age, gender and regional accents. Each speaker was recorded in 1 or 2 different environments taking among 7 possible environments (STOP_MOTOR_RUNNING, LOW_SPEED_ROUGH_ROAD, HIGH_SPEED_GOOD_ROAD, and etc.). <br /> <br /> 4 high quality audio channels were used for speech collection. <br /> <br /> A pronunciation lexicon is available. All audio files were manually transcribed and annotated by native transcribers. The corpus follows the general convention of SpeechDat-Car.<br /> <br /> For more details, please check the technical document or ask our sales people.<br /> <br /> Contact Information:<br /> Phone: +86-10-62660053<br /> Email: contact@speechocean.com<br />

King-ASR-129

The Canadian French Speech Recognition Corpus was collected in Montreal, Canada.<br /> <br /> The script contains 90,142(approx.) utterances in total, specially designed to provide materials for both training and testing of speech recognizers. Each utterance wave was stored in a separate file and uncompressed.<br /> <br /> This corpus contains the voices of 304 different speakers (152 males, 152 females) who were balanced distributed in age (mainly 16 – 30,31 – 45,46+), gender and regional accents. Each speaker was recorded in 1 or 2 different environments taking among 7 possible environments (STOP_MOTOR_RUNNING, LOW_SPEED_ROUGH_ROAD, HIGH_SPEED_GOOD_ROAD, and etc.). <br /> <br /> 3 kinds of vehicle (HONDA ACCORD, CIVIC, TOYOTA) and 3 kinds of Microphone (Shure SM10A / Sennheiser ME104 / AKG Q400) were used when recording. 4 high quality audio channels were used for speech collection. <br /> <br /> A pronunciation lexicon is available with a phonemic transcription in SAMPA phone set . All audio files were manually transcribed and annotated by native transcribers. The corpus follows the general convention of SpeechDat-Car.<br /> <br /> For more details, please check the technical document or ask our sales people.<br /> <br /> Contact Information:<br /> Phone: +86-10-62660053<br /> Email: contact@speechocean.com<br />

King-ASR-131

The American English Speech Recognition Corpus was collected in USA.<br /> <br /> The script contains 383,788(approx.) utterances in total, specially designed to provide materials for both training and testing of speech recognizers. Each utterance wave was stored in a separate file and uncompressed.<br /> <br /> This corpus contains the voices of 304 different speakers (152 males, 152 females) who were balanced distributed in age (mainly 18 – 30,31 – 45,46 – 65), gender and regional accents. Each speaker was recorded in 1 or 2 different environments taking among 7 possible environments (STOP_MOTOR_RUNNING, LOW_SPEED_ROUGH_ROAD, HIGH_SPEED_GOOD_ROAD, and etc.). <br /> <br /> 3 kinds of vehicle (MAZDA 6, HONDA ACCORD and HONDA CRV) and 3 kinds of Microphone (Shure SM10A / Sennheiser ME104 / AKG C400BL) were used when recording. 4 high quality audio channels were used for speech collection. <br /> <br /> A pronunciation lexicon is available with a phonemic transcription in CMU phone set. All audio files were manually transcribed and annotated by native transcribers. The corpus follows the general convention of SpeechDat-Car.<br /> <br /> For more details, please check the technical document or ask our sales people.<br /> <br /> Contact Information:<br /> Phone: +86-10-62660053<br /> Email: contact@speechocean.com<br />

King-ASR-132

The France French Speech Recognition Corpus was collected in France.<br /> <br /> The script contains 103,480(approx.) utterances in total, specially designed to provide materials for both training and testing of speech recognizers. Each utterance wave was stored in a separate file and uncompressed.<br /> <br /> This corpus contains the voices of 304 different speakers (142 males, 162 females) who were balanced distributed in age (16 – 30, 31 – 45, 46 – 65), gender and regional accents. Each speaker was recorded in 1 or 2 different environments taking among 7 possible environments (STOP_MOTOR_RUNNING, LOW_SPEED_ROUGH_ROAD, and etc.). <br /> <br /> 5 kinds of vehicle (PEUGEOT / TOYOTA / ...) and 8 kinds of Microphone (SHURE SM10A / SENNHEISER ME104 / ...) were used when recording. 4 high quality audio channels were used for speech collection. <br /> <br /> A pronunciation lexicon is available with a phonemic transcription in SAMPA phone set. All audio files were manually transcribed and annotated by native transcribers. The corpus follows the general convention of SpeechDat-Car.<br /> <br /> For more details, please check the technical document or ask our sales people.<br /> <br /> Contact Information:<br /> Phone: +86-10-62660053<br /> Email: contact@speechocean.com<br />

King-ASR-134

The Turkish Speech Recognition Corpus was collected in Turkey.<br /> <br /> The script contains 398,692(approx.) utterances in total, specially designed to provide materials for both training and testing of speech recognizers. Each utterance wave was stored in a separate file and uncompressed.<br /> <br /> This corpus contains the voices of 316 different speakers who were balanced distributed in age, gender and regional accents. Each speaker was recorded in 1 or 2 different environments taking among 7 possible environments (STOP_MOTOR_RUNNING, LOW_SPEED_ROUGH_ROAD, HIGH_SPEED_GOOD_ROAD, and etc.). <br /> <br /> 4 high quality audio channels were used for speech collection. <br /> <br /> A pronunciation lexicon is available. All audio files were manually transcribed and annotated by native transcribers. The corpus follows the general convention of SpeechDat-Car.<br /> <br /> For more details, please check the technical document or ask our sales people.<br /> <br /> Contact Information:<br /> Phone: +86-10-62660053<br /> Email: contact@speechocean.com<br />

King-ASR-135

The American English Speech Recognition Corpus was collected in USA.<br /> <br /> The script contains 395,712(approx.) utterances in total, specially designed to provide materials for both training and testing of speech recognizers. Each utterance wave was stored in a separate file and uncompressed.<br /> <br /> This corpus contains the voices of 300 different speakers (161 males, 139 females) who were balanced distributed in age (mainly 16 – 30,31 – 45,46 – 65), gender and regional accents. Each speaker was recorded in 1 or 2 different environments taking among 7 possible environments (STOP_MOTOR_RUNNING, LOW_SPEED_ROUGH_ROAD, HIGH_SPEED_GOOD_ROAD, and etc.). <br /> <br /> 3 kinds of vehicle (FORD FOCUS, VOLKSWAGEN GOLF, VOLKSWAGEN JETTA) and 3 kinds of Microphone (Shure SM10A / Sennheiser ME104 / AKG C400BL) were used when recording. 4 high quality audio channels were used for speech collection. <br /> <br /> A pronunciation lexicon is available with a phonemic transcription in OALD phone set. All audio files were manually transcribed and annotated by native transcribers. The corpus follows the general convention of SpeechDat-Car.<br /> <br /> For more details, please check the technical document or ask our sales people.<br /> <br /> Contact Information:<br /> Phone: +86-10-62660053<br /> Email: contact@speechocean.com<br />

King-ASR-144

The Mexican Spanish Speech Recognition Corpus was collected in Mexico.<br /> <br /> The script contains 401,148(approx.) utterances in total, specially designed to provide materials for both training and testing of speech recognizers. Each utterance wave was stored in a separate file and uncompressed.<br /> <br /> This corpus contains the voices of 301 different speakers who were balanced distributed in age, gender and regional accents. Each speaker was recorded in 1 or 2 different environments taking among 7 possible environments (STOP_MOTOR_RUNNING, LOW_SPEED_ROUGH_ROAD, HIGH_SPEED_GOOD_ROAD, and etc.). <br /> <br /> 4 high quality audio channels were used for speech collection. <br /> <br /> A pronunciation lexicon is available. All audio files were manually transcribed and annotated by native transcribers. The corpus follows the general convention of SpeechDat-Car.<br /> <br /> For more details, please check the technical document or ask our sales people.<br /> <br /> Contact Information:<br /> Phone: +86-10-62660053<br /> Email: contact@speechocean.com<br />

1 2