Welcome to KingLine Data Center!   Contact Phone: 0086-10-62660053   Email: marketing@speechocean.com

English

人工智能数据资源服务平台 Speechocean Homepage Help

Home > Commercial Resources

Recording Platform

>

All

Language

>

All

Sort By:Default

153 Results

King-Lexicon-001

Entries: 200,000
Phoneme Inventory: Computer Readable IPA(It can be converted to the phoneset Sampa, XSampa, and etc., based on demand.)
Tone: Included
Syllable Boundary: Included

King-ASR-001

The Chinese Mandarin Speech Recognition Corpus was collected in China.

The corpus contains 13,942(approx.) utterances in total, specially designed to provide materials for both training and testing of speech recognizers. Each utterance wave was stored in a separate file and uncompressed.

This corpus contains the voices of 265 different speakers (134 males, 131 females) who were distributed in different ages (16 – 28, 29 – 45), genders and regional accents. Each speaker recorded in quiet environment.

Telephone platform was used for speech collection. A pronunciation lexicon is available with a phonemic transcription in pinyin. All data were manually checked. All audio files were manually transcribed and annotated by native transcribers.

For more details, please check the technical document or ask our sales people.

Contact Information:
Phone: +86-10-62660053
Email: contact@speechocean.com

King-ASR-002

The Chinese Mandarin Speech Recognition Corpus was collected in China.

The corpus contains 14,492(approx.) utterances in total, specially designed to provide materials for both training and testing of speech recognizers. Each utterance wave was stored in a separate file and uncompressed.

This corpus contains the voices of 285 different speakers (144 males, 141 females) who were distributed in different ages, genders and regional accents. Each speaker recorded in quiet environment.

Telephone platform was used for speech collection. A pronunciation lexicon is available with a phonemic transcription in pinyin. All data were manually checked. All audio files were manually transcribed and annotated by native transcribers.

For more details, please check the technical document or ask our sales people.

Contact Information:
Phone: +86-10-62660053
Email: contact@speechocean.com

King-TTS-003

The Chinese Mandarin Speech Synthesis Corpus contains the recordings of 1 female voice talent. She is a broadcaster, 30 years old when recording this database in 2006, and she was born and grew up in Beijing.

The corpus contains 19,509 utterances and 28 categories. It was recorded in a professional studio over two channels--waveform and electroglottography (EGG) signal. Speech rate, energy and timbre were strictly controlled during recording process.

Each utterance was carefully proofreaded by linguists and was stored in Windows uncompressed PCM format. Phonetic tone labeling, prosody labeling and phone boundary labeling are included. All data were manually checked.

For more details, please check the technical document or ask our sales people.

Contact Information:
Phone: +86-10-62660053
Email: contact@speechocean.com

King-ASR-003

The Chinese Mandarin Speech Recognition Corpus was collected in China.

The corpus contains 7,606(approx.) utterances in total, specially designed to provide materials for both training and testing of speech recognizers. Each utterance wave was stored in a separate file and uncompressed.

This corpus contains the voices of 265 different speakers (134 males, 131 females) who were distributed in different ages, genders and regional accents. Each speaker recorded in quiet environment.

Telephone platform was used for speech collection. A pronunciation lexicon is available with a phonemic transcription in pinyin. All data were manually checked. All audio files were manually transcribed and annotated by native transcribers.

For more details, please check the technical document or ask our sales people.

Contact Information:
Phone: +86-10-62660053
Email: contact@speechocean.com

King-ASR-004

The Chinese Mandarin Speech Recognition Corpus was collected in China.

The corpus contains 8,109(approx.) utterances in total, specially designed to provide materials for both training and testing of speech recognizers. Each utterance wave was stored in a separate file and uncompressed.

This corpus contains the voices of 285 different speakers (144 males, 141 females) who were distributed in different ages, genders and regional accents. Each speaker recorded in quiet environment.

Telephone platform was used for speech collection. A pronunciation lexicon is available with a phonemic transcription in pinyin. All data were manually checked. All audio files were manually transcribed and annotated by native transcribers.

For more details, please check the technical document or ask our sales people.

Contact Information:
Phone: +86-10-62660053
Email: contact@speechocean.com

King-NLP-004

This data contains 1,260,000 SMS sentences collected from the real life of Chinese native speakers. All short message sentences were proofreaded manually, repeated sentences were filtered in the pure word layer and all the sentences were annotated with word segmentation information.
*Only for domestic market.

King-ASR-005

The Chinese Mandarin Speech Recognition Corpus was collected in China.

The corpus contains 6,972(approx.) utterances in total, specially designed to provide materials for both training and testing of speech recognizers. Each utterance wave was stored in a separate file and uncompressed.

This corpus contains the voices of 265 different speakers (134 males, 131 females) who were distributed in different ages, genders and regional accents. Each speaker recorded in quiet environment.

Telephone platform was used for speech collection. A pronunciation lexicon is available with a phonemic transcription in pinyin. All data were manually checked. All audio files were manually transcribed and annotated by native transcribers.

For more details, please check the technical document or ask our sales people.

Contact Information:
Phone: +86-10-62660053
Email: contact@speechocean.com

King-ASR-006

The Chinese Mandarin Speech Recognition Corpus was collected in China.

The corpus contains 7,239(approx.) utterances in total, specially designed to provide materials for both training and testing of speech recognizers. Each utterance wave was stored in a separate file and uncompressed.

This corpus contains the voices of 285 different speakers (144 males, 141 females) who were distributed in different ages, genders and regional accents. Each speaker recorded in quiet environment.

Telephone platform was used for speech collection. A pronunciation lexicon is available with a phonemic transcription in pinyin. All data were manually checked. All audio files were manually transcribed and annotated by native transcribers.

For more details, please check the technical document or ask our sales people.

Contact Information:
Phone: +86-10-62660053
Email: contact@speechocean.com

King-ASR-007

The Chinese Mandarin Speech Recognition Corpus was collected in China.

The corpus contains 3,190(approx.) utterances in total, specially designed to provide materials for both training and testing of speech recognizers. Each utterance wave was stored in a separate file and uncompressed.

This corpus contains the voices of 64 different speakers (52 males, 12 females) who were distributed in different ages, genders and regional accents. Each speaker recorded in quiet environment.

Telephone platform was used for speech collection. A pronunciation lexicon is available with a phonemic transcription in pinyin. All data were manually checked. All audio files were manually transcribed and annotated by native transcribers.

For more details, please check the technical document or ask our sales people.

Contact Information:
Phone: +86-10-62660053
Email: contact@speechocean.com

1 2 3 4 5 6 7 8 9 10