Welcome to KingLine Data Center!   Contact Phone: 0086-10-62660053   Email: marketing@speechocean.com

English

Speechocean HomepageHelp

Home > Commercial Resources > NLP-Corpus

Language

>

All

Sort By:Default

113 Results

King-NLP-001

This corpus contains 3,050,000 Chinese characters collected from the real Emails of Chinese native speakers. All words were proofreaded manually and all the sensitive words and repeated sentences were filtered in the pure word layer. <br /> *Only for domestic market.

King-NLP-002

This data contains 100,000 SMS sentences collected from the real life of Chinese native speakers. All short message sentences were proofreaded manually, repeated sentences were filtered, etc.; The whole data is formed by four layers of Pure word, Pinyin with Tone, Word Segmentation and Name Entity.<br /> *Only for domestic market.

King-NLP-003

This data contains 410,000 SMS sentences collected from the real life of Chinese native speakers. All short message sentences were proofreaded manually, repeated sentences were filtered in the pure word layer and all the sentences were annotated with Pinyin with tone information.<br /> *Only for domestic market.

King-NLP-004

This data contains 1,260,000 SMS sentences collected from the real life of Chinese native speakers. All short message sentences were proofreaded manually, repeated sentences were filtered in the pure word layer and all the sentences were annotated with word segmentation information.<br /> *Only for domestic market.

King-NLP-005

This data contains 110,000 SMS sentences collected from the real life of Chinese native speakers. All short message sentences were proofreaded manually, repeated sentences were filtered in the pure word layer and all the sentences were annotated with name entity information.<br /> *Only for domestic market.

King-NLP-006

This data contains 1,960,000 SMS sentences collected from the real life of Chinese natives. All short message sentences were proofreaded manually, repeated same sentences were filtered, etc.;The whole data is formed of four layers. <br /> *Only for domestic market.

King-NLP-007

This corpus contains 1,200,000 sentences collected from the real instance messages of Chinese natives. All words were proofreaded manually, sensitive words and repeated same sentences were filtered in the pure word layer.<br /> *Only for domestic market.

King-NLP-008

This corpus contains 1,450,000 sentences collected from the real instance messages of Chinese natives. All words were proofreaded manually, sensitive words and repeated same sentences were filtered in the pure word layer.<br /> *Only for domestic market.

King-NLP-009

This corpus contains 350,000 Chinese person names collected from the real names of Chinese native speakers. All words were proofreaded manually and repeated names were filtered.<br /> *Only for domestic market.

King-NLP-010

This corpus contains 6,000,000 Chinese place names collected based on the administrative districts of China. All words were proofreaded manually.<br /> *Only for domestic market.

1 2 3 4 5 6 7 8 9 10