Bibliographic data: WO2005091167 (A2) ― 2005-09-29


SYSTEMS AND METHODS FOR TRANSLATING CHINESE PINYIN TO CHINESE CHARACTERS  

No documents available for this priority number.
Page bookmark WO2005091167 (A2)  -  SYSTEMS AND METHODS FOR TRANSLATING CHINESE PINYIN TO CHINESE CHARACTERS
Inventor(s): WU JUN [US]; ZHU HUICAN [US]; ZHU HONGJUN [US] +
Applicant(s): GOOGLE INC [US]; WU JUN [US]; ZHU HUICAN [US]; ZHU HONGJUN [US] +
Classification:
- international: G06F17/22; G06F17/28; (IPC1-7): G06F17/28
- cooperative:
Application number: WO2005US08863 20050316 
Priority number(s): US20040802479 20040316
Also published as:


Abstract of  WO2005091167 (A2)


Tooltip
Translate this text into  

Systems and methods to process and translate pinyin to Chinese characters and words are disclosed. A chinese language model is trained by extracting unknown character strings from Chinese inputs, e.g., documents and/or user inputs/queries, determining valid words from the unknown character strings, and generating a transition matrix based on the Chinese inputs for predicting a word string given the context. A method for translating a pinyin input generally includes generating a set of Chinese character strings from the pinyin input using a Chinese dictionary including words derived from the Chinese inputs and a language model trained based on the Chinese inputs, each character string having a weight indicating the likelihood that the character string corresponds to the pinyin input.; Ambiguous user input may be classified as non-pinyin or pinyin by identifying an ambiguous pinyin/non-pinyin ASCII word in the user input and analyzing the context to classify the user input.