974 resultados para Chinese character recognition


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Chinese language is based on characters which are syllabic in nature. Since languages have syllabotactic rules which govern the construction of syllables and their allowed sequences, Chinese character sequence models can be used as a first level approximation of allowed syllable sequences. N-gram character sequence models were trained on 4.3 billion characters. Characters are used as a first level recognition unit with multiple pronunciations per character. For comparison the CU-HTK Mandarin word based system was used to recognize words which were then converted to character sequences. The character only system error rates for one best recognition were slightly worse than word based character recognition. However combining the two systems using log-linear combination gives better results than either system separately. An equally weighted combination gave consistent CER gains of 0.1-0.2% absolute over the word based standard system. Copyright © 2009 ISCA.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

On-line handwriting recognition has been a frontier area of research for the last few decades under the purview of pattern recognition. Word processing turns to be a vexing experience even if it is with the assistance of an alphanumeric keyboard in Indian languages. A natural solution for this problem is offered through online character recognition. There is abundant literature on the handwriting recognition of western, Chinese and Japanese scripts, but there are very few related to the recognition of Indic script such as Malayalam. This paper presents an efficient Online Handwritten character Recognition System for Malayalam Characters (OHR-M) using K-NN algorithm. It would help in recognizing Malayalam text entered using pen-like devices. A novel feature extraction method, a combination of time domain features and dynamic representation of writing direction along with its curvature is used for recognizing Malayalam characters. This writer independent system gives an excellent accuracy of 98.125% with recognition time of 15-30 milliseconds

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Handwritten character recognition is always a frontier area of research in the field of pattern recognition and image processing and there is a large demand for OCR on hand written documents. Even though, sufficient studies have performed in foreign scripts like Chinese, Japanese and Arabic characters, only a very few work can be traced for handwritten character recognition of Indian scripts especially for the South Indian scripts. This paper provides an overview of offline handwritten character recognition in South Indian Scripts, namely Malayalam, Tamil, Kannada and Telungu

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Optical Character Recognition plays an important role in Digital Image Processing and Pattern Recognition. Even though ambient study had been performed on foreign languages like Chinese and Japanese, effort on Indian script is still immature. OCR in Malayalam language is more complex as it is enriched with largest number of characters among all Indian languages. The challenge of recognition of characters is even high in handwritten domain, due to the varying writing style of each individual. In this paper we propose a system for recognition of offline handwritten Malayalam vowels. The proposed method uses Chain code and Image Centroid for the purpose of extracting features and a two layer feed forward network with scaled conjugate gradient for classification

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Several approaches have been proposed to recognize handwritten Bengali characters using different curve fitting algorithms and curvature analysis. In this paper, a new algorithm (Curve-fitting Algorithm) to identify various strokes of a handwritten character is developed. The curve-fitting algorithm helps recognizing various strokes of different patterns (line, quadratic curve) precisely. This reduces the error elimination burden heavily. Implementation of this Modified Syntactic Method demonstrates significant improvement in the recognition of Bengali handwritten characters.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The effectiveness of linear matched filters for improved character discrimination in presence of random noise and poorly defined characters has been investigated. We have found that although the performance of the filter in presence of random noise is reasonably good (16 dB gain in signal-to-noise-ratio) its performance is poor when the unknown character is distorted (linear shift and rotation).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The effectiveness of linear matched filters for improved character discrimination in presence of random noise and poorly defined characters has been investigated. We have found that although the performance of the filter in presence of random noise is reasonably good (16 dB gain in signal-to-noise-ratio) its performance is poor when the unknown character is distorted (linear shift and rotation).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes a technique for artificial generation of learning and test sample sets suitable for character recognition research. Sample sets of English (Latin), Malayalam, Kannada and Tamil characters are generated easily through their prototype specifications by the endpoint co-ordinates, nature of segments and connectivity.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The machine replication of human reading has been the subject of intensive research for more than three decades. A large number of research papers and reports have already been published on this topic. Many commercial establishments have manufactured recognizers of varying capabilities. Handheld, desk-top, medium-size and large systems costing as high as half a million dollars are available, and are in use for various applications. However, the ultimate goal of developing a reading machine having the same reading capabilities of humans still remains unachieved. So, there still is a great gap between human reading and machine reading capabilities, and a great amount of further effort is required to narrow-down this gap, if not bridge it. This review is organized into six major sections covering a general overview (an introduction), applications of character recognition techniques, methodologies in character recognition, research work in character recognition, some practical OCRs and the conclusions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a new application of two dimensional Principal Component Analysis (2DPCA) to the problem of online character recognition in Tamil Script. A novel set of features employing polynomial fits and quartiles in combination with conventional features are derived for each sample point of the Tamil character obtained after smoothing and resampling. These are stacked to form a matrix, using which a covariance matrix is constructed. A subset of the eigenvectors of the covariance matrix is employed to get the features in the reduced sub space. Each character is modeled as a separate subspace and a modified form of the Mahalanobis distance is derived to classify a given test character. Results indicate that the recognition accuracy using the 2DPCA scheme shows an approximate 3% improvement over the conventional PCA technique.