4 resultados para vocabulary

em Cochin University of Science


Relevância:

10.00% 10.00%

Publicador:

Resumo:

A new procedure for the classification of lower case English language characters is presented in this work . The character image is binarised and the binary image is further grouped into sixteen smaller areas ,called Cells . Each cell is assigned a name depending upon the contour present in the cell and occupancy of the image contour in the cell. A data reduction procedure called Filtering is adopted to eliminate undesirable redundant information for reducing complexity during further processing steps . The filtered data is fed into a primitive extractor where extraction of primitives is done . Syntactic methods are employed for the classification of the character . A decision tree is used for the interaction of the various components in the scheme . 1ike the primitive extraction and character recognition. A character is recognized by the primitive by primitive construction of its description . Openended inventories are used for including variants of the characters and also adding new members to the general class . Computer implementation of the proposal is discussed at the end using handwritten character samples . Results are analyzed and suggestions for future studies are made. The advantages of the proposal are discussed in detail .

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Statistical Machine Translation (SMT) is one of the potential applications in the field of Natural Language Processing. The translation process in SMT is carried out by acquiring translation rules automatically from the parallel corpora. However, for many language pairs (e.g. Malayalam- English), they are available only in very limited quantities. Therefore, for these language pairs a huge portion of phrases encountered at run-time will be unknown. This paper focuses on methods for handling such out-of-vocabulary (OOV) words in Malayalam that cannot be translated to English using conventional phrase-based statistical machine translation systems. The OOV words in the source sentence are pre-processed to obtain the root word and its suffix. Different inflected forms of the OOV root are generated and a match is looked up for the word variants in the phrase translation table of the translation model. A Vocabulary filter is used to choose the best among the translations of these word variants by finding the unigram count. A match for the OOV suffix is also looked up in the phrase entries and the target translations are filtered out. Structuring of the filtered phrases is done and SMT translation model is extended by adding OOV with its new phrase translations. By the results of the manual evaluation done it is observed that amount of OOV words in the input has been reduced considerably

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper discusses the implementation details of a child friendly, good quality, English text-to-speech (TTS) system that is phoneme-based, concatenative, easy to set up and use with little memory. Direct waveform concatenation and linear prediction coding (LPC) are used. Most existing TTS systems are unit-selection based, which use standard speech databases available in neutral adult voices.Here reduced memory is achieved by the concatenation of phonemes and by replacing phonetic wave files with their LPC coefficients. Linguistic analysis was used to reduce the algorithmic complexity instead of signal processing techniques. Sufficient degree of customization and generalization catering to the needs of the child user had been included through the provision for vocabulary and voice selection to suit the requisites of the child. Prosody had also been incorporated. This inexpensive TTS systemwas implemented inMATLAB, with the synthesis presented by means of a graphical user interface (GUI), thus making it child friendly. This can be used not only as an interesting language learning aid for the normal child but it also serves as a speech aid to the vocally disabled child. The quality of the synthesized speech was evaluated using the mean opinion score (MOS).