970 resultados para Malayalam handwritten characters
Resumo:
This dissertation is focused on the taxonomy, phylogeny, and ecology of the vagrant, erratic and allied terricolous and saxicolous species of the genera Aspicilia A. Massal. and Circinaria Link (Megasporaceae), particularly those traditionally referred to as manna lichens . The group has previously been defined on the basis of few morphological characters. The phylogeny of the family Megasporaceae is inferred from the combined dataset of nuLSU and mtSSU sequences. Five genera Aspicilia, Circinaria, Lobothallia, Megaspora, and Sagedia are recognized. Lobothallia is sister of the four other genera, while Aspicilia and Sagedia form the next clade. All these genera have small asci with eight spores. Circinaria is a sister genus of Megaspora, and these two have in common asci with (1 4) 6 8 large spores. Circinaria forms a monophyletic group and sphaerothallioid species form a monophyletic group within Circinaria. The presence of certain morphological characters such as pseudocyphellae, thickness of cortex and medulla layers, as well as ecological differences in sphaerothallioid species distinguish it from some other crustose species, especially those containing aspicilin and characterised by thin cortex and medulla layers, conidium length c. 6 12 µm and absence of pseudocyphellae. If sphaerothallioid species are accepted as a distinct genus, the rest of the Circinaria species would remain as a paraphyletic assemblage. Currently, the genus Circinaria includes all the sphaerothallioid species and its generic position is confirmed and accepted. Thus, it is proposed as a correct generic name also for the manna lichens described originally in other genera. Phylogeny at the species level was studied using nrITS sequence data. Traditionally, morphological characters have been used for the recognition of species. They were re-evaluated in the light of molecular data. Since characters such as vagrant, erratic and crustose growth forms proved to be misleading for the recognition of some species, a combination of several characters (including molecular data) is recommended. Vagrant growth form seems to have evolved several times among the distantly related lineages and even within a single population. The reasons behind the high plasticity in the external morphology of the sphaerothallioid Circinaria remain, however, unknown. Six new species are recognized: Aspicilia tibetica, Circinaria arida, C. digitata nom provis., C. gyrosa nom. provis., C. rogeri nom. provis., and C. rostamii nom. provis. Based on an analysis of nrITS dataset, three new erratic, vagrant and crustose species were also recognized, but these require additional study. The results also reveal that C. elmorei and C. hispida are not monophyletic as currently understood. In addition, 13 new combinations in the genus Circinaria are proposed.
Resumo:
This paper presents the design of a full fledged OCR system for printed Kannada text. The machine recognition of Kannada characters is difficult due to similarity in the shapes of different characters, script complexity and non-uniqueness in the representation of diacritics. The document image is subject to line segmentation, word segmentation and zone detection. From the zonal information, base characters, vowel modifiers and consonant conjucts are separated. Knowledge based approach is employed for recognizing the base characters. Various features are employed for recognising the characters. These include the coefficients of the Discrete Cosine Transform, Discrete Wavelet Transform and Karhunen-Louve Transform. These features are fed to different classifiers. Structural features are used in the subsequent levels to discriminate confused characters. Use of structural features, increases recognition rate from 93% to 98%. Apart from the classical pattern classification technique of nearest neighbour, Artificial Neural Network (ANN) based classifiers like Back Propogation and Radial Basis Function (RBF) Networks have also been studied. The ANN classifiers are trained in supervised mode using the transform features. Highest recognition rate of 99% is obtained with RBF using second level approximation coefficients of Haar wavelets as the features on presegmented base characters.
Resumo:
Feature extraction in bilingual OCR is handicapped by the increase in the number of classes or characters to be handled. This is evident in the case of Indian languages whose alphabet set is large. It is expected that the complexity of the feature extraction process increases with the number of classes. Though the determination of the best set of features that could be used cannot be ascertained through any quantitative measures, the characteristics of the scripts can help decide on the feature extraction procedure. This paper describes a hierarchical feature extraction scheme for recognition of printed bilingual (Tamil and Roman) text. The scheme divides the combined alphabet set of both the scripts into subsets by the extraction of certain spatial and structural features. Three features viz geometric moments, DCT based features and Wavelet transform based features are extracted from the grouped symbols and a linear transformation is performed on them for the purpose of efficient representation in the feature space. The transformation is obtained by the maximization of certain criterion functions. Three techniques : Principal component analysis, maximization of Fisher's ratio and maximization of divergence measure have been employed to estimate the transformation matrix. It has been observed that the proposed hierarchical scheme allows for easier handling of the alphabets and there is an appreciable rise in the recognition accuracy as a result of the transformations.
Resumo:
The Hanuman langur is one of the most widely distributed and morphologically variable non-human primates in South Asia. Even though it has been extensively studied, the taxonomic status of this species remains unresolved due to incongruence between various classification schemes. This incongruence, we believe, is largely due to the use of plastic morphological characters such as coat color in classification. Additionally these classification schemes were largely based on reanalysis of the same set of museum specimens. To bring greater resolution in Hanuman langur taxonomy we undertook a field survey to study variation in external morphological characters among Hanuman langurs. The primary objective of this study is to ascertain the number of morphologically recognizable units (morphotypes) of Hanuman langur in peninsular India and to compare our field observations with published classification schemes. We typed five color-independent characters for multiple adults from various populations in South India. We used the presence-absence matrix of these characters to derive the pair-wise distance between individuals and used this to construct a neighbor-joining (NJ) tree. The resulting NJ tree retrieved six distinct clusters, which we assigned to different morphotypes. These morphotypes can be identified in the field by using a combination of five diagnostic characters. We determined the approximate distributions of these morphotypes by plotting the sampling locations of each morphotype on a map using GIS software. Our field observations are largely concordant with some of the earliest classification schemes, but are incongruent with recent classification schemes. Based on these results we recommend Hill (Ceylon Journal of Science, Colombo 21:277-305, 1939) and Pocock (Primates and carnivora (in part) (pp. 97-163). London: Taylor and Francis, 1939) classification schemes for future studies on Hanuman langurs.
Resumo:
We propose a method to encode a 3D magnetic resonance image data and a decoder in such way that fast access to any 2D image is possible by decoding only the corresponding information from each subband image and thus provides minimum decoding time. This will be of immense use for medical community, because most of the PET and MRI data are volumetric data. Preprocessing is carried out at every level before wavelet transformation, to enable easier identification of coefficients from each subband image. Inclusion of special characters in the bit stream facilitates access to corresponding information from the encoded data. Results are taken by performing Daub4 along x (row), y (column) direction and Haar along z (slice) direction. Comparable results are achieved with the existing technique. In addition to that decoding time is reduced by 1.98 times. Arithmetic coding is used to encode corresponding information independently
Resumo:
This paper presents the preliminary analysis of Kannada WordNet and the set of relevant computational tools. Although the design has been inspired by the famous English WordNet, and to certain extent, by the Hindi WordNet, the unique features of Kannada WordNet are graded antonyms and meronymy relationships, nominal as well as verbal compoundings, complex verb constructions and efficient underlying database design (designed to handle storage and display of Kannada unicode characters). Kannada WordNet would not only add to the sparse collection of machine-readable Kannada dictionaries, but also will give new insights into the Kannada vocabulary. It provides sufficient interface for applications involved in Kannada machine translation, spell checker and semantic analyser.
Resumo:
The following topics were dealt with: document analysis and recognition; multimedia document processing; character recognition; document image processing; cheque processing; form processing; music processing; document segmentation; electronic documents; character classification; handwritten character recognition; information retrieval; postal automation; font recognition; Indian language OCR; handwriting recognition; performance evaluation; graphics recognition; oriental character recognition; and word recognition
Resumo:
A new species of the shrub frog genus Raorchestes Biju, Souche, Dubois, Dutta and Bossuyt is described as Raorchestes kakachi sp. nov. from Agastyamalai hill region in the southern Western Ghats, India. The small sized Raorchestes (male: 24.7–25.8 mm, n = 3 and female: 24.3–34.1 mm, n = 3) is distinguished from all other known congeners by the following suite of characters. Snout oval in dorsal view; tympanum indistinct; head wider than long; moderate webbing in feet; colour on dorsum varying from ivory to brown, blotches of dark brown on flanks, brown mottling on throat reducing towards vent; inner and outer surface of thigh, inner surface of shank and inner surface of tarsus with a distinct dark brown horizontal band which extends upto first three toes on upper surface. A detailed description, advertisement call features, ecology, natural history notes and comparison with closely related species are provided for the new species.
Resumo:
The Indian region is presently the second region after the Neotropics in terms of diversity of phalangopsid crickets. Yet their study is impeded by the lack of necessary taxonomic tools for taxon identification. In the present paper, all generic diagnoses are clarified, using morphological and genitalic characters; female genitalia are described and illustrated for all genera with known females. New taxa are described from southern India: Kempiola flavipunctatus Desutter-Grandcolas n. sp., Opiliosina meridionalis Desutter-Grandcolas n. gen., n. sp., Phalangopsina bolivari Desutter-Grandcolas n. sp., P. chopardi Desutter-Grandcolas n. sp., P. gravelyi Desutter-Grandcolas n. sp., and Speluncasina Desutter-Grandcolas n. gen. The list of phalangopsid crickets from the Indian Region is updated, and a key to phalangopsid genera proposed. A lectotype and a paralectotype are designated to fix the name of Phalangopsina dubia (Bolivar, 1900). Opilionacris annandalei Chopard, 1928, previously transferred to the African genus Phaeophilacris Walker, 1871, is transferred to the genus Speluncasina Desutter-Grandcolas n. gen., while Larandopsis jharnae Bhowmik, 1981 and L. newguineae Bhowmik, 1981 described from New Guinea are transferred to the eneopterine genus Lebinthus Stal, 1877. Finally Luzaropsis confusa Chopard, 1969 is removed from its synonymy with L. ferruginea Walker, 1871.
Resumo:
A palindrome is a set of characters that reads the same forwards and backwards. Since the discovery of palindromic peptide sequences two decades ago, little effort has been made to understand its structural, functional and evolutionary significance. Therefore, in view of this, an algorithm has been developed to identify all perfect palindromes (excluding the palindromic subset and tandem repeats) in a single protein sequence. The proposed algorithm does not impose any restriction on the number of residues to be given in the input sequence. This avant-garde algorithm will aid in the identification of palindromic peptide sequences of varying lengths in a single protein sequence.
Resumo:
This paper describes a semi-automatic tool for annotation of multi-script text from natural scene images. To our knowledge, this is the maiden tool that deals with multi-script text or arbitrary orientation. The procedure involves manual seed selection followed by a region growing process to segment each word present in the image. The threshold for region growing can be varied by the user so as to ensure pixel-accurate character segmentation. The text present in the image is tagged word-by-word. A virtual keyboard interface has also been designed for entering the ground truth in ten Indic scripts, besides English. The keyboard interface can easily be generated for any script, thereby expanding the scope of the toolkit. Optionally, each segmented word can further be labeled into its constituent characters/symbols. Polygonal masks are used to split or merge the segmented words into valid characters/symbols. The ground truth is represented by a pixel-level segmented image and a '.txt' file that contains information about the number of words in the image, word bounding boxes, script and ground truth Unicode. The toolkit, developed using MATLAB, can be used to generate ground truth and annotation for any generic document image. Thus, it is useful for researchers in the document image processing community for evaluating the performance of document analysis and recognition techniques. The multi-script annotation toolokit (MAST) is available for free download.
Resumo:
We propose a set of metrics that evaluate the uniformity, sharpness, continuity, noise, stroke width variance,pulse width ratio, transient pixels density, entropy and variance of components to quantify the quality of a document image. The measures are intended to be used in any optical character recognition (OCR) engine to a priori estimate the expected performance of the OCR. The suggested measures have been evaluated on many document images, which have different scripts. The quality of a document image is manually annotated by users to create a ground truth. The idea is to correlate the values of the measures with the user annotated data. If the measure calculated matches the annotated description,then the metric is accepted; else it is rejected. In the set of metrics proposed, some of them are accepted and the rest are rejected. We have defined metrics that are easily estimatable. The metrics proposed in this paper are based on the feedback of homely grown OCR engines for Indic (Tamil and Kannada) languages. The metrics are independent of the scripts, and depend only on the quality and age of the paper and the printing. Experiments and results for each proposed metric are discussed. Actual recognition of the printed text is not performed to evaluate the proposed metrics. Sometimes, a document image containing broken characters results in good document image as per the evaluated metrics, which is part of the unsolved challenges. The proposed measures work on gray scale document images and fail to provide reliable information on binarized document image.
Resumo:
Recent work on molecular phylogenetics of Scolopendridae from the Western Ghats, Peninsular India, has suggested the presence of six cryptic species of the otostigmine Digitipes Attems, 1930, together with three species described in previous taxonomic work by Jangi and Dass (1984). Digitipes is the correct generic attribution for a monophyletic group of Indian species, these being united with three species from tropical Africa (including the type) that share a distomedial process on the ultimate leg femur of males that is otherwise unknown in Otostigminae. Second maxillary characters previously used in the diagnosis of Digitipes are dismissed because Indian species do not possess the putatively diagnostic character states. Two new species from the Western Ghats that correspond to groupings identified based on monophyly, sequence divergence and coalescent analysis using molecular data are diagnosed based on distinct morphological characters. They are D. jangii and D. periyarensis n. spp. Three species named by Jangi and Dass (Digitipes barnabasi, D. coonoorensis and D. indicus) are revised based on new collections; D. indicus is a junior subjective synonym of Arthrorhabdus jonesii Verhoeff, 1938, the combination becoming Digitipes jonesii (Verhoeff, 1938) n. comb. The presence of Arthrorhabdus in India is accordingly refuted. Three putative species delimited by molecular and ecological data remain cryptic from the perspective of diagnostic morphological characters and are presently retained in D. barnabasi, D. jangii and D. jonesii. A molecularly-delimited species that resolved as sister group to a well-supported clade of Indian Digitipes is identified as Otostigmus ruficeps Pocock, 1890, originally described from a single specimen and revised herein. One Indian species originally assigned to Digitipes, D. gravelyi, deviates from confidently-assigned Digitipes with respect to several characters and is reassigned to Otostigmus, as O. gravelyi (Jangi and Dass, 1984) n. comb.
Resumo:
A new species of montane toad Duttaphrynus is described from Nagaland state of Northeast India. The new species is diagnosable based on following combination of characters: absence of preorbital, postorbital and orbitotympanic ridges, elongated and broad parotid gland, first finger longer than second and presence of a mid-dorsal line. The tympanum is hidden under a skin fold (in male) or absent (in female). The species is compared with its congers from India and Indo-China. We propose to consider Duttaphrynus wokhaensis as junior synonym of Duttaphrynus melanostictus.
Resumo:
Sepsophis punctatus Beddome 1870, the only species of a monotypic genus, was described based on a single specimen from the Eastern Ghats of India. We rediscovered the species based on specimens from Odisha and Andhra Pradesh state, India, after a gap of 137 years, including four specimens from close to the type locality. The holotype was studied in detail, and we present additional morphological characters of the species with details on natural history, habitat and diet. The morphological characters of the holotype along with two additional specimens collected by Beddome are compared with the specimens collected by us. We also briefly discuss the distribution of other members of the subfamily Scincinae and their evolutionary affinities.