149 resultados para Tamil
Resumo:
This study examined the formation and operation of women's microfinance self-help groups in southern India and investigated whether or not the poorest of the poor women were accepted as members of those groups. The study found that caste was used as a selection criterion. Many eligible women excluded themselves from joining the self-help group due to their own lack of education, age, poor health, poverty and lack of trust in the system. The research revealed that self-help groups enhanced women's income and education, improved village infrastructure, and reduced household conflict. Factors that might prevent inclusion of the poorest of the poor in future microfinance programs were identified.
Resumo:
OBJECTIVE: The study of ethnically homogeneous populations may help to identify schizophrenia risk loci. The authors conducted a genomewide linkage scan for schizophrenia in an Indian population. METHOD: Participants were 441 individuals (262 affected probands and siblings) who were recruited primarily from one ethnically homogeneous group, the Tamil Brahmin caste, although individuals from other geographically proximal castes also participated. Genotyping of 124 affected sibling pair pedigrees was performed with 402 short tandem repeat polymorphisms. Linkage analyses were conducted using nonparametric exponential LOD (logarithm of the odds ratio for linkage) scores and parametric heterogeneity LOD scores. Parametric heterogeneity scores were calculated using simple dominant and recessive models, correcting for multiple statistics. The data were examined for evidence of consanguinity. Genomewide significance levels were determined using 10,000 gene dropping simulations. RESULTS: These findings revealed genomewide significant linkage to chromosome 1p31.1, through the use of both exponential and heterogeneity LOD scores, incorporating correction for multiple statistics and mild consanguinity. The estimated sibling recurrence risk associated with this putative locus was 1.95. Analysis for heterogeneity LOD scores also detected suggestive linkage to chromosomes 13q22.1 and 16q12.2. Using 117 tag single nucleotide polymorphisms (SNPs), family-based association analyses of phosphodiesterase 4B (PDE4B), the closest schizophrenia candidate gene, detected no convincing evidence of association, suggesting that the chromosome 1 peak represents a novel risk locus. CONCLUSIONS: This is the first study-to the authors' knowledge-to report significant linkage of schizophrenia to chromosome 1p31.1. Further investigation of this chromosome region in diverse populations is warranted to identify underlying sequence variants.
Resumo:
Simple formalized rules are proposed for automatic phonetic transcription of Tamil words into Roman script. These rules are syntax-directed and require a one-symbol look-ahead facility and hence easily automated in a digital computer. Some suggestions are also put forth for the linearization of Tamil script for handling these by modern machinery.
Resumo:
This paper addresses the problem of resolving ambiguities in frequently confused online Tamil character pairs by employing script specific algorithms as a post classification step. Robust structural cues and temporal information of the preprocessed character are extensively utilized in the design of these algorithms. The methods are quite robust in automatically extracting the discriminative sub-strokes of confused characters for further analysis. Experimental validation on the IWFHR Database indicates error rates of less than 3 % for the confused characters. Thus, these post processing steps have a good potential to improve the performance of online Tamil handwritten character recognition.
Resumo:
In this paper, we propose a novel dexterous technique for fast and accurate recognition of online handwritten Kannada and Tamil characters. Based on the primary classifier output and prior knowledge, the best classifier is chosen from set of three classifiers for second stage classification. Prior knowledge is obtained through analysis of the confusion matrix of primary classifier which helped in identifying the multiple sets of confused characters. Further, studies were carried out to check the performance of secondary classifiers in disambiguating among the confusion sets. Using this technique we have achieved an average accuracy of 92.6% for Kannada characters on the MILE lab dataset and 90.2% for Tamil characters on the HP Labs dataset.
Resumo:
This paper presents a new application of two dimensional Principal Component Analysis (2DPCA) to the problem of online character recognition in Tamil Script. A novel set of features employing polynomial fits and quartiles in combination with conventional features are derived for each sample point of the Tamil character obtained after smoothing and resampling. These are stacked to form a matrix, using which a covariance matrix is constructed. A subset of the eigenvectors of the covariance matrix is employed to get the features in the reduced sub space. Each character is modeled as a separate subspace and a modified form of the Mahalanobis distance is derived to classify a given test character. Results indicate that the recognition accuracy using the 2DPCA scheme shows an approximate 3% improvement over the conventional PCA technique.
Resumo:
This paper describes the efforts at MILE lab, IISc, to create a 100,000-word database each in Kannada and Tamil for the design and development of Online Handwritten Recognition. It has been collected from over 600 users in order to capture the variations in writing style. We describe features of the scripts and how the number of symbols were reduced to be able to effectively train the data for recognition. The list of words include all the characters, Kannada and Indo-Arabic numerals, punctuations and other symbols. A semi-automated tool for the annotation of data from stroke to word level is used. It segments each word into stroke groups and also acts as a validation mechanism for segmentation. The tool displays the stroke, stroke groups and aksharas of a word and hence can be used to study the various styles of writing, delayed strokes and for assigning quality tags to the words. The tool is currently being used for annotating Tamil and Kannada data. The output is stored in a standard XML format.
Resumo:
We present a fractal coding method to recognize online handwritten Tamil characters and propose a novel technique to increase the efficiency in terms of time while coding and decoding. This technique exploits the redundancy in data, thereby achieving better compression and usage of lesser memory. It also reduces the encoding time and causes little distortion during reconstruction. Experiments have been conducted to use these fractal codes to classify the online handwritten Tamil characters from the IWFHR 2006 competition dataset. In one approach, we use fractal coding and decoding process. A recognition accuracy of 90% has been achieved by using DTW for distortion evaluation during classification and encoding processes as compared to 78% using nearest neighbor classifier. In other experiments, we use the fractal code, fractal dimensions and features derived from fractal codes as features in separate classifiers. While the fractal code is successful as a feature, the other two features are not able to capture the wide within-class variations.
Resumo:
In this paper, we compare the experimental results for Tamil online handwritten character recognition using HMM and Statistical Dynamic Time Warping (SDTW) as classifiers. HMM was used for a 156-class problem. Different feature sets and values for the HMM states & mixtures were tried and the best combination was found to be 16 states & 14 mixtures, giving an accuracy of 85%. The features used in this combination were retained and a SDTW model with 20 states and single Gaussian was used as classifier. Also, the symbol set was increased to include numerals, punctuation marks and special symbols like $, & and #, taking the number of classes to 188. It was found that, with a small addition to the feature set, this simple SDTW classifier performed on par with the more complicated HMM model, giving an accuracy of 84%. Mixture density estimation computations was reduced by 11 times. The recognition is writer independent, as the dataset used is quite large, with a variety of handwriting styles.
Resumo:
In this paper, we consider the problem of time series classification. Using piecewise linear interpolation various novel kernels are obtained which can be used with Support vector machines for designing classifiers capable of deciding the class of a given time series. The approach is general and is applicable in many scenarios. We apply the method to the task of Online Tamil handwritten character recognition with promising results.
Resumo:
N-gram language models and lexicon-based word-recognition are popular methods in the literature to improve recognition accuracies of online and offline handwritten data. However, there are very few works that deal with application of these techniques on online Tamil handwritten data. In this paper, we explore methods of developing symbol-level language models and a lexicon from a large Tamil text corpus and their application to improving symbol and word recognition accuracies. On a test database of around 2000 words, we find that bigram language models improve symbol (3%) and word recognition (8%) accuracies and while lexicon methods offer much greater improvements (30%) in terms of word recognition, there is a large dependency on choosing the right lexicon. For comparison to lexicon and language model based methods, we have also explored re-evaluation techniques which involve the use of expert classifiers to improve symbol and word recognition accuracies.
Resumo:
The objective of the paper is to estimate Safe Shutdown Earthquake (SSE) and Operating/Design Basis Earthquake (OBE/DBE) for the Nuclear Power Plant (NPP) site located at Kalpakkam, Tamil Nadu, India. The NPP is located at 12.558 degrees N, 80.175 degrees E and a 500 km circular area around NPP site is considered as `seismic study area' based on past regional earthquake damage distribution. The geology, seismicity and seismotectonics of the study area are studied and the seismotectonic map is prepared showing the seismic sources and the past earthquakes. Earthquake data gathered from many literatures are homogenized and declustered to form a complete earthquake catalogue for the seismic study area. The conventional maximum magnitude of each source is estimated considering the maximum observed magnitude (M-max(obs)) and/or the addition of 0.3 to 0.5 to M-max(obs). In this study maximum earthquake magnitude has been estimated by establishing a region's rupture character based on source length and associated M-max(obs). A final source-specific M-max is selected from the three M-max values by following the logical criteria. To estimate hazard at the NPP site, ten Ground-Motion Prediction Equations (GMPEs) valid for the study area are considered. These GMPEs are ranked based on Log-Likelihood (LLH) values. Top five GMPEs are considered to estimate the peak ground acceleration (PGA) for the site. Maximum PGA is obtained from three faults and named as vulnerable sources to decide the magnitudes of OBE and SSE. The average and normalized site specific response spectrum is prepared considering three vulnerable sources and further used to establish site-specific design spectrum at NPP site.
Resumo:
In this article, we aim at reducing the error rate of the online Tamil symbol recognition system by employing multiple experts to reevaluate certain decisions of the primary support vector machine classifier. Motivated by the relatively high percentage of occurrence of base consonants in the script, a reevaluation technique has been proposed to correct any ambiguities arising in the base consonants. Secondly, a dynamic time-warping method is proposed to automatically extract the discriminative regions for each set of confused characters. Class-specific features derived from these regions aid in reducing the degree of confusion. Thirdly, statistics of specific features are proposed for resolving any confusions in vowel modifiers. The reevaluation approaches are tested on two databases (a) the isolated Tamil symbols in the IWFHR test set, and (b) the symbols segmented from a set of 10,000 Tamil words. The recognition rate of the isolated test symbols of the IWFHR database improves by 1.9 %. For the word database, the incorporation of the reevaluation step improves the symbol recognition rate by 3.5 % (from 88.4 to 91.9 %). This, in turn, boosts the word recognition rate by 11.9 % (from 65.0 to 76.9 %). The reduction in the word error rate has been achieved using a generic approach, without the incorporation of language models.