12 resultados para Word Sense Disambiguation

em Indian Institute of Science - Bangalore - Índia


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article intends to cover two aspects of non-segmented negative sense RNA viruses. In the initial section, the strategy employed by these viruses to replicate their genomes is discussed. This would help in understanding the later section in which the use of these viruses as vaccine vectors has been discussed. For the description of the replication strategy which encompasses virus genome transcription and genome replication carried out by the same RNA dependent RNA polymerase complex, a member of the prototype rhabdovirus family - Chandipura virus has been chosen as an example to illustrate the complex nature of the two processes and their regulation. In the discussion on these viruses serving as vectors for carrying vaccine antigen genes, emphasis has been laid on describing the progress made in using the attenuated viruses as vectors and a description of the systems in which the efficiency of immune responses has been tested.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We describe the on-going design and implementation of a sensor network for agricultural management targeted at resource-poor farmers in India. Our focus on semi-arid regions led us to concentrate on water-related issues. Throughout 2004, we carried out a survey on the information needs of the population living in a cluster of villages in our study area. The results highlighted the potential that environment-related information has for the improvement of farming strategies in the face of highly variable conditions, in particular for risk management strategies (choice of crop varieties, sowing and harvest periods, prevention of pests and diseases, efficient use of irrigation water etc.). This leads us to advocate an original use of Information and Communication Technologies (ICT). We believe our demand-driven approach for the design of appropriate ICT tools that are targeted at the resource-poor to be relatively new. In order to go beyond a pure technocratic approach, we adopted an iterative, participatory methodology.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose a novel heuristic approach to segment recognizable symbols from online Kannada word data and perform recognition of the entire word. Two different estimates of first derivative are extracted from the preprocessed stroke groups and used as features for classification. Estimate 2 proved better resulting in 88% accuracy, which is 3% more than that achieved with estimate 1. Classification is performed by statistical dynamic space warping (SDSW) classifier which uses X, Y co-ordinates and their first derivatives as features. Classifier is trained with data from 40 writers. 295 classes are handled covering Kannada aksharas, with Kannada numerals, Indo-Arabic numerals, punctuations and other special symbols like $ and #. Classification accuracies obtained are 88% at the akshara level and 80% at the word level, which shows the scope for further improvement in segmentation algorithm

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Parallel sub-word recognition (PSWR) is a new model that has been proposed for language identification (LID) which does not need elaborate phonetic labeling of the speech data in a foreign language. The new approach performs a front-end tokenization in terms of sub-word units which are designed by automatic segmentation, segment clustering and segment HMM modeling. We develop PSWR based LID in a framework similar to the parallel phone recognition (PPR) approach in the literature. This includes a front-end tokenizer and a back-end language model, for each language to be identified. Considering various combinations of the statistical evaluation scores, it is found that PSWR can perform as well as PPR, even with broad acoustic sub-word tokenization, thus making it an efficient alternative to the PPR system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Network Intrusion Detection Systems (NIDS) intercept the traffic at an organization's network periphery to thwart intrusion attempts. Signature-based NIDS compares the intercepted packets against its database of known vulnerabilities and malware signatures to detect such cyber attacks. These signatures are represented using Regular Expressions (REs) and strings. Regular Expressions, because of their higher expressive power, are preferred over simple strings to write these signatures. We present Cascaded Automata Architecture to perform memory efficient Regular Expression pattern matching using existing string matching solutions. The proposed architecture performs two stage Regular Expression pattern matching. We replace the substring and character class components of the Regular Expression with new symbols. We address the challenges involved in this approach. We augment the Word-based Automata, obtained from the re-written Regular Expressions, with counter-based states and length bound transitions to perform Regular Expression pattern matching. We evaluated our architecture on Regular Expressions taken from Snort rulesets. We were able to reduce the number of automata states between 50% to 85%. Additionally, we could reduce the number of transitions by a factor of 3 leading to further reduction in the memory requirements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we discuss the issues related to word recognition in born-digital word images. We introduce a novel method of power-law transformation on the word image for binarization. We show the improvement in image binarization and the consequent increase in the recognition performance of OCR engine on the word image. The optimal value of gamma for a word image is automatically chosen by our algorithm with fixed stroke width threshold. We have exhaustively experimented our algorithm by varying the gamma and stroke width threshold value. By varying the gamma value, we found that our algorithm performed better than the results reported in the literature. On the ICDAR Robust Reading Systems Challenge-1: Word Recognition Task on born digital dataset, as compared to the recognition rate of 61.5% achieved by TH-OCR after suitable pre-processing by Yang et. al. and 63.4% by ABBYY Fine Reader (used as baseline by the competition organizers without any preprocessing), we achieved 82.9% using Omnipage OCR applied on the images after being processed by our algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

N-gram language models and lexicon-based word-recognition are popular methods in the literature to improve recognition accuracies of online and offline handwritten data. However, there are very few works that deal with application of these techniques on online Tamil handwritten data. In this paper, we explore methods of developing symbol-level language models and a lexicon from a large Tamil text corpus and their application to improving symbol and word recognition accuracies. On a test database of around 2000 words, we find that bigram language models improve symbol (3%) and word recognition (8%) accuracies and while lexicon methods offer much greater improvements (30%) in terms of word recognition, there is a large dependency on choosing the right lexicon. For comparison to lexicon and language model based methods, we have also explored re-evaluation techniques which involve the use of expert classifiers to improve symbol and word recognition accuracies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We have benchmarked the maximum obtainable recognition accuracy on five publicly available standard word image data sets using semi-automated segmentation and a commercial OCR. These images have been cropped from camera captured scene images, born digital images (BDI) and street view images. Using the Matlab based tool developed by us, we have annotated at the pixel level more than 3600 word images from the five data sets. The word images binarized by the tool, as well as by our own midline analysis and propagation of segmentation (MAPS) algorithm are recognized using the trial version of Nuance Omnipage OCR and these two results are compared with the best reported in the literature. The benchmark word recognition rates obtained on ICDAR 2003, Sign evaluation, Street view, Born-digital and ICDAR 2011 data sets are 83.9%, 89.3%, 79.6%, 88.5% and 86.7%, respectively. The results obtained from MAPS binarized word images without the use of any lexicon are 64.5% and 71.7% for ICDAR 2003 and 2011 respectively, and these values are higher than the best reported values in the literature of 61.1% and 41.2%, respectively. MAPS results of 82.8% for BDI 2011 dataset matches the performance of the state of the art method based on power law transform.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Plants produce volatile organic compounds (VOCs) in a variety of contexts that include response to abiotic and biotic stresses, attraction of pollinators and parasitoids, and repulsion of herbivores. Some of these VOCs may also exhibit diel variation in emission. In Ficus racemosa, we examined variation in VOCs released by fig syconia throughout syconium development and between day and night. Syconia are globular enclosed inflorescences that serve as developing nurseries for pollinating and parasitic fig wasps. Syconia are attacked by gallers early in their development, serviced by pollinators in mid phase, and are attractive to parasitoids in response to the development of gallers at later stages. VOC bouquets of the different development phases of the syconium were distinctive, as were their day and night VOC profiles. VOCs such as alpha-muurolene were characteristic of the pollen-receptive diurnal phase, and may serve to attract the diurnally-active pollinating wasps. Diel patterns of release of volatiles could not be correlated with their predicted volatility as determined by Henry's law constants at ambient temperatures. Therefore, factors other than Henry's law constant such as stomatal conductance or VOC synthesis must explain diel variation in VOC emission. A novel use of weighted gene co-expression network analysis (WGCNA) on the volatilome resulted in seven distinct modules of co-emitted VOCs that could be interpreted on the basis of syconium ecology. Some modules were characterized by the response of fig syconia to early galling by parasitic wasps and consisted largely of green leaf volatiles (GLVs). Other modules, that could be characterized by a combination of syconia response to oviposition and tissue feeding by larvae of herbivorous galler pollinators as well as of parasitized wasps, consisted largely of putative herbivore-induced plant volatiles (HIPVs). We demonstrated the usefulness of WGCNA analysis of the volatilome in making sense of the scents produced by the syconia at different stages and diel phases of their development.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we report a breakthrough result on the difficult task of segmentation and recognition of coloured text from the word image dataset of ICDAR robust reading competition challenge 2: reading text in scene images. We split the word image into individual colour, gray and lightness planes and enhance the contrast of each of these planes independently by a power-law transform. The discrimination factor of each plane is computed as the maximum between-class variance used in Otsu thresholding. The plane that has maximum discrimination factor is selected for segmentation. The trial version of Omnipage OCR is then used on the binarized words for recognition. Our recognition results on ICDAR 2011 and ICDAR 2003 word datasets are compared with those reported in the literature. As baseline, the images binarized by simple global and local thresholding techniques were also recognized. The word recognition rate obtained by our non-linear enhancement and selection of plance method is 72.8% and 66.2% for ICDAR 2011 and 2003 word datasets, respectively. We have created ground-truth for each image at the pixel level to benchmark these datasets using a toolkit developed by us. The recognition rate of benchmarked images is 86.7% and 83.9% for ICDAR 2011 and 2003 datasets, respectively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

G-Quadruplexes occupy important regulatory regions in the genome. DNA G-quadruplexes in the promoter regions and RNA quadruplexes in the UTRs (untranslated regions) have been individually studied and variously implicated at different regulatory levels of gene expression. However, the formation of G-quadruplexes in the sense and antisense strands and their corresponding roles in gene regulation have not been studied in much detail. In the present study, we have elucidated the effect of strand asymmetry in this context. Using biophysical methods, we have demonstrated the formation of stable G-quadruplex structure in vitro using CD and UV melting. Additionally, ITC was employed to demonstrate that a previously reported selective G-quadruplex ligand was able to bind and stabilize the G-quadruplex in the present sequence. Further, we have shown using reporter constructs that although the DNA G-quadruplex in either strand can reduce translation efficiency, transcriptional regulation differs when G-quadruplex is present in the sense or antisense strand. We demonstrate that the G-quadruplex motif in the antisense strand substantially inhibits transcription, while when in the sense strand, it does not affect transcription, although it does ultimately reduce translation. Further, it is also shown that the G-quadruplex stabilizing ligand can enhance this asymmetric transcription regulation as a result of the increased stabilization of the G-quadruplex.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The nonstructural protein NSs, encoded by the S RNA of groundnut bud necrosis virus (GBNV) (genus Tospovirus, family Bunyaviridae) has earlier been shown to possess nucleic-acid-stimulated NTPase and 50 a phosphatase activity. ATP hydrolysis is an essential function of a true helicase. Therefore, NSs was tested for DNA helicase activity. The results demonstrated that GBNV NSs possesses bidirectional DNA helicase activity. An alanine mutation in the Walker A motif (K189A rNSs) decreased DNA helicase activity substantially, whereas a mutation in the Walker B motif resulted in a marginal decrease in this activity. The parallel loss of the helicase and ATPase activity in the K189A mutant confirms that NSs acts as a non-canonical DNA helicase. Furthermore, both the wild-type and K189A NSs could function as RNA silencing suppressors, demonstrating that the suppressor activity of NSs is independent of its helicase or ATPase activity. This is the first report of a true helicase from a negative-sense RNA virus.