11 resultados para Text analysis
em Indian Institute of Science - Bangalore - Índia
Resumo:
A torsional MEMS varactor with wide dynamic range, lower actuation voltage and isolation between actuation voltage and signal voltage has been proposed in C. Venkatesh et al. (2005). In this paper we address the effects of pull-in, residual stress and continuous cycling on the performance of torsional MEMS varactor.
Resumo:
Extraction of text areas from the document images with complex content and layout is one of the challenging tasks. Few texture based techniques have already been proposed for extraction of such text blocks. Most of such techniques are greedy for computation time and hence are far from being realizable for real time implementation. In this work, we propose a modification to two of the existing texture based techniques to reduce the computation. This is accomplished with Harris corner detectors. The efficiency of these two textures based algorithms, one based on Gabor filters and other on log-polar wavelet signature, are compared. A combination of Gabor feature based texture classification performed on a smaller set of Harris corner detected points is observed to deliver the accuracy and efficiency.
Resumo:
Experimental charge density distribution in 2-chloro-4-fluorobenzoic acid and 4-fluorobenzamide has been carried out using high resolution X-ray diffraction data collected at 100 K using Hansen-Coppens multipolar formalism of electron density. These compounds display short Cl center dot center dot center dot F and F center dot center dot center dot F interactions, respectively. The experimental results are compared with the theoretical charge densities using theoretical structure factors obtained from periodic quantum calculation at the B3LYP/6-31G** level. The topological features were derived from Bader's ``atoms in molecules'' (AIM) approach. Intermolecular Cl center dot center dot center dot F interaction in 2-chloro-4-fluorobenzoic acid is attractive in nature (type II interaction) while the nature of F center dot center dot center dot F interactions in 4-fluorobenzamide shows indication of a minor decrease in repulsion (type I interaction), though the extent of polarization on the fluorine atom is arguably small.
Resumo:
Electron Diffraction Structure Analysis (EDSA) with data from standard selected-area electron diffraction (SAED) is still the method of choice for structure determination of nano-sized single crystals. The recently determined heavy atom structure α-Ti2Se (Albe & Weirich, 2003) is used as an example to illustrate the developed procedure for structure determination from two-dimensionally SAED data via direct methods and kinematical least-squares refinement. Despite the investigated crystallite had a relatively large effective thickness of about 230 Å as determined from dynamical calculations, the obtained structural model from SAED data was found in good agreement with the result from an earlier single crystal X-ray study (Weirich, Pöttgen & Simon, 1996). Arguments, which support the validity of the used quasi-kinematical approach, are given in the text. The influences of dynamical and secondary scattering on the quality of the data and the structure solution are discussed. Moreover, the usefulness of first-principles calculations for verifying the results from EDSA is demonstrated by two examples, whereas one of the structures was unattainable by conventional X-ray diffraction.
Resumo:
Structural and charge density distribution studies have been carried out on a single crystal data of an ammonium borate, [C(10)H(26)N(4)][B(5)O(6)(OH)(4)](2), synthesized by solvothermal method. Further, the experimentally observed geometry is used for the theoretical charge density calculations using the B3LYP/6-31G** level of theory, and the results are compared with the experimental values. Topological analysis of charge density based on the Atoms in Molecules approach for B-O bonds exhibit mixed covalent/ionic character. Detailed analysis of the hydrogen bonds in the crystal structure in the ammonium borate provides insights into the understanding of the reaction pathways that net atomic charges and electrostatic potential isosurfaces also give additional such systems. could result in the formation of borate minerals. The input to evaluate chemical and physical properties in such systems.
Resumo:
We present a technique for irreversible watermarking approach robust to affine transform attacks in camera, biomedical and satellite images stored in the form of monochrome bitmap images. The watermarking approach is based on image normalisation in which both watermark embedding and extraction are carried out with respect to an image normalised to meet a set of predefined moment criteria. The normalisation procedure is invariant to affine transform attacks. The result of watermarking scheme is suitable for public watermarking applications, where the original image is not available for watermark extraction. Here, direct-sequence code division multiple access approach is used to embed multibit text information in DCT and DWT transform domains. The proposed watermarking schemes are robust against various types of attacks such as Gaussian noise, shearing, scaling, rotation, flipping, affine transform, signal processing and JPEG compression. Performance analysis results are measured using image processing metrics.
Resumo:
The toplogical features of a sporadic trifurcated C-H center dot center dot center dot O interaction region, where an oxygen atom acts as an acceptor of three weak hydrogen bonds, has been investigated by experimental and theoretical charge density analysis of ferulic acid. The interaction energy of the asymmetric molecular dimer formed by the trifurcated C-H center dot center dot center dot O motif, based on the multipolar model, is shown to be greater than the corresponding asymmetric O-H center dot center dot center dot O dimer in this crystal structure. Further, the hydrogen bond energies associated with these interaction motifs have been estimated from the local kinetic and potential energy densities at the bond critical points. The trends suggest that the interaction energy of the trifurcated C-H center dot center dot center dot O region is comparable to that of a single O-H center dot center dot center dot O hydrogen bond.
Resumo:
This paper describes a semi-automatic tool for annotation of multi-script text from natural scene images. To our knowledge, this is the maiden tool that deals with multi-script text or arbitrary orientation. The procedure involves manual seed selection followed by a region growing process to segment each word present in the image. The threshold for region growing can be varied by the user so as to ensure pixel-accurate character segmentation. The text present in the image is tagged word-by-word. A virtual keyboard interface has also been designed for entering the ground truth in ten Indic scripts, besides English. The keyboard interface can easily be generated for any script, thereby expanding the scope of the toolkit. Optionally, each segmented word can further be labeled into its constituent characters/symbols. Polygonal masks are used to split or merge the segmented words into valid characters/symbols. The ground truth is represented by a pixel-level segmented image and a '.txt' file that contains information about the number of words in the image, word bounding boxes, script and ground truth Unicode. The toolkit, developed using MATLAB, can be used to generate ground truth and annotation for any generic document image. Thus, it is useful for researchers in the document image processing community for evaluating the performance of document analysis and recognition techniques. The multi-script annotation toolokit (MAST) is available for free download.
Resumo:
Scenic word images undergo degradations due to motion blur, uneven illumination, shadows and defocussing, which lead to difficulty in segmentation. As a result, the recognition results reported on the scenic word image datasets of ICDAR have been low. We introduce a novel technique, where we choose the middle row of the image as a sub-image and segment it first. Then, the labels from this segmented sub-image are used to propagate labels to other pixels in the image. This approach, which is unique and distinct from the existing methods, results in improved segmentation. Bayesian classification and Max-flow methods have been independently used for label propagation. This midline based approach limits the impact of degradations that happens to the image. The segmented text image is recognized using the trial version of Omnipage OCR. We have tested our method on ICDAR 2003 and ICDAR 2011 datasets. Our word recognition results of 64.5% and 71.6% are better than those of methods in the literature and also methods that competed in the Robust reading competition. Our method makes an implicit assumption that degradation is not present in the middle row.
Resumo:
The practice of Ayurveda, the traditional medicine of India, is based on the concept of three major constitutional types (Vata, Pitta and Kapha) defined as ``Prakriti''. To the best of our knowledge, no study has convincingly correlated genomic variations with the classification of Prakriti. In the present study, we performed genome-wide SNP (single nucleotide polymorphism) analysis (Affymetrix, 6.0) of 262 well-classified male individuals (after screening 3416 subjects) belonging to three Prakritis. We found 52 SNPs (p <= 1 x 10(-5)) were significantly different between Prakritis, without any confounding effect of stratification, after 10(6) permutations. Principal component analysis (PCA) of these SNPs classified 262 individuals into their respective groups (Vata, Pitta and Kapha) irrespective of their ancestry, which represent its power in categorization. We further validated our finding with 297 Indian population samples with known ancestry. Subsequently, we found that PGM1 correlates with phenotype of Pitta as described in the ancient text of Caraka Samhita, suggesting that the phenotypic classification of India's traditional medicine has a genetic basis; and its Prakriti-based practice in vogue for many centuries resonates with personalized medicine.
Resumo:
Computer Assisted Assessment (CAA) has been existing for several years now. While some forms of CAA do not require sophisticated text understanding (e.g., multiple choice questions), there are also student answers that consist of free text and require analysis of text in the answer. Research towards the latter till date has concentrated on two main sub-tasks: (i) grading of essays, which is done mainly by checking the style, correctness of grammar, and coherence of the essay and (ii) assessment of short free-text answers. In this paper, we present a structured view of relevant research in automated assessment techniques for short free-text answers. We review papers spanning the last 15 years of research with emphasis on recent papers. Our main objectives are two folds. First we present the survey in a structured way by segregating information on dataset, problem formulation, techniques, and evaluation measures. Second we present a discussion on some of the potential future directions in this domain which we hope would be helpful for researchers.