966 resultados para Prime OCR
Resumo:
Luteal insufficiency affects fertility and hence study of mechanisms that regulate corpus luteum (CL) function is of prime importance to overcome infertility problems. Exploration of human genome sequence has helped to study the frequency of single nucleotide polymorphisms (SNPs). Clinical benefits of screening SNPs in infertility are being recognized well in recent times. Examining SNPs in genes associated with maintenance and regression of CL may help to understand unexplained luteal insufficiency and related infertility. Publicly available microarray gene expression databases reveal the global gene expression patterns in primate CL during the different functional state. We intend to explore computationally the deleterious SNPs of human genes reported to be common targets of luteolysin and luteotropin in primate CL Different computational algorithms were used to dissect out the functional significance of SNPs in the luteinizing hormone sensitive genes. The results raise the possibility that screening for SNPs might be integrated to evaluate luteal insufficiency associated with human female infertility for future studies. (C) 2012 Elsevier B.V. All rights reserved,
Resumo:
This paper describes a semi-automatic tool for annotation of multi-script text from natural scene images. To our knowledge, this is the maiden tool that deals with multi-script text or arbitrary orientation. The procedure involves manual seed selection followed by a region growing process to segment each word present in the image. The threshold for region growing can be varied by the user so as to ensure pixel-accurate character segmentation. The text present in the image is tagged word-by-word. A virtual keyboard interface has also been designed for entering the ground truth in ten Indic scripts, besides English. The keyboard interface can easily be generated for any script, thereby expanding the scope of the toolkit. Optionally, each segmented word can further be labeled into its constituent characters/symbols. Polygonal masks are used to split or merge the segmented words into valid characters/symbols. The ground truth is represented by a pixel-level segmented image and a '.txt' file that contains information about the number of words in the image, word bounding boxes, script and ground truth Unicode. The toolkit, developed using MATLAB, can be used to generate ground truth and annotation for any generic document image. Thus, it is useful for researchers in the document image processing community for evaluating the performance of document analysis and recognition techniques. The multi-script annotation toolokit (MAST) is available for free download.
Resumo:
We propose a set of metrics that evaluate the uniformity, sharpness, continuity, noise, stroke width variance,pulse width ratio, transient pixels density, entropy and variance of components to quantify the quality of a document image. The measures are intended to be used in any optical character recognition (OCR) engine to a priori estimate the expected performance of the OCR. The suggested measures have been evaluated on many document images, which have different scripts. The quality of a document image is manually annotated by users to create a ground truth. The idea is to correlate the values of the measures with the user annotated data. If the measure calculated matches the annotated description,then the metric is accepted; else it is rejected. In the set of metrics proposed, some of them are accepted and the rest are rejected. We have defined metrics that are easily estimatable. The metrics proposed in this paper are based on the feedback of homely grown OCR engines for Indic (Tamil and Kannada) languages. The metrics are independent of the scripts, and depend only on the quality and age of the paper and the printing. Experiments and results for each proposed metric are discussed. Actual recognition of the printed text is not performed to evaluate the proposed metrics. Sometimes, a document image containing broken characters results in good document image as per the evaluated metrics, which is part of the unsolved challenges. The proposed measures work on gray scale document images and fail to provide reliable information on binarized document image.
Resumo:
Combating stress is one of the prime requirements for any organism. For parasitic microbes, stress levels are highest during the growth inside the host. Their survival depends on their ability to acclimatize and adapt to new environmental conditions. Robust cellular machinery for stress response is, therefore, both critical and essential especially for pathogenic microorganisms. Microbes have cleverly exploited stress proteins as virulence factors for pathogenesis in their hosts. Owing to its ability to sense and respond to the stress conditions, Heat shock protein 90 (Hsp90) is one of the key stress proteins utilized by parasitic microbes. There are growing evidences for the critical role played by Hsp90 in the growth of pathogenic organisms like Candida, Giardia, Plasmodium, Trypanosoma, and others. This review, therefore, explores potential of exploiting Hsp90 as a target for the treatment of infectious diseases. This molecular chaperone has already gained attention as an effective anti-cancer drug target. As a result, a lot of research has been done at laboratory, preclinical and clinical levels for several Hsp90 inhibitors as potential anti-cancer drugs. In addition, lot of data pertaining to toxicity studies, pharmacokinetics and pharmacodynamics studies, dosage regime, drug related toxicities, dose limiting toxicities as well as adverse drug reactions are available for Hsp90 inhibitors. Therefore, repurposing/repositioning strategies are also being explored for these compounds which have gone through advanced stage clinical trials. This review presents a comprehensive summary of current status of development of Hsp90 as a drug target and its inhibitors as candidate anti-infectives. A particular emphasis is laid on the possibility of repositioning strategies coupled with pharmaceutical solutions required for fulfilling needs for ever growing pharmaceutical infectious disease market.
Resumo:
In this paper, we discuss the issues related to word recognition in born-digital word images. We introduce a novel method of power-law transformation on the word image for binarization. We show the improvement in image binarization and the consequent increase in the recognition performance of OCR engine on the word image. The optimal value of gamma for a word image is automatically chosen by our algorithm with fixed stroke width threshold. We have exhaustively experimented our algorithm by varying the gamma and stroke width threshold value. By varying the gamma value, we found that our algorithm performed better than the results reported in the literature. On the ICDAR Robust Reading Systems Challenge-1: Word Recognition Task on born digital dataset, as compared to the recognition rate of 61.5% achieved by TH-OCR after suitable pre-processing by Yang et. al. and 63.4% by ABBYY Fine Reader (used as baseline by the competition organizers without any preprocessing), we achieved 82.9% using Omnipage OCR applied on the images after being processed by our algorithm.
Resumo:
Scenic word images undergo degradations due to motion blur, uneven illumination, shadows and defocussing, which lead to difficulty in segmentation. As a result, the recognition results reported on the scenic word image datasets of ICDAR have been low. We introduce a novel technique, where we choose the middle row of the image as a sub-image and segment it first. Then, the labels from this segmented sub-image are used to propagate labels to other pixels in the image. This approach, which is unique and distinct from the existing methods, results in improved segmentation. Bayesian classification and Max-flow methods have been independently used for label propagation. This midline based approach limits the impact of degradations that happens to the image. The segmented text image is recognized using the trial version of Omnipage OCR. We have tested our method on ICDAR 2003 and ICDAR 2011 datasets. Our word recognition results of 64.5% and 71.6% are better than those of methods in the literature and also methods that competed in the Robust reading competition. Our method makes an implicit assumption that degradation is not present in the middle row.
Resumo:
We have benchmarked the maximum obtainable recognition accuracy on five publicly available standard word image data sets using semi-automated segmentation and a commercial OCR. These images have been cropped from camera captured scene images, born digital images (BDI) and street view images. Using the Matlab based tool developed by us, we have annotated at the pixel level more than 3600 word images from the five data sets. The word images binarized by the tool, as well as by our own midline analysis and propagation of segmentation (MAPS) algorithm are recognized using the trial version of Nuance Omnipage OCR and these two results are compared with the best reported in the literature. The benchmark word recognition rates obtained on ICDAR 2003, Sign evaluation, Street view, Born-digital and ICDAR 2011 data sets are 83.9%, 89.3%, 79.6%, 88.5% and 86.7%, respectively. The results obtained from MAPS binarized word images without the use of any lexicon are 64.5% and 71.7% for ICDAR 2003 and 2011 respectively, and these values are higher than the best reported values in the literature of 61.1% and 41.2%, respectively. MAPS results of 82.8% for BDI 2011 dataset matches the performance of the state of the art method based on power law transform.
Resumo:
In this paper, we report a breakthrough result on the difficult task of segmentation and recognition of coloured text from the word image dataset of ICDAR robust reading competition challenge 2: reading text in scene images. We split the word image into individual colour, gray and lightness planes and enhance the contrast of each of these planes independently by a power-law transform. The discrimination factor of each plane is computed as the maximum between-class variance used in Otsu thresholding. The plane that has maximum discrimination factor is selected for segmentation. The trial version of Omnipage OCR is then used on the binarized words for recognition. Our recognition results on ICDAR 2011 and ICDAR 2003 word datasets are compared with those reported in the literature. As baseline, the images binarized by simple global and local thresholding techniques were also recognized. The word recognition rate obtained by our non-linear enhancement and selection of plance method is 72.8% and 66.2% for ICDAR 2011 and 2003 word datasets, respectively. We have created ground-truth for each image at the pixel level to benchmark these datasets using a toolkit developed by us. The recognition rate of benchmarked images is 86.7% and 83.9% for ICDAR 2011 and 2003 datasets, respectively.
Resumo:
We carry out a series of long atomistic molecular dynamics simulations to study the unfolding of a small protein, chicken villin headpiece (HP-36), in water-ethanol (EtOH) binary mixture. The prime objective of this work is to explore the sensitivity of protein unfolding dynamics toward increasing concentration of the cosolvent and unravel essential features of intermediates formed in search of a dynamical pathway toward unfolding. In water ethanol binary mixtures, HP-36 is found to unfold partially, under ambient conditions, that otherwise requires temperature as high as similar to 600 K to denature in pure aqueous solvent. However, an interesting course of pathway is observed to be followed in the process, guided by the formation of unique intermediates. The first step of unfolding is essentially the separation of the cluster formed by three hydrophobic (phenylalanine) residues, namely, Phe-7, Phe-11, and Phe-18, which constitute the hydrophobic core, thereby initiating melting of helix-2 of the protein. The initial steps are similar to temperature-induced unfolding as well as chemical unfolding using DMSO as cosolvent. Subsequent unfolding steps follow a unique path. As water-ethanol shows composition-dependent anomalies, so do the details of unfolding dynamics. With an increase in cosolvent concentration, different partially unfolded intermediates are found to be formed. This is reflected in a remarkable nonmonotonic composition dependence of several order parameters, including fraction of native contacts and protein-solvent interaction energy. The emergence of such partially unfolded states can be attributed to the preferential solvation of the hydrophobic residues by the ethyl groups of ethanol. We further quantify the local dynamics of unfolding by using a Marcus-type theory.
Resumo:
Space-vector-based pulse width modulation (PWM) for a voltage source inverter (VSI) offers flexibility in terms of different switching sequences. Numerical simulation is helpful to assess the performance of a PWM method before actual implementation. A quick-simulation tool to simulate a variety of space-vector-based PWM strategies for a two-level VSI-fed squirrel cage induction motor drive is presented. The simulator is developed using C and Python programming languages, and has a graphical user interface (GUI) also. The prime focus being PWM strategies, the simulator developed is 40 times faster than MATLAB in terms of the actual time taken for a simulation. Simulation and experimental results are presented on a 5-hp ac motor drive.
Resumo:
An aeroelastic analysis is used to investigate the rate dependent hysteresis in piezoceramic actuators and its effect on helicopter vibration control with trailing edge flaps. Hysteresis in piezoceramic materials can cause considerable complications in the use of smart actuators as prime movers in applications such as helicopter active vibration control. Dynamic hysteresis of the piezoelectric stack actuator is investigated for a range of frequencies (5 Hz (1/rev) to 30 Hz (6/rev)) which are of practical importance for helicopter vibration analysis. Bench top tests are conducted on a commercially available piezoelectric stack actuator. Frequency dependent hysteretic behavior is studied experimentally for helicopter operational frequencies. Material hysteresis in the smart actuator is mathematically modeled using the theory of conic sections. Numerical simulations are also performed at an advance ratio of 0.3 for vibration control analysis using a trailing edge flap with an idealized linear and a hysteretic actuator. The results indicate that dynamic hysteresis has a notable effect on the hub vibration levels. It is found that the theory of conic sections offers a straight forward approach for including hysteresis into aeroelastic analysis.
Resumo:
The Cubic Sieve Method for solving the Discrete Logarithm Problem in prime fields requires a nontrivial solution to the Cubic Sieve Congruence (CSC) x(3) equivalent to y(2)z (mod p), where p is a given prime number. A nontrivial solution must also satisfy x(3) not equal y(2)z and 1 <= x, y, z < p(alpha), where alpha is a given real number such that 1/3 < alpha <= 1/2. The CSC problem is to find an efficient algorithm to obtain a nontrivial solution to CSC. CSC can be parametrized as x equivalent to v(2)z (mod p) and y equivalent to v(3)z (mod p). In this paper, we give a deterministic polynomial-time (O(ln(3) p) bit-operations) algorithm to determine, for a given v, a nontrivial solution to CSC, if one exists. Previously it took (O) over tilde (p(alpha)) time in the worst case to determine this. We relate the CSC problem to the gap problem of fractional part sequences, where we need to determine the non-negative integers N satisfying the fractional part inequality {theta N} < phi (theta and phi are given real numbers). The correspondence between the CSC problem and the gap problem is that determining the parameter z in the former problem corresponds to determining N in the latter problem. We also show in the alpha = 1/2 case of CSC that for a certain class of primes the CSC problem can be solved deterministically in <(O)over tilde>(p(1/3)) time compared to the previous best of (O) over tilde (p(1/2)). It is empirically observed that about one out of three primes is covered by the above class. (C) 2013 Elsevier B.V. All rights reserved.
Resumo:
We previously reported interferon gamma secretion by human CD4(+) and CD8(+) T cells in response to recombinant E. coli-expressed Rv1860 protein of Mycobacterium tuberculosis (MTB) as well as protection of guinea pigs against a challenge with virulent MTB following prime-boost immunization with DNA vaccine and poxvirus expressing Rv1860. In contrast, a Statens Serum Institute Mycobacterium bovis BCG (BCG-SSI) recombinant expressing MTB Rv1860 (BCG-TB1860) showed loss of protective ability compared to the parent BCG strain expressing the control GFP protein (BCG-GFP). Since Rv1860 is a secreted mannosylated protein of MTB and BCG, we investigated the effect of BCG-TB1860 on innate immunity. Relative to BCG-GFP, BCG-TB1860 effected a significant near total reduction both in secretion of cytokines IL-2, IL-12p40, IL-12p70, TNF-alpha, IL-6 and IL-10, and up regulation of co-stimulatory molecules MHC-II, CD40, CD54, CD80 and CD86 by infected bone marrow derived dendritic cells (BMDC), while leaving secreted levels of TGF-beta unchanged. These effects were mimicked by BCG-TB1860His which carried a 6-Histidine tag at the C-terminus of Rv1860, killed sonicated preparations of BCG-TB1860 and purified H37Rv-derived Rv1860 glycoprotein added to BCG-GFP, but not by E. coli-expressed recombinant Rv1860. Most importantly, BMDC exposed to BCG-TB1860 failed to polarize allogeneic as well as syngeneic T cells to secrete IFN-gamma and IL-17 relative to BCG-GFP. Splenocytes from mice infected with BCG-SSI showed significantly less proliferation and secretion of IL-2, IFN-gamma and IL-17, but secreted higher levels of IL-10 in response to in vitro restimulation with BCG-TB1860 compared to BCG-GFP. Spleens from mice infected with BCG-TB1860 also harboured significantly fewer DC expressing MHC-II, IL-12, IL-2 and TNF-alpha compared to mice infected with BCG-GFP. Glycoproteins of MTB, through their deleterious effects on DC may thus contribute to suppress the generation of a TH1- and TH17-dominated adaptive immune response that is vital for protection against tuberculosis.
Resumo:
Let Z(n) denote the ring of integers modulo n. A permutation of Z(n) is a sequence of n distinct elements of Z(n). Addition and subtraction of two permutations is defined element-wise. In this paper we consider two extremal problems on permutations of Z(n), namely, the maximum size of a collection of permutations such that the sum of any two distinct permutations in the collection is again a permutation, and the maximum size of a collection of permutations such that no sum of two distinct permutations in the collection is a permutation. Let the sizes be denoted by s (n) and t (n) respectively. The case when n is even is trivial in both the cases, with s (n) = 1 and t (n) = n!. For n odd, we prove (n phi(n))/2(k) <= s(n) <= n!.2(-)(n-1)/2/((n-1)/2)! and 2 (n-1)/2 . (n-1/2)! <= t (n) <= 2(k) . (n-1)!/phi(n), where k is the number of distinct prime divisors of n and phi is the Euler's totient function.
Resumo:
The algebraic formulation for linear network coding in acyclic networks with the links having integer delay is well known. Based on this formulation, for a given set of connections over an arbitrary acyclic network with integer delay assumed for the links, the output symbols at the sink nodes, at any given time instant, is a F(p)m-linear combination of the input symbols across different generations, where F(p)m denotes the field over which the network operates (p is prime and m is a positive integer). We use finite-field discrete Fourier transform to convert the output symbols at the sink nodes, at any given time instant, into a F(p)m-linear combination of the input symbols generated during the same generation without making use of memory at the intermediate nodes. We call this as transforming the acyclic network with delay into n-instantaneous networks (n is sufficiently large). We show that under certain conditions, there exists a network code satisfying sink demands in the usual (nontransform) approach if and only if there exists a network code satisfying sink demands in the transform approach. When the zero-interference conditions are not satisfied, we propose three precoding-based network alignment (PBNA) schemes for three-source three-destination multiple unicast network with delays (3-S 3-D MUN-D) termed as PBNA using transform approach and time-invariant local encoding coefficients (LECs), PBNA using time-varying LECs, and PBNA using transform approach and block time-varying LECs. We derive sets of necessary and sufficient conditions under which throughputs close to n' + 1/2n' + 1, n'/2n' + 1, and n'/2n' + 1 are achieved for the three source-destination pairs in a 3-S 3-D MUN-D employing PBNA using transform approach and time-invariant LECs, and PBNA using transform approach and block time-varying LECs, where n' is a positive integer. For PBNA using time-varying LECs, we obtain a sufficient condition under which a throughput demand of n(1)/n, n(2)/n, and n(3)/n can be met for the three source-destination pairs in a 3-S 3-D MUN-D, where n(1), n(2), and n(3) are positive integers less than or equal to the positive integer n. This condition is also necessary when n(1) + n(3) = n(1) + n(2) = n where n(1) >= n(2) >= n(3).