984 resultados para Optical character recognition


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Chinese language is based on characters which are syllabic in nature. Since languages have syllabotactic rules which govern the construction of syllables and their allowed sequences, Chinese character sequence models can be used as a first level approximation of allowed syllable sequences. N-gram character sequence models were trained on 4.3 billion characters. Characters are used as a first level recognition unit with multiple pronunciations per character. For comparison the CU-HTK Mandarin word based system was used to recognize words which were then converted to character sequences. The character only system error rates for one best recognition were slightly worse than word based character recognition. However combining the two systems using log-linear combination gives better results than either system separately. An equally weighted combination gave consistent CER gains of 0.1-0.2% absolute over the word based standard system. Copyright © 2009 ISCA.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of the study is to develop a hand written character recognition system that could recognisze all the characters in the mordern script of malayalam language at a high recognition rate

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Garment information tracking is required for clean room garment management. In this paper, we present a camera-based robust system with implementation of Optical Character Reconition (OCR) techniques to fulfill garment label recognition. In the system, a camera is used for image capturing; an adaptive thresholding algorithm is employed to generate binary images; Connected Component Labelling (CCL) is then adopted for object detection in the binary image as a part of finding the ROI (Region of Interest); Artificial Neural Networks (ANNs) with the BP (Back Propagation) learning algorithm are used for digit recognition; and finally the system is verified by a system database. The system has been tested. The results show that it is capable of coping with variance of lighting, digit twisting, background complexity, and font orientations. The system performance with association to the digit recognition rate has met the design requirement. It has achieved real-time and error-free garment information tracking during the testing.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Features derived from the trispectra of DFT magnitude slices are used for multi-font digit recognition. These features are insensitive to translation, rotation, or scaling of the input. They are also robust to noise. Classification accuracy tests were conducted on a common data base of 256× 256 pixel bilevel images of digits in 9 fonts. Randomly rotated and translated noisy versions were used for training and testing. The results indicate that the trispectral features are better than moment invariants and affine moment invariants. They achieve a classification accuracy of 95% compared to about 81% for Hu's (1962) moment invariants and 39% for the Flusser and Suk (1994) affine moment invariants on the same data in the presence of 1% impulse noise using a 1-NN classifier. For comparison, a multilayer perceptron with no normalization for rotations and translations yields 34% accuracy on 16× 16 pixel low-pass filtered and decimated versions of the same data.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

An application of image processing techniques to recognition of hand-drawn circuit diagrams is presented. The scanned image of a diagram is pre-processed to remove noise and converted to bilevel. Morphological operations are applied to obtain a clean, connected representation using thinned lines. The diagram comprises of nodes, connections and components. Nodes and components are segmented using appropriate thresholds on a spatially varying object pixel density. Connection paths are traced using a pixel-stack. Nodes are classified using syntactic analysis. Components are classified using a combination of invariant moments, scalar pixel-distribution features, and vector relationships between straight lines in polygonal representations. A node recognition accuracy of 82% and a component recognition accuracy of 86% was achieved on a database comprising 107 nodes and 449 components. This recogniser can be used for layout “beautification” or to generate input code for circuit analysis and simulation packages

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This work describes an online handwritten character recognition system working in combination with an offline recognition system. The online input data is also converted into an offline image, and parallely recognized by both online and offline strategies. Features are proposed for offline recognition and a disambiguation step is employed in the offline system for the samples for which the confidence level of the classifier is low. The outputs are then combined probabilistically resulting in a classifier out-performing both individual systems. Experiments are performed for Kannada, a South Indian Language, over a database of 295 classes. The accuracy of the online recognizer improves by 11% when the combination with offline system is used.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Nearest neighbor retrieval is the task of identifying, given a database of objects and a query object, the objects in the database that are the most similar to the query. Retrieving nearest neighbors is a necessary component of many practical applications, in fields as diverse as computer vision, pattern recognition, multimedia databases, bioinformatics, and computer networks. At the same time, finding nearest neighbors accurately and efficiently can be challenging, especially when the database contains a large number of objects, and when the underlying distance measure is computationally expensive. This thesis proposes new methods for improving the efficiency and accuracy of nearest neighbor retrieval and classification in spaces with computationally expensive distance measures. The proposed methods are domain-independent, and can be applied in arbitrary spaces, including non-Euclidean and non-metric spaces. In this thesis particular emphasis is given to computer vision applications related to object and shape recognition, where expensive non-Euclidean distance measures are often needed to achieve high accuracy. The first contribution of this thesis is the BoostMap algorithm for embedding arbitrary spaces into a vector space with a computationally efficient distance measure. Using this approach, an approximate set of nearest neighbors can be retrieved efficiently - often orders of magnitude faster than retrieval using the exact distance measure in the original space. The BoostMap algorithm has two key distinguishing features with respect to existing embedding methods. First, embedding construction explicitly maximizes the amount of nearest neighbor information preserved by the embedding. Second, embedding construction is treated as a machine learning problem, in contrast to existing methods that are based on geometric considerations. The second contribution is a method for constructing query-sensitive distance measures for the purposes of nearest neighbor retrieval and classification. In high-dimensional spaces, query-sensitive distance measures allow for automatic selection of the dimensions that are the most informative for each specific query object. It is shown theoretically and experimentally that query-sensitivity increases the modeling power of embeddings, allowing embeddings to capture a larger amount of the nearest neighbor structure of the original space. The third contribution is a method for speeding up nearest neighbor classification by combining multiple embedding-based nearest neighbor classifiers in a cascade. In a cascade, computationally efficient classifiers are used to quickly classify easy cases, and classifiers that are more computationally expensive and also more accurate are only applied to objects that are harder to classify. An interesting property of the proposed cascade method is that, under certain conditions, classification time actually decreases as the size of the database increases, a behavior that is in stark contrast to the behavior of typical nearest neighbor classification systems. The proposed methods are evaluated experimentally in several different applications: hand shape recognition, off-line character recognition, online character recognition, and efficient retrieval of time series. In all datasets, the proposed methods lead to significant improvements in accuracy and efficiency compared to existing state-of-the-art methods. In some datasets, the general-purpose methods introduced in this thesis even outperform domain-specific methods that have been custom-designed for such datasets.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We describe a 42.6 Gbit/s all-optical pattern recognition system which uses semiconductor optical amplifiers (SOAs). A circuit with three SOA-based logic gates is used to identify the presence of specific port numbers in an optical packet header.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The Schiff base, 3-hydroxyquinoxaline-2-carboxalidine-4-aminoantipyrine, was synthesized by the condensation of 3-hydroxyquinoxaline-2-carboxaldehyde with 4-aminoantipyrine. HPLC, FT-IR and NMR spectral data revealed that the compound exists predominantly in the amide tautomeric form and exhibits both absorption and fluorescence solvatochromism, large stokes shift, two electron quasireversible redox behaviour and good thermal stability, with a glass transition temperature of 104oC. The third-order non-linear optical character was studied using open aperture Z-scan methodology employing 7 ns pulses at 532 nm. The third-order non-linear absorption coefficient, b, was 1.48 x 10-6 cm W-1 and the imaginary part of the third-order non-linear optical susceptibility, Im c(3), was 3.36 x10-10 esu. The optical limiting threshold for the compound was found to be 340 MW cm-2.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The Schiff base, 3-hydroxyquinoxaline-2-carboxalidine-4-aminoantipyrine, was synthesized by the condensation of 3-hydroxyquinoxaline-2-carboxaldehyde with 4-aminoantipyrine. HPLC, FT-IR and NMR spectral data revealed that the compound exists predominantly in the amide tautomeric form and exhibits both absorption and fluorescence solvatochromism, large stokes shift, two electron quasireversible redox behaviour and good thermal stability, with a glass transition temperature of 104 oC. The third-order non-linear optical character was studied using open aperture Z-scan methodology employing 7 ns pulses at 532 nm. The third-order non-linear absorption coefficient, b, was 1.48 x 10-6 cm W-1 and the imaginary part of the third-order non-linear optical susceptibility, Im c(3), was 3.36x10-10 esu. The optical limiting threshold for the compound was found to be 340 MW cm-2.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we propose a handwritten character recognition system for Malayalam language. The feature extraction phase consists of gradient and curvature calculation and dimensionality reduction using Principal Component Analysis. Directional information from the arc tangent of gradient is used as gradient feature. Strength of gradient in curvature direction is used as the curvature feature. The proposed system uses a combination of gradient and curvature feature in reduced dimension as the feature vector. For classification, discriminative power of Support Vector Machine (SVM) is evaluated. The results reveal that SVM with Radial Basis Function (RBF) kernel yield the best performance with 96.28% and 97.96% of accuracy in two different datasets. This is the highest accuracy ever reported on these datasets

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Die stereoskopische 3-D-Darstellung beruht auf der naturgetreuen Präsentation verschiedener Perspektiven für das rechte und linke Auge. Sie erlangt in der Medizin, der Architektur, im Design sowie bei Computerspielen und im Kino, zukünftig möglicherweise auch im Fernsehen, eine immer größere Bedeutung. 3-D-Displays dienen der zusätzlichen Wiedergabe der räumlichen Tiefe und lassen sich grob in die vier Gruppen Stereoskope und Head-mounted-Displays, Brillensysteme, autostereoskopische Displays sowie echte 3-D-Displays einteilen. Darunter besitzt der autostereoskopische Ansatz ohne Brillen, bei dem N≥2 Perspektiven genutzt werden, ein hohes Potenzial. Die beste Qualität in dieser Gruppe kann mit der Methode der Integral Photography, die sowohl horizontale als auch vertikale Parallaxe kodiert, erreicht werden. Allerdings ist das Verfahren sehr aufwendig und wird deshalb wenig genutzt. Den besten Kompromiss zwischen Leistung und Preis bieten präzise gefertigte Linsenrasterscheiben (LRS), die hinsichtlich Lichtausbeute und optischen Eigenschaften den bereits früher bekannten Barrieremasken überlegen sind. Insbesondere für die ergonomisch günstige Multiperspektiven-3-D-Darstellung wird eine hohe physikalische Monitorauflösung benötigt. Diese ist bei modernen TFT-Displays schon recht hoch. Eine weitere Verbesserung mit dem theoretischen Faktor drei erreicht man durch gezielte Ansteuerung der einzelnen, nebeneinander angeordneten Subpixel in den Farben Rot, Grün und Blau. Ermöglicht wird dies durch die um etwa eine Größenordnung geringere Farbauflösung des menschlichen visuellen Systems im Vergleich zur Helligkeitsauflösung. Somit gelingt die Implementierung einer Subpixel-Filterung, welche entsprechend den physiologischen Gegebenheiten mit dem in Luminanz und Chrominanz trennenden YUV-Farbmodell arbeitet. Weiterhin erweist sich eine Schrägstellung der Linsen im Verhältnis von 1:6 als günstig. Farbstörungen werden minimiert, und die Schärfe der Bilder wird durch eine weniger systematische Vergrößerung der technologisch unvermeidbaren Trennelemente zwischen den Subpixeln erhöht. Der Grad der Schrägstellung ist frei wählbar. In diesem Sinne ist die Filterung als adaptiv an den Neigungswinkel zu verstehen, obwohl dieser Wert für einen konkreten 3-D-Monitor eine Invariante darstellt. Die zu maximierende Zielgröße ist der Parameter Perspektiven-Pixel als Produkt aus Anzahl der Perspektiven N und der effektiven Auflösung pro Perspektive. Der Idealfall einer Verdreifachung wird praktisch nicht erreicht. Messungen mit Hilfe von Testbildern sowie Schrifterkennungstests lieferten einen Wert von knapp über 2. Dies ist trotzdem als eine signifikante Verbesserung der Qualität der 3-D-Darstellung anzusehen. In der Zukunft sind weitere Verbesserungen hinsichtlich der Zielgröße durch Nutzung neuer, feiner als TFT auflösender Technologien wie LCoS oder OLED zu erwarten. Eine Kombination mit der vorgeschlagenen Filtermethode wird natürlich weiterhin möglich und ggf. auch sinnvoll sein.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A new method for the automated selection of colour features is described. The algorithm consists of two stages of processing. In the first, a complete set of colour features is calculated for every object of interest in an image. In the second stage, each object is mapped into several n-dimensional feature spaces in order to select the feature set with the smallest variables able to discriminate the remaining objects. The evaluation of the discrimination power for each concrete subset of features is performed by means of decision trees composed of linear discrimination functions. This method can provide valuable help in outdoor scene analysis where no colour space has been demonstrated as being the most suitable. Experiment results recognizing objects in outdoor scenes are reported