979 resultados para Optical music recognition


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The world we live in is well labeled for the benefit of humans but to date robots have made little use of this resource. In this paper we describe a system that allows robots to read and interpret visible text and use it to understand the content of the scene. We use a generative probabilistic model that explains spotted text in terms of arbitrary search terms. This allows the robot to understand the underlying function of the scene it is looking at, such as whether it is a bank or a restaurant. We describe the text spotting engine at the heart of our system that is able to detect and parse wild text in images, and the generative model, and present results from images obtained with a robot in a busy city setting.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Objective To evaluate the effects of Optical Character Recognition (OCR) on the automatic cancer classification of pathology reports. Method Scanned images of pathology reports were converted to electronic free-text using a commercial OCR system. A state-of-the-art cancer classification system, the Medical Text Extraction (MEDTEX) system, was used to automatically classify the OCR reports. Classifications produced by MEDTEX on the OCR versions of the reports were compared with the classification from a human amended version of the OCR reports. Results The employed OCR system was found to recognise scanned pathology reports with up to 99.12% character accuracy and up to 98.95% word accuracy. Errors in the OCR processing were found to minimally impact on the automatic classification of scanned pathology reports into notifiable groups. However, the impact of OCR errors is not negligible when considering the extraction of cancer notification items, such as primary site, histological type, etc. Conclusions The automatic cancer classification system used in this work, MEDTEX, has proven to be robust to errors produced by the acquisition of freetext pathology reports from scanned images through OCR software. However, issues emerge when considering the extraction of cancer notification items.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present an approach to automatically de-identify health records. In our approach, personal health information is identified using a Conditional Random Fields machine learning classifier, a large set of linguistic and lexical features, and pattern matching techniques. Identified personal information is then removed from the reports. The de-identification of personal health information is fundamental for the sharing and secondary use of electronic health records, for example for data mining and disease monitoring. The effectiveness of our approach is first evaluated on the 2007 i2b2 Shared Task dataset, a widely adopted dataset for evaluating de-identification techniques. Subsequently, we investigate the robustness of the approach to limited training data; we study its effectiveness on different type and quality of data by evaluating the approach on scanned pathology reports from an Australian institution. This data contains optical character recognition errors, as well as linguistic conventions that differ from those contained in the i2b2 dataset, for example different date formats. The findings suggest that our approach compares to the best approach from the 2007 i2b2 Shared Task; in addition, the approach is found to be robust to variations of training size, data type and quality in presence of sufficient training data.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Objective Evaluate the effectiveness and robustness of Anonym, a tool for de-identifying free-text health records based on conditional random fields classifiers informed by linguistic and lexical features, as well as features extracted by pattern matching techniques. De-identification of personal health information in electronic health records is essential for the sharing and secondary usage of clinical data. De-identification tools that adapt to different sources of clinical data are attractive as they would require minimal intervention to guarantee high effectiveness. Methods and Materials The effectiveness and robustness of Anonym are evaluated across multiple datasets, including the widely adopted Integrating Biology and the Bedside (i2b2) dataset, used for evaluation in a de-identification challenge. The datasets used here vary in type of health records, source of data, and their quality, with one of the datasets containing optical character recognition errors. Results Anonym identifies and removes up to 96.6% of personal health identifiers (recall) with a precision of up to 98.2% on the i2b2 dataset, outperforming the best system proposed in the i2b2 challenge. The effectiveness of Anonym across datasets is found to depend on the amount of information available for training. Conclusion Findings show that Anonym compares to the best approach from the 2006 i2b2 shared task. It is easy to retrain Anonym with new datasets; if retrained, the system is robust to variations of training size, data type and quality in presence of sufficient training data.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The document images that are fed into an Optical Character Recognition system, might be skewed. This could be due to improper feeding of the document into the scanner or may be due to a faulty scanner. In this paper, we propose a skew detection and correction method for document images. We make use of the inherent randomness in the Horizontal Projection profiles of a text block image, as the skew of the image varies. The proposed algorithm has proved to be very robust and time efficient. The entire process takes less than a second on a 2.4 GHz Pentium IV PC.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We propose a set of metrics that evaluate the uniformity, sharpness, continuity, noise, stroke width variance,pulse width ratio, transient pixels density, entropy and variance of components to quantify the quality of a document image. The measures are intended to be used in any optical character recognition (OCR) engine to a priori estimate the expected performance of the OCR. The suggested measures have been evaluated on many document images, which have different scripts. The quality of a document image is manually annotated by users to create a ground truth. The idea is to correlate the values of the measures with the user annotated data. If the measure calculated matches the annotated description,then the metric is accepted; else it is rejected. In the set of metrics proposed, some of them are accepted and the rest are rejected. We have defined metrics that are easily estimatable. The metrics proposed in this paper are based on the feedback of homely grown OCR engines for Indic (Tamil and Kannada) languages. The metrics are independent of the scripts, and depend only on the quality and age of the paper and the printing. Experiments and results for each proposed metric are discussed. Actual recognition of the printed text is not performed to evaluate the proposed metrics. Sometimes, a document image containing broken characters results in good document image as per the evaluated metrics, which is part of the unsolved challenges. The proposed measures work on gray scale document images and fail to provide reliable information on binarized document image.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The present study examined individual differences in Absorption and fantasy, as well as in Achiievement and achievement striving as possible moderators of the perceptual closure effect found by Snodgrass and Feenan (1990). The study also examined whether different instructions (experiential versus instrumental) interact with the personality variables to moderate the relationship between priming and subsequent performance on a picture completion task. 1 28 participants completed two sessions, one to fill out the MPQ and NEO personality inventories and the other to complete the experimental task. The experimental task consisted of a priming phase and a test phase, with pictures presented on a computer screen for both phases. Participants were shown 30 pictures in the priming phase, and then shovm the 30 primed pictures along with 30 new pictures for the test phase. Participants were randomly assigned to receive one of the two different instruction sets for the task. Two measures of performance were calculated, most fragmented measure and threshold. Results of the present study confirm that a five-second exposure time is long enough to produce the perceptual closure effect. The analysis of the two-way interaction effects indicated a significant quadratic interaction of Absorption with priming level on threshold performance. The results were in the opposite direction of predictions. Possible explanations for the Absorption results include lack of optimal conditions, lack of intrinsic motivation and measurement problems. Primary analyses also revealed two significant between-subject effects of fantasy and achievement striving on performance collapsed across priming levels. These results suggest that fantasy has a beneficial effect on performance at test for pictures primed at all levels, whereas achievement striving seems to have an adverse effect on performance at test for pictures primed at all levels. Results of the secondary analyses with a revised threshold performance measure indicated a significant quadratic interaction of Absorption, condition and priming level. In the experiential condition, test performance, based on Absorption scores for pictures primed at level 4, showed a positive slope and performance for pictures primed at levels 1 and 7 based on Absorption showed a negative slope. The reverse effect was found in the instrumental condition. The results suggest that Absorption, in combination with experiential involvement, may affect implicit memory. A second significant result of the secondary analyses was a linear three-way interaction of Achievement, condition and priming level on performance. Results suggest that as Achievement scores increased, test performance improved for less fragmented primed pictures in the instrumental condition and test performance improved for more highly fragmented primes in the experiential condition. Results from the secondary analyses suggest that the revised threshold measure may be more sensitive to individual differences. Results of the exploratory analyses with Openness to Experience, Conscientiousness and agentic positive emotionality (PEM-A) measures indicated no significant effects of any of these personality variables. Results suggest that facets of the scales may be more useful with regard to perceptual research, and that future research should examine narrowly focused personality traits as opposed to broader constructs.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

L’amusie congénitale est un trouble neurogénétique qui se caractérise par une inhabileté à acquérir des habiletés musicales de base, telles que la perception musicale et la reconnaissance musicale normales, malgré une audition, un développement du langage et une intelligence normaux (Ayotte, Peretz & Hyde, 2002). Récemment, une éude d’aggrégation familiale a démontré que 39% des membres de familles d’individus amusiques démontrent le trouble, comparativement à 3% des membres de familles d’individus normaux (Peretz et al., 2007). Cette conclusion est intéressante puisqu’elle démontre une prévalence de l’amusie congénitale dans la population normale. Kalmus et Fry (1980) ont évalué cette prévalence à 4%, en utilisant le Distorted Tunes Test (DTT). Par contre, ce test présente certaines lacunes méthodologiques et statistiques, telles un effet plafond important, ainsi que l’usage de mélodies folkloriques, désavantageant les amusiques puisque ceux-ci ne peuvent pas assimiler ces mélodies correctement. L’étude présente visait à réévaluer la prévalence de l’amusie congénitale en utilisant un test en ligne récemment validé par Peretz et ses collègues (2008). Mille cent participants, d’un échantillon homogène, ont complété le test en ligne. Les résultats démontrent une prévalence globale de 11.6%, ainsi que quatre profiles de performance distincts: pitch deafness (1.5%), pitch memory amusia (3.2%), pitch perception amusia (3.3%), et beat deafness (3.3%). La variabilité des résultats obtenus avec le test en ligne démontre l’existence de quatre types d’amusies avec chacune une prévalence individuelle, indiquant une hétérogénéité dans l’expression de l’amusie congénitale qui devra être explorée ultérieurement.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

En aquest projecte es pretén utilitzar mètodes coneguts com ara Viola&Jones (detecció) i EigenFaces (reconeixement) per a detectar i reconèixer cares dintre d’imatges de vídeo. Per a aconseguir aquesta tasca cal partir d’un conjunt de dades d’entrenament per a cada un dels mètodes (base de dades formada per imatges i anotacions manuals). A partir d’aquí, l’aplicació, ha de ser capaç de detectar cares en noves imatges i reconèixer-les (identificar de quina cara es tracta)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Dissenyar, implementar i testejar un sistema per classificar imatges: disseny d’un sistema que primer aprèn com són les imatges d’una classe a partir d’un conjunt d’imatges d’entrenament i després és capaç de classificar noves imatges assignant-les-hi l’ etiqueta corresponent a una de les classes “apreses”. Concretament s’analitzen caràtules de cd-roms, les quals s’han de reconèixer per després reproduir automàticament la música del seu àlbum associat

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We propose a probabilistic object classifier for outdoor scene analysis as a first step in solving the problem of scene context generation. The method begins with a top-down control, which uses the previously learned models (appearance and absolute location) to obtain an initial pixel-level classification. This information provides us the core of objects, which is used to acquire a more accurate object model. Therefore, their growing by specific active regions allows us to obtain an accurate recognition of known regions. Next, a stage of general segmentation provides the segmentation of unknown regions by a bottom-strategy. Finally, the last stage tries to perform a region fusion of known and unknown segmented objects. The result is both a segmentation of the image and a recognition of each segment as a given object class or as an unknown segmented object. Furthermore, experimental results are shown and evaluated to prove the validity of our proposal

Relevância:

80.00% 80.00%

Publicador:

Resumo:

When underwater vehicles perform navigation close to the ocean floor, computer vision techniques can be applied to obtain quite accurate motion estimates. The most crucial step in the vision-based estimation of the vehicle motion consists on detecting matchings between image pairs. Here we propose the extensive use of texture analysis as a tool to ameliorate the correspondence problem in underwater images. Once a robust set of correspondences has been found, the three-dimensional motion of the vehicle can be computed with respect to the bed of the sea. Finally, motion estimates allow the construction of a map that could aid to the navigation of the robot

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Photo-mosaicing techniques have become popular for seafloor mapping in various marine science applications. However, the common methods cannot accurately map regions with high relief and topographical variations. Ortho-mosaicing borrowed from photogrammetry is an alternative technique that enables taking into account the 3-D shape of the terrain. A serious bottleneck is the volume of elevation information that needs to be estimated from the video data, fused, and processed for the generation of a composite ortho-photo that covers a relatively large seafloor area. We present a framework that combines the advantages of dense depth-map and 3-D feature estimation techniques based on visual motion cues. The main goal is to identify and reconstruct certain key terrain feature points that adequately represent the surface with minimal complexity in the form of piecewise planar patches. The proposed implementation utilizes local depth maps for feature selection, while tracking over several views enables 3-D reconstruction by bundle adjustment. Experimental results with synthetic and real data validate the effectiveness of the proposed approach