979 resultados para Optical music recognition


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Thesis (M.S.)--University of Illinois.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

ACM Computing Classification System (1998): I.7, I.7.5.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This dissertation introduces a novel automated book reader as an assistive technology tool for persons with blindness. The literature shows extensive work in the area of optical character recognition, but the current methodologies available for the automated reading of books or bound volumes remain inadequate and are severely constrained during document scanning or image acquisition processes. The goal of the book reader design is to automate and simplify the task of reading a book while providing a user-friendly environment with a realistic but affordable system design. This design responds to the main concerns of (a) providing a method of image acquisition that maintains the integrity of the source (b) overcoming optical character recognition errors created by inherent imaging issues such as curvature effects and barrel distortion, and (c) determining a suitable method for accurate recognition of characters that yields an interface with the ability to read from any open book with a high reading accuracy nearing 98%. This research endeavor focuses in its initial aim on the development of an assistive technology tool to help persons with blindness in the reading of books and other bound volumes. But its secondary and broader aim is to also find in this design the perfect platform for the digitization process of bound documentation in line with the mission of the Open Content Alliance (OCA), a nonprofit Alliance at making reading materials available in digital form. The theoretical perspective of this research relates to the mathematical developments that are made in order to resolve both the inherent distortions due to the properties of the camera lens and the anticipated distortions of the changing page curvature as one leafs through the book. This is evidenced by the significant increase of the recognition rate of characters and a high accuracy read-out through text to speech processing. This reasonably priced interface with its high performance results and its compatibility to any computer or laptop through universal serial bus connectors extends greatly the prospects for universal accessibility to documentation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

With the world of professional sports shifting towards employing better sport analytics, the demand for vision-based performance analysis is growing increasingly in recent years. In addition, the nature of many sports does not allow the use of any kind of sensors or other wearable markers attached to players for monitoring their performances during competitions. This provides a potential application of systematic observations such as tracking information of the players to help coaches to develop their visual skills and perceptual awareness needed to make decisions about team strategy or training plans. My PhD project is part of a bigger ongoing project between sport scientists and computer scientists involving also industry partners and sports organisations. The overall idea is to investigate the contribution technology can make to the analysis of sports performance on the example of team sports such as rugby, football or hockey. A particular focus is on vision-based tracking, so that information about the location and dynamics of the players can be gained without any additional sensors on the players. To start with, prior approaches on visual tracking are extensively reviewed and analysed. In this thesis, methods to deal with the difficulties in visual tracking to handle the target appearance changes caused by intrinsic (e.g. pose variation) and extrinsic factors, such as occlusion, are proposed. This analysis highlights the importance of the proposed visual tracking algorithms, which reflect these challenges and suggest robust and accurate frameworks to estimate the target state in a complex tracking scenario such as a sports scene, thereby facilitating the tracking process. Next, a framework for continuously tracking multiple targets is proposed. Compared to single target tracking, multi-target tracking such as tracking the players on a sports field, poses additional difficulties, namely data association, which needs to be addressed. Here, the aim is to locate all targets of interest, inferring their trajectories and deciding which observation corresponds to which target trajectory is. In this thesis, an efficient framework is proposed to handle this particular problem, especially in sport scenes, where the players of the same team tend to look similar and exhibit complex interactions and unpredictable movements resulting in matching ambiguity between the players. The presented approach is also evaluated on different sports datasets and shows promising results. Finally, information from the proposed tracking system is utilised as the basic input for further higher level performance analysis such as tactics and team formations, which can help coaches to design a better training plan. Due to the continuous nature of many team sports (e.g. soccer, hockey), it is not straightforward to infer the high-level team behaviours, such as players’ interaction. The proposed framework relies on two distinct levels of performance analysis: low-level performance analysis, such as identifying players positions on the play field, as well as a high-level analysis, where the aim is to estimate the density of player locations or detecting their possible interaction group. The related experiments show the proposed approach can effectively explore this high-level information, which has many potential applications.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Nos encontramos a escasos 8 años del siglo XXI y la información que cobra día a día más auge y más importancia, el acelerado devenir tecnológico y los descubrimientos científicos, hacen que el bibliotecario sea un transmisor de innovación y comunicación que se desenvuelva en un mundo competitivo, en donde debe ser agresivo, dinámico y capaz de adoptar todo ese cúmulo tecnológico y científico si quiere sobrevivir en el futuro como profesional.Si retrocedemos cinco años, nos damos cuenta que la Bibliotecología es una de las disciplinas que más ha evolucionado con respecto a términos relacionados con gestión automatizada de información. Palabras como scanners, videodisco, reconocimiento de caracteres ópticos, CD-ROM, CD-I (Disco compacto interactivo), etc., forman parte del vocabulario bibliotecológico que ha sido incorporado por los profesionales quienes se desenvuelven en el complicado mundo de la información.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Hoy día uno de los tópicos principales en la literatura bibliotecaria es el impacto y el uso de la nueva tecnología en el ambiente bibliotecario. La conversión de información por medio de computadora es una de las técnicas que ha causado mayor aceptación en bibliotecas.El reconocimiento de caracteres ópticos es una técnica sofisticada que ha alcanzado gran popularidad no solamente en organizaciones comerciales sino también en bibliotecas porque su versatilidad permite convertir material impreso o mecanografiado por medios Computacionales sin necesidad de digitar la información.El reconocimiento de caracteres ópticos ha sido exitosamente empleado en los sistemas de Circulación y Catalogación de las bibliotecas. También ha sido usado en el registro de reportes, documentos técnicos, páginas de contenido, índices de revistas, resúmenes, etc. Su futuro y aplicación en las bibliotecas es muy prometedor porque provee soluciones a problemas administrativos y facilita el proceso técnico de la información.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This presentation was given at the Panhandle Library Access Network's (PLAN) Innovation Conference: Digitization- Preserving the Past for the Future Conference on August 14th, 2015. The presentation uses a specific collection of directories as a case study of the complications librarians and archivists face in digitizing older materials that may also be quite large, such as a directory. Prime OCR and Abbyy Fine Reader are discussed and their pros and cons covered. Troubleshooting and editing with Adobe Photoshop is also discussed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This presentation was given at the FLVC regional conference at Broward College on May 7, 2015 and introduced scanning, processing, record creation, dissemination, and preservation in FIU Libraries' Digital Collections Center. The main focus was on processing, specifically employing OCR technology with difficult sources.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Bangla OCR (Optical Character Recognition) is a long deserving software for Bengali community all over the world. Numerous e efforts suggest that due to the inherent complex nature of Bangla alphabet and its word formation process development of high fidelity OCR producing a reasonably acceptable output still remains a challenge. One possible way of improvement is by using post processing of OCR’s output; algorithms such as Edit Distance and the use of n-grams statistical information have been used to rectify misspelled words in language processing. This work presents the first known approach to use these algorithms to replace misrecognized words produced by Bangla OCR. The assessment is made on a set of fifty documents written in Bangla script and uses a dictionary of 541,167 words. The proposed correction model can correct several words lowering the recognition error rate by 2.87% and 3.18% for the character based n- gram and edit distance algorithms respectively. The developed system suggests a list of 5 (five) alternatives for a misspelled word. It is found that in 33.82% cases, the correct word is the topmost suggestion of 5 words list for n-gram algorithm while using Edit distance algorithm the first word in the suggestion properly matches 36.31% of the cases. This work will ignite rooms of thoughts for possible improvements in character recognition endeavour.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper we propose a novel approach to multi-action recognition that performs joint segmentation and classification. This approach models each action using a Gaussian mixture using robust low-dimensional action features. Segmentation is achieved by performing classification on overlapping temporal windows, which are then merged to produce the final result. This approach is considerably less complicated than previous methods which use dynamic programming or computationally expensive hidden Markov models (HMMs). Initial experiments on a stitched version of the KTH dataset show that the proposed approach achieves an accuracy of 78.3%, outperforming a recent HMM-based approach which obtained 71.2%.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Urea-based molecular constructs are shown for the first time to be nonlinear optically (NLO) active in solution. We demonstrate self-assembly triggered large amplification and specific anion recognition driven attenuation of the NLO activity. This orthogonal modulation along with an excellent nonlinearity-transparency trade-off makes them attractive NLO probes for studies related to weak self-assembly and anion transportation by second harmonic microscopy.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We address the problem of multi-instrument recognition in polyphonic music signals. Individual instruments are modeled within a stochastic framework using Student's-t Mixture Models (tMMs). We impose a mixture of these instrument models on the polyphonic signal model. No a priori knowledge is assumed about the number of instruments in the polyphony. The mixture weights are estimated in a latent variable framework from the polyphonic data using an Expectation Maximization (EM) algorithm, derived for the proposed approach. The weights are shown to indicate instrument activity. The output of the algorithm is an Instrument Activity Graph (IAG), using which, it is possible to find out the instruments that are active at a given time. An average F-ratio of 0 : 7 5 is obtained for polyphonies containing 2-5 instruments, on a experimental test set of 8 instruments: clarinet, flute, guitar, harp, mandolin, piano, trombone and violin.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We propose to develop a 3-D optical flow features based human action recognition system. Optical flow based features are employed here since they can capture the apparent movement in object, by design. Moreover, they can represent information hierarchically from local pixel level to global object level. In this work, 3-D optical flow based features a re extracted by combining the 2-1) optical flow based features with the depth flow features obtained from depth camera. In order to develop an action recognition system, we employ a Meta-Cognitive Neuro-Fuzzy Inference System (McFIS). The m of McFIS is to find the decision boundary separating different classes based on their respective optical flow based features. McFIS consists of a neuro-fuzzy inference system (cognitive component) and a self-regulatory learning mechanism (meta-cognitive component). During the supervised learning, self-regulatory learning mechanism monitors the knowledge of the current sample with respect to the existing knowledge in the network and controls the learning by deciding on sample deletion, sample learning or sample reserve strategies. The performance of the proposed action recognition system was evaluated on a proprietary data set consisting of eight subjects. The performance evaluation with standard support vector machine classifier and extreme learning machine indicates improved performance of McFIS is recognizing actions based of 3-D optical flow based features.