919 resultados para optical character recognition system
Resumo:
In many movies of scientific fiction, machines were capable of speaking with humans. However mankind is still far away of getting those types of machines, like the famous character C3PO of Star Wars. During the last six decades the automatic speech recognition systems have been the target of many studies. Throughout these years many technics were developed to be used in applications of both software and hardware. There are many types of automatic speech recognition system, among which the one used in this work were the isolated word and independent of the speaker system, using Hidden Markov Models as the recognition system. The goals of this work is to project and synthesize the first two steps of the speech recognition system, the steps are: the speech signal acquisition and the pre-processing of the signal. Both steps were developed in a reprogrammable component named FPGA, using the VHDL hardware description language, owing to the high performance of this component and the flexibility of the language. In this work it is presented all the theory of digital signal processing, as Fast Fourier Transforms and digital filters and also all the theory of speech recognition using Hidden Markov Models and LPC processor. It is also presented all the results obtained for each one of the blocks synthesized e verified in hardware
Resumo:
Pós-graduação em Química - IQ
Resumo:
The human eye is sensitive to visible light. Increasing illumination on the eye causes the pupil of the eye to contract, while decreasing illumination causes the pupil to dilate. Visible light causes specular reflections inside the iris ring. On the other hand, the human retina is less sensitive to near infra-red (NIR) radiation in the wavelength range from 800 nm to 1400 nm, but iris detail can still be imaged with NIR illumination. In order to measure the dynamic movement of the human pupil and iris while keeping the light-induced reflexes from affecting the quality of the digitalized image, this paper describes a device based on the consensual reflex. This biological phenomenon contracts and dilates the two pupils synchronously when illuminating one of the eyes by visible light. In this paper, we propose to capture images of the pupil of one eye using NIR illumination while illuminating the other eye using a visible-light pulse. This new approach extracts iris features called "dynamic features (DFs)." This innovative methodology proposes the extraction of information about the way the human eye reacts to light, and to use such information for biometric recognition purposes. The results demonstrate that these features are discriminating features, and, even using the Euclidean distance measure, an average accuracy of recognition of 99.1% was obtained. The proposed methodology has the potential to be "fraud-proof," because these DFs can only be extracted from living irises.
Resumo:
The identification of people by measuring some traits of individual anatomy or physiology has led to a specific research area called biometric recognition. This thesis is focused on improving fingerprint recognition systems considering three important problems: fingerprint enhancement, fingerprint orientation extraction and automatic evaluation of fingerprint algorithms. An effective extraction of salient fingerprint features depends on the quality of the input fingerprint. If the fingerprint is very noisy, we are not able to detect a reliable set of features. A new fingerprint enhancement method, which is both iterative and contextual, is proposed. This approach detects high-quality regions in fingerprints, selectively applies contextual filtering and iteratively expands like wildfire toward low-quality ones. A precise estimation of the orientation field would greatly simplify the estimation of other fingerprint features (singular points, minutiae) and improve the performance of a fingerprint recognition system. The fingerprint orientation extraction is improved following two directions. First, after the introduction of a new taxonomy of fingerprint orientation extraction methods, several variants of baseline methods are implemented and, pointing out the role of pre- and post- processing, we show how to improve the extraction. Second, the introduction of a new hybrid orientation extraction method, which follows an adaptive scheme, allows to improve significantly the orientation extraction in noisy fingerprints. Scientific papers typically propose recognition systems that integrate many modules and therefore an automatic evaluation of fingerprint algorithms is needed to isolate the contributions that determine an actual progress in the state-of-the-art. The lack of a publicly available framework to compare fingerprint orientation extraction algorithms, motivates the introduction of a new benchmark area called FOE (including fingerprints and manually-marked orientation ground-truth) along with fingerprint matching benchmarks in the FVC-onGoing framework. The success of such framework is discussed by providing relevant statistics: more than 1450 algorithms submitted and two international competitions.
Resumo:
BACKGROUND: In this paper, we present a new method for the calibration of a microscope and its registration using an active optical tracker. METHODS: Practically, both operations are done simultaneously by moving an active optical marker within the field of view of the two devices. The IR LEDs composing the marker are first segmented from the microscope images. By knowing their corresponding three-dimensional (3D) position in the optical tracker reference system, it is possible to find the transformation matrix between the referential of the two devices. Registration and calibration parameters can be extracted directly from that transformation. In addition, since the zoom and focus can be modified by the surgeon during the operation, we propose a spline based method to update the camera model to the new setup. RESULTS: The proposed technique is currently being used in an augmented reality system for image-guided surgery in the fields of ear, nose and throat (ENT) and craniomaxillofacial surgeries. CONCLUSIONS: The results have proved to be accurate and the technique is a fast, dynamic and reliable way to calibrate and register the two devices in an OR environment.
Resumo:
Ex vivo porcine retina laser lesions applied with varying laser power (20 mW–2 W, 10 ms pulse, 196 lesions) are manually evaluated by microscopic and optical coherence tomography (OCT) visibility, as well as in histological sections immediately after the deposition of the laser energy. An optical coherence tomography system with 1.78 um axial resolution specifically developed to image thin retinal layers simultaneously to laser therapy is presented, and visibility thresholds of the laser lesions in OCT data and fundus imaging are compared. Optical coherence tomography scans are compared with histological sections to estimate the resolving power for small optical changes in the retinal layers, and real-time time-lapse scans during laser application are shown and analyzed quantitatively. Ultrahigh-resolution OCT inspection features a lesion visibility threshold 40–50 mW (17 reduction) lower than for visual inspection. With the new measurement system, 42 of the lesions that were invisible using state-of-the-art ophthalmoscopic methods could be detected.
Resumo:
Retinal laser photocoagulation is an established and successful treatment for a variety of retinal diseases. While being a valuable treatment modality, laser photocoagulation shows the drawback of employing high energy lasers which are capable of physically destroying the neural retina. For reliable therapy, it is therefore crucial to closely monitor the therapy effects caused in the retinal tissue. A depth resolved representation of optical tissue properties as provided by optical coherence tomography may provide valuable information about the treatment effects in the retinal layers if recorded simultaneously to laser coagulation. Therefore, in this work, the use of ultra-high resolution optical coherence tomography to represent tissue changes caused by conventional and selective retinal photocoagulation is investigated. Laser lesions were placed on porcine retina ex-vivo using a 577 nm laser as well as a pulsed laser at 527 nm built for selective treatment of the retinal pigment epithelium. Applied energies were varied to generate lesions best representing the span from under- to overtreatment. The lesions were examined using a custom-designed optical coherence tomography system with an axial resolution of 1.78 μm and 70 kHz Ascan rate. Optical coherence tomography scans included volume scans before and after irradiation, as well as time lapse scans (Mscan) of the lesions. Results show OCT lesion visibility thresholds to be below the thresholds of ophthalmoscopic inspection. With the ultra-high resolution OCT, 42% - 44% of ophthalmoscopically invisible lesions could be detected and lesions that were under- or overexposed could be distinguished using the OCT data.
Resumo:
Smart homes for the aging population have recently started attracting the attention of the research community. The "health state" of smart homes is comprised of many different levels; starting with the physical health of citizens, it also includes longer-term health norms and outcomes, as well as the arena of positive behavior changes. One of the problems of interest is to monitor the activities of daily living (ADL) of the elderly, aiming at their protection and well-being. For this purpose, we installed passive infrared (PIR) sensors to detect motion in a specific area inside a smart apartment and used them to collect a set of ADL. In a novel approach, we describe a technology that allows the ground truth collected in one smart home to train activity recognition systems for other smart homes. We asked the users to label all instances of all ADL only once and subsequently applied data mining techniques to cluster in-home sensor firings. Each cluster would therefore represent the instances of the same activity. Once the clusters were associated to their corresponding activities, our system was able to recognize future activities. To improve the activity recognition accuracy, our system preprocessed raw sensor data by identifying overlapping activities. To evaluate the recognition performance from a 200-day dataset, we implemented three different active learning classification algorithms and compared their performance: naive Bayesian (NB), support vector machine (SVM) and random forest (RF). Based on our results, the RF classifier recognized activities with an average specificity of 96.53%, a sensitivity of 68.49%, a precision of 74.41% and an F-measure of 71.33%, outperforming both the NB and SVM classifiers. Further clustering markedly improved the results of the RF classifier. An activity recognition system based on PIR sensors in conjunction with a clustering classification approach was able to detect ADL from datasets collected from different homes. Thus, our PIR-based smart home technology could improve care and provide valuable information to better understand the functioning of our societies, as well as to inform both individual and collective action in a smart city scenario.
Resumo:
Many mobile devices embed nowadays inertial sensors. This enables new forms of human-computer interaction through the use of gestures (movements performed with the mobile device) as a way of communication. This paper presents an accelerometer-based gesture recognition system for mobile devices which is able to recognize a collection of 10 different hand gestures. The system was conceived to be light and to operate in a user -independent manner in real time. The recognition system was implemented in a smart phone and evaluated through a collection of user tests, which showed a recognition accuracy similar to other state-of-the art techniques and a lower computational complexity. The system was also used to build a human -robot interface that enables controlling a wheeled robot with the gestures made with the mobile phone.
Resumo:
The availability of inertial sensors embedded in mobile devices has enabled a new type of interaction based on the movements or “gestures” made by the users when holding the device. In this paper we propose a gesture recognition system for mobile devices based on accelerometer and gyroscope measurements. The system is capable of recognizing a set of predefined gestures in a user-independent way, without the need of a training phase. Furthermore, it was designed to be executed in real-time in resource-constrained devices, and therefore has a low computational complexity. The performance of the system is evaluated offline using a dataset of gestures, and also online, through some user tests with the system running in a smart phone.
Resumo:
A method to achieve improvement in template size for an iris-recognition system is reported. To achieve this result, the biological characteristics of the human iris have been studied. Processing has been performed by image processing techniques, isolating the iris and enhancing the area of study, after which multi resolution analysis is made. Reduction of the pattern obtained has been obtained via statistical study.
Resumo:
New forms of natural interactions between human operators and UAVs (Unmanned Aerial Vehicle) are demanded by the military industry to achieve a better balance of the UAV control and the burden of the human operator. In this work, a human machine interface (HMI) based on a novel gesture recognition system using depth imagery is proposed for the control of UAVs. Hand gesture recognition based on depth imagery is a promising approach for HMIs because it is more intuitive, natural, and non-intrusive than other alternatives using complex controllers. The proposed system is based on a Support Vector Machine (SVM) classifier that uses spatio-temporal depth descriptors as input features. The designed descriptor is based on a variation of the Local Binary Pattern (LBP) technique to efficiently work with depth video sequences. Other major consideration is the especial hand sign language used for the UAV control. A tradeoff between the use of natural hand signs and the minimization of the inter-sign interference has been established. Promising results have been achieved in a depth based database of hand gestures especially developed for the validation of the proposed system.
Resumo:
This paper introduces APA (?Artificial Prion Assembly?): a pattern recognition system based on artificial prion crystalization. Specifically, the system exhibits the capability to classify patterns according to the resulting prion self- assembly simulated with cellular automata. Our approach is inspired in the biological process of proteins aggregation, known as prions, which are assembled as amyloid fibers related with neurodegenerative disorders.
Resumo:
MFCC coefficients extracted from the power spectral density of speech as a whole, seems to have become the de facto standard in the area of speaker recognition, as demonstrated by its use in almost all systems submitted to the 2013 Speaker Recognition Evaluation (SRE) in Mobile Environment [1], thus relegating to background this component of the recognition systems. However, in this article we will show that selecting the adequate speaker characterization system is as important as the selection of the classifier. To accomplish this we will compare the recognition rates achieved by different recognition systems that relies on the same classifier (GMM-UBM) but connected with different feature extraction systems (based on both classical and biometric parameters). As a result we will show that a gender dependent biometric parameterization with a simple recognition system based on GMM- UBM paradigm provides very competitive or even better recognition rates when compared to more complex classification systems based on classical features
Resumo:
This paper describes the GTH-UPM system for the Albayzin 2014 Search on Speech Evaluation. Teh evaluation task consists of searching a list of terms/queries in audio files. The GTH-UPM system we are presenting is based on a LVCSR (Large Vocabulary Continuous Speech Recognition) system. We have used MAVIR corpus and the Spanish partition of the EPPS (European Parliament Plenary Sessions) database for training both acoustic and language models. The main effort has been focused on lexicon preparation and text selection for the language model construction. The system makes use of different lexicon and language models depending on the task that is performed. For the best configuration of the system on the development set, we have obtained a FOM of 75.27 for the deyword spotting task.