919 resultados para optical character recognition system
Resumo:
A near-field scanning optical microscopy (NSOM) system employing a very-small-aperture laser (VSAL) as an active probe is reported in this Letter. The VSAL in our experiment has an aperture size of 300 nmx300 nm and a near-field spot size of about 600 nm. The resolution of the NSOM system with the VSAL can reach about 600 nm, and even 400 nm. Considering the high output power of the VSAL, such a NSOM system is a potentially useful tool for nanodetection, data storage, nanolithography, and nanobiology.
Resumo:
This paper describes a high-performance multiplexed vibration sensor system using fiber lasers. A serial vibration sensor array consists of four short cavity fiber lasers. The system employs a single, polarization-insensitive, unbalanced Michelson interferometer to translate individual laser wavelength shifts induced by vibration signals into interferometer phase shifts. A dense wavelength division demultiplexor (DWDM) with high channel isolation is inserted to demultiplex each laser signal as a wavelength filter. Finally, a digital phase demodulator based on the phase generated carrier technique is used to achieve high-resolution interrogation. Experimental results show that no observable crosstalk is measured on the output channels, and the minimal detectable acceleration of this system is similar to 200ng/root Hz at 250Hz, which is fundamentally limited by the frequency noise of the lasers.
Resumo:
The electronic band structures and optical gains of InAs1-xNx/GaAs pyramid quantum dots (QDs) are calculated using the ten-band k . p model and the valence force field method. The optical gains are calculated using the zero-dimensional optical gain formula with taking into consideration of both homogeneous and inhomogeneous broadenings due to the size fluctuation of quantum dots which follows a normal distribution. With the variation of QD sizes and nitrogen composition, it can be shown that the nitrogen composition and the strains can significantly affect the energy levels especially the conduction band which has repulsion interaction with nitrogen resonant state due to the band anticrossing interaction. It facilitates to achieve emission of longer wavelength (1.33 or 1.55 mu m) lasers for optical fiber communication system. For QD with higher nitrogen composition, it has longer emission wavelength and less detrimental effect of higher excited state transition, but nitrogen composition can affect the maximum gain depending on the factors of transition matrix element and the Fermi-Dirac distributions for electrons in the conduction bands and holes in the valence bands respectively. For larger QD, its maximum optical gain is greater at lower carrier density, but it is slowly surpassed by smaller QD as carrier concentration increases. Larger QD can reach its saturation gain faster, but this saturation gain is smaller than that of smaller QD. So the trade-off between longer wavelength, maximum optical, saturation gain, and differential gain must be considered to select the appropriate QD size according to the specific application requirement. (C) 2009 American Institute of Physics. [DOI: 10.1063/1.3143025]
Resumo:
In this paper, we propose a new scheme for omnidirectional object-recognition in free space. The proposed scheme divides above problem into several onmidirectional object-recognition with different depression angles. An onmidirectional object-recognition system with oblique observation directions based on a new recognition theory-Biomimetic Pattern Recognition (BPR) is discussed in detail. Based on it, we can get the size of training samples in the onmidirectional object-recognition system in free space. Omnidirection ally cognitive tests were done on various kinds of animal models of rather similar shapes. For the total 8400 tests, the correct recognition rate is 99.89%. The rejection rate is 0.11% and on the condition of zero error rates. Experimental results are presented to show that the proposed approach outperforms three types of SVMs with either a three degree polynomial kernel or a radial basis function kernel.
Resumo:
The Double Synapse Weighted Neuron (DSWN) is a kind of general-purpose neuron model, which with the ability of configuring Hyper-sausage neuron (HSN). After introducing the design method of hardware DSWN synapse, this paper proposed a DSWN-based specific purpose neural computing device-CASSANN-IIspr. As its application, a rigid body recognition system was developed on CASSANN-IIspr, which achieved better performance than RIBF-SVMs system.
Resumo:
The optical storage characteristics of a new kind of organic photochromic material-pyrrylfulgide were experimentally investigated in the established parallel optical data storage system. Using the pyrrylfulgide/PMMA film as a photon-mode recording medium, micro-images and encoded binary digital data were recorded, readout and erased in this parallel system. The storage density currently reaches 3 x 10(7) bit/cm(2). The recorded information on the film can be kept for years in darkness at room temperature.
Resumo:
Planar punch through heterojunction phototransistors with a novel emitter control electrode and ion- implanted isolation (CE-PTHPT) are investigated. The phototransistors have a working voltage of 3-10V and high sensitivity at low input power. The base of the transistor is completely depleted under operating condition. Base current is zero. The CE-PTHPT has an increased speed and a decreased noise. The novel CE-PTHPT has been fabricated in this paper. The optical gain of GaAlAs/GaAs CE-PTHPT for the incident light power 1.3 and 43nw with the wavelength of 0.8 mu m reached 1260 and 8108. The input noise current calculated is 5.46 x 10(-16) A/H-z(1/2). For polysilicon emitter CE-PTHPT, the optical gain is 3083 at the input power of 0.174 mu w. The optical gain of InGaAs/InP CE-PTHPT reaches 350 for an incident power of 0.3 mu w at the wavelength of 1.55 mu m. The CE-PTHPT detectors is promising as photo detectors for optical fiber communication system.
Resumo:
Video-based facial expression recognition is a challenging problem in computer vision and human-computer interaction. To target this problem, texture features have been extracted and widely used, because they can capture image intensity changes raised by skin deformation. However, existing texture features encounter problems with albedo and lighting variations. To solve both problems, we propose a new texture feature called image ratio features. Compared with previously proposed texture features, e. g., high gradient component features, image ratio features are more robust to albedo and lighting variations. In addition, to further improve facial expression recognition accuracy based on image ratio features, we combine image ratio features with facial animation parameters (FAPs), which describe the geometric motions of facial feature points. The performance evaluation is based on the Carnegie Mellon University Cohn-Kanade database, our own database, and the Japanese Female Facial Expression database. Experimental results show that the proposed image ratio feature is more robust to albedo and lighting variations, and the combination of image ratio features and FAPs outperforms each feature alone. In addition, we study asymmetric facial expressions based on our own facial expression database and demonstrate the superior performance of our combined expression recognition system.
Resumo:
本文介绍用光学阵列传感器的机器人物体分类系统。传感器直接安装在机器人的两个手指上。被抓物体的阴影通过光导纤维传到安放在“安全区”的光敏元件上。计算机识别物体的轮廓后命令机器人抓握物体,并把它运送到指定的地点从而达到物体分类的目的。
Resumo:
本文设计与实现了一种基于TMS320DM642的车牌识别系统,详细阐述了该系统的硬件构成、软件流程、检测算法以及针对DSP处理器进行的系统优化。系统通过摄像头获取汽车牌照图像,以TMS320DM642处理器为核心建立硬件平台,完成车牌定位,倾斜角校正,字符分割,字符识别等一系列算法。实验结果表明基于TMS320DM642的车牌识别系统准确、有效,应用前景广泛。
Resumo:
Recently,Handheld Communication Devices is developing very fast, extending in users and spreading in application fields, and has an promising future. This study investigated the acceptance of the multimodal text entry method and the behavioral characteristics when using it. Based on the general information process model of a bimodal system and the human factor studies about the multimodal map system, the present study mainly focused on the hand-speech bimodal text entry method. For acceptance, the study investigated the subjective perception of the accuracy of speech recognition by Wizard of Oz (WOz) experiment and a questionnaire. Results showed that there was a linear relationship between the speech recognition accuracy and the subjective accuracy. Furthermore, as the familiarity increasing, the difference between the acceptable accuracy and the subjective accuracy gradually decreased. In addition, the similarity of meaning between the outcome of speech recognition and the correct sentences was an important referential criterion. The second study investigated three aspects of the bimodal text entry method, including input, error recovery and modal shifts. The first experiment aimed to find the behavioral characteristics of user when doing error recovery task. Results indicated that participants preferred to correct the error by handwriting, which had no relationship with the input modality. The second experiment aimed to discover the behavioral characteristics of users when doing text entry in various types of text. Results showed that users preferred to speech input in both words and sentences conditions, which was highly consistent among individuals, while no significant difference was found between handwriting and speech input in the character condition. Participants used more direct strategy than jumping strategy to deal with mixed text, especially for the Chinese-English mixed type. The third experiment examined the cognitive load in the different modal shifts, results suggesting that there were significant differences between different shifts. Moreover, relevant little time was needed in the Shift from speech input to hand input. Based on the main findings, implications were discussed as follows: Firstly, when evaluating a speech recognition system, attention should be paid to the fact that the speech recognition accuracy was not equal to the subjective accuracy. Secondly, in order to make a speech input system more acceptable, a good method is to train and supply the feedback for the accuracy in training, which improving the familiarity and sensitivity to the system. Thirdly, both the universal and individual behavioral patterns were taken into consideration to improve the error recovery method. Fourthly, easing the study and the use of speech input, the operations of speech input should be simpler. Fifthly, more convenient text input method for non-Chinese text entry should be provided. Finally, the shifting time between hand input and speech input provides an important parameter for the design of automatic-evoked speech recognition system.
Resumo:
This paper presents a new method of grouping edges in order to recognize objects. This grouping method succeeds on images of both two- and three- dimensional objects. So that the recognition system can consider first the collections of edges most likely to lead to the correct recognition of objects, we order groups of edges based on the likelihood that a single object produced them. The grouping module estimates this likelihood using the distance that separates edges and their relative orientation. This ordering greatly reduces the amount of computation required to locate objects and improves the system's robustness to error.
Resumo:
This research project is a study of the role of fixation and visual attention in object recognition. In this project, we build an active vision system which can recognize a target object in a cluttered scene efficiently and reliably. Our system integrates visual cues like color and stereo to perform figure/ground separation, yielding candidate regions on which to focus attention. Within each image region, we use stereo to extract features that lie within a narrow disparity range about the fixation position. These selected features are then used as input to an alignment-style recognition system. We show that visual attention and fixation significantly reduce the complexity and the false identifications in model-based recognition using Alignment methods. We also demonstrate that stereo can be used effectively as a figure/ground separator without the need for accurate camera calibration.
Resumo:
The key to understanding a program is recognizing familiar algorithmic fragments and data structures in it. Automating this recognition process will make it easier to perform many tasks which require program understanding, e.g., maintenance, modification, and debugging. This report describes a recognition system, called the Recognizer, which automatically identifies occurrences of stereotyped computational fragments and data structures in programs. The Recognizer is able to identify these familiar fragments and structures, even though they may be expressed in a wide range of syntactic forms. It does so systematically and efficiently by using a parsing technique. Two important advances have made this possible. The first is a language-independent graphical representation for programs and programming structures which canonicalizes many syntactic features of programs. The second is an efficient graph parsing algorithm.
Resumo:
Nearest neighbor retrieval is the task of identifying, given a database of objects and a query object, the objects in the database that are the most similar to the query. Retrieving nearest neighbors is a necessary component of many practical applications, in fields as diverse as computer vision, pattern recognition, multimedia databases, bioinformatics, and computer networks. At the same time, finding nearest neighbors accurately and efficiently can be challenging, especially when the database contains a large number of objects, and when the underlying distance measure is computationally expensive. This thesis proposes new methods for improving the efficiency and accuracy of nearest neighbor retrieval and classification in spaces with computationally expensive distance measures. The proposed methods are domain-independent, and can be applied in arbitrary spaces, including non-Euclidean and non-metric spaces. In this thesis particular emphasis is given to computer vision applications related to object and shape recognition, where expensive non-Euclidean distance measures are often needed to achieve high accuracy. The first contribution of this thesis is the BoostMap algorithm for embedding arbitrary spaces into a vector space with a computationally efficient distance measure. Using this approach, an approximate set of nearest neighbors can be retrieved efficiently - often orders of magnitude faster than retrieval using the exact distance measure in the original space. The BoostMap algorithm has two key distinguishing features with respect to existing embedding methods. First, embedding construction explicitly maximizes the amount of nearest neighbor information preserved by the embedding. Second, embedding construction is treated as a machine learning problem, in contrast to existing methods that are based on geometric considerations. The second contribution is a method for constructing query-sensitive distance measures for the purposes of nearest neighbor retrieval and classification. In high-dimensional spaces, query-sensitive distance measures allow for automatic selection of the dimensions that are the most informative for each specific query object. It is shown theoretically and experimentally that query-sensitivity increases the modeling power of embeddings, allowing embeddings to capture a larger amount of the nearest neighbor structure of the original space. The third contribution is a method for speeding up nearest neighbor classification by combining multiple embedding-based nearest neighbor classifiers in a cascade. In a cascade, computationally efficient classifiers are used to quickly classify easy cases, and classifiers that are more computationally expensive and also more accurate are only applied to objects that are harder to classify. An interesting property of the proposed cascade method is that, under certain conditions, classification time actually decreases as the size of the database increases, a behavior that is in stark contrast to the behavior of typical nearest neighbor classification systems. The proposed methods are evaluated experimentally in several different applications: hand shape recognition, off-line character recognition, online character recognition, and efficient retrieval of time series. In all datasets, the proposed methods lead to significant improvements in accuracy and efficiency compared to existing state-of-the-art methods. In some datasets, the general-purpose methods introduced in this thesis even outperform domain-specific methods that have been custom-designed for such datasets.