979 resultados para Optical music recognition


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In many parts of the world, uncontrolled fires in sparsely populated areas are a major concern as they can quickly grow into large and destructive conflagrations in short time spans. Detecting these fires has traditionally been a job for trained humans on the ground, or in the air. In many cases, these manned solutions are simply not able to survey the amount of area necessary to maintain sufficient vigilance and coverage. This paper investigates the use of unmanned aerial systems (UAS) for automated wildfire detection. The proposed system uses low-cost, consumer-grade electronics and sensors combined with various airframes to create a system suitable for automatic detection of wildfires. The system employs automatic image processing techniques to analyze captured images and autonomously detect fire-related features such as fire lines, burnt regions, and flammable material. This image recognition algorithm is designed to cope with environmental occlusions such as shadows, smoke and obstructions. Once the fire is identified and classified, it is used to initialize a spatial/temporal fire simulation. This simulation is based on occupancy maps whose fidelity can be varied to include stochastic elements, various types of vegetation, weather conditions, and unique terrain. The simulations can be used to predict the effects of optimized firefighting methods to prevent the future propagation of the fires and greatly reduce time to detection of wildfires, thereby greatly minimizing the ensuing damage. This paper also documents experimental flight tests using a SenseFly Swinglet UAS conducted in Brisbane, Australia as well as modifications for custom UAS.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we use optical flow based complex-valued features extracted from video sequences to recognize human actions. The optical flow features between two image planes can be appropriately represented in the Complex plane. Therefore, we argue that motion information that is used to model the human actions should be represented as complex-valued features and propose a fast learning fully complex-valued neural classifier to solve the action recognition task. The classifier, termed as, ``fast learning fully complex-valued neural (FLFCN) classifier'' is a single hidden layer fully complex-valued neural network. The neurons in the hidden layer employ the fully complex-valued activation function of the type of a hyperbolic secant function. The parameters of the hidden layer are chosen randomly and the output weights are estimated as the minimum norm least square solution to a set of linear equations. The results indicate the superior performance of FLFCN classifier in recognizing the actions compared to real-valued support vector machines and other existing results in the literature. Complex valued representation of 2D motion and orthogonal decision boundaries boost the classification performance of FLFCN classifier. (c) 2012 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we present a fast learning neural network classifier for human action recognition. The proposed classifier is a fully complex-valued neural network with a single hidden layer. The neurons in the hidden layer employ the fully complex-valued hyperbolic secant as an activation function. The parameters of the hidden layer are chosen randomly and the output weights are estimated analytically as a minimum norm least square solution to a set of linear equations. The fast leaning fully complex-valued neural classifier is used for recognizing human actions accurately. Optical flow-based features extracted from the video sequences are utilized to recognize 10 different human actions. The feature vectors are computationally simple first order statistics of the optical flow vectors, obtained from coarse to fine rectangular patches centered around the object. The results indicate the superior performance of the complex-valued neural classifier for action recognition. The superior performance of the complex neural network for action recognition stems from the fact that motion, by nature, consists of two components, one along each of the axes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we present a machine learning approach for subject independent human action recognition using depth camera, emphasizing the importance of depth in recognition of actions. The proposed approach uses the flow information of all 3 dimensions to classify an action. In our approach, we have obtained the 2-D optical flow and used it along with the depth image to obtain the depth flow (Z motion vectors). The obtained flow captures the dynamics of the actions in space time. Feature vectors are obtained by averaging the 3-D motion over a grid laid over the silhouette in a hierarchical fashion. These hierarchical fine to coarse windows capture the motion dynamics of the object at various scales. The extracted features are used to train a Meta-cognitive Radial Basis Function Network (McRBFN) that uses a Projection Based Learning (PBL) algorithm, referred to as PBL-McRBFN, henceforth. PBL-McRBFN begins with zero hidden neurons and builds the network based on the best human learning strategy, namely, self-regulated learning in a meta-cognitive environment. When a sample is used for learning, PBLMcRBFN uses the sample overlapping conditions, and a projection based learning algorithm to estimate the parameters of the network. The performance of PBL-McRBFN is compared to that of a Support Vector Machine (SVM) and Extreme Learning Machine (ELM) classifiers with representation of every person and action in the training and testing datasets. Performance study shows that PBL-McRBFN outperforms these classifiers in recognizing actions in 3-D. Further, a subject-independent study is conducted by leave-one-subject-out strategy and its generalization performance is tested. It is observed from the subject-independent study that McRBFN is capable of generalizing actions accurately. The performance of the proposed approach is benchmarked with Video Analytics Lab (VAL) dataset and Berkeley Multimodal Human Action Database (MHAD). (C) 2013 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we report a breakthrough result on the difficult task of segmentation and recognition of coloured text from the word image dataset of ICDAR robust reading competition challenge 2: reading text in scene images. We split the word image into individual colour, gray and lightness planes and enhance the contrast of each of these planes independently by a power-law transform. The discrimination factor of each plane is computed as the maximum between-class variance used in Otsu thresholding. The plane that has maximum discrimination factor is selected for segmentation. The trial version of Omnipage OCR is then used on the binarized words for recognition. Our recognition results on ICDAR 2011 and ICDAR 2003 word datasets are compared with those reported in the literature. As baseline, the images binarized by simple global and local thresholding techniques were also recognized. The word recognition rate obtained by our non-linear enhancement and selection of plance method is 72.8% and 66.2% for ICDAR 2011 and 2003 word datasets, respectively. We have created ground-truth for each image at the pixel level to benchmark these datasets using a toolkit developed by us. The recognition rate of benchmarked images is 86.7% and 83.9% for ICDAR 2011 and 2003 datasets, respectively.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Peripherally triarylborane decorated porphyrin (2) and its Zn(II) complex (3) have been synthesized. Compound 3 contains of two different Lewis acidic binding sites (Zn(II) and boron center). Unlike all previously known triarylborane based sensors, the optical responses of 3 toward fluoride and cyanide are distinctively different, thus enabling the discrimination of these two interfering anions. Metalloporphyrin 3 shows a multiple channel fluorogenic response toward fluoride and cyanide and also a selective visual colorimetric response toward cyanide. By comparison with model systems and from detailed photophysical studies on 2 and 3, we conclude that the preferential binding of fluoride occurs at the peripheral borane moieties resulting in the cessation of the EET (electronic energy transfer) process from borane to porphyrin core and with negligible negetive cooperative effects. On the other hand, cyanide binding occurs at the Zn(II) core leading to drastic changes in its absorption behavior which can be followed by the naked eye. Such changes are not observed when the boryl substituent is absent (e.g., Zn-TPP and TPP). Compounds 2 and 3 were also found to be capable of extracting fluoride from aqueous medium.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The tonic is a fundamental concept in Indian art music. It is the base pitch, which an artist chooses in order to construct the melodies during a rg(a) rendition, and all accompanying instruments are tuned using the tonic pitch. Consequently, tonic identification is a fundamental task for most computational analyses of Indian art music, such as intonation analysis, melodic motif analysis and rg recognition. In this paper we review existing approaches for tonic identification in Indian art music and evaluate them on six diverse datasets for a thorough comparison and analysis. We study the performance of each method in different contexts such as the presence/absence of additional metadata, the quality of audio data, the duration of audio data, music tradition (Hindustani/Carnatic) and the gender of the singer (male/female). We show that the approaches that combine multi-pitch analysis with machine learning provide the best performance in most cases (90% identification accuracy on average), and are robust across the aforementioned contexts compared to the approaches based on expert knowledge. In addition, we also show that the performance of the latter can be improved when additional metadata is available to further constrain the problem. Finally, we present a detailed error analysis of each method, providing further insights into the advantages and limitations of the methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fringe tracking and fringe order assignment have become the central topics of current research in digital photoelasticity. Isotropic points (IPs) appearing in low fringe order zones are often either overlooked or entirely missed in conventional as well as digital photoelasticity. We aim to highlight image processing for characterizing IPs in an isochromatic fringe field. By resorting to a global analytical solution of a circular disk, sensitivity of IPs to small changes in far-field loading on the disk is highlighted. A local theory supplements the global closed-form solutions of three-, four-, and six-point loading configurations of circular disk. The local theoretical concepts developed in this paper are demonstrated through digital image analysis of isochromatics in circular disks subjected to three-and four-point loads. (C) 2015 Society of Photo-Optical Instrumentation Engineers (SPIE)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a new method for local key and chord estimation from audio signals. This method relies primarily on principles from music theory, and does not require any training on a corpus of labelled audio files. A harmonic content of the musical piece is first extracted by computing a set of chroma vectors. A set of chord/key pairs is selected for every frame by correlation with fixed chord and key templates. An acyclic harmonic graph is constructed with these pairs as vertices, using a musical distance to weigh its edges. Finally, the sequences of chords and keys are obtained by finding the best path in the graph using dynamic programming. The proposed method allows a mutual chord and key estimation. It is evaluated on a corpus composed of Beatles songs for both the local key estimation and chord recognition tasks, as well as a larger corpus composed of songs taken from the Billboard dataset.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A visual pattern recognition network and its training algorithm are proposed. The network constructed of a one-layer morphology network and a two-layer modified Hamming net. This visual network can implement invariant pattern recognition with respect to image translation and size projection. After supervised learning takes place, the visual network extracts image features and classifies patterns much the same as living beings do. Moreover we set up its optoelectronic architecture for real-time pattern recognition. (C) 1996 Optical Society of America

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An ordered gray-scale erosion is suggested according to the definition of hit-miss transform. Instead of using three operations, two images, and two structuring elements, the developed operation requires only one operation and one structuring element, but with three gray-scale levels. Therefore, a union of the ordered gray-scale erosions with different structuring elements can constitute a simple image algebra to program any combined image processing function. An optical parallel ordered gray-scale erosion processor is developed based on the incoherent correlation in a single channel. Experimental results are also given for an edge detection and a pattern recognition. (C) 1998 Society of Photo-Optical Instrumentation Engineers. [S0091-3286(98)00306-7].

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ultrafast temporal pattern generation and recognition with femtosecond laser technology is presented, analyzed, and experimentally implemented. Ultrafast temporal pattern generation and recognition are realized by taking advantage of two well-known techniques: the space-time conversion technique and the ultrafast pulse measurement technique. Here the temporal pattern for the designed multiple pulses, optimized with a preassumed Gaussian spectral distribution of an ultrashort pulse, is described. With the simulation of a Gaussian spectral distribution, we realize that the uniformity of the generated multiple ultrafast temporal pulses is relevant to the repeated number of modulation periods in the mask in the spectral plane. Moreover, the change of Gaussian spectral phases with the wavelengths in the modulated phase plate is considered. Experiments of ultrafast temporal pattern recognition by the frequency-resolved optical gating (FROG) characterization technique are also given. (C) 2004 Society of Photo-Optical Instrumentation Engineers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of the work was to develop a non-invasive methodology for image acquisition, processing and nonlinear trajectory analysis of the collective fish response to a stochastic event. Object detection and motion estimation were performed by an optical flow algorithm in order to detect moving fish and simultaneously eliminate background, noise and artifacts. The Entropy and the Fractal Dimension (FD) of the trajectory followed by the centroids of the groups of fish were calculated using Shannon and permutation Entropy and the Katz, Higuchi and Katz-Castiglioni's FD algorithms respectively. The methodology was tested on three case groups of European sea bass (Dicentrarchus labrax), two of which were similar (C1 control and C2 tagged fish) and very different from the third (C3, tagged fish submerged in methylmercury contaminated water). The results indicate that Shannon entropy and Katz-Castiglioni were the most sensitive algorithms and proved to be promising tools for the non-invasive identification and quantification of differences in fish responses. In conclusion, we believe that this methodology has the potential to be embedded in online/real time architecture for contaminant monitoring programs in the aquaculture industry.