10 resultados para ASR
em Indian Institute of Science - Bangalore - Índia
Resumo:
We are addressing the problem of jointly using multiple noisy speech patterns for automatic speech recognition (ASR), given that they come from the same class. If the user utters a word K times, the ASR system should try to use the information content in all the K patterns of the word simultaneously and improve its speech recognition accuracy compared to that of the single pattern based speech recognition. T address this problem, recently we proposed a Multi Pattern Dynamic Time Warping (MPDTW) algorithm to align the K patterns by finding the least distortion path between them. A Constrained Multi Pattern Viterbi algorithm was used on this aligned path for isolated word recognition (IWR). In this paper, we explore the possibility of using only the MPDTW algorithm for IWR. We also study the properties of the MPDTW algorithm. We show that using only 2 noisy test patterns (10 percent burst noise at -5 dB SNR) reduces the noisy speech recognition error rate by 37.66 percent when compared to the single pattern recognition using the Dynamic Time Warping algorithm.
Resumo:
High mass X-ray binary (HMXB) pulsars are of two types, persistent and transient. 4U1538-52 is a persistent HMXB whose orbit was previously measured to be circular but the RXTE observations revealed an eccentric orbit. We observed this system with RXTE-PCA in August 2003 and our timing analysis supports the eccentric orbit of the system. However, we do not find any evidence for orbital evolution. Rotational and tidal interactions between the stars of a closed binary system result in apsidal motion which can be measured in systems with eccentric orbit. 4U0115+63 is a Be-transient HMXB whose eccentric orbit was well-determined during its 1978 outburst. We report preliminary results from analysis of data obtained during the 1999 outburst of this source with the RXTE-PCA.
Resumo:
High mass X-ray binary (H M X B) pulsars are of two types, persistent and transient. 4U 1538-52 is a persistent HMXB whose orbit was previously measured to be circular but the RXTE observations revealed an eccentric orbit. We observed this system with RXTE-PCA in August 2003 and our timing analysis supports the eccentric orbit of the system. However, we do not find any evidence for orbital evolution. Rotational and tidal interactions between the stars of a closed binary system result in apsidal motion which can be measured in systems with eccentric orbit. 4U0115+63 is a Be-transient HMXB whose eccentric orbit was well-determined during its 1978 outburst. We report preliminary results from analysis of data obtained during the 1999 outburst of this source with the RXTE-PCA.
Resumo:
Joint decoding of multiple speech patterns so as to improve speech recognition performance is important, especially in the presence of noise. In this paper, we propose a Multi-Pattern Viterbi algorithm (MPVA) to jointly decode and recognize multiple speech patterns for automatic speech recognition (ASR). The MPVA is a generalization of the Viterbi Algorithm to jointly decode multiple patterns given a Hidden Markov Model (HMM). Unlike the previously proposed two stage Constrained Multi-Pattern Viterbi Algorithm (CMPVA),the MPVA is a single stage algorithm. MPVA has the advantage that it cart be extended to connected word recognition (CWR) and continuous speech recognition (CSR) problems. MPVA is shown to provide better speech recognition performance than the earlier techniques: using only two repetitions of noisy speech patterns (-5 dB SNR, 10% burst noise), the word error rate using MPVA decreased by 28.5%, when compared to using individual decoding. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The 1300-km rupture of the 2004 interplate earthquake terminated at around 15 degrees N, in the northernmost segment of the Andaman-Nicobar subduction zone. This part of the plate boundary is noted for its generally lower level seismicity, compared with the southern segments. Based on the Global Centroid Moment Tensor (CMT) and National Earthquake Information Center (NEIC) data, most of the earthquakes of M-w >= 4.5 prior to 2004 were associated with the Andaman Spreading Ridge (ASR), and a few events were located within the forearc basin. The 2004 event was followed by an upward migration of hypocenters along the subducting plate, and the Andaman segment experienced a surge of aftershock activity. The continuing extensional faulting events, including the most recent earthquake (10 August 2009; M-w 7.5) in the northern end of the 2004 rupture, suggest the reduction of compressional strain associated with the interplate event. The style of faulting of the intraplate events before and after a great plate boundary earthquake reflects the relative influences of the plate-driving forces. Here we discuss the pattern of earthquakes in the Andaman segment before and after the 2004 event to appraise the spatial and temporal relation between large interplate thrust events and intraplate deformation. This study suggests that faulting mechanisms in the outer-ridge and outer-rise regions could be indicative of the maturity of interplate seismic cycles.
Resumo:
Fly ash and silica fume are two pozzolans that have been widely used for improved concrete strength and durability. Silica fume displays a greater pozzolanic reactivity than fly ash primarily due to its finer particle size. The reactivity of fly ash can be improved by reducing its particle size distribution. This paper discusses the fresh and hardened properties of concrete made with an ultra-fine fly ash (UFFA) produced by air classification. Durability testing for chloride diffusivity, rapid chloride permeability, alkali-silica reaction (ASR), and sulfate attack was also conducted It was found that at a given workability and water content, concrete containing UFFA could be produced with only 50% of the high-range water-reducer dosage required for comparable silica fume concrete. Similar early strengths and durability measures as silica fume concrete were observed when a slightly higher dosage of UFFA was used with a small reduction (10%) in water content.
Resumo:
Effective feature extraction for robust speech recognition is a widely addressed topic and currently there is much effort to invoke non-stationary signal models instead of quasi-stationary signal models leading to standard features such as LPC or MFCC. Joint amplitude modulation and frequency modulation (AM-FM) is a classical non-parametric approach to non-stationary signal modeling and recently new feature sets for automatic speech recognition (ASR) have been derived based on a multi-band AM-FM representation of the signal. We consider several of these representations and compare their performances for robust speech recognition in noise, using the AURORA-2 database. We show that FEPSTRUM representation proposed is more effective than others. We also propose an improvement to FEPSTRUM based on the Teager energy operator (TEO) and show that it can selectively outperform even FEPSTRUM
Resumo:
This paper discusses an approach for river mapping and flood evaluation based on multi-temporal time series analysis of satellite images utilizing pixel spectral information for image classification and region-based segmentation for extracting water-covered regions. Analysis of MODIS satellite images is applied in three stages: before flood, during flood and after flood. Water regions are extracted from the MODIS images using image classification (based on spectral information) and image segmentation (based on spatial information). Multi-temporal MODIS images from ``normal'' (non-flood) and flood time-periods are processed in two steps. In the first step, image classifiers such as Support Vector Machines (SVMs) and Artificial Neural Networks (ANNs) separate the image pixels into water and non-water groups based on their spectral features. The classified image is then segmented using spatial features of the water pixels to remove the misclassified water. From the results obtained, we evaluate the performance of the method and conclude that the use of image classification (SVM and ANN) and region-based image segmentation is an accurate and reliable approach for the extraction of water-covered regions. (c) 2012 COSPAR. Published by Elsevier Ltd. All rights reserved.
Resumo:
Variable Endmember Constrained Least Square (VECLS) technique is proposed to account endmember variability in the linear mixture model by incorporating the variance for each class, the signals of which varies from pixel to pixel due to change in urban land cover (LC) structures. VECLS is first tested with a computer simulated three class endmember considering four bands having small, medium and large variability with three different spatial resolutions. The technique is next validated with real datasets of IKONOS, Landsat ETM+ and MODIS. The results show that correlation between actual and estimated proportion is higher by an average of 0.25 for the artificial datasets compared to a situation where variability is not considered. With IKONOS, Landsat ETM+ and MODIS data, the average correlation increased by 0.15 for 2 and 3 classes and by 0.19 for 4 classes, when compared to single endmember per class. (C) 2013 COSPAR. Published by Elsevier Ltd. All rights reserved.
Resumo:
Acoustic feature based speech (syllable) rate estimation and syllable nuclei detection are important problems in automatic speech recognition (ASR), computer assisted language learning (CALL) and fluency analysis. A typical solution for both the problems consists of two stages. The first stage involves computing a short-time feature contour such that most of the peaks of the contour correspond to the syllabic nuclei. In the second stage, the peaks corresponding to the syllable nuclei are detected. In this work, instead of the peak detection, we perform a mode-shape classification, which is formulated as a supervised binary classification problem - mode-shapes representing the syllabic nuclei as one class and remaining as the other. We use the temporal correlation and selected sub-band correlation (TCSSBC) feature contour and the mode-shapes in the TCSSBC feature contour are converted into a set of feature vectors using an interpolation technique. A support vector machine classifier is used for the classification. Experiments are performed separately using Switchboard, TIMIT and CTIMIT corpora in a five-fold cross validation setup. The average correlation coefficients for the syllable rate estimation turn out to be 0.6761, 0.6928 and 0.3604 for three corpora respectively, which outperform those obtained by the best of the existing peak detection techniques. Similarly, the average F-scores (syllable level) for the syllable nuclei detection are 0.8917, 0.8200 and 0.7637 for three corpora respectively. (C) 2016 Elsevier B.V. All rights reserved.