Biblioteca Digital

974 resultados para Speech Processing

H.264 COMPRESSED VIDEO CLASSIFICATION USING HISTOGRAM OF ORIENTED MOTION VECTORS (HOMV)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we have proposed a simple and effective approach to classify H.264 compressed videos, by capturing orientation information from the motion vectors. Our major contribution involves computing Histogram of Oriented Motion Vectors (HOMV) for overlapping hierarchical Space-Time cubes. The Space-Time cubes selected are partially overlapped. HOMV is found to be very effective to define the motion characteristics of these cubes. We then use Bag of Features (B OF) approach to define the video as histogram of HOMV keywords, obtained using k-means clustering. The video feature, thus computed, is found to be very effective in classifying videos. We demonstrate our results with experiments on two large publicly available video database.

A NEAR OPTIMAL PROJECTION FOR SPARSE REPRESENTATION BASED CLASSIFICATION

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sparse representation based classification (SRC) is one of the most successful methods that has been developed in recent times for face recognition. Optimal projection for Sparse representation based classification (OPSRC)1] provides a dimensionality reduction map that is supposed to give optimum performance for SRC framework. However, the computational complexity involved in this method is too high. Here, we propose a new projection technique using the data scatter matrix which is computationally superior to the optimal projection method with comparable classification accuracy with respect OPSRC. The performance of the proposed approach is benchmarked with various publicly available face database.

FUSION OF ALGORITHMS FOR COMPRESSED SENSING

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Numerous algorithms have been proposed recently for sparse signal recovery in Compressed Sensing (CS). In practice, the number of measurements can be very limited due to the nature of the problem and/or the underlying statistical distribution of the non-zero elements of the sparse signal may not be known a priori. It has been observed that the performance of any sparse signal recovery algorithm depends on these factors, which makes the selection of a suitable sparse recovery algorithm difficult. To take advantage in such situations, we propose to use a fusion framework using which we employ multiple sparse signal recovery algorithms and fuse their estimates to get a better estimate. Theoretical results justifying the performance improvement are shown. The efficacy of the proposed scheme is demonstrated by Monte Carlo simulations using synthetic sparse signals and ECG signals selected from MIT-BIH database.

Co-registration of speech production datasets from electromagnetic articulography and real-time magnetic resonance imaging

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes a spatio-temporal registration approach for speech articulation data obtained from electromagnetic articulography (EMA) and real-time Magnetic Resonance Imaging (rtMRI). This is motivated by the potential for combining the complementary advantages of both types of data. The registration method is validated on EMA and rtMRI datasets obtained at different times, but using the same stimuli. The aligned corpus offers the advantages of high temporal resolution (from EMA) and a complete mid-sagittal view (from rtMRI). The co-registration also yields optimum placement of EMA sensors as articulatory landmarks on the magnetic resonance images, thus providing richer spatio-temporal information about articulatory dynamics. (C) 2014 Acoustical Society of America

Threshold-Independent QRS Detection Using the Dynamic Plosion Index

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Detection of QRS serves as a first step in many automated ECG analysis techniques. Motivated by the strong similarities between the signal structures of an ECG signal and the integrated linear prediction residual (ILPR) of voiced speech, an algorithm proposed earlier for epoch detection from ILPR is extended to the problem of QRS detection. The ECG signal is pre-processed by high-pass filtering to remove the baseline wandering and by half-wave rectification to reduce the ambiguities. The initial estimates of the QRS are iteratively obtained using a non-linear temporal feature, named the dynamic plosion index suitable for detection of transients in a signal. These estimates are further refined to obtain a higher temporal accuracy. Unlike most of the high performance algorithms, this technique does not make use of any threshold or differencing operation. The proposed algorithm is validated on the MIT-BIH database using the standard metrics and its performance is found to be comparable to the state-of-the-art algorithms, despite its threshold independence and simple decision logic.

A Communication-Theoretic Framework for 2-DMR Channel Modeling: Performance Evaluation of Coding and Signal Processing Methods

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We develop a communication theoretic framework for modeling 2-D magnetic recording channels. Using the model, we define the signal-to-noise ratio (SNR) for the channel considering several physical parameters, such as the channel bit density, code rate, bit aspect ratio, and noise parameters. We analyze the problem of optimizing the bit aspect ratio for maximizing SNR. The read channel architecture comprises a novel 2-D joint self-iterating equalizer and detection system with noise prediction capability. We evaluate the system performance based on our channel model through simulations. The coded performance with the 2-D equalizer detector indicates similar to 5.5 dB of SNR gain over uncoded data.

Semisolid Processing of A380 Al Alloy Using Cooling Slope

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study is aimed toward obtaining near spherical microstructural features of Rheocast A380 aluminum alloy. Cooling slope (CS) technique has been used to generate semisolid slurry from the superheated alloy melt. Spherodization of primary grains is the heart of semisolid processing which improves mechanical properties significantly in the parts cast from semisolid state compared to the conventional casting processes. Keeping in view of the desired microstructural morphology, i.e., rosette or spherical shape of primary alpha-Al phase, successive slurry samples have been collected during melt flow and oil quenched to investigate the microstructure evolution mechanism. Conventionally cast A380 Al alloy sample shows dendritic grains surrounded by large eutectic phase whereas finer, near spherical grains have been observed within the cooling slope processed slurry and also in the solidified castings which confirms the effectiveness of semisolid processing of the alloy following cooling slope technique. Grain refiner addition into the alloy melt is found to have favorable effect which leads to the generation of finer primary grains within the slurry with higher degree of sphericity.

Capacity of Gaussian Channels With Energy Harvesting and Processing Cost

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Energy harvesting sensor nodes are gaining popularity due to their ability to improve the network life time and are becoming a preferred choice supporting green communication. In this paper, we focus on communicating reliably over an additive white Gaussian noise channel using such an energy harvesting sensor node. An important part of this paper involves appropriate modeling of energy harvesting, as done via various practical architectures. Our main result is the characterization of the Shannon capacity of the communication system. The key technical challenge involves dealing with the dynamic (and stochastic) nature of the (quadratic) cost of the input to the channel. As a corollary, we find close connections between the capacity achieving energy management policies and the queueing theoretic throughput optimal policies.

Gammatone Wavelet Cepstral Coefficients for Robust Speech Recognition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We develop noise robust features using Gammatone wavelets derived from the popular Gammatone functions. These wavelets incorporate the characteristics of human peripheral auditory systems, in particular the spatially-varying frequency response of the basilar membrane. We refer to the new features as Gammatone Wavelet Cepstral Coefficients (GWCC). The procedure involved in extracting GWCC from a speech signal is similar to that of the conventional Mel-Frequency Cepstral Coefficients (MFCC) technique, with the difference being in the type of filterbank used. We replace the conventional mel filterbank in MFCC with a Gammatone wavelet filterbank, which we construct using Gammatone wavelets. We also explore the effect of Gammatone filterbank based features (Gammatone Cepstral Coefficients (GCC)) for robust speech recognition. On AURORA 2 database, a comparison of GWCCs and GCCs with MFCCs shows that Gammatone based features yield a better recognition performance at low SNRs.

Effect of boron addition and processing of Ti-6Al-4V on corrosion behaviour and biocompatibility

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ti-6Al-4V is widely used to prepare biomedical implant for orthopaedic and dental applications, but it is an expensive choice relative to other implant materials such as stainless steels and Co-Cr alloys, in large part due to the high manufacturing cost. Adding boron to refine the as cast microstructure of Ti-6Al-4V can eliminate the need for extensive hot working and thereby reduce processing costs. The effect of 0.1 wt-% boron addition and the choice of processing route (forging or extrusion) was studied in the context of potential biomedical applications. Corrosion tests in simulated body fluid indicated that the presence of boron increased the corrosion rate of Ti-6Al-4V and that the increase was higher for forged alloys than for extruded alloys. Boron addition and processing route were found to have a minimal effect on the viability of osteoblasts on the alloy surfaces. It is concluded that the addition of boron could offer advantages during the processing of Ti-6Al-4V for biomedical applications.

Processing-microstructure-yield strength correlation in a near beta Ti alloy, Ti-5Al-5Mo-5V-3Cr

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A combined set of thermo-mechanical steps recommended for high strength beta Ti alloy are homogenization, deformation, recrystallization, annealing and ageing steps in sequence. Recrystallization carried out above or below beta transus temperature generates either beta annealed (lath type morphology of alpha) or bimodal (lath+globular morphology of alpha) microstructure. Through variations in heat treatment parameters at these processing steps, wide ranges of length scales of features have been generated in both types of microstructures in a near beta Ti alloy, Ti-5Al-5Mo-5V-3Cr (Ti-5553). 0.2% Yield strength (YS) has been correlated to various microstructural features and associated heat treatment parameters. Relative importance of microstructural features in influencing YS has been identified. Process parameters at different steps have been identified and recommended for attaining different levels of YS for this near beta Ti alloy. (C) 2014 Elsevier B.V. All rights reserved.

Phase Field Simulation of Equiaxed Microstructure Formation during Semi-solid Processing of A380 Al Alloy

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A phase field modelling approach is implemented in the present study towards simulation of microstructure evolution during cooling slope semi solid slurry generation process of A380 Aluminium alloy. First, experiments are performed to evaluate the number of seeds required within the simulation domain to simulate near spherical microstructure formation, occurs during cooling slope processing of the melt. Subsequently, microstructure evolution is studied employing a phase field method. Simulations are performed to understand the effect of cooling rate on the slurry microstructure. Encouraging results are obtained from the simulation studies which are validated by experimental observations. The results obtained from mesoscopic phase field simulations are grain size, grain density, degree of sphericity of the evolving primary Al phase and the amount of solid fraction present within the slurry at different time frames. Effect of grain refinement also has been studied with an aim of improving the slurry microstructure further. Insight into the process has been obtained from the numerical findings, which are found to be useful for process control.

Hot deformation behavior of Ni-Fe-Ga-based ferromagnetic shape memory alloy - A study using processing map

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ni-Fe-Ga-based alloys form a new class of ferromagnetic shape memory alloys (FSMAs) that show considerable formability because of the presence of a disordered fcc gamma-phase. The current study explores the deformation processing of this alloy using an off-stoichiometric Ni55Fe59Ga26 alloy that contains the ductile gamma-phase. The hot deformation behavior of this alloy has been characterized on the basis of its flow stress variation obtained by isothermal constant true strain rate compression tests in the 1123-1323 K temperature range and strain rate range of 10(-3)-10 s(-1) and using a combination of constitutive modeling and processing map. The dynamic recrystallization (DRX) regime for thermomechanical processing has been identified for this Heusler alloy on the basis of the processing maps and the deformed microstructures. This alloy also shows evidence of dynamic strain-aging (DSA) effect which has not been reported so far for any Heusler FSMAs. Similar effect is also noticed in a Ni-Mn-Ga-based Heusler alloy which is devoid of any gamma-phase. (C) 2014 Elsevier Ltd. All rights reserved.

Estimation of voice-onset time in continuous speech using temporal measures

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes an automatic acoustic-phonetic method for estimating voice-onset time of stops. This method requires neither transcription of the utterance nor training of a classifier. It makes use of the plosion index for the automatic detection of burst onsets of stops. Having detected the burst onset, the onset of the voicing following the burst is detected using the epochal information and a temporal measure named the maximum weighted inner product. For validation, several experiments are carried out on the entire TIMIT database and two of the CMU Arctic corpora. The performance of the proposed method compares well with three state-of-the-art techniques. (C) 2014 Acoustical Society of America

Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

USC-TIMIT is an extensive database of multimodal speech production data, developed to complement existing resources available to the speech research community and with the intention of being continuously refined and augmented. The database currently includes real-time magnetic resonance imaging data from five male and five female speakers of American English. Electromagnetic articulography data have also been presently collected from four of these speakers. The two modalities were recorded in two independent sessions while the subjects produced the same 460 sentence corpus used previously in the MOCHA-TIMIT database. In both cases the audio signal was recorded and synchronized with the articulatory data. The database and companion software are freely available to the research community. (C) 2014 Acoustical Society of America.

«
1
2
...
57
58
59
60
61
62
63
64
65
»