Biblioteca Digital

974 resultados para Speech Processing

A generalized Stein's estimation approach to speech enhancement based on perceptual criteria

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the problem of speech enhancement using a risk- estimation approach. In particular, we propose the use the Stein’s unbiased risk estimator (SURE) for solving the problem. The need for a suitable finite-sample risk estimator arises because the actual risks invariably depend on the unknown ground truth. We consider the popular mean-squared error (MSE) criterion first, and then compare it against the perceptually-motivated Itakura-Saito (IS) distortion, by deriving unbiased estimators of the corresponding risks. We use a generalized SURE (GSURE) development, recently proposed by Eldar for MSE. We consider dependent observation models from the exponential family with an additive noise model,and derive an unbiased estimator for the risk corresponding to the IS distortion, which is non-quadratic. This serves to address the speech enhancement problem in a more general setting. Experimental results illustrate that the IS metric is efficient in suppressing musical noise, which affects the MSE-enhanced speech. However, in terms of global signal-to-noise ratio (SNR), the minimum MSE solution gives better results.

Sub-band envelope approach to obtain instants of significant excitation in speech

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose a new sub-band approach to estimate the glottal activity. The method is based on the spectral harmonicity and the sub-band temporal properties of voiced speech. We propose a method to represent glottal excitation signal using sub-band temporal envelope. Instants of maximum glottal excitation or Glottal Closure Instants (GCI) are extracted from the estimated glottal excitation pattern and the result is compared with a standard GCI computation method, DYPSA [1]. The performance of the algorithm is also compared for the noisy signal and it is shown that the proposed method is less variant to GCI estimation under noisy conditions compared to DYPSA. The algorithm is evaluated on the CMU-ARCTIC database.

Processing of DNA double-stranded breaks and intermediates of recombination and repair by saccharomyces cerevisiae Mre11 and its stimulation by Rad50, Xrs2, and Sae2 proteins

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Saccharomyces cerevisiae RAD50, MRE11, and XRS2 genes are essential for telomere length maintenance, cell cycle checkpoint signaling, meiotic recombination, and DNA double-stranded break (DSB) repair via nonhomologous end joining and homologous recombination. The DSB repair pathways that draw upon Mre11-Rad50-Xrs2 subunits are complex, so their mechanistic features remain poorly understood. Moreover, the molecular basis of DSB end resection in yeast mre11-nuclease deficient mutants and Mre11 nuclease-independent activation of ATM in mammals remains unknown and adds a new dimension to many unanswered questions about the mechanism of DSB repair. Here, we demonstrate that S. cerevisiae Mre11 (ScMre11) exhibits higher binding affinity for single-over double-stranded DNA and intermediates of recombination and repair and catalyzes robust unwinding of substrates possessing a 3' single-stranded DNA overhang but not of 5' overhangs or blunt-ended DNA fragments. Additional evidence disclosed that ScMre11 nuclease activity is dispensable for its DNA binding and unwinding activity, thus uncovering the molecular basis underlying DSB end processing in mre11 nuclease deficient mutants. Significantly, Rad50, Xrs2, and Sae2 potentiate the DNA unwinding activity of Mre11, thus underscoring functional interaction among the components of DSB end repair machinery. Our results also show that ScMre11 by itself binds to DSB ends, then promotes end bridging of duplex DNA, and directly interacts with Sae2. We discuss the implications of these results in the context of an alternative mechanism for DSB end processing and the generation of single-stranded DNA for DNA repair and homologous recombination.

Performance improvement of short-length regular low-density parity-check codes with low-complexity post-processing

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is well known that extremely long low-density parity-check (LDPC) codes perform exceptionally well for error correction applications, short-length codes are preferable in practical applications. However, short-length LDPC codes suffer from performance degradation owing to graph-based impairments such as short cycles, trapping sets and stopping sets and so on in the bipartite graph of the LDPC matrix. In particular, performance degradation at moderate to high E-b/N-0 is caused by the oscillations in bit node a posteriori probabilities induced by short cycles and trapping sets in bipartite graphs. In this study, a computationally efficient algorithm is proposed to improve the performance of short-length LDPC codes at moderate to high E-b/N-0. This algorithm makes use of the information generated by the belief propagation (BP) algorithm in previous iterations before a decoding failure occurs. Using this information, a reliability-based estimation is performed on each bit node to supplement the BP algorithm. The proposed algorithm gives an appreciable coding gain as compared with BP decoding for LDPC codes of a code rate equal to or less than 1/2 rate coding. The coding gains are modest to significant in the case of optimised (for bipartite graph conditioning) regular LDPC codes, whereas the coding gains are huge in the case of unoptimised codes. Hence, this algorithm is useful for relaxing some stringent constraints on the graphical structure of the LDPC code and for developing hardware-friendly designs.

Microstructure and texture evolution during sub-transus thermomechanical processing of Ti-6Al-4V-0.1B alloy: part I. hot rolling in (alpha plus beta) phase field

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the current study, the evolution of microstructure and texture has been studied for Ti-6Al-4V-0.1B alloy during sub-transus thermomechanical processing. This part of the work deals with the deformation response of the alloy by rolling in the (alpha + beta) phase field. The (alpha + beta) annealing behavior of the rolled specimen is communicated in part II. Rolled microstructures of the alloys exhibit either kinked or straight alpha colonies depending on their orientations with respect to the principal rolling directions. The Ti-6Al-4V-0.1B alloy shows an improved rolling response compared with the alloy Ti-6Al-4V because of smaller alpha lamellae size, coherency of alpha/beta interfaces, and multiple slip due to orientation factors. Accelerated dynamic globularization for this alloy is similarly caused by the intralamellar transverse boundary formation via multiple slip and strain accumulation at TiB particles. The (0002)(alpha) pole figures of rolled Ti-6Al-4V alloy shows ``TD splitting'' at lower rolling temperatures because of strong initial texture. Substantial beta phase mitigates the effect of starting texture at higher temperature so that ``RD splitting'' characterizes the basal pole figure. Weak starting texture and easy slip transfer for Ti-6Al-4V-0.1B alloy produce simultaneous TD and RD splittings in basal pole figures at all rolling temperatures.

Microstructure and texture evolution during sub-transus thermo-mechanical processing of Ti-6Al-4V-0.1B alloy: part II. static annealing in (alpha plus beta) regime

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The first part of this study describes the evolution of microstructure and texture in Ti-6Al-4V-0.1B alloy during sub-transus rolling vis-A -vis the control alloy Ti-6Al-4V. In the second part, the static annealing response of the two alloys at self-same conditions is compared and the principal micromechanisms are analyzed. Faster globularization kinetics has been observed in the Ti-6Al-4V-0.1B alloy for equivalent annealing conditions. This is primarily attributed to the alpha colonies, which leads to easy boundary splitting via multiple slip activation in this alloy. The other mechanisms facilitating lamellar to equiaxed morphological transformations, e.g., termination migration and cylinderization, also start early in the boron-modified alloy due to small alpha colony size, small aspect ratio of the alpha lamellae, and the presence of TiB particles in the microstructure. Both the alloys exhibit weakening of basal fiber (ND||aOE (c) 0001 >) and strengthening of prism fiber (RD||aOE (c) aOE(a)) upon annealing. A close proximity between the orientations of fully globularized primary alpha and secondary alpha phases during alpha -> beta -> alpha transformation has accounted for such a texture modification.

Processing of enriched elemental boron (B-10 similar to 65 at. %)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Procedures were developed for purification and processing of electrodeposited enriched boron powder for control rod application in India's first commercial Proto Type Fast Breeder Reactor (PFBR). Methodology for removal of anionic (F-, Cl-, BF4-) and cationic (Fe2+, Fe3+, Ni2+) impurities was developed. Parameters for grinding boron flakes obtained after electrodeposition were optimized to obtain the boron powder having particle size less than 100 gm. The rate of removal of impurities was studied with respect to time and concentration of the reagents used for purification. Process parameters for grinding and removal of impurities were optimized. A flowsheet was proposed which helps in minimizing the purification time and concentration of the reagent used for the effective removal of impurities. The purification methodology developed in this work could produce boron that meets the technical specifications for control rod application in a fast reactor.

Joint pitch-analysis formant-synthesis framework for CS recovery of speech

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A joint analysis-synthesis framework is developed for the compressive sensing (CS) recovery of speech signals. The signal is assumed to be sparse in the residual domain with the linear prediction filter used as the sparse transformation. Importantly this transform is not known apriori, since estimating the predictor filter requires the knowledge of the signal. Two prediction filters, one comb filter for pitch and another all pole formant filter are needed to induce maximum sparsity. An iterative method is proposed for the estimation of both the prediction filters and the signal itself. Formant prediction filter is used as the synthesis transform, while the pitch filter is used to model the periodicity in the residual excitation signal, in the analysis mode. Significant improvement in the LLR measure is seen over the previously reported formant filter estimation.

Automatic speech segmentation using probabilistic latent component modeling

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Latent variable methods, such as PLCA (Probabilistic Latent Component Analysis) have been successfully used for analysis of non-negative signal representations. In this paper, we formulate PLCS (Probabilistic Latent Component Segmentation), which models each time frame of a spectrogram as a spectral distribution. Given the signal spectrogram, the segmentation boundaries are estimated using a maximum-likelihood approach. For an efficient solution, the algorithm imposes a hard constraint that each segment is modelled by a single latent component. The hard constraint facilitates the solution of ML boundary estimation using dynamic programming. The PLCS framework does not impose a parametric assumption unlike earlier ML segmentation techniques. PLCS can be naturally extended to model coarticulation between successive phones. Experiments on the TIMIT corpus show that the proposed technique is promising compared to most state of the art speech segmentation algorithms.

Eigen-profiles of spatio-temporal fragments for adaptive region-based tracking

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose a novel space-time descriptor for region-based tracking which is very concise and efficient. The regions represented by covariance matrices within a temporal fragment, are used to estimate this space-time descriptor which we call the Eigenprofiles(EP). EP so obtained is used in estimating the Covariance Matrix of features over spatio-temporal fragments. The Second Order Statistics of spatio-temporal fragments form our target model which can be adapted for variations across the video. The model being concise also allows the use of multiple spatially overlapping fragments to represent the target. We demonstrate good tracking results on very challenging datasets, shot under insufficient illumination conditions.

Emotiphons: emotion markers in conversational speech - comparison across Indian languages

Relevância:

20.00% 20.00%

Publicador:

Epoch extraction based on integrated linear prediction residual using plosion index

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Epoch is defined as the instant of significant excitation within a pitch period of voiced speech. Epoch extraction continues to attract the interest of researchers because of its significance in speech analysis. Existing high performance epoch extraction algorithms require either dynamic programming techniques or a priori information of the average pitch period. An algorithm without such requirements is proposed based on integrated linear prediction residual (ILPR) which resembles the voice source signal. Half wave rectified and negated ILPR (or Hilbert transform of ILPR) is used as the pre-processed signal. A new non-linear temporal measure named the plosion index (PI) has been proposed for detecting `transients' in speech signal. An extension of PI, called the dynamic plosion index (DPI) is applied on pre-processed signal to estimate the epochs. The proposed DPI algorithm is validated using six large databases which provide simultaneous EGG recordings. Creaky and singing voice samples are also analyzed. The algorithm has been tested for its robustness in the presence of additive white and babble noise and on simulated telephone quality speech. The performance of the DPI algorithm is found to be comparable or better than five state-of-the-art techniques for the experiments considered.

Data acquisition and processing at ocean bottom for a Tsunami warning system

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The design and development of a Bottom Pressure Recorder for a Tsunami Early Warning System is described here. The special requirements that it should satisfy for the specific application of deployment at ocean bed and pressure monitoring of the water column above are dealt with. A high-resolution data digitization and low circuit power consumption are typical ones. The implementation details of the data sensing and acquisition part to meet these are also brought out. The data processing part typically encompasses a Tsunami detection algorithm that should detect an event of significance in the background of a variety of periodic and aperiodic noise signals. Such an algorithm and its simulation are presented. Further, the results of sea trials carried out on the system off the Chennai coast are presented. The high quality and fidelity of the data prove that the system design is robust despite its low cost and with suitable augmentations, is ready for a full-fledged deployment at ocean bed. (C) 2013 Elsevier Ltd. All rights reserved.

STUDENT'S-t MIXTURE MODEL BASED MULTI-INSTRUMENT RECOGNITION IN POLYPHONIC MUSIC

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the problem of multi-instrument recognition in polyphonic music signals. Individual instruments are modeled within a stochastic framework using Student's-t Mixture Models (tMMs). We impose a mixture of these instrument models on the polyphonic signal model. No a priori knowledge is assumed about the number of instruments in the polyphony. The mixture weights are estimated in a latent variable framework from the polyphonic data using an Expectation Maximization (EM) algorithm, derived for the proposed approach. The weights are shown to indicate instrument activity. The output of the algorithm is an Instrument Activity Graph (IAG), using which, it is possible to find out the instruments that are active at a given time. An average F-ratio of 0 : 7 5 is obtained for polyphonies containing 2-5 instruments, on a experimental test set of 8 instruments: clarinet, flute, guitar, harp, mandolin, piano, trombone and violin.

BILATERAL EDGE DETECTORS

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose to employ bilateral filters to solve the problem of edge detection. The proposed methodology presents an efficient and noise robust method for detecting edges. Classical bilateral filters smooth images without distorting edges. In this paper, we modify the bilateral filter to perform edge detection, which is the opposite of bilateral smoothing. The Gaussian domain kernel of the bilateral filter is replaced with an edge detection mask, and Gaussian range kernel is replaced with an inverted Gaussian kernel. The modified range kernel serves to emphasize dissimilar regions. The resulting approach effectively adapts the detection mask according as the pixel intensity differences. The results of the proposed algorithm are compared with those of standard edge detection masks. Comparisons of the bilateral edge detector with Canny edge detection algorithm, both after non-maximal suppression, are also provided. The results of our technique are observed to be better and noise-robust than those offered by methods employing masks alone, and are also comparable to the results from Canny edge detector, outperforming it in certain cases.

«
1
2
...
57
58
59
60
61
62
63
64
65
»