842 resultados para PSWT Based Linear Predictive Coding
Resumo:
During 1990's the Wavelet Transform emerged as an important signal processing tool with potential applications in time-frequency analysis and non-stationary signal processing.Wavelets have gained popularity in broad range of disciplines like signal/image compression, medical diagnostics, boundary value problems, geophysical signal processing, statistical signal processing,pattern recognition,underwater acoustics etc.In 1993, G. Evangelista introduced the Pitch- synchronous Wavelet Transform, which is particularly suited for pseudo-periodic signal processing.The work presented in this thesis mainly concentrates on two interrelated topics in signal processing,viz. the Wavelet Transform based signal compression and the computation of Discrete Wavelet Transform. A new compression scheme is described in which the Pitch-Synchronous Wavelet Transform technique is combined with the popular linear Predictive Coding method for pseudo-periodic signal processing. Subsequently,A novel Parallel Multiple Subsequence structure is presented for the efficient computation of Wavelet Transform. Case studies also presented to highlight the potential applications.
Resumo:
Speech signals are one of the most important means of communication among the human beings. In this paper, a comparative study of two feature extraction techniques are carried out for recognizing speaker independent spoken isolated words. First one is a hybrid approach with Linear Predictive Coding (LPC) and Artificial Neural Networks (ANN) and the second method uses a combination of Wavelet Packet Decomposition (WPD) and Artificial Neural Networks. Voice signals are sampled directly from the microphone and then they are processed using these two techniques for extracting the features. Words from Malayalam, one of the four major Dravidian languages of southern India are chosen for recognition. Training, testing and pattern recognition are performed using Artificial Neural Networks. Back propagation method is used to train the ANN. The proposed method is implemented for 50 speakers uttering 20 isolated words each. Both the methods produce good recognition accuracy. But Wavelet Packet Decomposition is found to be more suitable for recognizing speech because of its multi-resolution characteristics and efficient time frequency localizations
Resumo:
This paper deals with non-linear transformations for improving the performance of an entropy-based voice activity detector (VAD). The idea to use a non-linear transformation has already been applied in the field of speech linear prediction, or linear predictive coding (LPC), based on source separation techniques, where a score function is added to classical equations in order to take into account the true distribution of the signal. We explore the possibility of estimating the entropy of frames after calculating its score function, instead of using original frames. We observe that if the signal is clean, the estimated entropy is essentially the same; if the signal is noisy, however, the frames transformed using the score function may give entropy that is different in voiced frames as compared to nonvoiced ones. Experimental evidence is given to show that this fact enables voice activity detection under high noise, where the simple entropy method fails.
Resumo:
This thesis investigated the potential use of Linear Predictive Coding in speech communication applications. A Modified Block Adaptive Predictive Coder is developed, which reduces the computational burden and complexity without sacrificing the speech quality, as compared to the conventional adaptive predictive coding (APC) system. For this, changes in the evaluation methods have been evolved. This method is as different from the usual APC system in that the difference between the true and the predicted value is not transmitted. This allows the replacement of the high order predictor in the transmitter section of a predictive coding system, by a simple delay unit, which makes the transmitter quite simple. Also, the block length used in the processing of the speech signal is adjusted relative to the pitch period of the signal being processed rather than choosing a constant length as hitherto done by other researchers. The efficiency of the newly proposed coder has been supported with results of computer simulation using real speech data. Three methods for voiced/unvoiced/silent/transition classification have been presented. The first one is based on energy, zerocrossing rate and the periodicity of the waveform. The second method uses normalised correlation coefficient as the main parameter, while the third method utilizes a pitch-dependent correlation factor. The third algorithm which gives the minimum error probability has been chosen in a later chapter to design the modified coder The thesis also presents a comparazive study beh-cm the autocorrelation and the covariance methods used in the evaluaiicn of the predictor parameters. It has been proved that the azztocorrelation method is superior to the covariance method with respect to the filter stabf-it)‘ and also in an SNR sense, though the increase in gain is only small. The Modified Block Adaptive Coder applies a switching from pitch precitzion to spectrum prediction when the speech segment changes from a voiced or transition region to an unvoiced region. The experiments cont;-:ted in coding, transmission and simulation, used speech samples from .\£=_‘ajr2_1a:r1 and English phrases. Proposal for a speaker reecgnifion syste: and a phoneme identification system has also been outlized towards the end of the thesis.
Resumo:
Environmental tobacco smoke (ETS) is recognized as an occupational hazard in the hospitality industry. Although Portuguese legislation banned smoking in most indoor public spaces, it is still allowed in some restaurants/bars, representing a potential risk to the workers’ health, particularly for chronic respiratory diseases. The aims of this work were to characterize biomarkers of early genetic effects and to disclose proteomic signatures associated to occupational exposure to ETS and with potential to predict respiratory diseases development. A detailed lifestyle survey and clinical evaluation (including spirometry) were performed in 81 workers from Lisbon restaurants. ETS exposure was assessed through the level of PM 2.5 in indoor air and the urinary level of cotinine. The plasma samples were immunodepleted and analysed by 2D-SDSPAGE followed by in-gel digestion and LC-MS/MS. DNA lesions and chromosome damage were analysed innlymphocytes and in exfoliated buccal cells from 19 cigarette smokers, 29 involuntary smokers, and 33 non-smokers not exposed to tobacco smoke. Also, the DNA repair capacity was evaluated using an ex vivo challenge comet assay with an alkylating agent (EMS). All workers were considered healthy and recorded normal lung function. Interestingly, following 2D-DIGE-MS (MALDI-TOF/TOF), 61 plasma proteins were found differentially expressed in ETS-exposed subjects, including 38 involved in metabolism, acute-phase respiratory inflammation, and immune or vascular functions. On the other hand, the involuntary smokers showed neither an increased level of DNA/chromosome damage on lymphocytes nor an increased number of micronuclei in buccal cells, when compared to non-exposed non-smokers. Noteworthy, lymphocytes challenge with EMS resulted in a significantly lower level of DNA breaks in ETS-exposed as compared to non-exposed workers (P<0.0001) suggestive of an adaptive response elicited by the previous exposure to low levels of ETS. Overall, changes in proteome may be promising early biomarkers of exposure to ETS. Likewise, alterations of the DNA repair competence observed upon ETS exposure deserves to be further understood. Work supported by Fundação Calouste Gulbenkian, ACSS and FCT/Polyannual Funding Program.
Resumo:
This report describes a new approach to the problem of scheduling highway construction type projects. The technique can accurately model linear activities and identify the controlling activity path on a linear schedule. Current scheduling practices are unable to accomplish these two tasks with any accuracy for linear activities, leaving planners and manager suspicious of the information they provide. Basic linear scheduling is not a new technique, and many attempts have been made to apply it to various types of work in the past. However, the technique has never been widely used because of the lack of an analytical approach to activity relationships and development of an analytical approach to determining controlling activities. The Linear Scheduling Model (LSM) developed in this report, completes the linear scheduling technique by adding to linear scheduling all of the analytical capabilities, including computer applications, present in CPM scheduling today. The LSM has tremendous potential, and will likely have a significant impact on the way linear construction is scheduled in the future.
Resumo:
This paper proposes a three-shot improvement scheme for the hard-decision based method (HDM), an implementation solution for linear decorrelating detector (LDD) in asynchronous DS/CDMA systems. By taking advantage of the preceding (already reconstructed) bit and the matched filter output for the following two bits, the coupling between temporally adjacent bits (TABs), which always exists for asynchronous systems, is greatly suppressed and the performance of the original HDM is substantially improved. This new scheme requires no signaling overhead yet offers nearly the same performance as those more complicated methods. Also, it can easily accommodate the change in the number of active users in the channel, as no symbol/bit grouping is involved. Finally, the influence of synchronisation errors is investigated.
Resumo:
Forensic speaker comparison exams have complex characteristics, demanding a long time for manual analysis. A method for automatic recognition of vowels, providing feature extraction for acoustic analysis is proposed, aiming to contribute as a support tool in these exams. The proposal is based in formant measurements by LPC (Linear Predictive Coding), selectively by fundamental frequency detection, zero crossing rate, bandwidth and continuity, with the clustering being done by the k-means method. Experiments using samples from three different databases have shown promising results, in which the regions corresponding to five of the Brasilian Portuguese vowels were successfully located, providing visualization of a speaker’s vocal tract behavior, as well as the detection of segments corresponding to target vowels.
Resumo:
In this thesis, a tube-based Distributed Economic Predictive Control (DEPC) scheme is presented for a group of dynamically coupled linear subsystems. These subsystems are components of a large scale system and control inputs are computed based on optimizing a local economic objective. Each subsystem is interacting with its neighbors by sending its future reference trajectory, at each sampling time. It solves a local optimization problem in parallel, based on the received future reference trajectories of the other subsystems. To ensure recursive feasibility and a performance bound, each subsystem is constrained to not deviate too much from its communicated reference trajectory. This difference between the plan trajectory and the communicated one is interpreted as a disturbance on the local level. Then, to ensure the satisfaction of both state and input constraints, they are tightened by considering explicitly the effect of these local disturbances. The proposed approach averages over all possible disturbances, handles tightened state and input constraints, while satisfies the compatibility constraints to guarantee that the actual trajectory lies within a certain bound in the neighborhood of the reference one. Each subsystem is optimizing a local arbitrary economic objective function in parallel while considering a local terminal constraint to guarantee recursive feasibility. In this framework, economic performance guarantees for a tube-based distributed predictive control (DPC) scheme are developed rigorously. It is presented that the closed-loop nominal subsystem has a robust average performance bound locally which is no worse than that of a local robust steady state. Since a robust algorithm is applying on the states of the real (with disturbances) subsystems, this bound can be interpreted as an average performance result for the real closed-loop system. To this end, we present our outcomes on local and global performance, illustrated by a numerical example.
Resumo:
Video coding technologies have played a major role in the explosion of large market digital video applications and services. In this context, the very popular MPEG-x and H-26x video coding standards adopted a predictive coding paradigm, where complex encoders exploit the data redundancy and irrelevancy to 'control' much simpler decoders. This codec paradigm fits well applications and services such as digital television and video storage where the decoder complexity is critical, but does not match well the requirements of emerging applications such as visual sensor networks where the encoder complexity is more critical. The Slepian Wolf and Wyner-Ziv theorems brought the possibility to develop the so-called Wyner-Ziv video codecs, following a different coding paradigm where it is the task of the decoder, and not anymore of the encoder, to (fully or partly) exploit the video redundancy. Theoretically, Wyner-Ziv video coding does not incur in any compression performance penalty regarding the more traditional predictive coding paradigm (at least for certain conditions). In the context of Wyner-Ziv video codecs, the so-called side information, which is a decoder estimate of the original frame to code, plays a critical role in the overall compression performance. For this reason, much research effort has been invested in the past decade to develop increasingly more efficient side information creation methods. This paper has the main objective to review and evaluate the available side information methods after proposing a classification taxonomy to guide this review, allowing to achieve more solid conclusions and better identify the next relevant research challenges. After classifying the side information creation methods into four classes, notably guess, try, hint and learn, the review of the most important techniques in each class and the evaluation of some of them leads to the important conclusion that the side information creation methods provide better rate-distortion (RD) performance depending on the amount of temporal correlation in each video sequence. It became also clear that the best available Wyner-Ziv video coding solutions are almost systematically based on the learn approach. The best solutions are already able to systematically outperform the H.264/AVC Intra, and also the H.264/AVC zero-motion standard solutions for specific types of content. (C) 2013 Elsevier B.V. All rights reserved.
Resumo:
This paper introduces local distance-based generalized linear models. These models extend (weighted) distance-based linear models firstly with the generalized linear model concept, then by localizing. Distances between individuals are the only predictor information needed to fit these models. Therefore they are applicable to mixed (qualitative and quantitative) explanatory variables or when the regressor is of functional type. Models can be fitted and analysed with the R package dbstats, which implements several distancebased prediction methods.
Resumo:
The prediction filters are well known models for signal estimation, in communications, control and many others areas. The classical method for deriving linear prediction coding (LPC) filters is often based on the minimization of a mean square error (MSE). Consequently, second order statistics are only required, but the estimation is only optimal if the residue is independent and identically distributed (iid) Gaussian. In this paper, we derive the ML estimate of the prediction filter. Relationships with robust estimation of auto-regressive (AR) processes, with blind deconvolution and with source separation based on mutual information minimization are then detailed. The algorithm, based on the minimization of a high-order statistics criterion, uses on-line estimation of the residue statistics. Experimental results emphasize on the interest of this approach.
Resumo:
The linear prediction coding of speech is based in the assumption that the generation model is autoregresive. In this paper we propose a structure to cope with the nonlinear effects presents in the generation of the speech signal. This structure will consist of two stages, the first one will be a classical linear prediction filter, and the second one will model the residual signal by means of two nonlinearities between a linear filter. The coefficients of this filter are computed by means of a gradient search on the score function. This is done in order to deal with the fact that the probability distribution of the residual signal still is not gaussian. This fact is taken into account when the coefficients are computed by a ML estimate. The algorithm based on the minimization of a high-order statistics criterion, uses on-line estimation of the residue statistics and is based on blind deconvolution of Wiener systems [1]. Improvements in the experimental results with speech signals emphasize on the interest of this approach.