880 resultados para Speech signals
Resumo:
A speech by Sean O'Sullivan, given in the House of Commons, "For the Recognition of the Beaver as a Symbol of the Sovereignty of the Dominion of Canada".
Resumo:
UANL
Resumo:
Thesis written in co-mentorship with Richard Chase Smith Ph.D, of El Instituto del Bien Comun (IBC) in Peru. The attached file is a pdf created in Word. The pdf file serves to preserve the accuracy of the many linguistic symbols found in the text.
Resumo:
On étudie l’application des algorithmes de décomposition matricielles tel que la Factorisation Matricielle Non-négative (FMN), aux représentations fréquentielles de signaux audio musicaux. Ces algorithmes, dirigés par une fonction d’erreur de reconstruction, apprennent un ensemble de fonctions de base et un ensemble de coef- ficients correspondants qui approximent le signal d’entrée. On compare l’utilisation de trois fonctions d’erreur de reconstruction quand la FMN est appliquée à des gammes monophoniques et harmonisées: moindre carré, divergence Kullback-Leibler, et une mesure de divergence dépendente de la phase, introduite récemment. Des nouvelles méthodes pour interpréter les décompositions résultantes sont présentées et sont comparées aux méthodes utilisées précédemment qui nécessitent des connaissances du domaine acoustique. Finalement, on analyse la capacité de généralisation des fonctions de bases apprises par rapport à trois paramètres musicaux: l’amplitude, la durée et le type d’instrument. Pour ce faire, on introduit deux algorithmes d’étiquetage des fonctions de bases qui performent mieux que l’approche précédente dans la majorité de nos tests, la tâche d’instrument avec audio monophonique étant la seule exception importante.
Resumo:
Cette thèse étudie des modèles de séquences de haute dimension basés sur des réseaux de neurones récurrents (RNN) et leur application à la musique et à la parole. Bien qu'en principe les RNN puissent représenter les dépendances à long terme et la dynamique temporelle complexe propres aux séquences d'intérêt comme la vidéo, l'audio et la langue naturelle, ceux-ci n'ont pas été utilisés à leur plein potentiel depuis leur introduction par Rumelhart et al. (1986a) en raison de la difficulté de les entraîner efficacement par descente de gradient. Récemment, l'application fructueuse de l'optimisation Hessian-free et d'autres techniques d'entraînement avancées ont entraîné la recrudescence de leur utilisation dans plusieurs systèmes de l'état de l'art. Le travail de cette thèse prend part à ce développement. L'idée centrale consiste à exploiter la flexibilité des RNN pour apprendre une description probabiliste de séquences de symboles, c'est-à-dire une information de haut niveau associée aux signaux observés, qui en retour pourra servir d'à priori pour améliorer la précision de la recherche d'information. Par exemple, en modélisant l'évolution de groupes de notes dans la musique polyphonique, d'accords dans une progression harmonique, de phonèmes dans un énoncé oral ou encore de sources individuelles dans un mélange audio, nous pouvons améliorer significativement les méthodes de transcription polyphonique, de reconnaissance d'accords, de reconnaissance de la parole et de séparation de sources audio respectivement. L'application pratique de nos modèles à ces tâches est détaillée dans les quatre derniers articles présentés dans cette thèse. Dans le premier article, nous remplaçons la couche de sortie d'un RNN par des machines de Boltzmann restreintes conditionnelles pour décrire des distributions de sortie multimodales beaucoup plus riches. Dans le deuxième article, nous évaluons et proposons des méthodes avancées pour entraîner les RNN. Dans les quatre derniers articles, nous examinons différentes façons de combiner nos modèles symboliques à des réseaux profonds et à la factorisation matricielle non-négative, notamment par des produits d'experts, des architectures entrée/sortie et des cadres génératifs généralisant les modèles de Markov cachés. Nous proposons et analysons également des méthodes d'inférence efficaces pour ces modèles, telles la recherche vorace chronologique, la recherche en faisceau à haute dimension, la recherche en faisceau élagué et la descente de gradient. Finalement, nous abordons les questions de l'étiquette biaisée, du maître imposant, du lissage temporel, de la régularisation et du pré-entraînement.
Resumo:
La rapamycine est un immunosuppresseur utilisé pour traiter plusieurs types de maladies dont le cancer du rein. Son fonctionnement par l’inhibition de la voie de Tor mène à des changements dans des processus physiologiques, incluant le cycle cellulaire. Chez Saccharomyces cerevisiae, la rapamycine conduit à une altération rapide et globale de l’expression génique, déclenchant un remodelage de la chromatine. Nous proposons que les modifications des histones peuvent jouer un rôle crucial dans le remodelage de la chromatine en réponse à la rapamycine. Notre objectif principal est d’identifier d’une banque de mutants d’histone les variantes qui vont échouer à répondre à la rapamycine dans une tentative de réaliser une caractérisation des modifications d’histone critiques pour la réponse à cette drogue. Ainsi, nous avons réalisé un criblage d’une banque de mutants d’histone et identifié plusieurs mutants d‘histone dont la résistance à la rapamycine a été altérée. Nous avons caractérisé une de ces variantes d’histone, à savoir H2B, qui porte une substitution de l’alanine en arginine en position 95 (H2B-R95A) et démontré que ce mutant est extrêmement résistant à la rapamycine, et non à d’autres drogues. Des immunoprécipitations ont démontré que H2B-R95A est défectueux pour former un complexe avec Spt16, un facteur essentiel pour la dissociation de H2A et H2B de la chromatine, permetant la réplication et la transcription par les ADN et ARN polymérases, respectivement. Des expériences de ChIP-Chip et de micropuce ont démontré que l’arginine 95 de H2B est requise pour recruter Spt16 afin de permettre l’expression d’une multitude de gènes, dont certains font partie de la voie des phéromones. Des évidences seront présentées pour la première fois démontrant que la rapamycine peut activer la voie des phéromones et qu’une défectuosité dans cette voie cause la résistante à cette drogue.
Resumo:
A forward - biased point contact germanium signal diode placed inside a waveguide section along the E -vector is found to introduce significant phase shift of microwave signals . The usefulness of the arrangement as a phase modulator for microwave carriers is demonstrated. While there is a less significant amplitude modulation accompanying phase modulation , the insertion losses are found to be negligible. The observations can be explained on the basis of the capacitance variation of the barrier layer with forward current in the diode
Resumo:
Fine magnetic particles (size≅100 Å) belonging to the series ZnxFe1−xFe2O4 were synthesized by cold co-precipitation methods and their structural properties were evaluated using X-ray diffraction. Magnetization studies have been carried out using vibrating sample magnetometry (VSM) showing near-zero loss loop characteristics. Ferrofluids were then prepared employing these fine magnetic powders using oleic acid as surfactant and kerosene as carrier liquid by modifying the usually reported synthesis technique in order to induce anisotropy and enhance the magneto-optical signals. Liquid thin films of these fluids were prepared and field-induced laser transmission through these films was studied. The transmitted light intensity decreases at the centre with applied magnetic field in a linear fashion when subjected to low magnetic fields and saturate at higher fields. This is in accordance with the saturation in cluster formation. The pattern exhibited by these films in the presence of different magnetic fields was observed with the help of a CCD camera and was recorded photographically.
Resumo:
We propose to show in this paper, that the time series obtained from biological systems such as human brain are invariably nonstationary because of different time scales involved in the dynamical process. This makes the invariant parameters time dependent. We made a global analysis of the EEG data obtained from the eight locations on the skull space and studied simultaneously the dynamical characteristics from various parts of the brain. We have proved that the dynamical parameters are sensitive to the time scales and hence in the study of brain one must identify all relevant time scales involved in the process to get an insight in the working of brain.
Resumo:
Sonar signal processing comprises of a large number of signal processing algorithms for implementing functions such as Target Detection, Localisation, Classification, Tracking and Parameter estimation. Current implementations of these functions rely on conventional techniques largely based on Fourier Techniques, primarily meant for stationary signals. Interestingly enough, the signals received by the sonar sensors are often non-stationary and hence processing methods capable of handling the non-stationarity will definitely fare better than Fourier transform based methods.Time-frequency methods(TFMs) are known as one of the best DSP tools for nonstationary signal processing, with which one can analyze signals in time and frequency domains simultaneously. But, other than STFT, TFMs have been largely limited to academic research because of the complexity of the algorithms and the limitations of computing power. With the availability of fast processors, many applications of TFMs have been reported in the fields of speech and image processing and biomedical applications, but not many in sonar processing. A structured effort, to fill these lacunae by exploring the potential of TFMs in sonar applications, is the net outcome of this thesis. To this end, four TFMs have been explored in detail viz. Wavelet Transform, Fractional Fourier Transfonn, Wigner Ville Distribution and Ambiguity Function and their potential in implementing five major sonar functions has been demonstrated with very promising results. What has been conclusively brought out in this thesis, is that there is no "one best TFM" for all applications, but there is "one best TFM" for each application. Accordingly, the TFM has to be adapted and tailored in many ways in order to develop specific algorithms for each of the applications.
Resumo:
In this thesis, the applications of the recurrence quantification analysis in metal cutting operation in a lathe, with specific objective to detect tool wear and chatter, are presented.This study is based on the discovery that process dynamics in a lathe is low dimensional chaotic. It implies that the machine dynamics is controllable using principles of chaos theory. This understanding is to revolutionize the feature extraction methodologies used in condition monitoring systems as conventional linear methods or models are incapable of capturing the critical and strange behaviors associated with the metal cutting process.As sensor based approaches provide an automated and cost effective way to monitor and control, an efficient feature extraction methodology based on nonlinear time series analysis is much more demanding. The task here is more complex when the information has to be deduced solely from sensor signals since traditional methods do not address the issue of how to treat noise present in real-world processes and its non-stationarity. In an effort to get over these two issues to the maximum possible, this thesis adopts the recurrence quantification analysis methodology in the study since this feature extraction technique is found to be robust against noise and stationarity in the signals.The work consists of two different sets of experiments in a lathe; set-I and set-2. The experiment, set-I, study the influence of tool wear on the RQA variables whereas the set-2 is carried out to identify the sensitive RQA variables to machine tool chatter followed by its validation in actual cutting. To obtain the bounds of the spectrum of the significant RQA variable values, in set-i, a fresh tool and a worn tool are used for cutting. The first part of the set-2 experiments uses a stepped shaft in order to create chatter at a known location. And the second part uses a conical section having a uniform taper along the axis for creating chatter to onset at some distance from the smaller end by gradually increasing the depth of cut while keeping the spindle speed and feed rate constant.The study concludes by revealing the dependence of certain RQA variables; percent determinism, percent recurrence and entropy, to tool wear and chatter unambiguously. The performances of the results establish this methodology to be viable for detection of tool wear and chatter in metal cutting operation in a lathe. The key reason is that the dynamics of the system under study have been nonlinear and the recurrence quantification analysis can characterize them adequately.This work establishes that principles and practice of machining can be considerably benefited and advanced from using nonlinear dynamics and chaos theory.
Resumo:
Natural systems are inherently non linear. Recurrent behaviours are typical of natural systems. Recurrence is a fundamental property of non linear dynamical systems which can be exploited to characterize the system behaviour effectively. Cross recurrence based analysis of sensor signals from non linear dynamical system is presented in this thesis. The mutual dependency among relatively independent components of a system is referred as coupling. The analysis is done for a mechanically coupled system specifically designed for conducting experiment. Further, cross recurrence method is extended to the actual machining process in a lathe to characterize the chatter during turning. The result is verified by permutation entropy method. Conventional linear methods or models are incapable of capturing the critical and strange behaviours associated with the dynamical process. Hence any effective feature extraction methodologies should invariably gather information thorough nonlinear time series analysis. The sensor signals from the dynamical system normally contain noise and non stationarity. In an effort to get over these two issues to the maximum possible extent, this work adopts the cross recurrence quantification analysis (CRQA) methodology since it is found to be robust against noise and stationarity in the signals. The study reveals that the CRQA is capable of characterizing even weak coupling among system signals. It also divulges the dependence of certain CRQA variables like percent determinism, percent recurrence and entropy to chatter unambiguously. The surrogate data test shows that the results obtained by CRQA are the true properties of the temporal evolution of the dynamics and contain a degree of deterministic structure. The results are verified using permutation entropy (PE) to detect the onset of chatter from the time series. The present study ascertains that this CRP based methodology is capable of recognizing the transition from regular cutting to the chatter cutting irrespective of the machining parameters or work piece material. The results establish this methodology to be feasible for detection of chatter in metal cutting operation in a lathe.
Resumo:
This thesis investigates the potential use of zerocrossing information for speech sample estimation. It provides 21 new method tn) estimate speech samples using composite zerocrossings. A simple linear interpolation technique is developed for this purpose. By using this method the A/D converter can be avoided in a speech coder. The newly proposed zerocrossing sampling theory is supported with results of computer simulations using real speech data. The thesis also presents two methods for voiced/ unvoiced classification. One of these methods is based on a distance measure which is a function of short time zerocrossing rate and short time energy of the signal. The other one is based on the attractor dimension and entropy of the signal. Among these two methods the first one is simple and reguires only very few computations compared to the other. This method is used imtea later chapter to design an enhanced Adaptive Transform Coder. The later part of the thesis addresses a few problems in Adaptive Transform Coding and presents an improved ATC. Transform coefficient with maximum amplitude is considered as ‘side information’. This. enables more accurate tfiiz assignment enui step—size computation. A new bit reassignment scheme is also introduced in this work. Finally, sum ATC which applies switching between luiscrete Cosine Transform and Discrete Walsh-Hadamard Transform for voiced and unvoiced speech segments respectively is presented. Simulation results are provided to show the improved performance of the coder
Resumo:
Biometrics deals with the physiological and behavioral characteristics of an individual to establish identity. Fingerprint based authentication is the most advanced biometric authentication technology. The minutiae based fingerprint identification method offer reasonable identification rate. The feature minutiae map consists of about 70-100 minutia points and matching accuracy is dropping down while the size of database is growing up. Hence it is inevitable to make the size of the fingerprint feature code to be as smaller as possible so that identification may be much easier. In this research, a novel global singularity based fingerprint representation is proposed. Fingerprint baseline, which is the line between distal and intermediate phalangeal joint line in the fingerprint, is taken as the reference line. A polygon is formed with the singularities and the fingerprint baseline. The feature vectors are the polygonal angle, sides, area, type and the ridge counts in between the singularities. 100% recognition rate is achieved in this method. The method is compared with the conventional minutiae based recognition method in terms of computation time, receiver operator characteristics (ROC) and the feature vector length. Speech is a behavioural biometric modality and can be used for identification of a speaker. In this work, MFCC of text dependant speeches are computed and clustered using k-means algorithm. A backpropagation based Artificial Neural Network is trained to identify the clustered speech code. The performance of the neural network classifier is compared with the VQ based Euclidean minimum classifier. Biometric systems that use a single modality are usually affected by problems like noisy sensor data, non-universality and/or lack of distinctiveness of the biometric trait, unacceptable error rates, and spoof attacks. Multifinger feature level fusion based fingerprint recognition is developed and the performances are measured in terms of the ROC curve. Score level fusion of fingerprint and speech based recognition system is done and 100% accuracy is achieved for a considerable range of matching threshold
Resumo:
This thesis investigated the potential use of Linear Predictive Coding in speech communication applications. A Modified Block Adaptive Predictive Coder is developed, which reduces the computational burden and complexity without sacrificing the speech quality, as compared to the conventional adaptive predictive coding (APC) system. For this, changes in the evaluation methods have been evolved. This method is as different from the usual APC system in that the difference between the true and the predicted value is not transmitted. This allows the replacement of the high order predictor in the transmitter section of a predictive coding system, by a simple delay unit, which makes the transmitter quite simple. Also, the block length used in the processing of the speech signal is adjusted relative to the pitch period of the signal being processed rather than choosing a constant length as hitherto done by other researchers. The efficiency of the newly proposed coder has been supported with results of computer simulation using real speech data. Three methods for voiced/unvoiced/silent/transition classification have been presented. The first one is based on energy, zerocrossing rate and the periodicity of the waveform. The second method uses normalised correlation coefficient as the main parameter, while the third method utilizes a pitch-dependent correlation factor. The third algorithm which gives the minimum error probability has been chosen in a later chapter to design the modified coder The thesis also presents a comparazive study beh-cm the autocorrelation and the covariance methods used in the evaluaiicn of the predictor parameters. It has been proved that the azztocorrelation method is superior to the covariance method with respect to the filter stabf-it)‘ and also in an SNR sense, though the increase in gain is only small. The Modified Block Adaptive Coder applies a switching from pitch precitzion to spectrum prediction when the speech segment changes from a voiced or transition region to an unvoiced region. The experiments cont;-:ted in coding, transmission and simulation, used speech samples from .\£=_‘ajr2_1a:r1 and English phrases. Proposal for a speaker reecgnifion syste: and a phoneme identification system has also been outlized towards the end of the thesis.