10 resultados para speech signals
Resumo:
9 p. : il.
Resumo:
Time variability of the scattering signals from wind turbines may lead to degradation problems on the communication systems provided in the UHF band, especially under near field condition. In order to analyze the variability due to the rotation of the blades, this paper characterizes empirical Doppler spectra obtained from real samples of signals scattered by wind turbines with rotating blades under near field condition. A new Doppler spectrum model is proposed to fit the spectral characteristics of these signals, providing notable goodness of fit. Finally, the effect of this kind of time variability on the degradation of OFDM signals is studied.
Resumo:
Accurate and fast decoding of speech imagery from electroencephalographic (EEG) data could serve as a basis for a new generation of brain computer interfaces (BCIs), more portable and easier to use. However, decoding of speech imagery from EEG is a hard problem due to many factors. In this paper we focus on the analysis of the classification step of speech imagery decoding for a three-class vowel speech imagery recognition problem. We empirically show that different classification subtasks may require different classifiers for accurately decoding and obtain a classification accuracy that improves the best results previously published. We further investigate the relationship between the classifiers and different sets of features selected by the common spatial patterns method. Our results indicate that further improvement on BCIs based on speech imagery could be achieved by carefully selecting an appropriate combination of classifiers for the subtasks involved.
Resumo:
The work presented here is part of a larger study to identify novel technologies and biomarkers for early Alzheimer disease (AD) detection and it focuses on evaluating the suitability of a new approach for early AD diagnosis by non-invasive methods. The purpose is to examine in a pilot study the potential of applying intelligent algorithms to speech features obtained from suspected patients in order to contribute to the improvement of diagnosis of AD and its degree of severity. In this sense, Artificial Neural Networks (ANN) have been used for the automatic classification of the two classes (AD and control subjects). Two human issues have been analyzed for feature selection: Spontaneous Speech and Emotional Response. Not only linear features but also non-linear ones, such as Fractal Dimension, have been explored. The approach is non invasive, low cost and without any side effects. Obtained experimental results were very satisfactory and promising for early diagnosis and classification of AD patients.
Resumo:
Feature-based vocoders, e.g., STRAIGHT, offer a way to manipulate the perceived characteristics of the speech signal in speech transformation and synthesis. For the harmonic model, which provide excellent perceived quality, features for the amplitude parameters already exist (e.g., Line Spectral Frequencies (LSF), Mel-Frequency Cepstral Coefficients (MFCC)). However, because of the wrapping of the phase parameters, phase features are more difficult to design. To randomize the phase of the harmonic model during synthesis, a voicing feature is commonly used, which distinguishes voiced and unvoiced segments. However, voice production allows smooth transitions between voiced/unvoiced states which makes voicing segmentation sometimes tricky to estimate. In this article, two-phase features are suggested to represent the phase of the harmonic model in a uniform way, without voicing decision. The synthesis quality of the resulting vocoder has been evaluated, using subjective listening tests, in the context of resynthesis, pitch scaling, and Hidden Markov Model (HMM)-based synthesis. The experiments show that the suggested signal model is comparable to STRAIGHT or even better in some scenarios. They also reveal some limitations of the harmonic framework itself in the case of high fundamental frequencies.
Resumo:
Study of emotions in human-computer interaction is a growing research area. This paper shows an attempt to select the most significant features for emotion recognition in spoken Basque and Spanish Languages using different methods for feature selection. RekEmozio database was used as the experimental data set. Several Machine Learning paradigms were used for the emotion classification task. Experiments were executed in three phases, using different sets of features as classification variables in each phase. Moreover, feature subset selection was applied at each phase in order to seek for the most relevant feature subset. The three phases approach was selected to check the validity of the proposed approach. Achieved results show that an instance-based learning algorithm using feature subset selection techniques based on evolutionary algorithms is the best Machine Learning paradigm in automatic emotion recognition, with all different feature sets, obtaining a mean of 80,05% emotion recognition rate in Basque and a 74,82% in Spanish. In order to check the goodness of the proposed process, a greedy searching approach (FSS-Forward) has been applied and a comparison between them is provided. Based on achieved results, a set of most relevant non-speaker dependent features is proposed for both languages and new perspectives are suggested.
Resumo:
A central question in Neuroscience is that of how the nervous system generates the spatiotemporal commands needed to realize complex gestures, such as handwriting. A key postulate is that the central nervous system (CNS) builds up complex movements from a set of simpler motor primitives or control modules. In this study we examined the control modules underlying the generation of muscle activations when performing different types of movement: discrete, point-to-point movements in eight different directions and continuous figure-eight movements in both the normal, upright orientation and rotated 90 degrees. To test for the effects of biomechanical constraints, movements were performed in the frontal-parallel or sagittal planes, corresponding to two different nominal flexion/abduction postures of the shoulder. In all cases we measured limb kinematics and surface electromyographic activity (EMB) signals for seven different muscles acting around the shoulder. We first performed principal component analysis (PCA) of the EMG signals on a movement-by-movement basis. We found a surprisingly consistent pattern of muscle groupings across movement types and movement planes, although we could detect systematic differences between the PCs derived from movements performed in each sholder posture and between the principal components associated with the different orientations of the figure. Unexpectedly we found no systematic differences between the figute eights and the point-to-point movements. The first three principal components could be associated with a general co-contraction of all seven muscles plus two patterns of reciprocal activatoin. From these results, we surmise that both "discrete-rhythmic movements" such as the figure eight, and discrete point-to-point movement may be constructed from three different fundamental modules, one regulating the impedance of the limb over the time span of the movement and two others operating to generate movement, one aligned with the vertical and the other aligned with the horizontal.
Resumo:
We wished to replicate evidence that an experimental paradigm of speech illusions is associated with psychotic experiences. Fifty-four patients with a first episode of psychosis (FEP) and 150 healthy subjects were examined in an experimental paradigm assessing the presence of speech illusion in neutral white noise. Socio-demographic, cognitive function and family history data were collected. The Positive and Negative Syndrome Scale (PANSS) was administered in the patient group and the Structured Interview for Schizotypy-Revised (SIS-R), and the Community Assessment of Psychic Experiences (CAPE) in the control group. Patients had a much higher rate of speech illusions (33.3% versus 8.7%, ORadjusted: 5.1, 95% CI: 2.3-11.5), which was only partly explained by differences in IQ (ORadjusted: 3.4, 95% CI: 1.4-8.3). Differences were particularly marked for signals in random noise that were perceived as affectively salient (ORadjusted: 9.7, 95% CI: 1.8-53.9). Speech illusion tended to be associated with positive symptoms in patients (ORadjusted: 3.3, 95% CI: 0.9-11.6), particularly affectively salient illusions (ORadjusted: 8.3, 95% CI: 0.7-100.3). In controls, speech illusions were not associated with positive schizotypy (ORadjusted: 1.1, 95% CI: 0.3-3.4) or self-reported psychotic experiences (ORadjusted: 1.4, 95% CI: 0.4-4.6). Experimental paradigms indexing the tendency to detect affectively salient signals in noise may be used to identify liability to psychosis.
Resumo:
We distinguish two general approaches to inner speech (IS) the "format" and the "activity" views and defend the activity view. The format view grounds the utility of IS on features of the representational format of language, and is related to the thesis that the proper function of IS is to make conscious thinking possible. IS appears typically as a product constituted by representations of phonological features. The view also has implications for the idea that passivity phenomena in cognition may be misat-tributed IS. The activity view sees IS as a speaking activity that does not have a proper function in cognition. It simply inherits the array of functions of outer speech. We argue that it is methodologically advisable to start from this variety of uses, which suggests commonalities between internal and external activities. The format view has several problems; it has to deny "unsymbolized thinking"; it cannot easily explain how IS makes thoughts available to consciousness, and it cannot explain those uses of IS where its format features apparently play no role. The activity view not only lacks these problems but also has explanatory advantages: construing IS as an activity allows it to be integrally constituted by its content; the view is able to construe unsymbolized thinking as part of a continuum of phenomena that exploit the same mechanisms, and it offers a simple explanation for the variety of uses of IS
Resumo:
In multisource industrial scenarios (MSIS) coexist NOAA generating activities with other productive sources of airborne particles, such as parallel processes of manufacturing or electrical and diesel machinery. A distinctive characteristic of MSIS is the spatially complex distribution of aerosol sources, as well as their potential differences in dynamics, due to the feasibility of multi-task configuration at a given time. Thus, the background signal is expected to challenge the aerosol analyzers at a probably wide range of concentrations and size distributions, depending of the multisource configuration at a given time. Monitoring and prediction by using statistical analysis of time series captured by on-line particle analyzers in industrial scenarios, have been proven to be feasible in predicting PNC evolution provided a given quality of net signals (difference between signal at source and background). However the analysis and modelling of non-consistent time series, influenced by low levels of SNR (Signal-Noise Ratio) could build a misleading basis for decision making. In this context, this work explores the use of stochastic models based on ARIMA methodology to monitor and predict exposure values (PNC). The study was carried out in a MSIS where an case study focused on the manufacture of perforated tablets of nano-TiO2 by cold pressing was performed