967 resultados para Speaker Recognition, Text-constrained, Multilingual, Speaker Verification, HMMs


Relevância:

30.00% 30.00%

Publicador:

Resumo:

While humans can easily segregate and track a speaker's voice in a loud noisy environment, most modern speech recognition systems still perform poorly in loud background noise. The computational principles behind auditory source segregation in humans is not yet fully understood. In this dissertation, we develop a computational model for source segregation inspired by auditory processing in the brain. To support the key principles behind the computational model, we conduct a series of electro-encephalography experiments using both simple tone-based stimuli and more natural speech stimulus. Most source segregation algorithms utilize some form of prior information about the target speaker or use more than one simultaneous recording of the noisy speech mixtures. Other methods develop models on the noise characteristics. Source segregation of simultaneous speech mixtures with a single microphone recording and no knowledge of the target speaker is still a challenge. Using the principle of temporal coherence, we develop a novel computational model that exploits the difference in the temporal evolution of features that belong to different sources to perform unsupervised monaural source segregation. While using no prior information about the target speaker, this method can gracefully incorporate knowledge about the target speaker to further enhance the segregation.Through a series of EEG experiments we collect neurological evidence to support the principle behind the model. Aside from its unusual structure and computational innovations, the proposed model provides testable hypotheses of the physiological mechanisms of the remarkable perceptual ability of humans to segregate acoustic sources, and of its psychophysical manifestations in navigating complex sensory environments. Results from EEG experiments provide further insights into the assumptions behind the model and provide motivation for future single unit studies that can provide more direct evidence for the principle of temporal coherence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Value and reasons for action are often cited by rationalists and moral realists as providing a desire-independent foundation for normativity. Those maintaining instead that normativity is dependent upon motivation often deny that anything called '"value" or "reasons" exists. According to the interest-relational theory, something has value relative to some perspective of desire just in case it satisfies those desires, and a consideration is a reason for some action just in case it indicates that something of value will be accomplished by that action. Value judgements therefore describe real properties of objects and actions, but have no normative significance independent of desires. It is argued that only the interest-relational theory can account for the practical significance of value and reasons for action. Against the Kantian hypothesis of prescriptive rational norms, I attack the alleged instrumental norm or hypothetical imperative, showing that the normative force for taking the means to our ends is explicable in terms of our desire for the end, and not as a command of reason. This analysis also provides a solution to the puzzle concerning the connection between value judgement and motivation. While it is possible to hold value judgements without motivation, the connection is more than accidental. This is because value judgements are usually but not always made from the perspective of desires that actually motivate the speaker. In the normal case judgement entails motivation. But often we conversationally borrow external perspectives of desire, and subsequent judgements do not entail motivation. This analysis drives a critique of a common practice as a misuse of normative language. The "absolutist" attempts to use and, as philosopher, analyze normative language in such a way as to justify the imposition of certain interests over others. But these uses and analyses are incoherent - in denying relativity to particular desires they conflict with the actual meaning of these utterances, which is always indexed to some particular set of desires.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Visual recognition is a fundamental research topic in computer vision. This dissertation explores datasets, features, learning, and models used for visual recognition. In order to train visual models and evaluate different recognition algorithms, this dissertation develops an approach to collect object image datasets on web pages using an analysis of text around the image and of image appearance. This method exploits established online knowledge resources (Wikipedia pages for text; Flickr and Caltech data sets for images). The resources provide rich text and object appearance information. This dissertation describes results on two datasets. The first is Berg’s collection of 10 animal categories; on this dataset, we significantly outperform previous approaches. On an additional set of 5 categories, experimental results show the effectiveness of the method. Images are represented as features for visual recognition. This dissertation introduces a text-based image feature and demonstrates that it consistently improves performance on hard object classification problems. The feature is built using an auxiliary dataset of images annotated with tags, downloaded from the Internet. Image tags are noisy. The method obtains the text features of an unannotated image from the tags of its k-nearest neighbors in this auxiliary collection. A visual classifier presented with an object viewed under novel circumstances (say, a new viewing direction) must rely on its visual examples. This text feature may not change, because the auxiliary dataset likely contains a similar picture. While the tags associated with images are noisy, they are more stable when appearance changes. The performance of this feature is tested using PASCAL VOC 2006 and 2007 datasets. This feature performs well; it consistently improves the performance of visual object classifiers, and is particularly effective when the training dataset is small. With more and more collected training data, computational cost becomes a bottleneck, especially when training sophisticated classifiers such as kernelized SVM. This dissertation proposes a fast training algorithm called Stochastic Intersection Kernel Machine (SIKMA). This proposed training method will be useful for many vision problems, as it can produce a kernel classifier that is more accurate than a linear classifier, and can be trained on tens of thousands of examples in two minutes. It processes training examples one by one in a sequence, so memory cost is no longer the bottleneck to process large scale datasets. This dissertation applies this approach to train classifiers of Flickr groups with many group training examples. The resulting Flickr group prediction scores can be used to measure image similarity between two images. Experimental results on the Corel dataset and a PASCAL VOC dataset show the learned Flickr features perform better on image matching, retrieval, and classification than conventional visual features. Visual models are usually trained to best separate positive and negative training examples. However, when recognizing a large number of object categories, there may not be enough training examples for most objects, due to the intrinsic long-tailed distribution of objects in the real world. This dissertation proposes an approach to use comparative object similarity. The key insight is that, given a set of object categories which are similar and a set of categories which are dissimilar, a good object model should respond more strongly to examples from similar categories than to examples from dissimilar categories. This dissertation develops a regularized kernel machine algorithm to use this category dependent similarity regularization. Experiments on hundreds of categories show that our method can make significant improvement for categories with few or even no positive examples.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The study of ichthyio-plankton stages and its relations with the environment and other organisms is therefore crucial for a correct use of fishery resources. In this context, the extraction and the analysis of the content of the digestive tract, is a key method for the identification of the diet in early larval stages, the determination of the resources they rely on and possibly a comparison with the diet of other species. Additionally this approach could be useful in determination on occurrence of species competition. This technique is preceded by the analysis of morphometric data (Blackith & Reyment, 1971; Marcus, 1990), that is the acquisition of quantitative variables measured from the morphology of the object of study. They are linear distances, count, angles and ratios. The subsequent application of multivariate statistical methods, aims to quantify the changes in morphological measures between and within groups, relating them to the type and size of prey and evaluate if some changes appear in food choices along the larvae growth.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Anti-signal recognition particle (SRP) myopathy is a rare idiopathic inflammatory myositis that usually affects middle-age women, and is characterized by rapidly progressive proximal and symmetrical muscle weakness, elevated creatine kinase levels, severe necrotizing immune-mediated myopathy, presence of anti-SRP autoantibodies and poor response to steroid therapy. We report a geriatric case of a previously independent patient, presenting with slow onset of proximal paraparesis, myalgia and severe gait impairment. The patient was treated with steroid and azathioprine, with laboratory and pain response but modest muscle strength improvement. The clinical presentation of this unusual patient was atypical, which hampered the correct diagnosis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Forensic speaker comparison exams have complex characteristics, demanding a long time for manual analysis. A method for automatic recognition of vowels, providing feature extraction for acoustic analysis is proposed, aiming to contribute as a support tool in these exams. The proposal is based in formant measurements by LPC (Linear Predictive Coding), selectively by fundamental frequency detection, zero crossing rate, bandwidth and continuity, with the clustering being done by the k-means method. Experiments using samples from three different databases have shown promising results, in which the regions corresponding to five of the Brasilian Portuguese vowels were successfully located, providing visualization of a speaker’s vocal tract behavior, as well as the detection of segments corresponding to target vowels.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Close similarities have been found between the otoliths of sea-caught and laboratory-reared larvae of the common sole Solea solea (L.), given appropriate temperatures and nourishment of the latter. But from hatching to mouth formation. and during metamorphosis, sole otoliths have proven difficult to read because the increments may be less regular and low contrast. In this study, the growth increments in otoliths of larvae reared at 12 degrees C were counted by light microscopy to test the hypothesis of daily deposition, with some results verified using scanning electron microscopy (SEM), and by image analysis in order to compare the reliability of the 2 methods in age estimation. Age was first estimated (in days posthatch) from light micrographs of whole mounted otoliths. Counts were initiated from the increment formed at the time of month opening (Day 4). The average incremental deposition rate was consistent with the daily hypothesis. However, the light-micrograph readings tended to underestimate the mean ages of the larvae. Errors were probably associated with the low-contrast increments: those deposited after the mouth formation during the transition to first feeding, and those deposited from the onset of eye migration (about 20 d posthatch) during metamorphosis. SEM failed to resolve these low-contrast areas accurately because of poor etching. A method using image analysis was applied to a subsample of micrograph-counted otoliths. The image analysis was supported by an algorithm of pattern recognition (Growth Demodulation Algorithm, GDA). On each otolith, the GDA method integrated the growth pattern of these larval otoliths to averaged data from different radial profiles, in order to demodulate the exponential trend of the signal before spectral analysis (Fast Fourier Transformation, FFT). This second method both allowed more precise designation of increments, particularly for low-contrast areas, and more accurate readings but increased error in mean age estimation. The variability is probably due to a still rough perception of otolith increments by the GDA method, counting being achieved through a theoretical exponential pattern and mean estimates being given by FFT. Although this error variability was greater than expected, the method provides for improvement in both speed and accuracy in otolith readings.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Several studies have reported impairments in decoding emotional facial expressions in intimate partner violence (IPV) perpetrators. However, the mechanisms that underlie these impaired skills are not well known. Given this gap in the literature, we aimed to establish whether IPV perpetrators (n = 18) differ in their emotion decoding process, attentional skills, and testosterone (T), cortisol (C) levels and T/C ratio in comparison with controls (n = 20), and also to examine the moderating role of the group and hormonal parameters in the relationship between attention skills and the emotion decoding process. Our results demonstrated that IPV perpetrators showed poorer emotion recognition and higher attention switching costs than controls. Nonetheless, they did not differ in attention to detail and hormonal parameters. Finally, the slope predicting emotion recognition from deficits in attention switching became steeper as T levels increased, especially in IPV perpetrators, although the basal C and T/C ratios were unrelated to emotion recognition and attention deficits for both groups. These findings contribute to a better understanding of the mechanisms underlying emotion recognition deficits. These factors therefore constitute the target for future interventions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this article is to examine the factors that affect the inclusion of pupils in programmes for children with special needs from the perspective of the theory of recognition. The concept of recognition, which includes three aspects of social justice (economic, cultural and political), argues that the institutional arrangements that prevent ‘parity of participation’ in the school social life of the children with special needs are affected not only by economic distribution but also by the patterns of cultural values. A review of the literature shows that the arrangements of education of children with special needs are influenced primarily by the patterns of cultural values of capability and inferiority, as well as stereotypical images of children with special needs. Due to the significant emphasis on learning skills for academic knowledge and grades, less attention is dedicated to factors of recognition and representational character, making it impossible to improve some meaningful elements of inclusion. Any participation of pupils in activities, the voices of the children, visibility of the children due to achievements and the problems of arbitrariness in determining boundaries between programmes are some such elements. Moreover, aided by theories, the actions that could contribute to better inclusion are reviewed. An effective approach to changes would be the creation of transformative conditions for the recognition and balancing of redistribution, recognition, and representation. (DIPF/Orig.)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The comparative analysis of Polish and Spanish political discourse in the multilingual context of European institutions is challenging not only due to linguistic, cultural, geopolitical and social differences, but also because of a relatively short history of such contacts in the EU framework. Intercultural communication, as a dynamic social practice is a fascinating object of investigation. Bidirectional comparative analysis of Polish and Spanish oral texts allows define the barriers of such communication. It encompasses the discursive act together with its objectives, strategies and consequences, and also its raison d’être. It explains why different strategies reflected through discursive categories were used. Consequently it describes both, conditions and outcomes of identity negotiation. The latter is a political competence perceived and evaluated by the direct interlocutors, the participants of the political debate, and indirectly, by a public opinion of the European Union. That proves it is two-level communication. The negotiation of political identity through discourse, according to the Ting-Toomey theory, can lead to maintaining, loosing, recovering or reinforcing it282. The Identity Negotiation Theory includes the construction and development of personal, relational, role and desired identity and is one of the methodological axes of this investigation. Political identity consists of exhibiting necessary competences to efficiently participate in the legislation process, for example, in order to present amendments, promote a given ideology, participate in controversial discussions and manage conflicts, and, finally, gain the support of public opinion. The analysis of creation, negotiation, maintenance, recovery and promotion of the political identity is performed through the identification and description of discursive categories proposed by Van Dijk283 and adapted to the needs of this study. This is the second methodological axe of the investigation. The following questions arise: which discursive strategies, used by Polish and Spanish politicians, will be communication facilitators and which will be barriers hampering communication? Which strategies show political competencies of the speaker, his or her influence in the legal EU reality through discourse?...

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The main objectives of this thesis are to validate an improved principal components analysis (IPCA) algorithm on images; designing and simulating a digital model for image compression, face recognition and image detection by using a principal components analysis (PCA) algorithm and the IPCA algorithm; designing and simulating an optical model for face recognition and object detection by using the joint transform correlator (JTC); establishing detection and recognition thresholds for each model; comparing between the performance of the PCA algorithm and the performance of the IPCA algorithm in compression, recognition and, detection; and comparing between the performance of the digital model and the performance of the optical model in recognition and detection. The MATLAB © software was used for simulating the models. PCA is a technique used for identifying patterns in data and representing the data in order to highlight any similarities or differences. The identification of patterns in data of high dimensions (more than three dimensions) is too difficult because the graphical representation of data is impossible. Therefore, PCA is a powerful method for analyzing data. IPCA is another statistical tool for identifying patterns in data. It uses information theory for improving PCA. The joint transform correlator (JTC) is an optical correlator used for synthesizing a frequency plane filter for coherent optical systems. The IPCA algorithm, in general, behaves better than the PCA algorithm in the most of the applications. It is better than the PCA algorithm in image compression because it obtains higher compression, more accurate reconstruction, and faster processing speed with acceptable errors; in addition, it is better than the PCA algorithm in real-time image detection due to the fact that it achieves the smallest error rate as well as remarkable speed. On the other hand, the PCA algorithm performs better than the IPCA algorithm in face recognition because it offers an acceptable error rate, easy calculation, and a reasonable speed. Finally, in detection and recognition, the performance of the digital model is better than the performance of the optical model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Computer simulation programs are essential tools for scientists and engineers to understand a particular system of interest. As expected, the complexity of the software increases with the depth of the model used. In addition to the exigent demands of software engineering, verification of simulation programs is especially challenging because the models represented are complex and ridden with unknowns that will be discovered by developers in an iterative process. To manage such complexity, advanced verification techniques for continually matching the intended model to the implemented model are necessary. Therefore, the main goal of this research work is to design a useful verification and validation framework that is able to identify model representation errors and is applicable to generic simulators. The framework that was developed and implemented consists of two parts. The first part is First-Order Logic Constraint Specification Language (FOLCSL) that enables users to specify the invariants of a model under consideration. From the first-order logic specification, the FOLCSL translator automatically synthesizes a verification program that reads the event trace generated by a simulator and signals whether all invariants are respected. The second part consists of mining the temporal flow of events using a newly developed representation called State Flow Temporal Analysis Graph (SFTAG). While the first part seeks an assurance of implementation correctness by checking that the model invariants hold, the second part derives an extended model of the implementation and hence enables a deeper understanding of what was implemented. The main application studied in this work is the validation of the timing behavior of micro-architecture simulators. The study includes SFTAGs generated for a wide set of benchmark programs and their analysis using several artificial intelligence algorithms. This work improves the computer architecture research and verification processes as shown by the case studies and experiments that have been conducted.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Modern power networks incorporate communications and information technology infrastructure into the electrical power system to create a smart grid in terms of control and operation. The smart grid enables real-time communication and control between consumers and utility companies allowing suppliers to optimize energy usage based on price preference and system technical issues. The smart grid design aims to provide overall power system monitoring, create protection and control strategies to maintain system performance, stability and security. This dissertation contributed to the development of a unique and novel smart grid test-bed laboratory with integrated monitoring, protection and control systems. This test-bed was used as a platform to test the smart grid operational ideas developed here. The implementation of this system in the real-time software creates an environment for studying, implementing and verifying novel control and protection schemes developed in this dissertation. Phasor measurement techniques were developed using the available Data Acquisition (DAQ) devices in order to monitor all points in the power system in real time. This provides a practical view of system parameter changes, system abnormal conditions and its stability and security information system. These developments provide valuable measurements for technical power system operators in the energy control centers. Phasor Measurement technology is an excellent solution for improving system planning, operation and energy trading in addition to enabling advanced applications in Wide Area Monitoring, Protection and Control (WAMPAC). Moreover, a virtual protection system was developed and implemented in the smart grid laboratory with integrated functionality for wide area applications. Experiments and procedures were developed in the system in order to detect the system abnormal conditions and apply proper remedies to heal the system. A design for DC microgrid was developed to integrate it to the AC system with appropriate control capability. This system represents realistic hybrid AC/DC microgrids connectivity to the AC side to study the use of such architecture in system operation to help remedy system abnormal conditions. In addition, this dissertation explored the challenges and feasibility of the implementation of real-time system analysis features in order to monitor the system security and stability measures. These indices are measured experimentally during the operation of the developed hybrid AC/DC microgrids. Furthermore, a real-time optimal power flow system was implemented to optimally manage the power sharing between AC generators and DC side resources. A study relating to real-time energy management algorithm in hybrid microgrids was performed to evaluate the effects of using energy storage resources and their use in mitigating heavy load impacts on system stability and operational security.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Physiological signals, which are controlled by the autonomic nervous system (ANS), could be used to detect the affective state of computer users and therefore find applications in medicine and engineering. The Pupil Diameter (PD) seems to provide a strong indication of the affective state, as found by previous research, but it has not been investigated fully yet. In this study, new approaches based on monitoring and processing the PD signal for off-line and on-line affective assessment (“relaxation” vs. “stress”) are proposed. Wavelet denoising and Kalman filtering methods are first used to remove abrupt changes in the raw Pupil Diameter (PD) signal. Then three features (PDmean, PDmax and PDWalsh) are extracted from the preprocessed PD signal for the affective state classification. In order to select more relevant and reliable physiological data for further analysis, two types of data selection methods are applied, which are based on the paired t-test and subject self-evaluation, respectively. In addition, five different kinds of the classifiers are implemented on the selected data, which achieve average accuracies up to 86.43% and 87.20%, respectively. Finally, the receiver operating characteristic (ROC) curve is utilized to investigate the discriminating potential of each individual feature by evaluation of the area under the ROC curve, which reaches values above 0.90. For the on-line affective assessment, a hard threshold is implemented first in order to remove the eye blinks from the PD signal and then a moving average window is utilized to obtain the representative value PDr for every one-second time interval of PD. There are three main steps for the on-line affective assessment algorithm, which are preparation, feature-based decision voting and affective determination. The final results show that the accuracies are 72.30% and 73.55% for the data subsets, which were respectively chosen using two types of data selection methods (paired t-test and subject self-evaluation). In order to further analyze the efficiency of affective recognition through the PD signal, the Galvanic Skin Response (GSR) was also monitored and processed. The highest affective assessment classification rate obtained from GSR processing is only 63.57% (based on the off-line processing algorithm). The overall results confirm that the PD signal should be considered as one of the most powerful physiological signals to involve in future automated real-time affective recognition systems, especially for detecting the “relaxation” vs. “stress” states.