924 resultados para acoustic speech recognition system
Resumo:
In the drilling processes and especially deep-hole drilling process, the monitoring system and having control on mechanical parameters (e.g. Force, Torque,Vibration and Acoustic emission) are essential. The main focus of this thesis work is to study the characteristics of deep-hole drilling process, and optimize the monitoring system for controlling the process. The vibration is considered as a major defect area of the deep-hole drilling process which often leads to breakage of the drill, therefore by vibration analysis and optimizing the workpiecefixture, this area is studied by finite element method and the suggestions are explained. By study on a present monitoring system, and searching on the new sensor products, the modifications and recommendations are suggested for optimize the present monitoring system for excellent performance in deep-hole drilling process research and measurements.
Resumo:
In order to spare functional areas during the removal of brain tumours, electrical stimulation mapping was used in 90 patients (77 in the left hemisphere and 13 in the right; 2754 cortical sites tested). Language functions were studied with a special focus on comprehension of auditory and visual words and the semantic system. In addition to naming, patients were asked to perform pointing tasks from auditory and visual stimuli (using sets of 4 different images controlled for familiarity), and also auditory object (sound recognition) and Token test tasks. Ninety-two auditory comprehension interference sites were observed. We found that the process of auditory comprehension involved a few, fine-grained, sub-centimetre cortical territories. Early stages of speech comprehension seem to relate to two posterior regions in the left superior temporal gyrus. Downstream lexical-semantic speech processing and sound analysis involved 2 pathways, along the anterior part of the left superior temporal gyrus, and posteriorly around the supramarginal and middle temporal gyri. Electrostimulation experimentally dissociated perceptual consciousness attached to speech comprehension. The initial word discrimination process can be considered as an "automatic" stage, the attention feedback not being impaired by stimulation as would be the case at the lexical-semantic stage. Multimodal organization of the superior temporal gyrus was also detected since some neurones could be involved in comprehension of visual material and naming. These findings demonstrate a fine graded, sub-centimetre, cortical representation of speech comprehension processing mainly in the left superior temporal gyrus and are in line with those described in dual stream models of language comprehension processing.
Resumo:
We report two unrelated patients with a multisystem disease involving liver, eye, immune system, connective tissue, and bone, caused by biallelic mutations in the neuroblastoma amplified sequence (NBAS) gene. Both presented as infants with recurrent episodes triggered by fever with vomiting, dehydration, and elevated transaminases. They had frequent infections, hypogammaglobulinemia, reduced natural killer cells, and the Pelger-Huët anomaly of their granulocytes. Their facial features were similar with a pointed chin and proptosis; loose skin and reduced subcutaneous fat gave them a progeroid appearance. Skeletal features included short stature, slender bones, epiphyseal dysplasia with multiple phalangeal pseudo-epiphyses, and small C1-C2 vertebrae causing cervical instability and myelopathy. Retinal dystrophy and optic atrophy were present in one patient. NBAS is a component of the synthaxin-18 complex and is involved in nonsense-mediated mRNA decay control. Putative loss-of-function mutations in NBAS are already known to cause disease in humans. A specific founder mutation has been associated with short stature, optic nerve atrophy and Pelger-Huët anomaly of granulocytes (SOPH) in the Siberian Yakut population. A more recent report associates NBAS mutations with recurrent acute liver failure in infancy in a group of patients of European descent. Our observations indicate that the phenotypic spectrum of NBAS deficiency is wider than previously known and includes skeletal, hepatic, metabolic, and immunologic aspects. Early recognition of the skeletal phenotype is important for preventive management of cervical instability. © 2015 Wiley Periodicals, Inc.
Resumo:
INTRODUCTION: Dispatch-assisted cardiopulmonary resuscitation (DA-CPR) plays a key role in out-of-hospital cardiac arrests. We sought to measure dispatchers' performances in a criteria-based system in recognizing cardiac arrest and delivering DA-CPR. Our secondary purpose was to identify the factors that hampered dispatchers' identification of cardiac arrests, the factors that prevented them from proposing DA-CPR, and the factors that prevented bystanders from performing CPR. METHODS AND RESULTS: We reviewed dispatch recordings for 1254 out-of-hospital cardiac arrests occurring between January 1, 2011 and December 31, 2013. Dispatchers correctly identified cardiac arrests in 71% of the reviewed cases and 84% of the cases in which they were able to assess for patient consciousness and breathing. The median time to recognition of the arrest was 60s. The median time to start chest compression was 220s. CONCLUSIONS: This study demonstrates that performances from a criteria-based dispatch system can be similar to those from a medical-priority dispatch system regarding out-of-hospital cardiac arrest (OHCA) time recognition and DA-CPR delivery. Agonal breathing recognition remains the weakest link in this sensitive task in both systems. It is of prime importance that all dispatch centers tend not only to implement DA-CPR but also to have tools to help them reach this objective, as today it should be mandatory to offer this service to the community. In order to improve benchmarking opportunities, we completed previously proposed performance standards as propositions.
Resumo:
Top-down contextual influences play a major part in speech understanding, especially in hearing-impaired patients with deteriorated auditory input. Those influences are most obvious in difficult listening situations, such as listening to sentences in noise but can also be observed at the word level under more favorable conditions, as in one of the most commonly used tasks in audiology, i.e., repeating isolated words in silence. This study aimed to explore the role of top-down contextual influences and their dependence on lexical factors and patient-specific factors using standard clinical linguistic material. Spondaic word perception was tested in 160 hearing-impaired patients aged 23-88 years with a four-frequency average pure-tone threshold ranging from 21 to 88 dB HL. Sixty spondaic words were randomly presented at a level adjusted to correspond to a speech perception score ranging between 40 and 70% of the performance intensity function obtained using monosyllabic words. Phoneme and whole-word recognition scores were used to calculate two context-influence indices (the j factor and the ratio of word scores to phonemic scores) and were correlated with linguistic factors, such as the phonological neighborhood density and several indices of word occurrence frequencies. Contextual influence was greater for spondaic words than in similar studies using monosyllabic words, with an overall j factor of 2.07 (SD = 0.5). For both indices, context use decreased with increasing hearing loss once the average hearing loss exceeded 55 dB HL. In right-handed patients, significantly greater context influence was observed for words presented in the right ears than for words presented in the left, especially in patients with many years of education. The correlations between raw word scores (and context influence indices) and word occurrence frequencies showed a significant age-dependent effect, with a stronger correlation between perception scores and word occurrence frequencies when the occurrence frequencies were based on the years corresponding to the patients' youth, showing a "historic" word frequency effect. This effect was still observed for patients with few years of formal education, but recent occurrence frequencies based on current word exposure had a stronger influence for those patients, especially for younger ones.
Resumo:
This Master's thesis addresses the design and implementation of the optical character recognition (OCR) system for a mobile device working on the Symbian operating system. The developed OCR system, named OCRCapriccio, emphasizes the modularity, effective extensibility and reuse. The system consists of two parts which are the graphical user interface and the OCR engine that was implemented as a plug-in. In fact, the plug-in includes two implementations of the OCR engine for enabling two types of recognition: the bitmap comparison based recognition and statistical recognition. The implementation results have shown that the approach based on bitmap comparison is more suitable for the Symbian environment because of its nature. Although the current implementation of bitmap comparison is lacking in accuracy, further development should be done in its direction. The biggest challenges of this work were related to developing an OCR scheme that would be suitable for Symbian OS Smartphones that have limited computational power and restricted resources.
Resumo:
Today´s organizations must have the ability to react to rapid changes in the market. These rapid changes cause pressure to continuously find new efficient ways to organize work practices. Increased competition requires businesses to become more effective and to pay attention to quality of management and to make people to understand their work's impact on the final result. The fundamentals in continmuois improvement are systematic and agile tackling of indentified individual process constraints and the fact tha nothin finally improves without changes. Successful continuous improvement requires management commitment, education, implementation, measurement, recognition and regeneration. These ingredients form the foundation, both for breakthrough projects and small step ongoing improvement activities. One part of the organization's management system are the quality tools, which provide systematic methodologies for identifying problems, defining their root causes, finding solutions, gathering and sorting of data, supporting decision making and implementing the changes, and many other management tasks. Organizational change management includes processes and tools for managing the people in an organizational level change. These tools include a structured approach, which can be used for effective transition of organizations through change. When combined with the understanding of change management of individuals, these tools provide a framework for managing people in change,
Resumo:
In this paper, we present the Melodic Analysis of Speech method (MAS) that enables us to carry out complete and objective descriptions of a language's intonation, from a phonetic (melodic) point of view as well as from a phonological point of view. It is based on the acoustic-perceptive method by Cantero (2002), which has already been used in research on prosody in different languages. In this case, we present the results of its application in Spanish and Catalan.
Resumo:
This dissertation considers the segmental durations of speech from the viewpoint of speech technology, especially speech synthesis. The idea is that better models of segmental durations lead to higher naturalness and better intelligibility. These features are the key factors for better usability and generality of synthesized speech technology. Even though the studies are based on a Finnish corpus the approaches apply to all other languages as well. This is possibly due to the fact that most of the studies included in this dissertation are about universal effects taking place on utterance boundaries. Also the methods invented and used here are suitable for any other study of another language. This study is based on two corpora of news reading speech and sentences read aloud. The other corpus is read aloud by a 39-year-old male, whilst the other consists of several speakers in various situations. The use of two corpora is twofold: it involves a comparison of the corpora and a broader view on the matters of interest. The dissertation begins with an overview to the phonemes and the quantity system in the Finnish language. Especially, we are covering the intrinsic durations of phonemes and phoneme categories, as well as the difference of duration between short and long phonemes. The phoneme categories are presented to facilitate the problem of variability of speech segments. In this dissertation we cover the boundary-adjacent effects on segmental durations. In initial positions of utterances we find that there seems to be initial shortening in Finnish, but the result depends on the level of detail and on the individual phoneme. On the phoneme level we find that the shortening or lengthening only affects the very first ones at the beginning of an utterance. However, on average, the effect seems to shorten the whole first word on the word level. We establish the effect of final lengthening in Finnish. The effect in Finnish has been an open question for a long time, whilst Finnish has been the last missing piece for it to be a universal phenomenon. Final lengthening is studied from various angles and it is also shown that it is not a mere effect of prominence or an effect of speech corpus with high inter- and intra-speaker variation. The effect of final lengthening seems to extend from the final to the penultimate word. On a phoneme level it reaches a much wider area than the initial effect. We also present a normalization method suitable for corpus studies on segmental durations. The method uses an utterance-level normalization approach to capture the pattern of segmental durations within each utterance. This prevents the impact of various problematic variations within the corpora. The normalization is used in a study on final lengthening to show that the results on the effect are not caused by variation in the material. The dissertation shows an implementation and prowess of speech synthesis on a mobile platform. We find that the rule-based method of speech synthesis is a real-time software solution, but the signal generation process slows down the system beyond real time. Future aspects of speech synthesis on limited platforms are discussed. The dissertation considers ethical issues on the development of speech technology. The main focus is on the development of speech synthesis with high naturalness, but the problems and solutions are applicable to any other speech technology approaches.
Resumo:
Speaker diarization is the process of sorting speeches according to the speaker. Diarization helps to search and retrieve what a certain speaker uttered in a meeting. Applications of diarization systemsextend to other domains than meetings, for example, lectures, telephone, television, and radio. Besides, diarization enhances the performance of several speech technologies such as speaker recognition, automatic transcription, and speaker tracking. Methodologies previously used in developing diarization systems are discussed. Prior results and techniques are studied and compared. Methods such as Hidden Markov Models and Gaussian Mixture Models that are used in speaker recognition and other speech technologies are also used in speaker diarization. The objective of this thesis is to develop a speaker diarization system in meeting domain. Experimental part of this work indicates that zero-crossing rate can be used effectively in breaking down the audio stream into segments, and adaptive Gaussian Models fit adequately short audio segments. Results show that 35 Gaussian Models and one second as average length of each segment are optimum values to build a diarization system for the tested data. Uniting the segments which are uttered by same speaker is done in a bottom-up clustering by a newapproach of categorizing the mixture weights.
Resumo:
The flow of information within modern information society has increased rapidly over the last decade. The major part of this information flow relies on the individual’s abilities to handle text or speech input. For the majority of us it presents no problems, but there are some individuals who would benefit from other means of conveying information, e.g. signed information flow. During the last decades the new results from various disciplines have all suggested towards the common background and processing for sign and speech and this was one of the key issues that I wanted to investigate further in this thesis. The basis of this thesis is firmly within speech research and that is why I wanted to design analogous test batteries for widely used speech perception tests for signers – to find out whether the results for signers would be the same as in speakers’ perception tests. One of the key findings within biology – and more precisely its effects on speech and communication research – is the mirror neuron system. That finding has enabled us to form new theories about evolution of communication, and it all seems to converge on the hypothesis that all communication has a common core within humans. In this thesis speech and sign are discussed as equal and analogical counterparts of communication and all research methods used in speech are modified for sign. Both speech and sign are thus investigated using similar test batteries. Furthermore, both production and perception of speech and sign are studied separately. An additional framework for studying production is given by gesture research using cry sounds. Results of cry sound research are then compared to results from children acquiring sign language. These results show that individuality manifests itself from very early on in human development. Articulation in adults, both in speech and sign, is studied from two perspectives: normal production and re-learning production when the apparatus has been changed. Normal production is studied both in speech and sign and the effects of changed articulation are studied with regards to speech. Both these studies are done by using carrier sentences. Furthermore, sign production is studied giving the informants possibility for spontaneous speech. The production data from the signing informants is also used as the basis for input in the sign synthesis stimuli used in sign perception test battery. Speech and sign perception were studied using the informants’ answers to questions using forced choice in identification and discrimination tasks. These answers were then compared across language modalities. Three different informant groups participated in the sign perception tests: native signers, sign language interpreters and Finnish adults with no knowledge of any signed language. This gave a chance to investigate which of the characteristics found in the results were due to the language per se and which were due to the changes in modality itself. As the analogous test batteries yielded similar results over different informant groups, some common threads of results could be observed. Starting from very early on in acquiring speech and sign the results were highly individual. However, the results were the same within one individual when the same test was repeated. This individuality of results represented along same patterns across different language modalities and - in some occasions - across language groups. As both modalities yield similar answers to analogous study questions, this has lead us to providing methods for basic input for sign language applications, i.e. signing avatars. This has also given us answers to questions on precision of the animation and intelligibility for the users – what are the parameters that govern intelligibility of synthesised speech or sign and how precise must the animation or synthetic speech be in order for it to be intelligible. The results also give additional support to the well-known fact that intelligibility in fact is not the same as naturalness. In some cases, as shown within the sign perception test battery design, naturalness decreases intelligibility. This also has to be taken into consideration when designing applications. All in all, results from each of the test batteries, be they for signers or speakers, yield strikingly similar patterns, which would indicate yet further support for the common core for all human communication. Thus, we can modify and deepen the phonetic framework models for human communication based on the knowledge obtained from the results of the test batteries within this thesis.
Resumo:
Pulsed electroacoustic (PEA) method is a commonly used non-destructive technique for investigating space charges. It has been developed since early 1980s. These days there is continuing interest for better understanding of the influence of space charge on the reliability of solid electrical insulation under high electric field. The PEA method is widely used for space charge profiling for its robust and relatively inexpensive features. The PEA technique relies on a voltage impulse used to temporarily disturb the space charge equilibrium in a dielectric. The acoustic wave is generated by charge movement in the sample and detected by means of a piezoelectric film. The spatial distribution of the space charge is contained within the detected signal. The principle of such a system is already well established, and several kinds of setups have been constructed for different measurement needs. This thesis presents the design of a PEA measurement system as a systems engineering project. The operating principle and some recent developments are summarised. The steps of electrical and mechanical design of the instrument are discussed. A common procedure for measuring space charges is explained and applied to verify the functionality of the system. The measurement system is provided as an additional basic research tool for the Corporate Research Centre of ABB (China) Ltd. It can be used to characterise flat samples with thickness of 0.2–0.5 mm under DC stress. The spatial resolution of the measurement is 20 μm.
Resumo:
Asthma and allergy are common diseases and their prevalence is increasing. One of the hypotheses that explains this trend is exposure to inhalable chemicals such as traffi c-related air pollution. Epidemiological research supports this theory, as a correlation between environmental chemicals and allergic respiratory diseases has been found. In addition to ambient airborne particles, one may be exposed to engineered nanosized materials that are actively produced due to their favorable physico-chemical properties compared to their bulk size counterparts. On the cellular level, improper activity of T helper (Th) cells has been connected to allergic reactions. Th cells can differentiate into functionally different effector subsets, which are identifi ed according to their characteristic cytokine profi les resulting in specifi c ability to communicate with other cells. Th2 cells activate humoral immunity and stimulate eradication of extracellular pathogens. However, persistent predominance of Th2 cells is involved in a development of number of allergic diseases. The cytokine environment at the time of antigen recognition is the major factor determining the polarization of a naïve Th cell. Th2 cell differentiation is initiated by IL4, which signals via transcription factor STAT6. Although the importance of this pathway has been evaluated in the mouse studies, the signaling components involved have been largely unknown. The aim of this thesis was to identify molecules, which are under the control of IL4 and STAT6 in Th cells. This was done by using system-level analysis of STAT6 target genes at genome, mRNA and protein level resulting in identifi cation of various genes previously not connected to Th2 cell phenotype acquisition. In the study, STAT6-mediated primary and secondary target genes were dissection from each other and a detailed transcriptional kinetics of Th2 cell polarization of naïve human CD4+ T cells was collected. Integration of these data revealed the hierarchy of molecular events that mediates the differentiation towards Th2 cell phenotype. In addition, the results highlighted the importance of exploiting proteomics tools to complement the studies on STAT6 target genes identifi ed through transcriptional profi ling. In the last subproject, the effects of the exposure with ZnO and TiO2 nanoparticles was analyzed in Jurkat T cell line and in primary human monocyte-derived macrophages and dendritic cells to evaluate their toxicity and potential to cause infl ammation. Identifi cation of ZnO-derived gene expression showed that the same nanoparticles may elicit markedly distinctive responses in different cell types, thus underscoring the need for unbiased profi ling of target genes and pathways affected. The results gave additional proof that the cellular response to nanosized ZnO is due to leached Zn2+ ions. The approach used in ZnO and TiO2 nanoparticle study demonstrated the value of assessing nanoparticle responses through a toxicogenomics approach. The increased knowledge of Th2 cell signaling will hopefully reveal new therapeutic nodes and eventually improve our possibilities to prevent and tackle allergic infl ammatory diseases.
Resumo:
This study will concentrate on Product Data Management (PDM) systems, and sheet metal design features and classification. In this thesis, PDM is seen as an individual system which handles all product-related data and information. The meaning of relevant data is to take the manufacturing process further with fewer errors. The features of sheet metals are giving more information and value to the designed models. The possibility of implementing PDM and sheet metal features recognition are the core of this study. Their integration should make the design process faster and manufacturing-friendly products easier to design. The triangulation method is the basis for this research. The sections of this triangle are: scientific literature review, interview using the Delphi method and the author’s experience and observations. The main key findings of this study are: (1) the area of focus in triangle (the triangle of three different point of views: business, information exchange and technical) depends on the person’s background and their role in the company, (2) the classification in the PDM system (and also in the CAD system) should be done using the materials, tools and machines that are in use in the company and (3) the design process has to be more effective because of the increase of industrial production, sheet metal blank production and the designer’s time spent on actual design and (4) because Design For Manufacture (DFM) integration can be done with CAD-programs, DFM integration with the PDM system should also be possible.
Resumo:
The target of any immunization is to activate and expand lymphocyte clones with the desired recognition specificity and the necessary effector functions. In gene, recombinant and peptide vaccines, the immunogen is a single protein or a small assembly of epitopes from antigenic proteins. Since most immune responses against protein and peptide antigens are T-cell dependent, the molecular target of such vaccines is to generate at least 50-100 complexes between MHC molecule and the antigenic peptide per antigen-presenting cell, sensitizing a T cell population of appropriate clonal size and effector characteristics. Thus, the immunobiology of antigen recognition by T cells must be taken into account when designing new generation peptide- or gene-based vaccines. Since T cell recognition is MHC-restricted, and given the wide polymorphism of the different MHC molecules, distinct epitopes may be recognized by different individuals in the population. Therefore, the issue of whether immunization will be effective in inducing a protective immune response, covering the entire target population, becomes an important question. Many pathogens have evolved molecular mechanisms to escape recognition by the immune system by variation of antigenic protein sequences. In this short review, we will discuss the several concepts related to selection of amino acid sequences to be included in DNA and peptide vaccines.