975 resultados para speech interference level
Resumo:
Industrial ecology is an important field of sustainability science. It can be applied to study environmental problems in a policy relevant manner. Industrial ecology uses ecosystem analogy; it aims at closing the loop of materials and substances and at the same time reducing resource consumption and environmental emissions. Emissions from human activities are related to human interference in material cycles. Carbon (C), nitrogen (N) and phosphorus (P) are essential elements for all living organisms, but in excess have negative environmental impacts, such as climate change (CO2, CH4 N2O), acidification (NOx) and eutrophication (N, P). Several indirect macro-level drivers affect emissions change. Population and affluence (GDP/capita) often act as upward drivers for emissions. Technology, as emissions per service used, and consumption, as economic intensity of use, may act as drivers resulting in a reduction in emissions. In addition, the development of country-specific emissions is affected by international trade. The aim of this study was to analyse changes in emissions as affected by macro-level drivers in different European case studies. ImPACT decomposition analysis (IPAT identity) was applied as a method in papers I III. The macro-level perspective was applied to evaluate CO2 emission reduction targets (paper II) and the sharing of greenhouse gas emission reduction targets (paper IV) in the European Union (EU27) up to the year 2020. Data for the study were mainly gathered from official statistics. In all cases, the results were discussed from an environmental policy perspective. The development of nitrogen oxide (NOx) emissions was analysed in the Finnish energy sector during a long time period, 1950 2003 (paper I). Finnish emissions of NOx began to decrease in the 1980s as the progress in technology in terms of NOx/energy curbed the impact of the growth in affluence and population. Carbon dioxide (CO2) emissions related to energy use during 1993 2004 (paper II) were analysed by country and region within the European Union. Considering energy-based CO2 emissions in the European Union, dematerialization and decarbonisation did occur, but not sufficiently to offset population growth and the rapidly increasing affluence during 1993 2004. The development of nitrogen and phosphorus load from aquaculture in relation to salmonid consumption in Finland during 1980 2007 was examined, including international trade in the analysis (paper III). A regional environmental issue, eutrophication of the Baltic Sea, and a marginal, yet locally important source of nutrients was used as a case. Nutrient emissions from Finnish aquaculture decreased from the 1990s onwards: although population, affluence and salmonid consumption steadily increased, aquaculture technology improved and the relative share of imported salmonids increased. According to the sustainability challenge in industrial ecology, the environmental impact of the growing population size and affluence should be compensated by improvements in technology (emissions/service used) and with dematerialisation. In the studied cases, the emission intensity of energy production could be lowered for NOx by cleaning the exhaust gases. Reorganization of the structure of energy production as well as technological innovations will be essential in lowering the emissions of both CO2 and NOx. Regarding the intensity of energy use, making the combustion of fuels more efficient and reducing energy use are essential. In reducing nutrient emissions from Finnish aquaculture to the Baltic Sea (paper III) through technology, limits of biological and physical properties of cultured fish, among others, will eventually be faced. Regarding consumption, salmonids are preferred to many other protein sources. Regarding trade, increasing the proportion of imports will outsource the impacts. Besides improving technology and dematerialization, other viewpoints may also be needed. Reducing the total amount of nutrients cycling in energy systems and eventually contributing to NOx emissions needs to be emphasized. Considering aquaculture emissions, nutrient cycles can be partly closed through using local fish as feed replacing imported feed. In particular, the reduction of CO2 emissions in the future is a very challenging task when considering the necessary rates of dematerialisation and decarbonisation (paper II). Climate change mitigation may have to focus on other greenhouse gases than CO2 and on the potential role of biomass as a carbon sink, among others. The global population is growing and scaling up the environmental impact. Population issues and growing affluence must be considered when discussing emission reductions. Climate policy has only very recently had an influence on emissions, and strong actions are now called for climate change mitigation. Environmental policies in general must cover all the regions related to production and impacts in order to avoid outsourcing of emissions and leakage effects. The macro-level drivers affecting changes in emissions can be identified with the ImPACT framework. Statistics for generally known macro-indicators are currently relatively well available for different countries, and the method is transparent. In the papers included in this study, a similar method was successfully applied in different types of case studies. Using transparent macro-level figures and a simple top-down approach are also appropriate in evaluating and setting international emission reduction targets, as demonstrated in papers II and IV. The projected rates of population and affluence growth are especially worth consideration in setting targets. However, sensitivities in calculations must be carefully acknowledged. In the basic form of the ImPACT model, the economic intensity of consumption and emission intensity of use are included. In seeking to examine consumption but also international trade in more detail, imports were included in paper III. This example demonstrates well how outsourcing of production influences domestic emissions. Country-specific production-based emissions have often been used in similar decomposition analyses. Nevertheless, trade-related issues must not be ignored.
Resumo:
Speech has both auditory and visual components (heard speech sounds and seen articulatory gestures). During all perception, selective attention facilitates efficient information processing and enables concentration on high-priority stimuli. Auditory and visual sensory systems interact at multiple processing levels during speech perception and, further, the classical motor speech regions seem also to participate in speech perception. Auditory, visual, and motor-articulatory processes may thus work in parallel during speech perception, their use possibly depending on the information available and the individual characteristics of the observer. Because of their subtle speech perception difficulties possibly stemming from disturbances at elemental levels of sensory processing, dyslexic readers may rely more on motor-articulatory speech perception strategies than do fluent readers. This thesis aimed to investigate the neural mechanisms of speech perception and selective attention in fluent and dyslexic readers. We conducted four functional magnetic resonance imaging experiments, during which subjects perceived articulatory gestures, speech sounds, and other auditory and visual stimuli. Gradient echo-planar images depicting blood oxygenation level-dependent contrast were acquired during stimulus presentation to indirectly measure brain hemodynamic activation. Lip-reading activated the primary auditory cortex, and selective attention to visual speech gestures enhanced activity within the left secondary auditory cortex. Attention to non-speech sounds enhanced auditory cortex activity bilaterally; this effect showed modulation by sound presentation rate. A comparison between fluent and dyslexic readers' brain hemodynamic activity during audiovisual speech perception revealed stronger activation of predominantly motor speech areas in dyslexic readers during a contrast test that allowed exploration of the processing of phonetic features extracted from auditory and visual speech. The results show that visual speech perception modulates hemodynamic activity within auditory cortex areas once considered unimodal, and suggest that the left secondary auditory cortex specifically participates in extracting the linguistic content of seen articulatory gestures. They are strong evidence for the importance of attention as a modulator of auditory cortex function during both sound processing and visual speech perception, and point out the nature of attention as an interactive process (influenced by stimulus-driven effects). Further, they suggest heightened reliance on motor-articulatory and visual speech perception strategies among dyslexic readers, possibly compensating for their auditory speech perception difficulties.
Resumo:
A study has been carried out on the non-specific interference due to serum in the avidin biotin micro-ELISA for monkey chorionic gonadotropin. Results suggest that it is not due to any proteolytic activity in the serum, but immunoglobulin or associated factors interfering at the level of antigen-antibody interaction. This interference was eliminated by heating samples at 60°C for 30 min.
Resumo:
Higher level of inversion is achieved with a less number of switches in the proposed scheme. The scheme proposes a five-level inverter for an open-end winding induction motor which uses only two DC-link rectifiers of voltage rating of Vdc/4, a neutral-point clamped (NPC) three-level inverter and a two-level inverter. Even though the two-level inverter is connected to the high-voltage side, it is always in square-wave operation. Since the two-level inverter is not switching in a pulse width modulated fashion and the magnitude of switching transient is only half compared to the convention three-level NPC inverter, the switching losses and electromagnetic interference is not so high. The scheme is experimentally verified on a 2.5 kW induction machine.
Resumo:
This paper considers the high-rate performance of source coding for noisy discrete symmetric channels with random index assignment (IA). Accurate analytical models are developed to characterize the expected distortion performance of vector quantization (VQ) for a large class of distortion measures. It is shown that when the point density is continuous, the distortion can be approximated as the sum of the source quantization distortion and the channel-error induced distortion. Expressions are also derived for the continuous point density that minimizes the expected distortion. Next, for the case of mean squared error distortion, a more accurate analytical model for the distortion is derived by allowing the point density to have a singular component. The extent of the singularity is also characterized. These results provide analytical models for the expected distortion performance of both conventional VQ as well as for channel-optimized VQ. As a practical example, compression of the linear predictive coding parameters in the wideband speech spectrum is considered, with the log spectral distortion as performance metric. The theory is able to correctly predict the channel error rate that is permissible for operation at a particular level of distortion.
Resumo:
This work derives inner and outer bounds on the generalized degrees of freedom (GDOF) of the K-user symmetric MIMO Gaussian interference channel. For the inner bound, an achievable GDOF is derived by employing a combination of treating interference as noise, zero-forcing at the receivers, interference alignment (IA), and extending the Han-Kobayashi (HK) scheme to K users, depending on the number of antennas and the INR/SNR level. An outer bound on the GDOF is derived, using a combination of the notion of cooperation and providing side information to the receivers. Several interesting conclusions are drawn from the bounds. For example, in terms of the achievable GDOF in the weak interference regime, when the number of transmit antennas (M) is equal to the number of receive antennas (N), treating interference as noise performs the same as the HK scheme and is GDOF optimal. For K >; N/M+1, a combination of the HK and IA schemes performs the best among the schemes considered. However, for N/M <; K ≤ N/M+1, the HK scheme is found to be GDOF optimal.
Resumo:
Accurately characterizing the time-varying interference caused to the primary users is essential in ensuring a successful deployment of cognitive radios (CR). We show that the aggregate interference at the primary receiver (PU-Rx) from multiple, randomly located cognitive users (CUs) is well modeled as a shifted lognormal random process, which is more accurate than the lognormal and the Gaussian process models considered in the literature, even for a relatively dense deployment of CUs. It also compares favorably with the asymptotically exact stable and symmetric truncated stable distribution models, except at high CU densities. Our model accounts for the effect of imperfect spectrum sensing, which depends on path-loss, shadowing, and small-scale fading of the link from the primary transmitter to the CU; the interweave and underlay modes or CR operation, which determine the transmit powers of the CUs; and time-correlated shadowing and fading of the links from the CUs to the PU-Rx. It leads to expressions for the probability distribution function, level crossing rate, and average exceedance duration. The impact of cooperative spectrum sensing is also characterized. We validate the model by applying it to redesign the primary exclusive zone to account for the time-varying nature of interference.
Resumo:
The K-user multiple input multiple output (MIMO) Gaussian symmetric interference channel where each transmitter has M antennas and each receiver has N antennas is studied from a generalized degrees of freedom (GDOF) perspective. An inner bound on the GDOF is derived using a combination of techniques such as treating interference as noise, zero forcing (ZF) at the receivers, interference alignment (IA), and extending the Han-Kobayashi (HK) scheme to K users, as a function of the number of antennas and the log INR/log SNR level. Several interesting conclusions are drawn from the derived bounds. It is shown that when K > N/M + 1, a combination of the HK and IA schemes performs the best among the schemes considered. When N/M < K <= N/M + 1, the HK-scheme outperforms other schemes and is found to be GDOF optimal in many cases. In addition, when the SNR and INR are at the same level, ZF-receiving and the HK-scheme have the same GDOF performance.
Resumo:
We propose apractical, feature-level and score-level fusion approach by combining acoustic and estimated articulatory information for both text independent and text dependent speaker verification. From a practical point of view, we study how to improve speaker verification performance by combining dynamic articulatory information with the conventional acoustic features. On text independent speaker verification, we find that concatenating articulatory features obtained from measured speech production data with conventional Mel-frequency cepstral coefficients (MFCCs) improves the performance dramatically. However, since directly measuring articulatory data is not feasible in many real world applications, we also experiment with estimated articulatory features obtained through acoustic-to-articulatory inversion. We explore both feature level and score level fusion methods and find that the overall system performance is significantly enhanced even with estimated articulatory features. Such a performance boost could be due to the inter-speaker variation information embedded in the estimated articulatory features. Since the dynamics of articulation contain important information, we included inverted articulatory trajectories in text dependent speaker verification. We demonstrate that the articulatory constraints introduced by inverted articulatory features help to reject wrong password trials and improve the performance after score level fusion. We evaluate the proposed methods on the X-ray Microbeam database and the RSR 2015 database, respectively, for the aforementioned two tasks. Experimental results show that we achieve more than 15% relative equal error rate reduction for both speaker verification tasks. (C) 2015 Elsevier Ltd. All rights reserved.
Resumo:
Acoustic feature based speech (syllable) rate estimation and syllable nuclei detection are important problems in automatic speech recognition (ASR), computer assisted language learning (CALL) and fluency analysis. A typical solution for both the problems consists of two stages. The first stage involves computing a short-time feature contour such that most of the peaks of the contour correspond to the syllabic nuclei. In the second stage, the peaks corresponding to the syllable nuclei are detected. In this work, instead of the peak detection, we perform a mode-shape classification, which is formulated as a supervised binary classification problem - mode-shapes representing the syllabic nuclei as one class and remaining as the other. We use the temporal correlation and selected sub-band correlation (TCSSBC) feature contour and the mode-shapes in the TCSSBC feature contour are converted into a set of feature vectors using an interpolation technique. A support vector machine classifier is used for the classification. Experiments are performed separately using Switchboard, TIMIT and CTIMIT corpora in a five-fold cross validation setup. The average correlation coefficients for the syllable rate estimation turn out to be 0.6761, 0.6928 and 0.3604 for three corpora respectively, which outperform those obtained by the best of the existing peak detection techniques. Similarly, the average F-scores (syllable level) for the syllable nuclei detection are 0.8917, 0.8200 and 0.7637 for three corpora respectively. (C) 2016 Elsevier B.V. All rights reserved.
Discriminative language model adaptation for Mandarin broadcast speech transcription and translation
Resumo:
This paper investigates unsupervised test-time adaptation of language models (LM) using discriminative methods for a Mandarin broadcast speech transcription and translation task. A standard approach to adapt interpolated language models to is to optimize the component weights by minimizing the perplexity on supervision data. This is a widely made approximation for language modeling in automatic speech recognition (ASR) systems. For speech translation tasks, it is unclear whether a strong correlation still exists between perplexity and various forms of error cost functions in recognition and translation stages. The proposed minimum Bayes risk (MBR) based approach provides a flexible framework for unsupervised LM adaptation. It generalizes to a variety of forms of recognition and translation error metrics. LM adaptation is performed at the audio document level using either the character error rate (CER), or translation edit rate (TER) as the cost function. An efficient parameter estimation scheme using the extended Baum-Welch (EBW) algorithm is proposed. Experimental results on a state-of-the-art speech recognition and translation system are presented. The MBR adapted language models gave the best recognition and translation performance and reduced the TER score by up to 0.54% absolute. © 2007 IEEE.
Resumo:
The Chinese language is based on characters which are syllabic in nature. Since languages have syllabotactic rules which govern the construction of syllables and their allowed sequences, Chinese character sequence models can be used as a first level approximation of allowed syllable sequences. N-gram character sequence models were trained on 4.3 billion characters. Characters are used as a first level recognition unit with multiple pronunciations per character. For comparison the CU-HTK Mandarin word based system was used to recognize words which were then converted to character sequences. The character only system error rates for one best recognition were slightly worse than word based character recognition. However combining the two systems using log-linear combination gives better results than either system separately. An equally weighted combination gave consistent CER gains of 0.1-0.2% absolute over the word based standard system. Copyright © 2009 ISCA.
Resumo:
We analyse the physical origin of population inversion via continuous wave two-colour coherent excitation in three-level systems by dressing the inverted transition. Two different mechanisms are identified as being responsible for the population inversion. For V-configured systems and cascade (E) configured systems with inversion on the lower transition, the responsible mechanism is the selective trapping of dressed states, and the population inversion approaches the ideal value of 1. For Lambda-configured systems and Xi-configured systems with inversion on the upper transition, population inversion is based on the selective excitation of dressed states, with the population inversion tending towards 0.5. As the essential difference between these two mechanisms, the selective trapping of dressed states occurs in systems with strong decay into dressed states while the selective excitation appears in systems with strong decay out of dressed states.
Resumo:
A five-level tripod scheme is proposed for obtaining a high efficiency four-wave-mixing (FWM) process. The existence of double-dark resonances leads to a strong modification of the absorption and dispersion properties against a pump wave at two transparency windows. We show that both of them can be used to open the four-wave mixing channel and produce efficient mixing waves. In particular, higher FWM efficiency is always produced at the transparent window corresponding to the relatively weak-coupling field. By manipulating the intensity of the two coupling fields, the conversion efficiency of FWM can be controlled.