931 resultados para Speech-processing technologies


Relevância:

90.00% 90.00%

Publicador:

Resumo:

The use of visual cues during the processing of audiovisual (AV) speech is known to be less efficient in children and adults with language difficulties and difficulties are known to be more prevalent in children from low-income populations. In the present study, we followed an economically diverse group of thirty-seven infants longitudinally from 6–9 months to 14–16 months of age. We used eye-tracking to examine whether individual differences in visual attention during AV processing of speech in 6–9 month old infants, particularly when processing congruent and incongruent auditory and visual speech cues, might be indicative of their later language development. Twenty-two of these 6–9 month old infants also participated in an event-related potential (ERP) AV task within the same experimental session. Language development was then followed-up at the age of 14–16 months, using two measures of language development, the Preschool Language Scale and the Oxford Communicative Development Inventory. The results show that those infants who were less efficient in auditory speech processing at the age of 6–9 months had lower receptive language scores at 14–16 months. A correlational analysis revealed that the pattern of face scanning and ERP responses to audiovisually incongruent stimuli at 6–9 months were both significantly associated with language development at 14–16 months. These findings add to the understanding of individual differences in neural signatures of AV processing and associated looking behavior in infants.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Research on audiovisual speech integration has reported high levels of individual variability, especially among young infants. In the present study we tested the hypothesis that this variability results from individual differences in the maturation of audiovisual speech processing during infancy. A developmental shift in selective attention to audiovisual speech has been demonstrated between 6 and 9 months with an increase in the time spent looking to articulating mouths as compared to eyes (Lewkowicz & Hansen-Tift. (2012) Proc. Natl Acad. Sci. USA, 109, 1431–1436; Tomalski et al. (2012) Eur. J. Dev. Psychol., 1–14). In the present study we tested whether these changes in behavioural maturational level are associated with differences in brain responses to audiovisual speech across this age range. We measured high-density event-related potentials (ERPs) in response to videos of audiovisually matching and mismatched syllables /ba/ and /ga/, and subsequently examined visual scanning of the same stimuli with eye-tracking. There were no clear age-specific changes in ERPs, but the amplitude of audiovisual mismatch response (AVMMR) to the combination of visual /ba/ and auditory /ga/ was strongly negatively associated with looking time to the mouth in the same condition. These results have significant implications for our understanding of individual differences in neural signatures of audiovisual speech processing in infants, suggesting that they are not strictly related to chronological age but instead associated with the maturation of looking behaviour, and develop at individual rates in the second half of the first year of life.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Speech processing and consequent recognition are important areas of Digital Signal Processing since speech allows people to communicate more natu-rally and efficiently. In this work, a speech recognition system is developed for re-cognizing digits in Malayalam. For recognizing speech, features are to be ex-tracted from speech and hence feature extraction method plays an important role in speech recognition. Here, front end processing for extracting the features is per-formed using two wavelet based methods namely Discrete Wavelet Transforms (DWT) and Wavelet Packet Decomposition (WPD). Naive Bayes classifier is used for classification purpose. After classification using Naive Bayes classifier, DWT produced a recognition accuracy of 83.5% and WPD produced an accuracy of 80.7%. This paper is intended to devise a new feature extraction method which produces improvements in the recognition accuracy. So, a new method called Dis-crete Wavelet Packet Decomposition (DWPD) is introduced which utilizes the hy-brid features of both DWT and WPD. The performance of this new approach is evaluated and it produced an improved recognition accuracy of 86.2% along with Naive Bayes classifier.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Speech is the most natural means of communication among human beings and speech processing and recognition are intensive areas of research for the last five decades. Since speech recognition is a pattern recognition problem, classification is an important part of any speech recognition system. In this work, a speech recognition system is developed for recognizing speaker independent spoken digits in Malayalam. Voice signals are sampled directly from the microphone. The proposed method is implemented for 1000 speakers uttering 10 digits each. Since the speech signals are affected by background noise, the signals are tuned by removing the noise from it using wavelet denoising method based on Soft Thresholding. Here, the features from the signals are extracted using Discrete Wavelet Transforms (DWT) because they are well suitable for processing non-stationary signals like speech. This is due to their multi- resolutional, multi-scale analysis characteristics. Speech recognition is a multiclass classification problem. So, the feature vector set obtained are classified using three classifiers namely, Artificial Neural Networks (ANN), Support Vector Machines (SVM) and Naive Bayes classifiers which are capable of handling multiclasses. During classification stage, the input feature vector data is trained using information relating to known patterns and then they are tested using the test data set. The performances of all these classifiers are evaluated based on recognition accuracy. All the three methods produced good recognition accuracy. DWT and ANN produced a recognition accuracy of 89%, SVM and DWT combination produced an accuracy of 86.6% and Naive Bayes and DWT combination produced an accuracy of 83.5%. ANN is found to be better among the three methods.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The date palm Phoenix dactylifera has played an important role in the day-to-day life of the people for the last 7000 years. Today worldwide production, utilization and industrialization of dates are continuously increasing since date fruits have earned great importance in human nutrition owing to their rich content of essential nutrients. Tons of date palm fruit wastes are discarded daily by the date processing industries leading to environmental problems. Wastes such as date pits represent an average of 10% of the date fruits. Thus, there is an urgent need to find suitable applications for this waste. In spite of several studies on date palm cultivation, their utilization and scope for utilizing date fruit in therapeutic applications, very few reviews are available and they are limited to the chemistry and pharmacology of the date fruits and phytochemical composition, nutritional significance and potential health benefits of date fruit consumption. In this context, in the present review the prospects of valorization of these date fruit processing by-products and wastes’ employing fermentation and enzyme processing technologies towards total utilization of this valuable commodity for the production of biofuels, biopolymers, biosurfactants, organic acids, antibiotics, industrial enzymes and other possible industrial chemicals are discussed

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Summary - Cooking banana is one of the most important crops in Uganda; it is a staple food and source of household income in rural areas. The most common cooking banana is locally called matooke, a Musa sp triploid acuminate genome group (AAA-EAHB). It is perishable and traded in fresh form leading to very high postharvest losses (22-45%). This is attributed to: non-uniform level of harvest maturity, poor handling, bulk transportation and lack of value addition/processing technologies, which are currently the main challenges for trade and export, and diversified utilization of matooke. Drying is one of the oldest technologies employed in processing of agricultural produce. A lot of research has been carried out on drying of fruits and vegetables, but little information is available on matooke. Drying of matooke and milling it to flour extends its shelf-life is an important means to overcome the above challenges. Raw matooke flour is a generic flour developed to improve shelf stability of the fruit and to find alternative uses. It is rich in starch (80 - 85%db) and subsequently has a high potential as a calorie resource base. It possesses good properties for both food and non-food industrial use. Some effort has been done to commercialize the processing of matooke but there is still limited information on its processing into flour. It was imperative to carry out an in-depth study to bridge the following gaps: lack of accurate information on the maturity window within which matooke for processing into flour can be harvested leading to non-uniform quality of matooke flour; there is no information on moisture sorption isotherm for matooke from which the minimum equilibrium moisture content in relation to temperature and relative humidity is obtainable, below which the dry matooke would be microbiologically shelf-stable; and lack of information on drying behavior of matooke and standardized processing parameters for matooke in relation to physicochemical properties of the flour. The main objective of the study was to establish the optimum harvest maturity window and optimize the processing parameters for obtaining standardized microbiologically shelf-stable matooke flour with good starch quality attributes. This research was designed to: i) establish the optimum maturity harvest window within which matooke can be harvested to produce a consistent quality of matooke flour, ii) establish the sorption isotherms for matooke, iii) establish the effect of process parameters on drying characteristics of matooke, iv) optimize the drying process parameters for matooke, v) validate the models of maturity and optimum process parameters and vi) standardize process parameters for commercial processing of matooke. Samples were obtained from a banana plantation at Presidential Initiative on Banana Industrial Development (PIBID), Technology Business Incubation Center (TBI) at Nyaruzunga – Bushenyi in Western Uganda. A completely randomized design (CRD) was employed in selecting the banana stools from which samples for the experiments were picked. The cultivar Mbwazirume which is soft cooking and commonly grown in Bushenyi was selected for the study. The static gravitation method recommended by COST 90 Project (Wolf et al., 1985), was used for determination of moisture sorption isotherms. A research dryer developed for this research. All experiments were carried out in laboratories at TBI. The physiological maturity of matooke cv. mbwazirume at Bushenyi is 21 weeks. The optimum harvest maturity window for commercial processing of matooke flour (Raw Tooke Flour - RTF) at Bushenyi is between 15-21 weeks. The finger weight model is recommended for farmers to estimate harvest maturity for matooke and the combined model of finger weight and pulp peel ratio is recommended for commercial processors. Matooke isotherms exhibited type II curve behavior which is characteristic of foodstuffs. The GAB model best described all the adsorption and desorption moisture isotherms. For commercial processing of matooke, in order to obtain a microbiologically shelf-stable dry product. It is recommended to dry it to moisture content below or equal to 10% (wb). The hysteresis phenomenon was exhibited by the moisture sorption isotherms for matooke. The isoteric heat of sorption for both adsorptions and desorption isotherms increased with decreased moisture content. The total isosteric heat of sorption for matooke: adsorption isotherm ranged from 4,586 – 2,386 kJ/kg and desorption isotherm from 18,194– 2,391 kJ/kg for equilibrium moisture content from 0.3 – 0.01 (db) respectively. The minimum energy required for drying matooke from 80 – 10% (wb) is 8,124 kJ/kg of water removed. Implying that the minimum energy required for drying of 1 kg of fresh matooke from 80 - 10% (wb) is 5,793 kJ. The drying of matooke takes place in three steps: the warm-up and the two falling rate periods. The drying rate constant for all processing parameters ranged from 5,793 kJ and effective diffusivity ranged from 1.5E-10 - 8.27E-10 m2/s. The activation energy (Ea) for matooke was 16.3kJ/mol (1,605 kJ/kg). Comparing the activation energy (Ea) with the net isosteric heat of sorption for desorption isotherm (qst) (1,297.62) at 0.1 (kg water/kg dry matter), indicated that Ea was higher than qst suggesting that moisture molecules travel in liquid form in matooke slices. The total color difference (ΔE*) between the fresh and dry samples, was lowest for effect of thickness of 7 mm, followed by air velocity of 6 m/s, and then drying air temperature at 70˚C. The drying system controlled by set surface product temperature, reduced the drying time by 50% compared to that of a drying system controlled by set air drying temperature. The processing parameters did not have a significant effect on physicochemical and quality attributes, suggesting that any drying air temperature can be used in the initial stages of drying as long as the product temperature does not exceed gelatinization temperature of matooke (72˚C). The optimum processing parameters for single-layer drying of matooke are: thickness = 3 mm, air temperatures 70˚C, dew point temperature 18˚C and air velocity 6 m/s overflow mode. From practical point of view it is recommended that for commercial processing of matooke, to employ multi-layer drying of loading capacity equal or less than 7 kg/m², thickness 3 mm, air temperatures 70˚C, dew point temperature 18˚C and air velocity 6 m/s overflow mode.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper examines the visual speech processing abilities of older adults and the age-related effects on speechreading abilities.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Parkinson’s disease (PD) is an increasing neurological disorder in an aging society. The motor and non-motor symptoms of PD advance with the disease progression and occur in varying frequency and duration. In order to affirm the full extent of a patient’s condition, repeated assessments are necessary to adjust medical prescription. In clinical studies, symptoms are assessed using the unified Parkinson’s disease rating scale (UPDRS). On one hand, the subjective rating using UPDRS relies on clinical expertise. On the other hand, it requires the physical presence of patients in clinics which implies high logistical costs. Another limitation of clinical assessment is that the observation in hospital may not accurately represent a patient’s situation at home. For such reasons, the practical frequency of tracking PD symptoms may under-represent the true time scale of PD fluctuations and may result in an overall inaccurate assessment. Current technologies for at-home PD treatment are based on data-driven approaches for which the interpretation and reproduction of results are problematic.  The overall objective of this thesis is to develop and evaluate unobtrusive computer methods for enabling remote monitoring of patients with PD. It investigates first-principle data-driven model based novel signal and image processing techniques for extraction of clinically useful information from audio recordings of speech (in texts read aloud) and video recordings of gait and finger-tapping motor examinations. The aim is to map between PD symptoms severities estimated using novel computer methods and the clinical ratings based on UPDRS part-III (motor examination). A web-based test battery system consisting of self-assessment of symptoms and motor function tests was previously constructed for a touch screen mobile device. A comprehensive speech framework has been developed for this device to analyze text-dependent running speech by: (1) extracting novel signal features that are able to represent PD deficits in each individual component of the speech system, (2) mapping between clinical ratings and feature estimates of speech symptom severity, and (3) classifying between UPDRS part-III severity levels using speech features and statistical machine learning tools. A novel speech processing method called cepstral separation difference showed stronger ability to classify between speech symptom severities as compared to existing features of PD speech. In the case of finger tapping, the recorded videos of rapid finger tapping examination were processed using a novel computer-vision (CV) algorithm that extracts symptom information from video-based tapping signals using motion analysis of the index-finger which incorporates a face detection module for signal calibration. This algorithm was able to discriminate between UPDRS part III severity levels of finger tapping with high classification rates. Further analysis was performed on novel CV based gait features constructed using a standard human model to discriminate between a healthy gait and a Parkinsonian gait. The findings of this study suggest that the symptom severity levels in PD can be discriminated with high accuracies by involving a combination of first-principle (features) and data-driven (classification) approaches. The processing of audio and video recordings on one hand allows remote monitoring of speech, gait and finger-tapping examinations by the clinical staff. On the other hand, the first-principles approach eases the understanding of symptom estimates for clinicians. We have demonstrated that the selected features of speech, gait and finger tapping were able to discriminate between symptom severity levels, as well as, between healthy controls and PD patients with high classification rates. The findings support suitability of these methods to be used as decision support tools in the context of PD assessment.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Speech signals degraded by additive noise can affects different applications in telecommunication. The noise may degrades the intelligibility of the speech signals and its waveforms as well. In some applications such as speech coding, both intelligibility and waveform quality are important but only intelligibility has been focused lastly. So, modern speech quality measurement techniques such as PESQ (Perceptual Evaluation of Speech Quality) have been used and classical distortion measurement techniques such as Cepstral Distance are becoming unused. In this paper it is shown that some classical distortion measures are still important in applications where speech corrupted by additive noise has to be evaluated.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper describes a speech enhancement system (SES) based on a TMS320C31 digital signal processor (DSP) for real-time application. The SES algorithm is based on a modified spectral subtraction method and a new speech activity detector (SAD) is used. The system presents a medium computational load and a sampling rate up to 18 kHz can be used. The goal is load and a sampling rate up to 18 kHz can be used. The goal is to use it to reduce noise in an analog telephone line.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The consumer demand for natural, minimally processed, fresh like and functional food has lead to an increasing interest in emerging technologies. The aim of this PhD project was to study three innovative food processing technologies currently used in the food sector. Ultrasound-assisted freezing, vacuum impregnation and pulsed electric field have been investigated through laboratory scale systems and semi-industrial pilot plants. Furthermore, analytical and sensory techniques have been developed to evaluate the quality of food and vegetable matrix obtained by traditional and emerging processes. Ultrasound was found to be a valuable technique to improve the freezing process of potatoes, anticipating the beginning of the nucleation process, mainly when applied during the supercooling phase. A study of the effects of pulsed electric fields on phenol and enzymatic profile of melon juice has been realized and the statistical treatment of data was carried out through a response surface method. Next, flavour enrichment of apple sticks has been realized applying different techniques, as atmospheric, vacuum, ultrasound technologies and their combinations. The second section of the thesis deals with the development of analytical methods for the discrimination and quantification of phenol compounds in vegetable matrix, as chestnut bark extracts and olive mill waste water. The management of waste disposal in mill sector has been approached with the aim of reducing the amount of waste, and at the same time recovering valuable by-products, to be used in different industrial sectors. Finally, the sensory analysis of boiled potatoes has been carried out through the development of a quantitative descriptive procedure for the study of Italian and Mexican potato varieties. An update on flavour development in fresh and cooked potatoes has been realized and a sensory glossary, including general and specific definitions related to organic products, used in the European project Ecropolis, has been drafted.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Over the past years fruit and vegetable industry has become interested in the application of both osmotic dehydration and vacuum impregnation as mild technologies because of their low temperature and energy requirements. Osmotic dehydration is a partial dewatering process by immersion of cellular tissue in hypertonic solution. The diffusion of water from the vegetable tissue to the solution is usually accompanied by the simultaneous solutes counter-diffusion into the tissue. Vacuum impregnation is a unit operation in which porous products are immersed in a solution and subjected to a two-steps pressure change. The first step (vacuum increase) consists of the reduction of the pressure in a solid-liquid system and the gas in the product pores is expanded, partially flowing out. When the atmospheric pressure is restored (second step), the residual gas in the pores compresses and the external liquid flows into the pores. This unit operation allows introducing specific solutes in the tissue, e.g. antioxidants, pH regulators, preservatives, cryoprotectancts. Fruit and vegetable interact dynamically with the environment and the present study attempts to enhance our understanding on the structural, physico-chemical and metabolic changes of plant tissues upon the application of technological processes (osmotic dehydration and vacuum impregnation), by following a multianalytical approach. Macro (low-frequency nuclear magnetic resonance), micro (light microscopy) and ultrastructural (transmission electron microscopy) measurements combined with textural and differential scanning calorimetry analysis allowed evaluating the effects of individual osmotic dehydration or vacuum impregnation processes on (i) the interaction between air and liquid in real plant tissues, (ii) the plant tissue water state and (iii) the cell compartments. Isothermal calorimetry, respiration and photosynthesis determinations led to investigate the metabolic changes upon the application of osmotic dehydration or vacuum impregnation. The proposed multianalytical approach should enable both better designs of processing technologies and estimations of their effects on tissue.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We present a novel approach for detecting severe obstructive sleep apnea (OSA) cases by introducing non-linear analysis into sustained speech characterization. The proposed scheme was designed for providing additional information into our baseline system, built on top of state-of-the-art cepstral domain modeling techniques, aiming to improve accuracy rates. This new information is lightly correlated with our previous MFCC modeling of sustained speech and uncorrelated with the information in our continuous speech modeling scheme. Tests have been performed to evaluate the improvement for our detection task, based on sustained speech as well as combined with a continuous speech classifier, resulting in a 10% relative reduction in classification for the first and a 33% relative reduction for the fused scheme. Results encourage us to consider the existence of non-linear effects on OSA patients' voices, and to think about tools which could be used to improve short-time analysis.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Profiting by the increasing availability of laser sources delivering intensities above 10 9 W/cm 2 with pulse energies in the range of several Joules and pulse widths in the range of nanoseconds, laser shock processing (LSP) is being consolidating as an effective technology for the improvement of surface mechanical and corrosion resistance properties of metals and is being developed as a practical process amenable to production engineering. The main acknowledged advantage of the laser shock processing technique consists on its capability of inducing a relatively deep compression residual stresses field into metallic alloy pieces allowing an improved mechanical behaviour, explicitly, the life improvement of the treated specimens against wear, crack growth and stress corrosion cracking. Following a short description of the theoretical/computational and experimental methods developed by the authors for the predictive assessment and experimental implementation of LSP treatments, experimental results on the residual stress profiles and associated surface properties modification successfully reached in typical materials (specifically steels and Al and Ti alloys) under different LSP irradiation conditions are presented

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Syntax denotes a rule system that allows one to predict the sequencing of communication signals. Despite its significance for both human speech processing and animal acoustic communication, the representation of syntactic structure in the mammalian brain has not been studied electrophysiologically at the single-unit level. In the search for a neuronal correlate for syntax, we used playback of natural and temporally destructured complex species-specific communication calls—so-called composites—while recording extracellularly from neurons in a physiologically well defined area (the FM–FM area) of the mustached bat’s auditory cortex. Even though this area is known to be involved in the processing of target distance information for echolocation, we found that units in the FM–FM area were highly responsive to composites. The finding that neuronal responses were strongly affected by manipulation in the time domain of the natural composite structure lends support to the hypothesis that syntax processing in mammals occurs at least at the level of the nonprimary auditory cortex.