993 resultados para SPEECH-PERCEPTION
Resumo:
Development of continuous shrimp cell lines for effective investigation on shrimp viruses remains elusive with an arduous history of over 25 years. Despite presenting challenges to researchers in developing a cell line, the billion dollar aquaculture industry is under viral threat. Advances in molecular biology and various gene transfer technologies for immortalization of cells have resulted in the development of hundreds of cell lines from insects and mammals, but yet not a single cell line has been developed from shrimp and other marine invertebrates. Though improved growth and longevity of shrimp cells in vitro could be achieved by using modified growth media this did not make any leap to spontaneous transformation; probably due to the fact that shrimp cells inhibited neoplastic transformations. Oncogenic induction and immortalization are considered as the possible ways, and an exclusive medium for shrimp cell culture and an appropriate mode of transformation are crucial. In this review status of shrimp cell line development and its future orientation are discussed
Resumo:
Digit speech recognition is important in many applications such as automatic data entry, PIN entry, voice dialing telephone, automated banking system, etc. This paper presents speaker independent speech recognition system for Malayalam digits. The system employs Mel frequency cepstrum coefficient (MFCC) as feature for signal processing and Hidden Markov model (HMM) for recognition. The system is trained with 21 male and female voices in the age group of 20 to 40 years and there was 98.5% word recognition accuracy (94.8% sentence recognition accuracy) on a test set of continuous digit recognition task.
Resumo:
Performance of any continuous speech recognition system is dependent on the accuracy of its acoustic model. Hence, preparation of a robust and accurate acoustic model lead to satisfactory recognition performance for a speech recognizer. In acoustic modeling of phonetic unit, context information is of prime importance as the phonemes are found to vary according to the place of occurrence in a word. In this paper we compare and evaluate the effect of context dependent tied (CD tied) models, context dependent (CD) and context independent (CI) models in the perspective of continuous speech recognition of Malayalam language. The database for the speech recognition system has utterance from 21 speakers including 11 female and 10 males. Our evaluation results show that CD tied models outperforms CI models over 21%.
Resumo:
A connected digit speech recognition is important in many applications such as automated banking system, catalogue-dialing, automatic data entry, automated banking system, etc. This paper presents an optimum speaker-independent connected digit recognizer forMalayalam language. The system employs Perceptual Linear Predictive (PLP) cepstral coefficient for speech parameterization and continuous density Hidden Markov Model (HMM) in the recognition process. Viterbi algorithm is used for decoding. The training data base has the utterance of 21 speakers from the age group of 20 to 40 years and the sound is recorded in the normal office environment where each speaker is asked to read 20 set of continuous digits. The system obtained an accuracy of 99.5 % with the unseen data.
Resumo:
Modeling nonlinear systems using Volterra series is a century old method but practical realizations were hampered by inadequate hardware to handle the increased computational complexity stemming from its use. But interest is renewed recently, in designing and implementing filters which can model much of the polynomial nonlinearities inherent in practical systems. The key advantage in resorting to Volterra power series for this purpose is that nonlinear filters so designed can be made to work in parallel with the existing LTI systems, yielding improved performance. This paper describes the inclusion of a quadratic predictor (with nonlinearity order 2) with a linear predictor in an analog source coding system. Analog coding schemes generally ignore the source generation mechanisms but focuses on high fidelity reconstruction at the receiver. The widely used method of differential pnlse code modulation (DPCM) for speech transmission uses a linear predictor to estimate the next possible value of the input speech signal. But this linear system do not account for the inherent nonlinearities in speech signals arising out of multiple reflections in the vocal tract. So a quadratic predictor is designed and implemented in parallel with the linear predictor to yield improved mean square error performance. The augmented speech coder is tested on speech signals transmitted over an additive white gaussian noise (AWGN) channel.
Resumo:
This paper discusses the implementation details of a child friendly, good quality, English text-to-speech (TTS) system that is phoneme-based, concatenative, easy to set up and use with little memory. Direct waveform concatenation and linear prediction coding (LPC) are used. Most existing TTS systems are unit-selection based, which use standard speech databases available in neutral adult voices.Here reduced memory is achieved by the concatenation of phonemes and by replacing phonetic wave files with their LPC coefficients. Linguistic analysis was used to reduce the algorithmic complexity instead of signal processing techniques. Sufficient degree of customization and generalization catering to the needs of the child user had been included through the provision for vocabulary and voice selection to suit the requisites of the child. Prosody had also been incorporated. This inexpensive TTS systemwas implemented inMATLAB, with the synthesis presented by means of a graphical user interface (GUI), thus making it child friendly. This can be used not only as an interesting language learning aid for the normal child but it also serves as a speech aid to the vocally disabled child. The quality of the synthesized speech was evaluated using the mean opinion score (MOS).
Resumo:
This paper describes certain findings of intonation and intensity study of emotive speech with the minimal use of signal processing algorithms. This study was based on six basic emotions and the neutral, elicited from 1660 English utterances obtained from the speech recordings of six Indian women. The correctness of the emotional content was verified through perceptual listening tests. Marked similarity was noted among pitch contours of like-worded, positive valence emotions, though no such similarity was observed among the four negative valence emotional expressions. The intensity patterns were also studied. The results of the study were validated using arbitrary television recordings for four emotions. The findings are useful to technical researchers, social psychologists and to the common man interested in the dynamics of vocal expression of emotions
Resumo:
This study is an attempt to situate the quality of life and standard of living of local communities in ecotourism destinations inter alia their perception on forest conservation and the satisfaction level of the local community. 650 EDC/VSS members from Kerala demarcated into three zones constitute the data source. Four variables have been considered for evaluating the quality of life of the stakeholders of ecotourism sites, which is then funneled to the income-education spectrum for hypothesizing into the SLI framework. Zone-wise analysis of the community members working in tourism sector shows that the community members have benefited totally from tourism development in the region as they have got both employments as well as secured livelihood options. Most of the quality of life-indicators of the community in the eco-tourist centres show a promising position. The community perception does not show any negative impact on environment as well as on their local culture.
Resumo:
Speech is the primary, most prominent and convenient means of communication in audible language. Through speech, people can express their thoughts, feelings or perceptions by the articulation of words. Human speech is a complex signal which is non stationary in nature. It consists of immensely rich information about the words spoken, accent, attitude of the speaker, expression, intention, sex, emotion as well as style. The main objective of Automatic Speech Recognition (ASR) is to identify whatever people speak by means of computer algorithms. This enables people to communicate with a computer in a natural spoken language. Automatic recognition of speech by machines has been one of the most exciting, significant and challenging areas of research in the field of signal processing over the past five to six decades. Despite the developments and intensive research done in this area, the performance of ASR is still lower than that of speech recognition by humans and is yet to achieve a completely reliable performance level. The main objective of this thesis is to develop an efficient speech recognition system for recognising speaker independent isolated words in Malayalam.
Resumo:
Aiming at consensus building in environmental management with democratic procedure, a communication tool for mutual understanding is profoundly needed. The main purpose of this research is the establishment of a practical methodology to understand vernacular meanings of the environment in terms of resident landscape perception as a precondition for environmental discussion.
Resumo:
In East Africa, Uganda is one of the major producers of organic pineapples for export. These pineapples are mainly produced in central Uganda and have to meet stringent quality standards before they can be allowed on international markets. These quality standards may put considerable strain on farmers and may not be wholly representative of their quality interpretation. The aim of this paper is therefore, to determine the Ugandan organic pineapple farmers’ quality perception, the activities they carry out in order to attain that quality and challenges (production, postharvest & marketing) faced on the same. Qualitative semi-structured interviews were carried out among 28 organic pineapple farmers in Kayunga district, central Uganda. Findings suggest that quality of organic pineapples is mainly perceived in terms of product attributes particularly appearance followed by food security provision. Certification plays a minor role in what farmers describe as organic quality. High production input costs (labour and coffee husks) coupled with a stagnant premium are some of the major challenges faced by farmers in attaining organic quality. The paper argues that currently there are concealed negative food security effects embroiled in these pineapple schemes. It is recommended that the National Organic Agricultural Movement of Uganda (NOGAMU) works with all relevant stakeholders to have the farmer premium price raised and an official organic policy enacted.
Resumo:
Sketches are commonly used in the early stages of design. Our previous system allows users to sketch mechanical systems that the computer interprets. However, some parts of the mechanical system might be too hard or too complicated to express in the sketch. Adding speech recognition to create a multimodal system would move us toward our goal of creating a more natural user interface. This thesis examines the relationship between the verbal and sketch input, particularly how to segment and align the two inputs. Toward this end, subjects were recorded while they sketched and talked. These recordings were transcribed, and a set of rules to perform segmentation and alignment was created. These rules represent the knowledge that the computer needs to perform segmentation and alignment. The rules successfully interpreted the 24 data sets that they were given.
Resumo:
We present a novel scheme ("Categorical Basis Functions", CBF) for object class representation in the brain and contrast it to the "Chorus of Prototypes" scheme recently proposed by Edelman. The power and flexibility of CBF is demonstrated in two examples. CBF is then applied to investigate the phenomenon of Categorical Perception, in particular the finding by Bulthoff et al. (1998) of categorization of faces by gender without corresponding Categorical Perception. Here, CBF makes predictions that can be tested in a psychophysical experiment. Finally, experiments are suggested to further test CBF.
Resumo:
We present an unsupervised learning algorithm that acquires a natural-language lexicon from raw speech. The algorithm is based on the optimal encoding of symbol sequences in an MDL framework, and uses a hierarchical representation of language that overcomes many of the problems that have stymied previous grammar-induction procedures. The forward mapping from symbol sequences to the speech stream is modeled using features based on articulatory gestures. We present results on the acquisition of lexicons and language models from raw speech, text, and phonetic transcripts, and demonstrate that our algorithm compares very favorably to other reported results with respect to segmentation performance and statistical efficiency.