887 resultados para low rate speech coding
Resumo:
Medical fields requires fast, simple and noninvasive methods of diagnostic techniques. Several methods are available and possible because of the growth of technology that provides the necessary means of collecting and processing signals. The present thesis details the work done in the field of voice signals. New methods of analysis have been developed to understand the complexity of voice signals, such as nonlinear dynamics aiming at the exploration of voice signals dynamic nature. The purpose of this thesis is to characterize complexities of pathological voice from healthy signals and to differentiate stuttering signals from healthy signals. Efficiency of various acoustic as well as non linear time series methods are analysed. Three groups of samples are used, one from healthy individuals, subjects with vocal pathologies and stuttering subjects. Individual vowels/ and a continuous speech data for the utterance of the sentence "iruvarum changatimaranu" the meaning in English is "Both are good friends" from Malayalam language are recorded using a microphone . The recorded audio are converted to digital signals and are subjected to analysis.Acoustic perturbation methods like fundamental frequency (FO), jitter, shimmer, Zero Crossing Rate(ZCR) were carried out and non linear measures like maximum lyapunov exponent(Lamda max), correlation dimension (D2), Kolmogorov exponent(K2), and a new measure of entropy viz., Permutation entropy (PE) are evaluated for all three groups of the subjects. Permutation Entropy is a nonlinear complexity measure which can efficiently distinguish regular and complex nature of any signal and extract information about the change in dynamics of the process by indicating sudden change in its value. The results shows that nonlinear dynamical methods seem to be a suitable technique for voice signal analysis, due to the chaotic component of the human voice. Permutation entropy is well suited due to its sensitivity to uncertainties, since the pathologies are characterized by an increase in the signal complexity and unpredictability. Pathological groups have higher entropy values compared to the normal group. The stuttering signals have lower entropy values compared to the normal signals.PE is effective in charaterising the level of improvement after two weeks of speech therapy in the case of stuttering subjects. PE is also effective in characterizing the dynamical difference between healthy and pathological subjects. This suggests that PE can improve and complement the recent voice analysis methods available for clinicians. The work establishes the application of the simple, inexpensive and fast algorithm of PE for diagnosis in vocal disorders and stuttering subjects.
Resumo:
LLDPE was blended with poly (vinyl alcohol) and mechanical, thermal, spectroscopic properties and biodegradability were investigated. The biodegradability of LLDPE/PVA blends has been studied in two environments, viz. (1) a culture medium containing Vibrio sp. and (2) a soil environment over a period of 15 weeks. Nanoanatase having photo catalytic activity was synthesized by hydrothermal method using titanium-iso-propoxide. The synthesized TiO2 was characterized by X-Ray diffraction (XRD), BET studies, FTIR studies and scanning electron microscopy (SEM). The crystallite size of titania was calculated to be ≈ 6nm from the XRD results and the surface area was found to be about 310m2/g by BET method. SEM shows that nanoanatase particles prepared by this method are spherical in shape. Linear low density polyethylene films containing polyvinyl alcohol and a pro-oxidant (TiO2 or cobalt stearate with or without vegetable oil) were prepared. The films were then subjected to natural weathering and UV exposure followed by biodegradation in culture medium as well as in soil environment. The degradation was monitored by mechanical property measurements, thermal studies, rate of weight loss, FTIR and SEM studies. Higher weight loss, texture change and greater increments in carbonyl index values were observed in samples containing cobalt stearate and vegetable oil. The present study demonstrates that the combination of LLDPE/PVA blends with (I) nanoanatase/vegetable oil and (ii) cobalt stearate/vegetable oil leads to extensive photodegradation. These samples show substantial degradation when subsequent exposure to Vibrio sp. is made. Thus a combined photodegradation and biodegradation process is a promising step towards obtaining a biodegradable grade of LLDPE.
Resumo:
Biometrics deals with the physiological and behavioral characteristics of an individual to establish identity. Fingerprint based authentication is the most advanced biometric authentication technology. The minutiae based fingerprint identification method offer reasonable identification rate. The feature minutiae map consists of about 70-100 minutia points and matching accuracy is dropping down while the size of database is growing up. Hence it is inevitable to make the size of the fingerprint feature code to be as smaller as possible so that identification may be much easier. In this research, a novel global singularity based fingerprint representation is proposed. Fingerprint baseline, which is the line between distal and intermediate phalangeal joint line in the fingerprint, is taken as the reference line. A polygon is formed with the singularities and the fingerprint baseline. The feature vectors are the polygonal angle, sides, area, type and the ridge counts in between the singularities. 100% recognition rate is achieved in this method. The method is compared with the conventional minutiae based recognition method in terms of computation time, receiver operator characteristics (ROC) and the feature vector length. Speech is a behavioural biometric modality and can be used for identification of a speaker. In this work, MFCC of text dependant speeches are computed and clustered using k-means algorithm. A backpropagation based Artificial Neural Network is trained to identify the clustered speech code. The performance of the neural network classifier is compared with the VQ based Euclidean minimum classifier. Biometric systems that use a single modality are usually affected by problems like noisy sensor data, non-universality and/or lack of distinctiveness of the biometric trait, unacceptable error rates, and spoof attacks. Multifinger feature level fusion based fingerprint recognition is developed and the performances are measured in terms of the ROC curve. Score level fusion of fingerprint and speech based recognition system is done and 100% accuracy is achieved for a considerable range of matching threshold
Resumo:
In recent years, reversible logic has emerged as one of the most important approaches for power optimization with its application in low power CMOS, quantum computing and nanotechnology. Low power circuits implemented using reversible logic that provides single error correction – double error detection (SEC-DED) is proposed in this paper. The design is done using a new 4 x 4 reversible gate called ‘HCG’ for implementing hamming error coding and detection circuits. A parity preserving HCG (PPHCG) that preserves the input parity at the output bits is used for achieving fault tolerance for the hamming error coding and detection circuits.
Resumo:
Speech signals are one of the most important means of communication among the human beings. In this paper, a comparative study of two feature extraction techniques are carried out for recognizing speaker independent spoken isolated words. First one is a hybrid approach with Linear Predictive Coding (LPC) and Artificial Neural Networks (ANN) and the second method uses a combination of Wavelet Packet Decomposition (WPD) and Artificial Neural Networks. Voice signals are sampled directly from the microphone and then they are processed using these two techniques for extracting the features. Words from Malayalam, one of the four major Dravidian languages of southern India are chosen for recognition. Training, testing and pattern recognition are performed using Artificial Neural Networks. Back propagation method is used to train the ANN. The proposed method is implemented for 50 speakers uttering 20 isolated words each. Both the methods produce good recognition accuracy. But Wavelet Packet Decomposition is found to be more suitable for recognizing speech because of its multi-resolution characteristics and efficient time frequency localizations
Resumo:
While channel coding is a standard method of improving a system’s energy efficiency in digital communications, its practice does not extend to high-speed links. Increasing demands in network speeds are placing a large burden on the energy efficiency of high-speed links and render the benefit of channel coding for these systems a timely subject. The low error rates of interest and the presence of residual intersymbol interference (ISI) caused by hardware constraints impede the analysis and simulation of coded high-speed links. Focusing on the residual ISI and combined noise as the dominant error mechanisms, this paper analyses error correlation through concepts of error region, channel signature, and correlation distance. This framework provides a deeper insight into joint error behaviours in high-speed links, extends the range of statistical simulation for coded high-speed links, and provides a case against the use of biased Monte Carlo methods in this setting
Resumo:
Modeling nonlinear systems using Volterra series is a century old method but practical realizations were hampered by inadequate hardware to handle the increased computational complexity stemming from its use. But interest is renewed recently, in designing and implementing filters which can model much of the polynomial nonlinearities inherent in practical systems. The key advantage in resorting to Volterra power series for this purpose is that nonlinear filters so designed can be made to work in parallel with the existing LTI systems, yielding improved performance. This paper describes the inclusion of a quadratic predictor (with nonlinearity order 2) with a linear predictor in an analog source coding system. Analog coding schemes generally ignore the source generation mechanisms but focuses on high fidelity reconstruction at the receiver. The widely used method of differential pnlse code modulation (DPCM) for speech transmission uses a linear predictor to estimate the next possible value of the input speech signal. But this linear system do not account for the inherent nonlinearities in speech signals arising out of multiple reflections in the vocal tract. So a quadratic predictor is designed and implemented in parallel with the linear predictor to yield improved mean square error performance. The augmented speech coder is tested on speech signals transmitted over an additive white gaussian noise (AWGN) channel.
Resumo:
This paper discusses the implementation details of a child friendly, good quality, English text-to-speech (TTS) system that is phoneme-based, concatenative, easy to set up and use with little memory. Direct waveform concatenation and linear prediction coding (LPC) are used. Most existing TTS systems are unit-selection based, which use standard speech databases available in neutral adult voices.Here reduced memory is achieved by the concatenation of phonemes and by replacing phonetic wave files with their LPC coefficients. Linguistic analysis was used to reduce the algorithmic complexity instead of signal processing techniques. Sufficient degree of customization and generalization catering to the needs of the child user had been included through the provision for vocabulary and voice selection to suit the requisites of the child. Prosody had also been incorporated. This inexpensive TTS systemwas implemented inMATLAB, with the synthesis presented by means of a graphical user interface (GUI), thus making it child friendly. This can be used not only as an interesting language learning aid for the normal child but it also serves as a speech aid to the vocally disabled child. The quality of the synthesized speech was evaluated using the mean opinion score (MOS).
Low-altitude aerial photography for optimum N fertilization of winter wheat on the North China Plain
Resumo:
Previous research has shown that site-specific nitrogen (N) fertilizer recommendations based on an assessment of a soil’s N supply (mineral N testing) and the crop’s N status (sap nitrate analysis) can help to decrease excessive N inputs for winter wheat on the North China Plain. However, the costs to derive such recommendations based on multiple sampling of a single field hamper the use of this approach at the on-farm level. In this study low-altitude aerial true-color photographs were used to examine the relationship between image-derived reflectance values and soil–plant data in an on-station experiment. Treatments comprised a conventional N treatment (typical farmers’ practice), an optimum N treatment (N application based on soil–plant testing) and six treatments without N (one to six cropping seasons without any N fertilizer input). Normalized intensities of the red, green and blue color bands on the photographs were highly correlated with total N concentrations, SPAD readings and stem sap nitrate of winter wheat. The results indicate the potential of aerial photography to determine in combination with on site soil–plant testing the optimum N fertilizer rate for larger fields and to thereby decrease the costs for N need assessments.
Resumo:
Information display technology is a rapidly growing research and development field. Using state-of-the-art technology, optical resolution can be increased dramatically by organic light-emitting diode - since the light emitting layer is very thin, under 100nm. The main question is what pixel size is achievable technologically? The next generation of display will considers three-dimensional image display. In 2D , one is considering vertical and horizontal resolutions. In 3D or holographic images, there is another dimension – depth. The major requirement is the high resolution horizontal dimension in order to sustain the third dimension using special lenticular glass or barrier masks, separate views for each eye. The high-resolution 3D display offers hundreds of more different views of objects or landscape. OLEDs have potential to be a key technology for information displays in the future. The display technology presented in this work promises to bring into use bright colour 3D flat panel displays in a unique way. Unlike the conventional TFT matrix, OLED displays have constant brightness and colour, independent from the viewing angle i.e. the observer's position in front of the screen. A sandwich (just 0.1 micron thick) of organic thin films between two conductors makes an OLE Display device. These special materials are named electroluminescent organic semi-conductors (or organic photoconductors (OPC )). When electrical current is applied, a bright light is emitted (electrophosphorescence) from the formed Organic Light-Emitting Diode. Usually for OLED an ITO layer is used as a transparent electrode. Such types of displays were the first for volume manufacture and only a few products are available in the market at present. The key challenges that OLED technology faces in the application areas are: producing high-quality white light achieving low manufacturing costs increasing efficiency and lifetime at high brightness. Looking towards the future, by combining OLED with specially constructed surface lenses and proper image management software it will be possible to achieve 3D images.
Resumo:
Type and rate of fertilizers influence the level of soil organic carbon (Corg) and total nitrogen (Nt) markedly, but the effect on C and N partitioning into different pools is open to question. The objectives of the present work were to: (i) quantify the impact of fertilizer type and rate on labile, intermediate and passive C and N pools by using a combination of biological, chemical and mathematical methods; (ii) explain previously reported differences in the soil organic matter (SOM) levels between soils receiving farmyard manure with or without biodynamic preparations by using Corg time series and information on SOM partitioning; and (iii) quantify the long-term and short-term dynamics of SOM in density fractions and microbial biomass as affected by fertilizer type and rate and determine the incorporation of crop residues into labile SOM fractions. Samples were taken from a sandy Cambisol from the long-term fertilization trial in Darmstadt, Germany, founded in 1980. The nine treatments (four field replicates) were: straw incorporation plus application of mineral fertilizer (MSI) and application of rotted farmyard manure with (DYN) or without (FYM) addition of biodynamic preparations, each at high (140 – 150 kg N ha-1 year-1; MSIH, DYNH, FYMH), medium (100 kg N ha-1 year-1; MSIM, DYNM, FYMM) and low (50 – 60 kg N ha-1 year-1; MSIL, DYNL, FYML) rates. The main findings were: (i) The stocks of Corg (t ha-1) were affected by fertilizer type and rate and increased in the order MSIL (23.6), MSIM (23.7), MSIH (24.2) < FYML (25.3) < FYMM (28.1), FYMH (28.1). Stocks of Nt were affected in the same way (C/N ratio: 11). Storage of C and N in the modelled labile pools (turnover times: 462 and 153 days for C and N, respectively) were not influenced by the type of fertilizer (FYM and MSI) but depended significantly (p ≤ 0.05) on the application rate and ranged from 1.8 to 3.2 t C ha 1 (7 – 13% of Corg) and from 90 to 140 kg N ha-1 (4-5% of Nt). In the calculated intermediate pool (C/N ratio 7), stocks of C were markedly higher in FYM treatments (15-18 t ha-1) compared to MSI treatments (12-14 t ha-1). This showed that differences in SOM stocks in the sandy Cambisol induced by fertilizer rate may be short-lived in case of changing management, but differences induced by fertilizer type may persist for decades. (ii) Crop yields, estimated C inputs (1.5 t ha-1 year-1) with crop residue, microbial bio¬mass C (Cmic, 118 – 150 mg kg-1), microbial biomass N (17 – 20 mg kg-1) and labile C and N pools did not differ significantly between FYM and DYN treatments. However, labile C increased linearly with application rate (R2 = 0.53) from 7 to 11% of Corg. This also applied for labile N (3.5 to 4.9% of Nt). The higher contents of Corg in DYN treatments existed since 1982, when the first sampling was conducted for all individual treatments. Contents of Corg between DYN and FYM treatments con-verged slightly since then. Furthermore, at least 30% of the difference in Corg was located in the passive pool where a treatment effect could be excluded. Therefore, the reported differences in Corg contents existed most likely since the beginning of the experiment and, as a single factor of biodynamic agriculture, application of bio-dynamic preparations had no effect on SOM stocks. (iii) Stocks of SOM, light fraction organic C (LFOC, ρ ≤ 2.0 g cm-3), light fraction organic N and Cmic decreased in the order FYMH > FYML > MSIH, MSIL for all sampling dates in 2008 (March, May, September, December). However, statistical significance of treatment effects differed between the dates, probably due to dif-ferences in the spatial variation throughout the year. The high proportion of LFOC on total Corg stocks (45 – 55%) highlighted the importance of selective preservation of OM as a stabilization mechanism in this sandy Cambisol. The apparent turnover time of LFOC was between 21 and 32 years, which agreed very well with studies with substantially longer vegetation change compared to our study. Overall, both approaches; (I) the combination of incubation, chemical fractionation and simple modelling and (II) the density fractionation; provided complementary information on the partitioning of SOM into pools of different stability. The density fractionation showed that differences in Corg stocks between FYM and MSI treatments were mainly located in the light fraction, i.e. induced by higher recalcitrance of the organic input in the FYM treatments. Moreover, the use of the combination of biological, chemical and mathematical methods indicated that effects of fertilizer rate on total Corg and Nt stocks may be short-lived, but that the effect of fertilizer type may persist for longer time spans in the sandy Cambisol.
Resumo:
We present MikeTalk, a text-to-audiovisual speech synthesizer which converts input text into an audiovisual speech stream. MikeTalk is built using visemes, which are a small set of images spanning a large range of mouth shapes. The visemes are acquired from a recorded visual corpus of a human subject which is specifically designed to elicit one instantiation of each viseme. Using optical flow methods, correspondence from every viseme to every other viseme is computed automatically. By morphing along this correspondence, a smooth transition between viseme images may be generated. A complete visual utterance is constructed by concatenating viseme transitions. Finally, phoneme and timing information extracted from a text-to-speech synthesizer is exploited to determine which viseme transitions to use, and the rate at which the morphing process should occur. In this manner, we are able to synchronize the visual speech stream with the audio speech stream, and hence give the impression of a photorealistic talking face.
Resumo:
Synechocystis PCC 6803 is a photosynthetic bacterium that has the potential to make bioproducts from carbon dioxide and light. Biochemical production from photosynthetic organisms is attractive because it replaces the typical bioprocessing steps of crop growth, milling, and fermentation, with a one-step photosynthetic process. However, low yields and slow growth rates limit the economic potential of such endeavors. Rational metabolic engineering methods are hindered by limited cellular knowledge and inadequate models of Synechocystis. Instead, inverse metabolic engineering, a scheme based on combinatorial gene searches which does not require detailed cellular models, but can exploit sequence data and existing molecular biological techniques, was used to find genes that (1) improve the production of the biopolymer poly-3-hydroxybutyrate (PHB) and (2) increase the growth rate. A fluorescence activated cell sorting assay was developed to screen for high PHB producing clones. Separately, serial sub-culturing was used to select clones that improve growth rate. Novel gene knock-outs were identified that increase PHB production and others that increase the specific growth rate. These improvements make this system more attractive for industrial use and demonstrate the power of inverse metabolic engineering to identify novel phenotype-associated genes in poorly understood systems.
Resumo:
Considers channel capacity, coding rate, repetition code, Hamming code, Hamming distance
Resumo:
Low take up of stigma-free social benefits is often blamed on information asymmetries or administrative barriers. There is limited evidence on which of these potential channels is more salient in which contexts. We designed and implemented a randomized controlled trial to assess the extent to which informational barriers are responsible for the prevalent low take-up of government benefits among Colombian conflict-driven internal refugees. We provide timely information on benefits eligibility via SMS to a random half of the displaced household that migrated to Bogot´a over a 6-month period. We show that improving information increases benefits’ take up. However, the effect is small and only true for certain type of benefits. Hence, consistent with previous experimental literature, the availability of timely information explains only part of the low-take up rates and the role of administrative barriers and bureaucratic processes should be tackled to increase the well-being of internal refugees in Colombia.