309 resultados para Spectral Characterization


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis investigates aspects of encoding the speech spectrum at low bit rates, with extensions to the effect of such coding on automatic speaker identification. Vector quantization (VQ) is a technique for jointly quantizing a block of samples at once, in order to reduce the bit rate of a coding system. The major drawback in using VQ is the complexity of the encoder. Recent research has indicated the potential applicability of the VQ method to speech when product code vector quantization (PCVQ) techniques are utilized. The focus of this research is the efficient representation, calculation and utilization of the speech model as stored in the PCVQ codebook. In this thesis, several VQ approaches are evaluated, and the efficacy of two training algorithms is compared experimentally. It is then shown that these productcode vector quantization algorithms may be augmented with lossless compression algorithms, thus yielding an improved overall compression rate. An approach using a statistical model for the vector codebook indices for subsequent lossless compression is introduced. This coupling of lossy compression and lossless compression enables further compression gain. It is demonstrated that this approach is able to reduce the bit rate requirement from the current 24 bits per 20 millisecond frame to below 20, using a standard spectral distortion metric for comparison. Several fast-search VQ methods for use in speech spectrum coding have been evaluated. The usefulness of fast-search algorithms is highly dependent upon the source characteristics and, although previous research has been undertaken for coding of images using VQ codebooks trained with the source samples directly, the product-code structured codebooks for speech spectrum quantization place new constraints on the search methodology. The second major focus of the research is an investigation of the effect of lowrate spectral compression methods on the task of automatic speaker identification. The motivation for this aspect of the research arose from a need to simultaneously preserve the speech quality and intelligibility and to provide for machine-based automatic speaker recognition using the compressed speech. This is important because there are several emerging applications of speaker identification where compressed speech is involved. Examples include mobile communications where the speech has been highly compressed, or where a database of speech material has been assembled and stored in compressed form. Although these two application areas have the same objective - that of maximizing the identification rate - the starting points are quite different. On the one hand, the speech material used for training the identification algorithm may or may not be available in compressed form. On the other hand, the new test material on which identification is to be based may only be available in compressed form. Using the spectral parameters which have been stored in compressed form, two main classes of speaker identification algorithm are examined. Some studies have been conducted in the past on bandwidth-limited speaker identification, but the use of short-term spectral compression deserves separate investigation. Combining the major aspects of the research, some important design guidelines for the construction of an identification model when based on the use of compressed speech are put forward.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis presents an original approach to parametric speech coding at rates below 1 kbitsjsec, primarily for speech storage applications. Essential processes considered in this research encompass efficient characterization of evolutionary configuration of vocal tract to follow phonemic features with high fidelity, representation of speech excitation using minimal parameters with minor degradation in naturalness of synthesized speech, and finally, quantization of resulting parameters at the nominated rates. For encoding speech spectral features, a new method relying on Temporal Decomposition (TD) is developed which efficiently compresses spectral information through interpolation between most steady points over time trajectories of spectral parameters using a new basis function. The compression ratio provided by the method is independent of the updating rate of the feature vectors, hence allows high resolution in tracking significant temporal variations of speech formants with no effect on the spectral data rate. Accordingly, regardless of the quantization technique employed, the method yields a high compression ratio without sacrificing speech intelligibility. Several new techniques for improving performance of the interpolation of spectral parameters through phonetically-based analysis are proposed and implemented in this research, comprising event approximated TD, near-optimal shaping event approximating functions, efficient speech parametrization for TD on the basis of an extensive investigation originally reported in this thesis, and a hierarchical error minimization algorithm for decomposition of feature parameters which significantly reduces the complexity of the interpolation process. Speech excitation in this work is characterized based on a novel Multi-Band Excitation paradigm which accurately determines the harmonic structure in the LPC (linear predictive coding) residual spectra, within individual bands, using the concept 11 of Instantaneous Frequency (IF) estimation in frequency domain. The model yields aneffective two-band approximation to excitation and computes pitch and voicing with high accuracy as well. New methods for interpolative coding of pitch and gain contours are also developed in this thesis. For pitch, relying on the correlation between phonetic evolution and pitch variations during voiced speech segments, TD is employed to interpolate the pitch contour between critical points introduced by event centroids. This compresses pitch contour in the ratio of about 1/10 with negligible error. To approximate gain contour, a set of uniformly-distributed Gaussian event-like functions is used which reduces the amount of gain information to about 1/6 with acceptable accuracy. The thesis also addresses a new quantization method applied to spectral features on the basis of statistical properties and spectral sensitivity of spectral parameters extracted from TD-based analysis. The experimental results show that good quality speech, comparable to that of conventional coders at rates over 2 kbits/sec, can be achieved at rates 650-990 bits/sec.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Analysis by enzyme-linked immunosorbent assay showed that Rice tungro bacilliform virus (RTBV) accumulated in a cyclic pattern from early to late stages of infection in tungro-susceptible variety, Taichung Native 1 (TN1), and resistant variety, Balimau Putih, singly infected with RTBV or co-infected with RTBV+Rice tungro spherical virus (RTSV). These changes in virus accumulation resulted in differences in RTBV levels and incidence of infection. The virus levels were expressed relative to those of the susceptible variety and the incidence of infection was assessed at different weeks after inoculation. At a particular time point, RTBV levels in TN1 or Balimau Putih singly infected with RTBV were not significantly different from the virus level in plants co-infected with RTBV+RTSV. The relative RTBV levels in Balimau Putih either singly infected with RTBV or co-infected with RTBV+RTSV were significantly lower than those in TN1. The incidence of RTBV infection varied at different times in Balimau Putih but not in TN1, and to determine the actual infection, the number of plants that became infected at least once anytime during the 4wk observation period was considered. Considering the changes in RTBV accumulation, new parameters for analyzing RTBV resistance were established. Based on these parameters, Balimau Putih was characterized having resistance to virus accumulation although the actual incidence of infection was >75%.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

n the field of tissue engineering new polymers are needed to fabricate scaffolds with specific properties depending on the targeted tissue. This work aimed at designing and developing a 3D scaffold with variable mechanical strength, fully interconnected porous network, controllable hydrophilicity and degradability. For this, a desktop-robot-based melt-extrusion rapid prototyping technique was applied to a novel tri-block co-polymer, namely poly(ethylene glycol)-block-poly(epsi-caprolactone)-block-poly(DL-lactide), PEG-PCL-P(DL)LA. This co-polymer was melted by electrical heating and directly extruded out using computer-controlled rapid prototyping by means of compressed purified air to build porous scaffolds. Various lay-down patterns (0/30/60/90/120/150°, 0/45/90/135°, 0/60/120° and 0/90°) were produced by using appropriate positioning of the robotic control system. Scanning electron microscopy and micro-computed tomography were used to show that 3D scaffold architectures were honeycomb-like with completely interconnected and controlled channel characteristics. Compression tests were performed and the data obtained agreed well with the typical behavior of a porous material undergoing deformation. Preliminary cell response to the as-fabricated scaffolds has been studied with primary human fibroblasts. The results demonstrated the suitability of the process and the cell biocompatibility of the polymer, two important properties among the many required for effective clinical use and efficient tissue-engineering scaffolding.