910 resultados para low rate speech coding


Relevância:

50.00% 50.00%

Publicador:

Resumo:

Motion compensated frame interpolation (MCFI) is one of the most efficient solutions to generate side information (SI) in the context of distributed video coding. However, it creates SI with rather significant motion compensated errors for some frame regions while rather small for some other regions depending on the video content. In this paper, a low complexity Infra mode selection algorithm is proposed to select the most 'critical' blocks in the WZ frame and help the decoder with some reliable data for those blocks. For each block, the novel coding mode selection algorithm estimates the encoding rate for the Intra based and WZ coding modes and determines the best coding mode while maintaining a low encoder complexity. The proposed solution is evaluated in terms of rate-distortion performance with improvements up to 1.2 dB regarding a WZ coding mode only solution.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Low-density parity-check (LDPC) codes are nowadays one of the hottest topics in coding theory, notably due to their advantages in terms of bit error rate performance and low complexity. In order to exploit the potential of the Wyner-Ziv coding paradigm, practical distributed video coding (DVC) schemes should use powerful error correcting codes with near-capacity performance. In this paper, new ways to design LDPC codes for the DVC paradigm are proposed and studied. The new LDPC solutions rely on merging parity-check nodes, which corresponds to reduce the number of rows in the parity-check matrix. This allows to change gracefully the compression ratio of the source (DCT coefficient bitplane) according to the correlation between the original and the side information. The proposed LDPC codes reach a good performance for a wide range of source correlations and achieve a better RD performance when compared to the popular turbo codes.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

A new idea for waveform coding using vector quantisation (VQ) is introduced. This idea makes it possible to deal with codevectors much larger than before for a fixed bit per sample rate. Also a solution to the matching problem (inherent in the present context) in the &-norm describing a measure of neamess is presented. The overall computational complexity of this solution is O(n3 log, n). Sample results are presented to demonstrate the advantage of using this technique in the context of coding of speech waveforms.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Advances in digital speech processing are now supporting application and deployment of a variety of speech technologies for human/machine communication. In fact, new businesses are rapidly forming about these technologies. But these capabilities are of little use unless society can afford them. Happily, explosive advances in microelectronics over the past two decades have assured affordable access to this sophistication as well as to the underlying computing technology. The research challenges in speech processing remain in the traditionally identified areas of recognition, synthesis, and coding. These three areas have typically been addressed individually, often with significant isolation among the efforts. But they are all facets of the same fundamental issue--how to represent and quantify the information in the speech signal. This implies deeper understanding of the physics of speech production, the constraints that the conventions of language impose, and the mechanism for information processing in the auditory system. In ongoing research, therefore, we seek more accurate models of speech generation, better computational formulations of language, and realistic perceptual guides for speech processing--along with ways to coalesce the fundamental issues of recognition, synthesis, and coding. Successful solution will yield the long-sought dictation machine, high-quality synthesis from text, and the ultimate in low bit-rate transmission of speech. It will also open the door to language-translating telephony, where the synthetic foreign translation can be in the voice of the originating talker.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Backscatter communication is an emerging wireless technology that recently has gained an increase in attention from both academic and industry circles. The key innovation of the technology is the ability of ultra-low power devices to utilize nearby existing radio signals to communicate. As there is no need to generate their own energetic radio signal, the devices can benefit from a simple design, are very inexpensive and are extremely energy efficient compared with traditional wireless communication. These benefits have made backscatter communication a desirable candidate for distributed wireless sensor network applications with energy constraints.

The backscatter channel presents a unique set of challenges. Unlike a conventional one-way communication (in which the information source is also the energy source), the backscatter channel experiences strong self-interference and spread Doppler clutter that mask the information-bearing (modulated) signal scattered from the device. Both of these sources of interference arise from the scattering of the transmitted signal off of objects, both stationary and moving, in the environment. Additionally, the measurement of the location of the backscatter device is negatively affected by both the clutter and the modulation of the signal return.

This work proposes a channel coding framework for the backscatter channel consisting of a bi-static transmitter/receiver pair and a quasi-cooperative transponder. It proposes to use run-length limited coding to mitigate the background self-interference and spread-Doppler clutter with only a small decrease in communication rate. The proposed method applies to both binary phase-shift keying (BPSK) and quadrature-amplitude modulation (QAM) scheme and provides an increase in rate by up to a factor of two compared with previous methods.

Additionally, this work analyzes the use of frequency modulation and bi-phase waveform coding for the transmitted (interrogating) waveform for high precision range estimation of the transponder location. Compared to previous methods, optimal lower range sidelobes are achieved. Moreover, since both the transmitted (interrogating) waveform coding and transponder communication coding result in instantaneous phase modulation of the signal, cross-interference between localization and communication tasks exists. Phase discriminating algorithm is proposed to make it possible to separate the waveform coding from the communication coding, upon reception, and achieve localization with increased signal energy by up to 3 dB compared with previous reported results.

The joint communication-localization framework also enables a low-complexity receiver design because the same radio is used both for localization and communication.

Simulations comparing the performance of different codes corroborate the theoretical results and offer possible trade-off between information rate and clutter mitigation as well as a trade-off between choice of waveform-channel coding pairs. Experimental results from a brass-board microwave system in an indoor environment are also presented and discussed.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The purpose of this study was to investigate the effects of a short-term low-or high-carbohydrate (CHO) diet consumed after exercise on sympathetic nervous system activity. Twelve healthy males underwent a progressive incremental test; a control measurement of plasma catecholamines and heart rate variability (HRV); an exercise protocol to reduce endogenous CHO stores; a low-or high-CHO diet (counterbalanced order) consumed for 2 days, beginning immediately after the exercise protocol; and a second resting plasma catecholamine and HRV measurement. The exercise and diet protocols and the second round of measurements were performed again after a 1-week washout period. The mean (+/- SD) values of the standard deviation of R-R intervals were similar between conditions (control, 899.0 +/- 146.1 ms; low-CHO diet, 876.8 +/- 115.8 ms; and high-CHO diet, 878.7 +/- 127.7 ms). The absolute high-and low-frequency (HF and LF, respectively) densities of the HRV power spectrum were also not different between conditions. However, normalized HF and LF (i.e., relative to the total power spectrum) were lower and higher, respectively, in the low-CHO diet than in the control diet (mean +/- SD, 17 +/- 9 normalized units (NU) and 83 +/- 9 NU vs. 27 +/- 11 NU and 73 +/- 17 NU, respectively; p < 0.05). The LF/HF ratio was higher with the low-CHO diet than with the control diet (mean +/- SD, 7.2 +/- 6.2 and 4.2 +/- 3.2, respectively; p < 0.05). The mean values of plasma catecholamines were not different between diets. These results suggest that the autonomic control of the heart rate was modified after a short-term low-CHO diet, but plasma catecholamine levels were not altered.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The primary objective of this study was to assess the lingual kinematic strategies used by younger and older adults to increase rate of speech. It was hypothesised that the strategies used by the older adults would differ from the young adults either as a direct result of, or in response to a need to compensate for, age-related changes in the tongue. Electromagnetic articulography was used to examine the tongue movements of eight young (M526.7 years) and eight older (M567.1 years) females during repetitions of /ta/ and /ka/ at a controlled moderate rate and then as fast as possible. The younger and older adults were found to significantly reduce consonant durations and increase syllable repetition rate by similar proportions. To achieve these reduced durations both groups appeared to use the same strategy, that of reducing the distances travelled by the tongue. Further comparisons at each rate, however, suggested a speed-accuracy trade-off and increased speech monitoring in the older adults. The results may assist in differentiating articulatory changes associated with normal aging from pathological changes found in disorders that affect the older population.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The purpose of this paper is to provide a cross-linguistic survey of the variation of coding strategies that are available for the grammatical distinction between direct and indirect speech representation with a particular focus on the expression of indirect reported speech. Cross-linguistic data from a sample of 42 languages will be provided to illustrate the range of available grammatical coding strategies.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Visual system abnormalities are commonly encountered in the fetal alcohol syndrome although the level of exposure at which they become manifest is uncertain. In this study we have examined the effects of either low (ETLD) or high dose (ETHD) ethanol, given between postnatal days 4-9, on the axons of the rat optic nerve. Rats were exposed to ethanol vapour in a special chamber for a period of 3 h per day during the treatment period. The blood alcohol concentration in the ETLD animals averaged similar to 171 mg/dl and in the ETHD animals similar to 430 mg/dl at the end of the treatment on any given day. Groups of 10 and 30-d-old mother-reared control (MRC), separation control (SC), ETLD and ETHD rats were anaesthetised with an intraperitoneal injection or ketamine and xylazine, and killed by intracardiac perfusion with phosphate-buffered glutaraldehyde. In the 10-d-old rat optic nerves there was a total of similar to 145000-165000 axons in MRC, SC and ETLD animals. About 4 % of these fibres were myelinated. The differences between these groups were not statistically significant. However, the 10-d-old ETHD animals had only about 75000 optic nerve axone (P < 0.05) of which about 2.8 % were myelinated. By 30 d of age there was a total of between 75000 90000 optic nerve axons, irrespective of the group examined. The proportion of axons which were myelinated at this age was still significantly lower (P < 0.001) in the ETHD animals (similar to 77 %) than in the other groups (about 98 %). It is concluded that the normal stages of development and maturation of the rat optic nerve axons, as assessed in this study, can be severely compromised by exposure to a relatively high (but not low) dose of ethanol between postnatal d 4 and 9.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Breast cancer accounts for approximately one quarter of all cancers in females. HER2 gene amplification or HER2 protein overexpression, detected in about 20% of breast carcinomas, predicts a more aggressive clinical course and determines eligibility for targeted therapy with trastuzumab. HER2 testing has become an essential part of the clinical evaluation of all breast carcinoma patients, and accurate HER2 results are critical in identifying patients who may be benefited from targeted therapy. This study investigated the concordance in the results of HER2 immunohistochemistry assays performed in 500 invasive breast carcinomas between a reference laboratory and 149 local laboratories from all geographic regions of Brazil. Our results showed an overall poor concordance (171 of 500 cases, 34.2%) regarding HER2 results between local and reference laboratories, which may be related to the low-volume load of HER2 assays, inexperience with HER2 scoring system, and/or technical issues related to immunohistochemistry in local laboratories. Standardization of HER2 testing with rigorous quality control measures by local laboratories is highly recommended to avoid erroneous treatment of breast cancer patients.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This study reexamined the association between speech rate and memory span in children from kindergarten to sixth grade (N = 152) in order to potentially account for the inconsistencies within the published literature on this topic. Some of the inconsistencies in past research may reflect the different methods adopted in assessing speech rate. In particular, repeating word triples may itself involve memory demands, contaminating the correlation between speech rate and memory span in younger children. Analyses using composite speech rate and memory span measures showed that speech rate for word triples shared variance with memory span that was independent of speech rate for single words. Moreover, speech rate for word triples was largely redundant with age in explaining additional variation in memory span once the effects of speech rate for single words were controlled. (C) 2002 Elsevier Science.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A major limitation in any high-performance digital communication system is the linearity region of the transmitting amplifier. Nonlinearities typically lead to signal clipping. Efficient communication in such conditions requires maintaining a low peak-to-average power ratio (PAR) in the transmitted signal while achieving a high throughput of data. Excessive PAR leads either to frequent clipping or to inadequate resolution in the analog-to-digital or digital-to-analog converters. Currently proposed signaling schemes for future generation wireless communications suffer from a high PAR. This paper presents a new signaling scheme for channels with clipping which achieves a PAR as low as 3. For a given linear range in the transmitter's digital-to-analog converter, this scheme achieves a lower bit-error rate than existing multicarrier schemes, owing to increased separation between constellation points. We present the theoretical basis for this new scheme, approximations for the expected bit-error rate, and simulation results. (C) 2002 Elsevier Science (USA).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

As high dynamic range video is gaining popularity, video coding solutions able to efficiently provide both low and high dynamic range video, notably with a single bitstream, are increasingly important. While simulcasting can provide both dynamic range videos at the cost of some compression efficiency penalty, bit-depth scalable video coding can provide a better trade-off between compression efficiency, adaptation flexibility and computational complexity. Considering the widespread use of H.264/AVC video, this paper proposes a H.264/AVC backward compatible bit-depth scalable video coding solution offering a low dynamic range base layer and two high dynamic range enhancement layers with different qualities, at low complexity. Experimental results show that the proposed solution has an acceptable rate-distortion performance penalty regarding the HDR H.264/AVC single-layer coding solution.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

As the wireless cellular market reaches competitive levels never seen before, network operators need to focus on maintaining Quality of Service (QoS) a main priority if they wish to attract new subscribers while keeping existing customers satisfied. Speech Quality as perceived by the end user is one major example of a characteristic in constant need of maintenance and improvement. It is in this topic that this Master Thesis project fits in. Making use of an intrusive method of speech quality evaluation, as a means to further study and characterize the performance of speech codecs in second-generation (2G) and third-generation (3G) technologies. Trying to find further correlation between codecs with similar bit rates, along with the exploration of certain transmission parameters which may aid in the assessment of speech quality. Due to some limitations concerning the audio analyzer equipment that was to be employed, a different system for recording the test samples was sought out. Although the new designed system is not standard, after extensive testing and optimization of the system's parameters, final results were found reliable and satisfactory. Tests include a set of high and low bit rate codecs for both 2G and 3G, where values were compared and analysed, leading to the outcome that 3G speech codecs perform better, under the approximately same conditions, when compared with 2G. Reinforcing the idea that 3G is, with no doubt, the best choice if the costumer looks for the best possible listening speech quality. Regarding the transmission parameters chosen for the experiment, the Receiver Quality (RxQual) and Received Energy per Chip to the Power Density Ratio (Ec/N0), these were subject to speech quality correlation tests. Final results of RxQual were compared to those of prior studies from different researchers and, are considered to be of important relevance. Leading to the confirmation of RxQual as a reliable indicator of speech quality. As for Ec/N0, it is not possible to state it as a speech quality indicator however, it shows clear thresholds for which the MOS values decrease significantly. The studied transmission parameters show that they can be used not only for network management purposes but, at the same time, give an expected idea to the communications engineer (or technician) of the end-to-end speech quality consequences. With the conclusion of the work new ideas for future studies come to mind. Considering that the fourth-generation (4G) cellular technologies are now beginning to take an important place in the global market, as the first all-IP network structure, it seems of great relevance that 4G speech quality should be subject of evaluation. Comparing it to 3G, not only in narrowband but also adding wideband scenarios with the most recent standard objective method of speech quality assessment, POLQA. Also, new data found on Ec/N0 tests, justifies further research studies with the intention of validating the assumptions made in this work.