909 resultados para coding complexity
Resumo:
This thesis investigated the potential use of Linear Predictive Coding in speech communication applications. A Modified Block Adaptive Predictive Coder is developed, which reduces the computational burden and complexity without sacrificing the speech quality, as compared to the conventional adaptive predictive coding (APC) system. For this, changes in the evaluation methods have been evolved. This method is as different from the usual APC system in that the difference between the true and the predicted value is not transmitted. This allows the replacement of the high order predictor in the transmitter section of a predictive coding system, by a simple delay unit, which makes the transmitter quite simple. Also, the block length used in the processing of the speech signal is adjusted relative to the pitch period of the signal being processed rather than choosing a constant length as hitherto done by other researchers. The efficiency of the newly proposed coder has been supported with results of computer simulation using real speech data. Three methods for voiced/unvoiced/silent/transition classification have been presented. The first one is based on energy, zerocrossing rate and the periodicity of the waveform. The second method uses normalised correlation coefficient as the main parameter, while the third method utilizes a pitch-dependent correlation factor. The third algorithm which gives the minimum error probability has been chosen in a later chapter to design the modified coder The thesis also presents a comparazive study beh-cm the autocorrelation and the covariance methods used in the evaluaiicn of the predictor parameters. It has been proved that the azztocorrelation method is superior to the covariance method with respect to the filter stabf-it)‘ and also in an SNR sense, though the increase in gain is only small. The Modified Block Adaptive Coder applies a switching from pitch precitzion to spectrum prediction when the speech segment changes from a voiced or transition region to an unvoiced region. The experiments cont;-:ted in coding, transmission and simulation, used speech samples from .\£=_‘ajr2_1a:r1 and English phrases. Proposal for a speaker reecgnifion syste: and a phoneme identification system has also been outlized towards the end of the thesis.
Resumo:
In recent years, reversible logic has emerged as one of the most important approaches for power optimization with its application in low power CMOS, quantum computing and nanotechnology. Low power circuits implemented using reversible logic that provides single error correction – double error detection (SEC-DED) is proposed in this paper. The design is done using a new 4 x 4 reversible gate called ‘HCG’ for implementing hamming error coding and detection circuits. A parity preserving HCG (PPHCG) that preserves the input parity at the output bits is used for achieving fault tolerance for the hamming error coding and detection circuits.
Resumo:
The modern telecommunication industry demands higher capacity networks with high data rate. Orthogonal frequency division multiplexing (OFDM) is a promising technique for high data rate wireless communications at reasonable complexity in wireless channels. OFDM has been adopted for many types of wireless systems like wireless local area networks such as IEEE 802.11a, and digital audio/video broadcasting (DAB/DVB). The proposed research focuses on a concatenated coding scheme that improve the performance of OFDM based wireless communications. It uses a Redundant Residue Number System (RRNS) code as the outer code and a convolutional code as the inner code. The bit error rate (BER) performances of the proposed system under different channel conditions are investigated. These include the effect of additive white Gaussian noise (AWGN), multipath delay spread, peak power clipping and frame start synchronization error. The simulation results show that the proposed RRNS-Convolutional concatenated coding (RCCC) scheme provides significant improvement in the system performance by exploiting the inherent properties of RRNS.
Resumo:
Speech signals are one of the most important means of communication among the human beings. In this paper, a comparative study of two feature extraction techniques are carried out for recognizing speaker independent spoken isolated words. First one is a hybrid approach with Linear Predictive Coding (LPC) and Artificial Neural Networks (ANN) and the second method uses a combination of Wavelet Packet Decomposition (WPD) and Artificial Neural Networks. Voice signals are sampled directly from the microphone and then they are processed using these two techniques for extracting the features. Words from Malayalam, one of the four major Dravidian languages of southern India are chosen for recognition. Training, testing and pattern recognition are performed using Artificial Neural Networks. Back propagation method is used to train the ANN. The proposed method is implemented for 50 speakers uttering 20 isolated words each. Both the methods produce good recognition accuracy. But Wavelet Packet Decomposition is found to be more suitable for recognizing speech because of its multi-resolution characteristics and efficient time frequency localizations
Resumo:
While channel coding is a standard method of improving a system’s energy efficiency in digital communications, its practice does not extend to high-speed links. Increasing demands in network speeds are placing a large burden on the energy efficiency of high-speed links and render the benefit of channel coding for these systems a timely subject. The low error rates of interest and the presence of residual intersymbol interference (ISI) caused by hardware constraints impede the analysis and simulation of coded high-speed links. Focusing on the residual ISI and combined noise as the dominant error mechanisms, this paper analyses error correlation through concepts of error region, channel signature, and correlation distance. This framework provides a deeper insight into joint error behaviours in high-speed links, extends the range of statistical simulation for coded high-speed links, and provides a case against the use of biased Monte Carlo methods in this setting
Resumo:
Modeling nonlinear systems using Volterra series is a century old method but practical realizations were hampered by inadequate hardware to handle the increased computational complexity stemming from its use. But interest is renewed recently, in designing and implementing filters which can model much of the polynomial nonlinearities inherent in practical systems. The key advantage in resorting to Volterra power series for this purpose is that nonlinear filters so designed can be made to work in parallel with the existing LTI systems, yielding improved performance. This paper describes the inclusion of a quadratic predictor (with nonlinearity order 2) with a linear predictor in an analog source coding system. Analog coding schemes generally ignore the source generation mechanisms but focuses on high fidelity reconstruction at the receiver. The widely used method of differential pnlse code modulation (DPCM) for speech transmission uses a linear predictor to estimate the next possible value of the input speech signal. But this linear system do not account for the inherent nonlinearities in speech signals arising out of multiple reflections in the vocal tract. So a quadratic predictor is designed and implemented in parallel with the linear predictor to yield improved mean square error performance. The augmented speech coder is tested on speech signals transmitted over an additive white gaussian noise (AWGN) channel.
Resumo:
This paper discusses the implementation details of a child friendly, good quality, English text-to-speech (TTS) system that is phoneme-based, concatenative, easy to set up and use with little memory. Direct waveform concatenation and linear prediction coding (LPC) are used. Most existing TTS systems are unit-selection based, which use standard speech databases available in neutral adult voices.Here reduced memory is achieved by the concatenation of phonemes and by replacing phonetic wave files with their LPC coefficients. Linguistic analysis was used to reduce the algorithmic complexity instead of signal processing techniques. Sufficient degree of customization and generalization catering to the needs of the child user had been included through the provision for vocabulary and voice selection to suit the requisites of the child. Prosody had also been incorporated. This inexpensive TTS systemwas implemented inMATLAB, with the synthesis presented by means of a graphical user interface (GUI), thus making it child friendly. This can be used not only as an interesting language learning aid for the normal child but it also serves as a speech aid to the vocally disabled child. The quality of the synthesized speech was evaluated using the mean opinion score (MOS).
Resumo:
The presence of microcalcifications in mammograms can be considered as an early indication of breast cancer. A fastfractal block coding method to model the mammograms fordetecting the presence of microcalcifications is presented in this paper. The conventional fractal image coding method takes enormous amount of time during the fractal block encoding.procedure. In the proposed method, the image is divided intoshade and non shade blocks based on the dynamic range, andonly non shade blocks are encoded using the fractal encodingtechnique. Since the number of image blocks is considerablyreduced in the matching domain search pool, a saving of97.996% of the encoding time is obtained as compared to theconventional fractal coding method, for modeling mammograms.The above developed mammograms are used for detectingmicrocalcifications and a diagnostic efficiency of 85.7% isobtained for the 28 mammograms used.
Resumo:
Analysis by reduction is a method used in linguistics for checking the correctness of sentences of natural languages. This method is modelled by restarting automata. All types of restarting automata considered in the literature up to now accept at least the deterministic context-free languages. Here we introduce and study a new type of restarting automaton, the so-called t-RL-automaton, which is an RL-automaton that is rather restricted in that it has a window of size one only, and that it works under a minimal acceptance condition. On the other hand, it is allowed to perform up to t rewrite (that is, delete) steps per cycle. Here we study the gap-complexity of these automata. The membership problem for a language that is accepted by a t-RL-automaton with a bounded number of gaps can be solved in polynomial time. On the other hand, t-RL-automata with an unbounded number of gaps accept NP-complete languages.
Resumo:
Analysis by reduction is a method used in linguistics for checking the correctness of sentences of natural languages. This method is modelled by restarting automata. Here we study a new type of restarting automaton, the so-called t-sRL-automaton, which is an RL-automaton that is rather restricted in that it has a window of size 1 only, and that it works under a minimal acceptance condition. On the other hand, it is allowed to perform up to t rewrite (that is, delete) steps per cycle. We focus on the descriptional complexity of these automata, establishing two complexity measures that are both based on the description of t-sRL-automata in terms of so-called meta-instructions. We present some hierarchy results as well as a non-recursive trade-off between deterministic 2-sRL-automata and finite-state acceptors.
Resumo:
Information display technology is a rapidly growing research and development field. Using state-of-the-art technology, optical resolution can be increased dramatically by organic light-emitting diode - since the light emitting layer is very thin, under 100nm. The main question is what pixel size is achievable technologically? The next generation of display will considers three-dimensional image display. In 2D , one is considering vertical and horizontal resolutions. In 3D or holographic images, there is another dimension – depth. The major requirement is the high resolution horizontal dimension in order to sustain the third dimension using special lenticular glass or barrier masks, separate views for each eye. The high-resolution 3D display offers hundreds of more different views of objects or landscape. OLEDs have potential to be a key technology for information displays in the future. The display technology presented in this work promises to bring into use bright colour 3D flat panel displays in a unique way. Unlike the conventional TFT matrix, OLED displays have constant brightness and colour, independent from the viewing angle i.e. the observer's position in front of the screen. A sandwich (just 0.1 micron thick) of organic thin films between two conductors makes an OLE Display device. These special materials are named electroluminescent organic semi-conductors (or organic photoconductors (OPC )). When electrical current is applied, a bright light is emitted (electrophosphorescence) from the formed Organic Light-Emitting Diode. Usually for OLED an ITO layer is used as a transparent electrode. Such types of displays were the first for volume manufacture and only a few products are available in the market at present. The key challenges that OLED technology faces in the application areas are: producing high-quality white light achieving low manufacturing costs increasing efficiency and lifetime at high brightness. Looking towards the future, by combining OLED with specially constructed surface lenses and proper image management software it will be possible to achieve 3D images.
Resumo:
Die vorliegende Arbeit beschäftigt sich mit den Einflüssen visuell wahrgenommener Bewegungsmerkmale auf die Handlungssteuerung eines Beobachters. Im speziellen geht es darum, wie die Bewegungsrichtung und die Bewegungsgeschwindigkeit als aufgabenirrelevante Reize die Ausführung von motorischen Reaktionen auf Farbreize beeinflussen und dabei schnellere bzw. verzögerte Reaktionszeiten bewirken. Bisherige Studien dazu waren auf lineare Bewegungen (von rechts nach links und umgekehrt) und sehr einfache Reizumgebungen (Bewegungen einfacher geometrischer Symbole, Punktwolken, Lichtpunktläufer etc.) begrenzt (z.B. Ehrenstein, 1994; Bosbach, 2004, Wittfoth, Buck, Fahle & Herrmann, 2006). In der vorliegenden Dissertation wurde die Gültigkeit dieser Befunde für Dreh- und Tiefenbewegungen sowie komplexe Bewegungsformen (menschliche Bewegungsabläufe im Sport) erweitert, theoretisch aufgearbeitet sowie in einer Serie von sechs Reaktionszeitexperimenten mittels Simon-Paradigma empirisch überprüft. Allen Experimenten war gemeinsam, dass Versuchspersonen an einem Computermonitor auf einen Farbwechsel innerhalb des dynamischen visuellen Reizes durch einen Tastendruck (links, rechts, proximal oder distal positionierte Taste) reagieren sollten, wobei die Geschwindigkeit und die Richtung der Bewegungen für die Reaktionen irrelevant waren. Zum Einfluss von Drehbewegungen bei geometrischen Symbolen (Exp. 1 und 1a) sowie bei menschlichen Drehbewegungen (Exp. 2) zeigen die Ergebnisse, dass Probanden signifikant schneller reagieren, wenn die Richtungsinformationen einer Drehbewegung kompatibel zu den räumlichen Merkmalen der geforderten Tastenreaktion sind. Der Komplexitätsgrad des visuellen Ereignisses spielt dabei keine Rolle. Für die kognitive Verarbeitung des Bewegungsreizes stellt nicht der Drehsinn, sondern die relative Bewegungsrichtung oberhalb und unterhalb der Drehachse das entscheidende räumliche Kriterium dar. Zum Einfluss räumlicher Tiefenbewegungen einer Kugel (Exp. 3) und einer gehenden Person (Exp. 4) belegen unsere Befunde, dass Probanden signifikant schneller reagieren, wenn sich der Reiz auf den Beobachter zu bewegt und ein proximaler gegenüber einem distalen Tastendruck gefordert ist sowie umgekehrt. Auch hier spielt der Komplexitätsgrad des visuellen Ereignisses keine Rolle. In beiden Experimenten führt die Wahrnehmung der Bewegungsrichtung zu einer Handlungsinduktion, die im kompatiblen Fall eine schnelle und im inkompatiblen Fall eine verzögerte Handlungsausführung bewirkt. In den Experimenten 5 und 6 wurden die Einflüsse von wahrgenommenen menschlichen Laufbewegungen (freies Laufen vs. Laufbandlaufen) untersucht, die mit und ohne eine Positionsveränderung erfolgten. Dabei zeigte sich, dass unabhängig von der Positionsveränderung die Laufgeschwindigkeit zu keiner Modulation des richtungsbasierten Simon Effekts führt. Zusammenfassend lassen sich die Studienergebnisse gut in effektbasierte Konzepte zur Handlungssteuerung (z.B. die Theorie der Ereigniskodierung von Hommel et al., 2001) einordnen. Weitere Untersuchungen sind nötig, um diese Ergebnisse auf großmotorische Reaktionen und Displays, die stärker an visuell wahrnehmbaren Ereignissen des Sports angelehnt sind, zu übertragen.
Resumo:
Signalling off-chip requires significant current. As a result, a chip's power-supply current changes drastically during certain output-bus transitions. These current fluctuations cause a voltage drop between the chip and circuit board due to the parasitic inductance of the power-supply package leads. Digital designers often go to great lengths to reduce this "transmitted" noise. Cray, for instance, carefully balances output signals using a technique called differential signalling to guarantee a chip has constant output current. Transmitted-noise reduction costs Cray a factor of two in output pins and wires. Coding achieves similar results at smaller costs.
Resumo:
This thesis attempts to quantify the amount of information needed to learn certain tasks. The tasks chosen vary from learning functions in a Sobolev space using radial basis function networks to learning grammars in the principles and parameters framework of modern linguistic theory. These problems are analyzed from the perspective of computational learning theory and certain unifying perspectives emerge.
Resumo:
The goal of this article is to reveal the computational structure of modern principle-and-parameter (Chomskian) linguistic theories: what computational problems do these informal theories pose, and what is the underlying structure of those computations? To do this, I analyze the computational complexity of human language comprehension: what linguistic representation is assigned to a given sound? This problem is factored into smaller, interrelated (but independently statable) problems. For example, in order to understand a given sound, the listener must assign a phonetic form to the sound; determine the morphemes that compose the words in the sound; and calculate the linguistic antecedent of every pronoun in the utterance. I prove that these and other subproblems are all NP-hard, and that language comprehension is itself PSPACE-hard.