943 resultados para Textual complexity for Romanian language
Resumo:
We have developed two reduced complexity bit-allocation algorithms for MP3/AAC based audio encoding, which can be useful at low bit-rates. One algorithm derives optimum bit-allocation using constrained optimization of weighted noise-to-mask ratio and the second algorithm uses decoupled iterations for distortion control and rate control, with convergence criteria. MUSHRA based evaluation indicated that the new algorithm would be comparable to AAC but requiring only about 1/10 th the complexity.
Resumo:
We present a improved language modeling technique for Lempel-Ziv-Welch (LZW) based LID scheme. The previous approach to LID using LZW algorithm prepares the language pattern table using LZW algorithm. Because of the sequential nature of the LZW algorithm, several language specific patterns of the language were missing in the pattern table. To overcome this, we build a universal pattern table, which contains all patterns of different length. For each language it's corresponding language specific pattern table is constructed by retaining the patterns of the universal table whose frequency of appearance in the training data is above the threshold.This approach reduces the classification score (Compression Ratio [LZW-CR] or the weighted discriminant score[LZW-WDS]) for non native languages and increases the LID performance considerably.
Resumo:
We present a new approach to spoken language modeling for language identification (LID) using the Lempel-Ziv-Welch (LZW) algorithm. The LZW technique is applicable to any kind of tokenization of the speech signal. Because of the efficiency of LZW algorithm to obtain variable length symbol strings in the training data, the LZW codebook captures the essentials of a language effectively. We develop two new deterministic measures for LID based on the LZW algorithm namely: (i) Compression ratio score (LZW-CR) and (ii) weighted discriminant score (LZW-WDS). To assess these measures, we consider error-free tokenization of speech as well as artificially induced noise in the tokenization. It is shown that for a 6 language LID task of OGI-TS database with clean tokenization, the new model (LZW-WDS) performs slightly better than the conventional bigram model. For noisy tokenization, which is the more realistic case, LZW-WDS significantly outperforms the bigram technique
Resumo:
Precoding for multiple-input multiple-output (MIMO) antenna systems is considered with perfect channel knowledge available at both the transmitter and the receiver. For two transmit antennas and QAM constellations, a real-valued precoder which is approximately optimal (with respect to the minimum Euclidean distance between points in the received signal space) among real-valued precoders based on the singular value decomposition (SVD) of the channel is proposed. The proposed precoder is obtainable easily for arbitrary QAM constellations, unlike the known complex-valued optimal precoder by Collin et al. for two transmit antennas which is in existence for 4-QAM alone and is extremely hard to obtain for larger QAM constellations. The proposed precoding scheme is extended to higher number of transmit antennas on the lines of the E - d(min) precoder for 4-QAM by Vrigneau et al. which is an extension of the complex-valued optimal precoder for 4-QAM. The proposed precoder's ML-decoding complexity as a function of the constellation size M is only O(root M)while that of the E - d(min) precoder is O(M root M)(M = 4). Compared to the recently proposed X- and Y-precoders, the error performance of the proposed precoder is significantly better while being only marginally worse than that of the E - d(min) precoder for 4-QAM. It is argued that the proposed precoder provides full-diversity for QAM constellations and this is supported by simulation plots of the word error probability for 2 x 2, 4 x 4 and 8 x 8 systems.
Resumo:
In this paper, we deal with low-complexity near-optimal detection/equalization in large-dimension multiple-input multiple-output inter-symbol interference (MIMO-ISI) channels using message passing on graphical models. A key contribution in the paper is the demonstration that near-optimal performance in MIMO-ISI channels with large dimensions can be achieved at low complexities through simple yet effective simplifications/approximations, although the graphical models that represent MIMO-ISI channels are fully/densely connected (loopy graphs). These include 1) use of Markov random field (MRF)-based graphical model with pairwise interaction, in conjunction with message damping, and 2) use of factor graph (FG)-based graphical model with Gaussian approximation of interference (GAI). The per-symbol complexities are O(K(2)n(t)(2)) and O(Kn(t)) for the MRF and the FG with GAI approaches, respectively, where K and n(t) denote the number of channel uses per frame, and number of transmit antennas, respectively. These low-complexities are quite attractive for large dimensions, i.e., for large Kn(t). From a performance perspective, these algorithms are even more interesting in large-dimensions since they achieve increasingly closer to optimum detection performance for increasing Kn(t). Also, we show that these message passing algorithms can be used in an iterative manner with local neighborhood search algorithms to improve the reliability/performance of M-QAM symbol detection.
Resumo:
In this paper, we give a new framework for constructing low ML decoding complexity space-time block codes (STBCs) using codes over the Klein group K. Almost all known low ML decoding complexity STBCs can be obtained via this approach. New full- diversity STBCs with low ML decoding complexity and cubic shaping property are constructed, via codes over K, for number of transmit antennas N = 2(m), m >= 1, and rates R > 1 complex symbols per channel use. When R = N, the new STBCs are information- lossless as well. The new class of STBCs have the least knownML decoding complexity among all the codes available in the literature for a large set of (N, R) pairs.
Resumo:
Current scientific research is characterized by increasing specialization, accumulating knowledge at a high speed due to parallel advances in a multitude of sub-disciplines. Recent estimates suggest that human knowledge doubles every two to three years – and with the advances in information and communication technologies, this wide body of scientific knowledge is available to anyone, anywhere, anytime. This may also be referred to as ambient intelligence – an environment characterized by plentiful and available knowledge. The bottleneck in utilizing this knowledge for specific applications is not accessing but assimilating the information and transforming it to suit the needs for a specific application. The increasingly specialized areas of scientific research often have the common goal of converting data into insight allowing the identification of solutions to scientific problems. Due to this common goal, there are strong parallels between different areas of applications that can be exploited and used to cross-fertilize different disciplines. For example, the same fundamental statistical methods are used extensively in speech and language processing, in materials science applications, in visual processing and in biomedicine. Each sub-discipline has found its own specialized methodologies making these statistical methods successful to the given application. The unification of specialized areas is possible because many different problems can share strong analogies, making the theories developed for one problem applicable to other areas of research. It is the goal of this paper to demonstrate the utility of merging two disparate areas of applications to advance scientific research. The merging process requires cross-disciplinary collaboration to allow maximal exploitation of advances in one sub-discipline for that of another. We will demonstrate this general concept with the specific example of merging language technologies and computational biology.
Resumo:
Parallel sub-word recognition (PSWR) is a new model that has been proposed for language identification (LID) which does not need elaborate phonetic labeling of the speech data in a foreign language. The new approach performs a front-end tokenization in terms of sub-word units which are designed by automatic segmentation, segment clustering and segment HMM modeling. We develop PSWR based LID in a framework similar to the parallel phone recognition (PPR) approach in the literature. This includes a front-end tokenizer and a back-end language model, for each language to be identified. Considering various combinations of the statistical evaluation scores, it is found that PSWR can perform as well as PPR, even with broad acoustic sub-word tokenization, thus making it an efficient alternative to the PPR system.
Resumo:
Since the days of Digital Subscriber Links (DSL), time domain equalizers (TEQ's) have been used to combat time dispersive channels in Multicarrier Systems. In this paper, we propose computationally inexpensive techniques to recompute TEQ weights in the presence of changes in the channel, especially over fast fading channels. The techniques use no extra information except the perturbation to the channel itself, and provide excellent approximations to the new TEQ weights. Adaptation methods for two existing Channel shortening algorithms are proposed and their performance over randomly varying, randomly perturbed channels is studied. The proposed adaptation techniques are shown to perform admirably well for small changes in channels for OFDM systems. (C) 2012 Elsevier GmbH. All rights reserved.
Resumo:
A new solvatomorph of gallic acid was generated using chiral additive technique and characterized by single crystal and powder X-ray diffraction, C-13 NMR, IR spectroscopic techniques and thermal analysis. The supramolecular channels formed by hexameric motifs of gallic acid and solvent molecules contain highly disordered solvent molecules with fractional occupancies. © 2012 Elsevier B.V.
Suite of tools for statistical N-gram language modeling for pattern mining in whole genome sequences
Resumo:
Genome sequences contain a number of patterns that have biomedical significance. Repetitive sequences of various kinds are a primary component of most of the genomic sequence patterns. We extended the suffix-array based Biological Language Modeling Toolkit to compute n-gram frequencies as well as n-gram language-model based perplexity in windows over the whole genome sequence to find biologically relevant patterns. We present the suite of tools and their application for analysis on whole human genome sequence.
Resumo:
Suppose G = (V, E) is a simple graph and k is a fixed positive integer. A subset D subset of V is a distance k-dominating set of G if for every u is an element of V. there exists a vertex v is an element of D such that d(G)(u, v) <= k, where d(G)(u, v) is the distance between u and v in G. A set D subset of V is a distance k-paired-dominating set of G if D is a distance k-dominating set and the induced subgraph GD] contains a perfect matching. Given a graph G = (V, E) and a fixed integer k > 0, the MIN DISTANCE k-PAIRED-DOM SET problem is to find a minimum cardinality distance k-paired-dominating set of G. In this paper, we show that the decision version of MIN DISTANCE k-PAIRED-DOM SET iS NP-complete for undirected path graphs. This strengthens the complexity of decision version Of MIN DISTANCE k-PAIRED-DOM SET problem in chordal graphs. We show that for a given graph G, unless NP subset of DTIME (n(0)((log) (log) (n)) MIN DISTANCE k-PAIRED-Dom SET problem cannot be approximated within a factor of (1 -epsilon ) In n for any epsilon > 0, where n is the number of vertices in G. We also show that MIN DISTANCE k-PAIRED-DOM SET problem is APX-complete for graphs with degree bounded by 3. On the positive side, we present a linear time algorithm to compute the minimum cardinality of a distance k-paired-dominating set of a strongly chordal graph G if a strong elimination ordering of G is provided. We show that for a given graph G, MIN DISTANCE k-PAIRED-DOM SET problem can be approximated with an approximation factor of 1 + In 2 + k . In(Delta(G)), where Delta(G) denotes the maximum degree of G. (C) 2012 Elsevier B.V All rights reserved.