210 resultados para k-Error linear complexity
em Indian Institute of Science - Bangalore - Índia
Resumo:
In this paper, we look at the problem of scheduling expression trees with reusable registers on delayed load architectures. Reusable registers come into the picture when the compiler has a data-flow analyzer which is able to estimate the extent of use of the registers. Earlier work considered the same problem without allowing for register variables. Subsequently, Venugopal considered non-reusable registers in the tree. We further extend these efforts to consider a much more general form of the tree. We describe an approximate algorithm for the problem. We formally prove that the code schedule produced by this algorithm will, in the worst case, generate one interlock and use just one more register than that used by the optimal schedule. Spilling is minimized. The approximate algorithm is simple and has linear complexity.
Resumo:
Enhancement of the photoacoustic signal from condensed materials by several folds is achieved by the introduction of a liquid with high vapor pressure in the photoacoustic cell. The enhancement is especially marked for low absorption coefficients and high chopping frequencies. Typically the enhancement is two to nine times in the presence of diethyl ether at 293 K. A linear relationship is observed between the enhancement and the vapor pressure of the liquid.
Resumo:
A Field Programmable Gate Array (FPGA) based hardware accelerator for multi-conductor parasitic capacitance extraction, using Method of Moments (MoM), is presented in this paper. Due to the prohibitive cost of solving a dense algebraic system formed by MoM, linear complexity fast solver algorithms have been developed in the past to expedite the matrix-vector product computation in a Krylov sub-space based iterative solver framework. However, as the number of conductors in a system increases leading to a corresponding increase in the number of right-hand-side (RHS) vectors, the computational cost for multiple matrix-vector products present a time bottleneck, especially for ill-conditioned system matrices. In this work, an FPGA based hardware implementation is proposed to parallelize the iterative matrix solution for multiple RHS vectors in a low-rank compression based fast solver scheme. The method is applied to accelerate electrostatic parasitic capacitance extraction of multiple conductors in a Ball Grid Array (BGA) package. Speed-ups up to 13x over equivalent software implementation on an Intel Core i5 processor for dense matrix-vector products and 12x for QR compressed matrix-vector products is achieved using a Virtex-6 XC6VLX240T FPGA on Xilinx's ML605 board.
Resumo:
3-D full-wave method of moments (MoM) based electromagnetic analysis is a popular means toward accurate solution of Maxwell's equations. The time and memory bottlenecks associated with such a solution have been addressed over the last two decades by linear complexity fast solver algorithms. However, the accurate solution of 3-D full-wave MoM on an arbitrary mesh of a package-board structure does not guarantee accuracy, since the discretization may not be fine enough to capture spatial changes in the solution variable. At the same time, uniform over-meshing on the entire structure generates a large number of solution variables and therefore requires an unnecessarily large matrix solution. In this paper, different refinement criteria are studied in an adaptive mesh refinement platform. Consequently, the most suitable conductor mesh refinement criterion for MoM-based electromagnetic package-board extraction is identified and the advantages of this adaptive strategy are demonstrated from both accuracy and speed perspectives. The results are also compared with those of the recently reported integral equation-based h-refinement strategy. Finally, a new methodology to expedite each adaptive refinement pass is proposed.
Resumo:
In this article, a Field Programmable Gate Array (FPGA)-based hardware accelerator for 3D electromagnetic extraction, using Method of Moments (MoM) is presented. As the number of nets or ports in a system increases, leading to a corresponding increase in the number of right-hand-side (RHS) vectors, the computational cost for multiple matrix-vector products presents a time bottleneck in a linear-complexity fast solver framework. In this work, an FPGA-based hardware implementation is proposed toward a two-level parallelization scheme: (i) matrix level parallelization for single RHS and (ii) pipelining for multiple-RHS. The method is applied to accelerate electrostatic parasitic capacitance extraction of multiple nets in a Ball Grid Array (BGA) package. The acceleration is shown to be linearly scalable with FPGA resources and speed-ups over 10x against equivalent software implementation on a 2.4GHz Intel Core i5 processor is achieved using a Virtex-6 XC6VLX240T FPGA on Xilinx's ML605 board with the implemented design operating at 200MHz clock frequency. (c) 2016 Wiley Periodicals, Inc. Microwave Opt Technol Lett 58:776-783, 2016
Resumo:
Partitional clustering algorithms, which partition the dataset into a pre-defined number of clusters, can be broadly classified into two types: algorithms which explicitly take the number of clusters as input and algorithms that take the expected size of a cluster as input. In this paper, we propose a variant of the k-means algorithm and prove that it is more efficient than standard k-means algorithms. An important contribution of this paper is the establishment of a relation between the number of clusters and the size of the clusters in a dataset through the analysis of our algorithm. We also demonstrate that the integration of this algorithm as a pre-processing step in classification algorithms reduces their running-time complexity.
Resumo:
An n-length block code C is said to be r-query locally correctable, if for any codeword x ∈ C, one can probabilistically recover any one of the n coordinates of the codeword x by querying at most r coordinates of a possibly corrupted version of x. It is known that linear codes whose duals contain 2-designs are locally correctable. In this article, we consider linear codes whose duals contain t-designs for larger t. It is shown here that for such codes, for a given number of queries r, under linear decoding, one can, in general, handle a larger number of corrupted bits. We exhibit to our knowledge, for the first time, a finite length code, whose dual contains 4-designs, which can tolerate a fraction of up to 0.567/r corrupted symbols as against a maximum of 0.5/r in prior constructions. We also present an upper bound that shows that 0.567 is the best possible for this code length and query complexity over this symbol alphabet thereby establishing optimality of this code in this respect. A second result in the article is a finite-length bound which relates the number of queries r and the fraction of errors that can be tolerated, for a locally correctable code that employs a randomized algorithm in which each instance of the algorithm involves t-error correction.
Resumo:
The paper presents two new algorithms for the direct parallel solution of systems of linear equations. The algorithms employ a novel recursive doubling technique to obtain solutions to an nth-order system in n steps with no more than 2n(n −1) processors. Comparing their performance with the Gaussian elimination algorithm (GE), we show that they are almost 100% faster than the latter. This speedup is achieved by dispensing with all the computation involved in the back-substitution phase of GE. It is also shown that the new algorithms exhibit error characteristics which are superior to GE. An n(n + 1) systolic array structure is proposed for the implementation of the new algorithms. We show that complete solutions can be obtained, through these single-phase solution methods, in 5n−log2n−4 computational steps, without the need for intermediate I/O operations.
Resumo:
In this paper, we generalize the existing rate-one space frequency (SF) and space-time frequency (STF) code constructions. The objective of this exercise is to provide a systematic design of full-diversity STF codes with high coding gain. Under this generalization, STF codes are formulated as linear transformations of data. Conditions on these linear transforms are then derived so that the resulting STF codes achieve full diversity and high coding gain with a moderate decoding complexity. Many of these conditions involve channel parameters like delay profile (DP) and temporal correlation. When these quantities are not available at the transmitter, design of codes that exploit full diversity on channels with arbitrary DIP and temporal correlation is considered. Complete characterization of a class of such robust codes is provided and their bit error rate (BER) performance is evaluated. On the other hand, when channel DIP and temporal correlation are available at the transmitter, linear transforms are optimized to maximize the coding gain of full-diversity STF codes. BER performance of such optimized codes is shown to be better than those of existing codes.
Resumo:
In this paper, we present a low-complexity algorithm for detection in high-rate, non-orthogonal space-time block coded (STBC) large-multiple-input multiple-output (MIMO) systems that achieve high spectral efficiencies of the order of tens of bps/Hz. We also present a training-based iterative detection/channel estimation scheme for such large STBC MIMO systems. Our simulation results show that excellent bit error rate and nearness-to-capacity performance are achieved by the proposed multistage likelihood ascent search (M-LAS) detector in conjunction with the proposed iterative detection/channel estimation scheme at low complexities. The fact that we could show such good results for large STBCs like 16 X 16 and 32 X 32 STBCs from Cyclic Division Algebras (CDA) operating at spectral efficiencies in excess of 20 bps/Hz (even after accounting for the overheads meant for pilot based training for channel estimation and turbo coding) establishes the effectiveness of the proposed detector and channel estimator. We decode perfect codes of large dimensions using the proposed detector. With the feasibility of such a low-complexity detection/channel estimation scheme, large-MIMO systems with tens of antennas operating at several tens of bps/Hz spectral efficiencies can become practical, enabling interesting high data rate wireless applications.
Resumo:
A half-duplex constrained non-orthogonal cooperative multiple access (NCMA) protocol suitable for transmission of information from N users to a single destination in a wireless fading channel is proposed. Transmission in this protocol comprises of a broadcast phase and a cooperation phase. In the broadcast phase, each user takes turn broadcasting its data to all other users and the destination in an orthogonal fashion in time. In the cooperation phase, each user transmits a linear function of what it received from all other users as well as its own data. In contrast to the orthogonal extension of cooperative relay protocols to the cooperative multiple access channels wherein at any point of time, only one user is considered as a source and all the other users behave as relays and do not transmit their own data, the NCMA protocol relaxes the orthogonality built into the protocols and hence allows for a more spectrally efficient usage of resources. Code design criteria for achieving full diversity of N in the NCMA protocol is derived using pair wise error probability (PEP) analysis and it is shown that this can be achieved with a minimum total time duration of 2N - 1 channel uses. Explicit construction of full diversity codes is then provided for arbitrary number of users. Since the Maximum Likelihood decoding complexity grows exponentially with the number of users, the notion of g-group decodable codes is introduced for our setup and a set of necesary and sufficient conditions is also obtained.
Resumo:
In recent years a large number of investigators have devoted their efforts to the study of flow and heat transfer in rarefied gases, using the BGK [1] model or the Boltzmann kinetic equation. The velocity moment method which is based on an expansion of the distribution function as a series of orthogonal polynomials in velocity space, has been applied to the linearized problem of shear flow and heat transfer by Mott-Smith [2] and Wang Chang and Uhlenbeck [3]. Gross, Jackson and Ziering [4] have improved greatly upon this technique by expressing the distribution function in terms of half-range functions and it is this feature which leads to the rapid convergence of the method. The full-range moments method [4] has been modified by Bhatnagar [5] and then applied to plane Couette flow using the B-G-K model. Bhatnagar and Srivastava [6] have also studied the heat transfer in plane Couette flow using the linearized B-G-K equation. On the other hand, the half-range moments method has been applied by Gross and Ziering [7] to heat transfer between parallel plates using Boltzmann equation for hard sphere molecules and by Ziering [83 to shear and heat flow using Maxwell molecular model. Along different lines, a moment method has been applied by Lees and Liu [9] to heat transfer in Couette flow using Maxwell's transfer equation rather than the Boltzmann equation for distribution function. An iteration method has been developed by Willis [10] to apply it to non-linear heat transfer problems using the B-G-K model, with the zeroth iteration being taken as the solution of the collisionless kinetic equation. Krook [11] has also used the moment method to formulate the equivalent continuum equations and has pointed out that if the effects of molecular collisions are described by the B-G-K model, exact numerical solutions of many rarefied gas-dynamic problems can be obtained. Recently, these numerical solutions have been obtained by Anderson [12] for the non-linear heat transfer in Couette flow,
Resumo:
The problem of determining whether a Tanner graph for a linear block code has a stopping set of a given size is shown to be NT-complete.
Resumo:
A linear time approximate maximum likelihood decoding algorithm on tail-biting trellises is presented, that requires exactly two rounds on the trellis. This is an adaptation of an algorithm proposed earlier with the advantage that it reduces the time complexity from O(m log m) to O(m) where m is the number of nodes in the tail-biting trellis. A necessary condition for the output of the algorithm to differ from the output of the ideal ML decoder is deduced and simulation results on an AWGN channel using tail-biting trellises for two rate 1/2 convolutional codes with memory 4 and 6 respectively, are reported.
A Low ML-Decoding Complexity, High Coding Gain, Full-Rate, Full-Diversity STBC for 4 x 2 MIMO System
Resumo:
This paper proposes a full-rate, full-diversity space-time block code(STBC) with low maximum likelihood (ML) decoding complexity and high coding gain for the 4 transmit antenna, 2 receive antenna (4 x 2) multiple-input multiple-output (MIMO) system that employs 4/16-QAM. For such a system, the best code known is the DjABBA code and recently, Biglieri, Hong and Viterbo have proposed another STBC (BHV code) for 4-QAM which has lower ML-decoding complexity than the DjABBA code but does not have full-diversity like the DjABBA code. The code proposed in this paper has the same ML-decoding complexity as the BHV code for any square M-QAM but has full-diversity for 4- and 16-QAM. Compared with the DjABBA code, the proposed code has lower ML-decoding complexity for square M-QAM constellation, higher coding gain for 4- and 16-QAM, and hence a better codeword error rate (CER) performance. Simulation results confirming this are presented.