18 resultados para data complexity
em Indian Institute of Science - Bangalore - Índia
Resumo:
Several techniques are known for searching an ordered collection of data. The techniques and analyses of retrieval methods based on primary attributes are straightforward. Retrieval using secondary attributes depends on several factors. For secondary attribute retrieval, the linear structures—inverted lists, multilists, doubly linked lists—and the recently proposed nonlinear tree structures—multiple attribute tree (MAT), K-d tree (kdT)—have their individual merits. It is shown in this paper that, of the two tree structures, MAT possesses several features of a systematic data structure for external file organisation which make it superior to kdT. Analytic estimates for the complexity of node searchers, in MAT and kdT for several types of queries, are developed and compared.
Resumo:
The research in software science has so far been concentrated on three measures of program complexity: (a) software effort; (b) cyclomatic complexity; and (c) program knots. In this paper we propose a measure of the logical complexity of programs in terms of the variable dependency of sequence of computations, inductive effort in writing loops and complexity of data structures. The proposed complexity mensure is described with the aid of a graph which exhibits diagrammatically the dependence of a computation at a node upon the computation of other (earlier) nodes. Complexity measures of several example programs have been computed and the related issues have been discussed. The paper also describes the role played by data structures in deciding the program complexity.
Resumo:
In this paper, we present an approach to estimate fractal complexity of discrete time signal waveforms based on computation of area bounded by sample points of the signal at different time resolutions. The slope of best straight line fit to the graph of log(A(rk)A / rk(2)) versus log(l/rk) is estimated, where A(rk) is the area computed at different time resolutions and rk time resolutions at which the area have been computed. The slope quantifies complexity of the signal and it is taken as an estimate of the fractal dimension (FD). The proposed approach is used to estimate the fractal dimension of parametric fractal signals with known fractal dimensions and the method has given accurate results. The estimation accuracy of the method is compared with that of Higuchi's and Sevcik's methods. The proposed method has given more accurate results when compared with that of Sevcik's method and the results are comparable to that of the Higuchi's method. The practical application of the complexity measure in detecting change in complexity of signals is discussed using real sleep electroencephalogram recordings from eight different subjects. The FD-based approach has shown good performance in discriminating different stages of sleep.
Resumo:
In this paper, we present a low-complexity algorithm for detection in high-rate, non-orthogonal space-time block coded (STBC) large-multiple-input multiple-output (MIMO) systems that achieve high spectral efficiencies of the order of tens of bps/Hz. We also present a training-based iterative detection/channel estimation scheme for such large STBC MIMO systems. Our simulation results show that excellent bit error rate and nearness-to-capacity performance are achieved by the proposed multistage likelihood ascent search (M-LAS) detector in conjunction with the proposed iterative detection/channel estimation scheme at low complexities. The fact that we could show such good results for large STBCs like 16 X 16 and 32 X 32 STBCs from Cyclic Division Algebras (CDA) operating at spectral efficiencies in excess of 20 bps/Hz (even after accounting for the overheads meant for pilot based training for channel estimation and turbo coding) establishes the effectiveness of the proposed detector and channel estimator. We decode perfect codes of large dimensions using the proposed detector. With the feasibility of such a low-complexity detection/channel estimation scheme, large-MIMO systems with tens of antennas operating at several tens of bps/Hz spectral efficiencies can become practical, enabling interesting high data rate wireless applications.
Resumo:
A half-duplex constrained non-orthogonal cooperative multiple access (NCMA) protocol suitable for transmission of information from N users to a single destination in a wireless fading channel is proposed. Transmission in this protocol comprises of a broadcast phase and a cooperation phase. In the broadcast phase, each user takes turn broadcasting its data to all other users and the destination in an orthogonal fashion in time. In the cooperation phase, each user transmits a linear function of what it received from all other users as well as its own data. In contrast to the orthogonal extension of cooperative relay protocols to the cooperative multiple access channels wherein at any point of time, only one user is considered as a source and all the other users behave as relays and do not transmit their own data, the NCMA protocol relaxes the orthogonality built into the protocols and hence allows for a more spectrally efficient usage of resources. Code design criteria for achieving full diversity of N in the NCMA protocol is derived using pair wise error probability (PEP) analysis and it is shown that this can be achieved with a minimum total time duration of 2N - 1 channel uses. Explicit construction of full diversity codes is then provided for arbitrary number of users. Since the Maximum Likelihood decoding complexity grows exponentially with the number of users, the notion of g-group decodable codes is introduced for our setup and a set of necesary and sufficient conditions is also obtained.
Resumo:
Hole-doped perovskites such as La1-xCaxMnO3 present special magnetic and magnetotransport properties, and it is commonly accepted that the local atomic structure around Mn ions plays a crucial role in determining these peculiar features. Therefore experimental techniques directly probing the local atomic structure, like x-ray absorption spectroscopy (XAS), have been widely exploited to deeply understand the physics of these compounds. Quantitative XAS analysis usually concerns the extended region [extended x-ray absorption fine structure (EXAFS)] of the absorption spectra. The near-edge region [x-ray absorption near-edge spectroscopy (XANES)] of XAS spectra can provide detailed complementary information on the electronic structure and local atomic topology around the absorber. However, the complexity of the XANES analysis usually prevents a quantitative understanding of the data. This work exploits the recently developed MXAN code to achieve a quantitative structural refinement of the Mn K-edge XANES of LaMnO3 and CaMnO3 compounds; they are the end compounds of the doped manganite series LaxCa1-xMnO3. The results derived from the EXAFS and XANES analyses are in good agreement, demonstrating that a quantitative picture of the local structure can be obtained from XANES in these crystalline compounds. Moreover, the quantitative XANES analysis provides topological information not directly achievable from EXAFS data analysis. This work demonstrates that combining the analysis of extended and near-edge regions of Mn K-edge XAS spectra could provide a complete and accurate description of Mn local atomic environment in these compounds.
Resumo:
Non-orthogonal space-time block codes (STBC) from cyclic division algebras (CDA) are attractive because they can simultaneously achieve both high spectral efficiencies (same spectral efficiency as in V-BLAST for a given number of transmit antennas) as well as full transmit diversity. Decoding of non-orthogonal STBCs with hundreds of dimensions has been a challenge. In this paper, we present a probabilistic data association (PDA) based algorithm for decoding non-orthogonal STBCs with large dimensions. Our simulation results show that the proposed PDA-based algorithm achieves near SISO AWGN uncoded BER as well as near-capacity coded BER (within 5 dB of the theoretical capacity) for large non-orthogonal STBCs from CDA. We study the effect of spatial correlation on the BER, and show that the performance loss due to spatial correlation can be alleviated by providing more receive spatial dimensions. We report good BER performance when a training-based iterative decoding/channel estimation is used (instead of assuming perfect channel knowledge) in channels with large coherence times. A comparison of the performances of the PDA algorithm and the likelihood ascent search (LAS) algorithm (reported in our recent work) is also presented.
Resumo:
Large MIMO systems with tens of antennas in each communication terminal using full-rate non-orthogonal space-time block codes (STBC) from Cyclic Division Algebras (CDA) can achieve the benefits of both transmit diversity as well as high spectral efficiencies. Maximum-likelihood (ML) or near-ML decoding of these large-sized STBCs at low complexities, however, has been a challenge. In this paper, we establish that near-ML decoding of these large STBCs is possible at practically affordable low complexities. We show that the likelihood ascent search (LAS) detector, reported earlier by us for V-BLAST, is able to achieve near-ML uncoded BER performance in decoding a 32x32 STBC from CDA, which employs 32 transmit antennas and sends 32(2) = 1024 complex data symbols in 32 time slots in one STBC matrix (i.e., 32 data symbols sent per channel use). In terms of coded BER, with a 16x16 STBC, rate-3/4 turbo code and 4-QAM (i.e., 24 bps/Hz), the LAS detector performs close to within just about 4 dB from the theoretical MIMO capacity. Our results further show that, with LAS detection, information lossless (ILL) STBCs perform almost as good as full-diversity ILL (FD-ILL) STBCs. Such low-complexity detectors can potentially enable implementation of high spectral efficiency large MIMO systems that could be considered in wireless standards.
Resumo:
This paper presents a fast algorithm for data exchange in a network of processors organized as a reconfigurable tree structure. For a given data exchange table, the algorithm generates a sequence of tree configurations in which the data exchanges are to be executed. A significant feature of the algorithm is that each exchange is executed in a tree configuration in which the source and destination nodes are adjacent to each other. It has been proved in a theorem that for every pair of nodes in the reconfigurable tree structure, there always exists two and only two configurations in which these two nodes are adjacent to each other. The algorithm utilizes this fact and determines the solution so as to optimize both the number of configurations required and the time to perform the data exchanges. Analysis of the algorithm shows that it has linear time complexity, and provides a large reduction in run-time as compared to a previously proposed algorithm. This is well-confirmed from the experimental results obtained by executing a large number of randomly-generated data exchange tables. Another significant feature of the algorithm is that the bit-size of the routing information code is always two bits, irrespective of the number of nodes in the tree. This not only increases the speed of the algorithm but also results in simpler hardware inside each node.
Resumo:
Very Long Instruction Word (VLIW) architectures exploit instruction level parallelism (ILP) with the help of the compiler to achieve higher instruction throughput with minimal hardware. However, control and data dependencies between operations limit the available ILP, which not only hinders the scalability of VLIW architectures, but also result in code size expansion. Although speculation and predicated execution mitigate ILP limitations due to control dependencies to a certain extent, they increase hardware cost and exacerbate code size expansion. Simultaneous multistreaming (SMS) can significantly improve operation throughput by allowing interleaved execution of operations from multiple instruction streams. In this paper we study SMS for VLIW architectures and quantify the benefits associated with it using a case study of the MPEG-2 video decoder. We also propose the notion of virtual resources for VLIW architectures, which decouple architectural resources (resources exposed to the compiler) from the microarchitectural resources, to limit code size expansion. Our results for a VLIW architecture demonstrate that: (1) SMS delivers much higher throughput than that achieved by speculation and predicated execution, (2) the increase in performance due to the addition of speculation and predicated execution support over SMS averages around 12%. The minor increase in performance might not warrant the additional hardware complexity involved, and (3) the notion of virtual resources is very effective in reducing no-operations (NOPs) and consequently reduce code size with little or no impact on performance.
Resumo:
Background: Temporal analysis of gene expression data has been limited to identifying genes whose expression varies with time and/or correlation between genes that have similar temporal profiles. Often, the methods do not consider the underlying network constraints that connect the genes. It is becoming increasingly evident that interactions change substantially with time. Thus far, there is no systematic method to relate the temporal changes in gene expression to the dynamics of interactions between them. Information on interaction dynamics would open up possibilities for discovering new mechanisms of regulation by providing valuable insight into identifying time-sensitive interactions as well as permit studies on the effect of a genetic perturbation. Results: We present NETGEM, a tractable model rooted in Markov dynamics, for analyzing the dynamics of the interactions between proteins based on the dynamics of the expression changes of the genes that encode them. The model treats the interaction strengths as random variables which are modulated by suitable priors. This approach is necessitated by the extremely small sample size of the datasets, relative to the number of interactions. The model is amenable to a linear time algorithm for efficient inference. Using temporal gene expression data, NETGEM was successful in identifying (i) temporal interactions and determining their strength, (ii) functional categories of the actively interacting partners and (iii) dynamics of interactions in perturbed networks. Conclusions: NETGEM represents an optimal trade-off between model complexity and data requirement. It was able to deduce actively interacting genes and functional categories from temporal gene expression data. It permits inference by incorporating the information available in perturbed networks. Given that the inputs to NETGEM are only the network and the temporal variation of the nodes, this algorithm promises to have widespread applications, beyond biological systems. The source code for NETGEM is available from https://github.com/vjethava/NETGEM
Resumo:
The rapid growth in the field of data mining has lead to the development of various methods for outlier detection. Though detection of outliers has been well explored in the context of numerical data, dealing with categorical data is still evolving. In this paper, we propose a two-phase algorithm for detecting outliers in categorical data based on a novel definition of outliers. In the first phase, this algorithm explores a clustering of the given data, followed by the ranking phase for determining the set of most likely outliers. The proposed algorithm is expected to perform better as it can identify different types of outliers, employing two independent ranking schemes based on the attribute value frequencies and the inherent clustering structure in the given data. Unlike some existing methods, the computational complexity of this algorithm is not affected by the number of outliers to be detected. The efficacy of this algorithm is demonstrated through experiments on various public domain categorical data sets.
Resumo:
Large software systems are developed by composing multiple programs. If the programs manip-ulate and exchange complex data, such as network packets or files, it is essential to establish that they follow compatible data formats. Most of the complexity of data formats is associated with the headers. In this paper, we address compatibility of programs operating over headers of network packets, files, images, etc. As format specifications are rarely available, we infer the format associated with headers by a program as a set of guarded layouts. In terms of these formats, we define and check compatibility of (a) producer-consumer programs and (b) different versions of producer (or consumer) programs. A compatible producer-consumer pair is free of type mismatches and logical incompatibilities such as the consumer rejecting valid outputs gen-erated by the producer. A backward compatible producer (resp. consumer) is guaranteed to be compatible with consumers (resp. producers) that were compatible with its older version. With our prototype tool, we identified 5 known bugs and 1 potential bug in (a) sender-receiver modules of Linux network drivers of 3 vendors and (b) different versions of a TIFF image library.
Resumo:
It is well known that the impulse response of a wide-band wireless channel is approximately sparse, in the sense that it has a small number of significant components relative to the channel delay spread. In this paper, we consider the estimation of the unknown channel coefficients and its support in OFDM systems using a sparse Bayesian learning (SBL) framework for exact inference. In a quasi-static, block-fading scenario, we employ the SBL algorithm for channel estimation and propose a joint SBL (J-SBL) and a low-complexity recursive J-SBL algorithm for joint channel estimation and data detection. In a time-varying scenario, we use a first-order autoregressive model for the wireless channel and propose a novel, recursive, low-complexity Kalman filtering-based SBL (KSBL) algorithm for channel estimation. We generalize the KSBL algorithm to obtain the recursive joint KSBL algorithm that performs joint channel estimation and data detection. Our algorithms can efficiently recover a group of approximately sparse vectors even when the measurement matrix is partially unknown due to the presence of unknown data symbols. Moreover, the algorithms can fully exploit the correlation structure in the multiple measurements. Monte Carlo simulations illustrate the efficacy of the proposed techniques in terms of the mean-square error and bit error rate performance.
Resumo:
H. 264/advanced video coding surveillance video encoders use the Skip mode specified by the standard to reduce bandwidth. They also use multiple frames as reference for motion-compensated prediction. In this paper, we propose two techniques to reduce the bandwidth and computational cost of static camera surveillance video encoders without affecting detection and recognition performance. A spatial sampler is proposed to sample pixels that are segmented using a Gaussian mixture model. Modified weight updates are derived for the parameters of the mixture model to reduce floating point computations. A storage pattern of the parameters in memory is also modified to improve cache performance. Skip selection is performed using the segmentation results of the sampled pixels. The second contribution is a low computational cost algorithm to choose the reference frames. The proposed reference frame selection algorithm reduces the cost of coding uncovered background regions. We also study the number of reference frames required to achieve good coding efficiency. Distortion over foreground pixels is measured to quantify the performance of the proposed techniques. Experimental results show bit rate savings of up to 94.5% over methods proposed in literature on video surveillance data sets. The proposed techniques also provide up to 74.5% reduction in compression complexity without increasing the distortion over the foreground regions in the video sequence.