7 resultados para Data processing Computer science
em University of Queensland eSpace - Australia
Resumo:
Quantile computation has many applications including data mining and financial data analysis. It has been shown that an is an element of-approximate summary can be maintained so that, given a quantile query d (phi, is an element of), the data item at rank [phi N] may be approximately obtained within the rank error precision is an element of N over all N data items in a data stream or in a sliding window. However, scalable online processing of massive continuous quantile queries with different phi and is an element of poses a new challenge because the summary is continuously updated with new arrivals of data items. In this paper, first we aim to dramatically reduce the number of distinct query results by grouping a set of different queries into a cluster so that they can be processed virtually as a single query while the precision requirements from users can be retained. Second, we aim to minimize the total query processing costs. Efficient algorithms are developed to minimize the total number of times for reprocessing clusters and to produce the minimum number of clusters, respectively. The techniques are extended to maintain near-optimal clustering when queries are registered and removed in an arbitrary fashion against whole data streams or sliding windows. In addition to theoretical analysis, our performance study indicates that the proposed techniques are indeed scalable with respect to the number of input queries as well as the number of items and the item arrival rate in a data stream.
Resumo:
A specialised reconfigurable architecture is targeted at wireless base-band processing. It is built to cater for multiple wireless standards. It has lower power consumption than the processor-based solution. It can be scaled to run in parallel for processing multiple channels. Test resources are embedded on the architecture and testing strategies are included. This architecture is functionally partitioned according to the common operations found in wireless standards, such as CRC error correction, convolution and interleaving. These modules are linked via Virtual Wire Hardware modules and route-through switch matrices. Data can be processed in any order through this interconnect structure. Virtual Wire ensures the same flexibility as normal interconnects, but the area occupied and the number of switches needed is reduced. The testing algorithm scans all possible paths within the interconnection network exhaustively and searches for faults in the processing modules. The testing algorithm starts by scanning the externally addressable memory space and testing the master controller. The controller then tests every switch in the route-through switch matrix by making loops from the shared memory to each of the switches. The local switch matrix is also tested in the same way. Next the local memory is scanned. Finally, pre-defined test vectors are loaded into local memory to check the processing modules. This paper compares various base-band processing solutions. It describes the proposed platform and its implementation. It outlines the test resources and algorithm. It concludes with the mapping of Bluetooth and GSM base-band onto the platform.
Resumo:
The schema of an information system can significantly impact the ability of end users to efficiently and effectively retrieve the information they need. Obtaining quickly the appropriate data increases the likelihood that an organization will make good decisions and respond adeptly to challenges. This research presents and validates a methodology for evaluating, ex ante, the relative desirability of alternative instantiations of a model of data. In contrast to prior research, each instantiation is based on a different formal theory. This research theorizes that the instantiation that yields the lowest weighted average query complexity for a representative sample of information requests is the most desirable instantiation for end-user queries. The theory was validated by an experiment that compared end-user performance using an instantiation of a data structure based on the relational model of data with performance using the corresponding instantiation of the data structure based on the object-relational model of data. Complexity was measured using three different Halstead metrics: program length, difficulty, and effort. For a representative sample of queries, the average complexity using each instantiation was calculated. As theorized, end users querying the instantiation with the lower average complexity made fewer semantic errors, i.e., were more effective at composing queries. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
Multiresolution Triangular Mesh (MTM) models are widely used to improve the performance of large terrain visualization by replacing the original model with a simplified one. MTM models, which consist of both original and simplified data, are commonly stored in spatial database systems due to their size. The relatively slow access speed of disks makes data retrieval the bottleneck of such terrain visualization systems. Existing spatial access methods proposed to address this problem rely on main-memory MTM models, which leads to significant overhead during query processing. In this paper, we approach the problem from a new perspective and propose a novel MTM called direct mesh that is designed specifically for secondary storage. It supports available indexing methods natively and requires no modification to MTM structure. Experiment results, which are based on two real-world data sets, show an average performance improvement of 5-10 times over the existing methods.
Resumo:
In many advanced applications, data are described by multiple high-dimensional features. Moreover, different queries may weight these features differently; some may not even specify all the features. In this paper, we propose our solution to support efficient query processing in these applications. We devise a novel representation that compactly captures f features into two components: The first component is a 2D vector that reflects a distance range ( minimum and maximum values) of the f features with respect to a reference point ( the center of the space) in a metric space and the second component is a bit signature, with two bits per dimension, obtained by analyzing each feature's descending energy histogram. This representation enables two levels of filtering: The first component prunes away points that do not share similar distance ranges, while the bit signature filters away points based on the dimensions of the relevant features. Moreover, the representation facilitates the use of a single index structure to further speed up processing. We employ the classical B+-tree for this purpose. We also propose a KNN search algorithm that exploits the access orders of critical dimensions of highly selective features and partial distances to prune the search space more effectively. Our extensive experiments on both real-life and synthetic data sets show that the proposed solution offers significant performance advantages over sequential scan and retrieval methods using single and multiple VA-files.