921 resultados para Processing wikipedia data
Resumo:
Quantile computation has many applications including data mining and financial data analysis. It has been shown that an is an element of-approximate summary can be maintained so that, given a quantile query d (phi, is an element of), the data item at rank [phi N] may be approximately obtained within the rank error precision is an element of N over all N data items in a data stream or in a sliding window. However, scalable online processing of massive continuous quantile queries with different phi and is an element of poses a new challenge because the summary is continuously updated with new arrivals of data items. In this paper, first we aim to dramatically reduce the number of distinct query results by grouping a set of different queries into a cluster so that they can be processed virtually as a single query while the precision requirements from users can be retained. Second, we aim to minimize the total query processing costs. Efficient algorithms are developed to minimize the total number of times for reprocessing clusters and to produce the minimum number of clusters, respectively. The techniques are extended to maintain near-optimal clustering when queries are registered and removed in an arbitrary fashion against whole data streams or sliding windows. In addition to theoretical analysis, our performance study indicates that the proposed techniques are indeed scalable with respect to the number of input queries as well as the number of items and the item arrival rate in a data stream.
Resumo:
A progressive spatial query retrieves spatial data based on previous queries (e.g., to fetch data in a more restricted area with higher resolution). A direct query, on the other side, is defined as an isolated window query. A multi-resolution spatial database system should support both progressive queries and traditional direct queries. It is conceptually challenging to support both types of query at the same time, as direct queries favour location-based data clustering, whereas progressive queries require fragmented data clustered by resolutions. Two new scaleless data structures are proposed in this paper. Experimental results using both synthetic and real world datasets demonstrate that the query processing time based on the new multiresolution approaches is comparable and often better than multi-representation data structures for both types of queries.
Resumo:
Non-technical losses (NTL) identification and prediction are important tasks for many utilities. Data from customer information system (CIS) can be used for NTL analysis. However, in order to accurately and efficiently perform NTL analysis, the original data from CIS need to be pre-processed before any detailed NTL analysis can be carried out. In this paper, we propose a feature selection based method for CIS data pre-processing in order to extract the most relevant information for further analysis such as clustering and classifications. By removing irrelevant and redundant features, feature selection is an essential step in data mining process in finding optimal subset of features to improve the quality of result by giving faster time processing, higher accuracy and simpler results with fewer features. Detailed feature selection analysis is presented in the paper. Both time-domain and load shape data are compared based on the accuracy, consistency and statistical dependencies between features.
Resumo:
Photonic technologies for data processing in the optical domain are expected to play a major role in future high-speed communications. Nonlinear effects in optical fibres have many attractive features and great, but not yet fully explored potential for optical signal processing. Here we provide an overview of our recent advances in developing novel techniques and approaches to all-optical processing based on fibre nonlinearities.
Resumo:
Recent advances in technology have produced a significant increase in the availability of free sensor data over the Internet. With affordable weather monitoring stations now available to individual meteorology enthusiasts a reservoir of real time data such as temperature, rainfall and wind speed can now be obtained for most of the United States and Europe. Despite the abundance of available data, obtaining useable information about the weather in your local neighbourhood requires complex processing that poses several challenges. This paper discusses a collection of technologies and applications that harvest, refine and process this data, culminating in information that has been tailored toward the user. In this case we are particularly interested in allowing a user to make direct queries about the weather at any location, even when this is not directly instrumented, using interpolation methods. We also consider how the uncertainty that the interpolation introduces can then be communicated to the user of the system, using UncertML, a developing standard for uncertainty representation.
Resumo:
We present the first experimental implementation of a recently designed quasi-lossless fiber span with strongly reduced signal power excursion. The resulting fiber waveguide medium can be advantageously used both in lightwave communications and in all-optical nonlinear data processing.
Resumo:
We present the first experimental implementation of a recently designed quasi-lossless fibre span with strongly reduced signal power excursion. The resulting fibre waveguide medium can be advantageously used both in lightwave communications and in all-optical nonlinear data processing.
Resumo:
Recent advances in technology have produced a significant increase in the availability of free sensor data over the Internet. With affordable weather monitoring stations now available to individual meteorology enthusiasts a reservoir of real time data such as temperature, rainfall and wind speed can now be obtained for most of the United States and Europe. Despite the abundance of available data, obtaining useable information about the weather in your local neighbourhood requires complex processing that poses several challenges. This paper discusses a collection of technologies and applications that harvest, refine and process this data, culminating in information that has been tailored toward the user. In this case we are particularly interested in allowing a user to make direct queries about the weather at any location, even when this is not directly instrumented, using interpolation methods. We also consider how the uncertainty that the interpolation introduces can then be communicated to the user of the system, using UncertML, a developing standard for uncertainty representation.
Resumo:
We present the first experimental implementation of a recently designed quasi-lossless fiber span with strongly reduced signal power excursion. The resulting fiber waveguide medium can be advantageously used both in lightwave communications and in all-optical nonlinear data processing. © 2005 IEEE.
Resumo:
We present the first experimental implementation of a recently designed quasi-lossless fibre span with strongly reduced signal power excursion. The resulting fibre waveguide medium can be advantageously used both in lightwave communications and in all-optical nonlinear data processing.
Resumo:
This paper focuses on a problem of Grid system decomposition by developing its object model. Unified Modelling Language (UML) is used as a formalization tool. This approach is motivated by the complexity of the system being analysed and the need for simulation model design.