96 resultados para Database management -- Computer programs


Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many queries sent to search engines refer to specific locations in the world. Location-based queries try to find local services and facilities around the user’s environment or in a particular area. This paper reviews the specifications of geospatial queries and discusses the similarities and differences between location-based queries and other queries. We introduce nine patterns for location-based queries containing either a service name alone or a service name accompanied by a location name. Our survey indicates that at least 22% of the Web queries have a geospatial dimension and most of these can be considered as location-based queries. We propose that location-based queries should be treated different from general queries to produce more relevant results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent years many real time applications need to handle data streams. We consider the distributed environments in which remote data sources keep on collecting data from real world or from other data sources, and continuously push the data to a central stream processor. In these kinds of environments, significant communication is induced by the transmitting of rapid, high-volume and time-varying data streams. At the same time, the computing overhead at the central processor is also incurred. In this paper, we develop a novel filter approach, called DTFilter approach, for evaluating the windowed distinct queries in such a distributed system. DTFilter approach is based on the searching algorithm using a data structure of two height-balanced trees, and it avoids transmitting duplicate items in data streams, thus lots of network resources are saved. In addition, theoretical analysis of the time spent in performing the search, and of the amount of memory needed is provided. Extensive experiments also show that DTFilter approach owns high performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Web transaction data between Web visitors and Web functionalities usually convey user task-oriented behavior pattern. Mining such type of click-stream data will lead to capture usage pattern information. Nowadays Web usage mining technique has become one of most widely used methods for Web recommendation, which customizes Web content to user-preferred style. Traditional techniques of Web usage mining, such as Web user session or Web page clustering, association rule and frequent navigational path mining can only discover usage pattern explicitly. They, however, cannot reveal the underlying navigational activities and identify the latent relationships that are associated with the patterns among Web users as well as Web pages. In this work, we propose a Web recommendation framework incorporating Web usage mining technique based on Probabilistic Latent Semantic Analysis (PLSA) model. The main advantages of this method are, not only to discover usage-based access pattern, but also to reveal the underlying latent factor as well. With the discovered user access pattern, we then present user more interested content via collaborative recommendation. To validate the effectiveness of proposed approach, we conduct experiments on real world datasets and make comparisons with some existing traditional techniques. The preliminary experimental results demonstrate the usability of the proposed approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spatial data mining recently emerges from a number of real applications, such as real-estate marketing, urban planning, weather forecasting, medical image analysis, road traffic accident analysis, etc. It demands for efficient solutions for many new, expensive, and complicated problems. In this paper, we investigate the problem of evaluating the top k distinguished “features” for a “cluster” based on weighted proximity relationships between the cluster and features. We measure proximity in an average fashion to address possible nonuniform data distribution in a cluster. Combining a standard multi-step paradigm with new lower and upper proximity bounds, we presented an efficient algorithm to solve the problem. The algorithm is implemented in several different modes. Our experiment results not only give a comparison among them but also illustrate the efficiency of the algorithm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Software simulation models are computer programs that need to be verified and debugged like any other software. In previous work, a method for error isolation in simulation models has been proposed. The method relies on a set of feature matrices that can be used to determine which part of the model implementation is responsible for deviations in the output of the model. Currrently these feature matrices have to be generated by hand from the model implementation, which is a tedious and error-prone task. In this paper, a method based on mutation analysis, as well as prototype tool support for the verification of the manually generated feature matrices is presented. The application of the method and tool to a model for wastewater treatment shows that the feature matrices can be verified effectively using a minimal number of mutants.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Summarizing topological relations is fundamental to many spatial applications including spatial query optimization. In this paper, we present several novel techniques to eectively construct cell density based spatial histograms for range (window) summarizations restricted to the four most important topological relations: contains, contained, overlap, and disjoint. We rst present a novel framework to construct a multiscale histogram composed of multiple Euler histograms with the guarantee of the exact summarization results for aligned windows in constant time. Then we present an approximate algorithm, with the approximate ratio 19/12, to minimize the storage spaces of such multiscale Euler histograms, although the problem is generally NP-hard. To conform to a limited storage space where only k Euler histograms are allowed, an effective algorithm is presented to construct multiscale histograms to achieve high accuracy. Finally, we present a new approximate algorithm to query an Euler histogram that cannot guarantee the exact answers; it runs in constant time. Our extensive experiments against both synthetic and real world datasets demonstrated that the approximate mul- tiscale histogram techniques may improve the accuracy of the existing techniques by several orders of magnitude while retaining the cost effciency, and the exact multiscale histogram technique requires only a storage space linearly proportional to the number of cells for the real datasets.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

40.00% 40.00%

Publicador:

Resumo:

With the proliferation of relational database programs for PC's and other platforms, many business end-users are creating, maintaining, and querying their own databases. More importantly, business end-users use the output of these queries as the basis for operational, tactical, and strategic decisions. Inaccurate data reduce the expected quality of these decisions. Implementing various input validation controls, including higher levels of normalisation, can reduce the number of data anomalies entering the databases. Even in well-maintained databases, however, data anomalies will still accumulate. To improve the quality of data, databases can be queried periodically to locate and correct anomalies. This paper reports the results of two experiments that investigated the effects of different data structures on business end-users' abilities to detect data anomalies in a relational database. The results demonstrate that both unnormalised and higher levels of normalisation lower the effectiveness and efficiency of queries relative to the first normal form. First normal form databases appear to provide the most effective and efficient data structure for business end-users formulating queries to detect data anomalies.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Seasonal climate forecasting offers potential for improving management of crop production risks in the cropping systems of NE Australia. But how is this capability best connected to management practice? Over the past decade, we have pursued participative systems approaches involving simulation-aided discussion with advisers and decision-makers. This has led to the development of discussion support software as a key vehicle for facilitating infusion of forecasting capability into practice. In this paper, we set out the basis of our approach, its implementation and preliminary evaluation. We outline the development of the discussion support software Whopper Cropper, which was designed for, and in close consultation with, public and private advisers. Whopper Cropper consists of a database of simulation output and a graphical user interface to generate analyses of risks associated with crop management options. The charts produced provide conversation pieces for advisers to use with their farmer clients in relation to the significant decisions they face. An example application, detail of the software development process and an initial survey of user needs are presented. We suggest that discussion support software is about moving beyond traditional notions of supply-driven decision support systems. Discussion support software is largely demand-driven and can compliment participatory action research programs by providing cost-effective general delivery of simulation-aided discussions about relevant management actions. The critical role of farm management advisers and dialogue among key players is highlighted. We argue that the discussion support concept, as exemplified by the software tool Whopper Cropper and the group processes surrounding it, provides an effective means to infuse innovations, like seasonal climate forecasting, into farming practice. Crown Copyright (C) 2002 Published by Elsevier Science Ltd. All rights reserved.