982 resultados para query rewriting


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The primary objective of this project, “the Assessment of Existing Information on Atlantic Coastal Fish Habitat”, is to inform conservation planning for the Atlantic Coastal Fish Habitat Partnership (ACFHP). ACFHP is recognized as a Partnership by the National Fish Habitat Action Plan (NFHAP), whose overall mission is to protect, restore, and enhance the nation’s fish and aquatic communities through partnerships that foster fish habitat conservation. This project is a cooperative effort of NOAA/NOS Center for Coastal Monitoring and Assessment (CCMA) Biogeography Branch and ACFHP. The Assessment includes three components; 1. a representative bibliographic and assessment database, 2. a Geographical Information System (GIS) spatial framework, and 3. a summary document with description of methods, analyses of habitat assessment information, and recommendations for further work. The spatial bibliography was created by linking the bibliographic table developed in Microsoft Excel and exported to SQL Server, with the spatial framework developed in ArcGIS and exported to GoogleMaps. The bibliography is a comprehensive, searchable database of over 500 selected documents and data sources on Atlantic coastal fish species and habitats. Key information captured for each entry includes basic bibliographic data, spatial footprint (e.g. waterbody or watershed), species and habitats covered, and electronic availability. Information on habitat condition indicators, threats, and conservation recommendations are extracted from each entry and recorded in a separate linked table. The spatial framework is a functional digital map based on polygon layers of watersheds, estuarine and marine waterbodies derived from NOAA’s Coastal Assessment Framework, MMS/NOAA’s Multipurpose Marine Cadastre, and other sources, providing spatial reference for all of the documents cited in the bibliography. Together, the bibliography and assessment tables and their spatial framework provide a powerful tool to query and assess available information through a publicly available web interface. They were designed to support the development of priorities for ACFHP’s conservation efforts within a geographic area extending from Maine to Florida, and from coastal watersheds seaward to the edge of the continental shelf. The Atlantic Coastal Fish Habitat Partnership has made initial use of the Assessment of Existing Information. Though it has not yet applied the AEI in a systematic or structured manner, it expects to find further uses as the draft conservation strategic plan is refined, and as regional action plans are developed. It also provides a means to move beyond an “assessment of existing information” towards an “assessment of fish habitat”, and is being applied towards the National Fish Habitat Action Plan (NFHAP) 2010 Assessment. Beyond the scope of the current project, there may be application to broader initiatives such as Integrated Ecosystem Assessments (IEAs), Ecosystem Based Management (EBM), and Marine Spatial Planning (MSP).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Modelo de dados e consultas Twing. Evolução dos algoritmos para consultas Twing. Avaliação dos algoritmos apresentados. Novos desafios. Considerações finais. Conclusões.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To construct high performance Web servers, system builders are increasingly turning to distributed designs. An important challenge that arises in distributed Web servers is the need to direct incoming connections to individual hosts. Previous methods for connection routing have employed a centralized node which handles all incoming requests. In contrast, we propose a distributed approach, called Distributed Packet Rewriting (DPR), in which all hosts of the distributed system participate in connection routing. We argue that this approach promises better scalability and fault-tolerance than the centralized approach. We describe our implementation of four variants of DPR and compare their performance. We show that DPR provides performance comparable to centralized alternatives, measured in terms of throughput and delay under the SPECweb96 benchmark. Finally, we argue that DPR is particularly attractive both for small scale systems and for systems following the emerging trend toward increasingly intelligent I/O subsystems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose and evaluate an implementation of a prototype scalable web server. The prototype consists of a load-balanced cluster of hosts that collectively accept and service TCP connections. The host IP addresses are advertised using the Round Robin DNS technique, allowing any host to receive requests from any client. Once a client attempts to establish a TCP connection with one of the hosts, a decision is made as to whether or not the connection should be redirected to a different host---namely, the host with the lowest number of established connections. We use the low-overhead Distributed Packet Rewriting (DPR) technique to redirect TCP connections. In our prototype, each host keeps information about connections in hash tables and linked lists. Every time a packet arrives, it is examined to see if it has to be redirected or not. Load information is maintained using periodic broadcasts amongst the cluster hosts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A common problem in many types of databases is retrieving the most similar matches to a query object. Finding those matches in a large database can be too slow to be practical, especially in domains where objects are compared using computationally expensive similarity (or distance) measures. This paper proposes a novel method for approximate nearest neighbor retrieval in such spaces. Our method is embedding-based, meaning that it constructs a function that maps objects into a real vector space. The mapping preserves a large amount of the proximity structure of the original space, and it can be used to rapidly obtain a short list of likely matches to the query. The main novelty of our method is that it constructs, together with the embedding, a query-sensitive distance measure that should be used when measuring distances in the vector space. The term "query-sensitive" means that the distance measure changes depending on the current query object. We report experiments with an image database of handwritten digits, and a time-series database. In both cases, the proposed method outperforms existing state-of-the-art embedding methods, meaning that it provides significantly better trade-offs between efficiency and retrieval accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Personal communication devices are increasingly equipped with sensors that are able to collect and locally store information from their environs. The mobility of users carrying such devices, and hence the mobility of sensor readings in space and time, opens new horizons for interesting applications. In particular, we envision a system in which the collective sensing, storage and communication resources, and mobility of these devices could be leveraged to query the state of (possibly remote) neighborhoods. Such queries would have spatio-temporal constraints which must be met for the query answers to be useful. Using a simplified mobility model, we analytically quantify the benefits from cooperation (in terms of the system's ability to satisfy spatio-temporal constraints), which we show to go beyond simple space-time tradeoffs. In managing the limited storage resources of such cooperative systems, the goal should be to minimize the number of unsatisfiable spatio-temporal constraints. We show that Data Centric Storage (DCS), or "directed placement", is a viable approach for achieving this goal, but only when the underlying network is well connected. Alternatively, we propose, "amorphous placement", in which sensory samples are cached locally, and shuffling of cached samples is used to diffuse the sensory data throughout the whole network. We evaluate conditions under which directed versus amorphous placement strategies would be more efficient. These results lead us to propose a hybrid placement strategy, in which the spatio-temporal constraints associated with a sensory data type determine the most appropriate placement strategy for that data type. We perform an extensive simulation study to evaluate the performance of directed, amorphous, and hybrid placement protocols when applied to queries that are subject to timing constraints. Our results show that, directed placement is better for queries with moderately tight deadlines, whereas amorphous placement is better for queries with looser deadlines, and that under most operational conditions, the hybrid technique gives the best compromise.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Query processing over the Internet involving autonomous data sources is a major task in data integration. It requires the estimated costs of possible queries in order to select the best one that has the minimum cost. In this context, the cost of a query is affected by three factors: network congestion, server contention state, and complexity of the query. In this paper, we study the effects of both the network congestion and server contention state on the cost of a query. We refer to these two factors together as system contention states. We present a new approach to determining the system contention states by clustering the costs of a sample query. For each system contention state, we construct two cost formulas for unary and join queries respectively using the multiple regression process. When a new query is submitted, its system contention state is estimated first using either the time slides method or the statistical method. The cost of the query is then calculated using the corresponding cost formulas. The estimated cost of the query is further adjusted to improve its accuracy. Our experiments show that our methods can produce quite accurate cost estimates of the submitted queries to remote data sources over the Internet.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A rapidly increasing number of Web databases are now become accessible via
their HTML form-based query interfaces. Query result pages are dynamically generated
in response to user queries, which encode structured data and are displayed for human
use. Query result pages usually contain other types of information in addition to query
results, e.g., advertisements, navigation bar etc. The problem of extracting structured data
from query result pages is critical for web data integration applications, such as comparison
shopping, meta-search engines etc, and has been intensively studied. A number of approaches
have been proposed. As the structures of Web pages become more and more complex, the
existing approaches start to fail, and most of them do not remove irrelevant contents which
may a®ect the accuracy of data record extraction. We propose an automated approach for
Web data extraction. First, it makes use of visual features and query terms to identify data
sections and extracts data records in these sections. We also represent several content and
visual features of visual blocks in a data section, and use them to ¯lter out noisy blocks.
Second, it measures similarity between data items in di®erent data records based on their
visual and content features, and aligns them into di®erent groups so that the data in the
same group have the same semantics. The results of our experiments with a large set of
Web query result pages in di®erent domains show that our proposed approaches are highly
e®ective.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In a human-computer dialogue system, the dialogue strategy can range from very restrictive to highly flexible. Each specific dialogue style has its pros and cons and a dialogue system needs to select the most appropriate style for a given user. During the course of interaction, the dialogue style can change based on a user’s response and the system observation of the user. This allows a dialogue system to understand a user better and provide a more suitable way of communication. Since measures of the quality of the user’s interaction with the system can be incomplete and uncertain, frameworks for reasoning with uncertain and incomplete information can help the system make better decisions when it chooses a dialogue strategy. In this paper, we investigate how to select a dialogue strategy based on aggregating the factors detected during the interaction with the user. For this purpose, we use probabilistic logic programming (PLP) to model probabilistic knowledge about how these factors will affect the degree of freedom of a dialogue. When a dialogue system needs to know which strategy is more suitable, an appropriate query can be executed against the PLP and a probabilistic solution with a degree of satisfaction is returned. The degree of satisfaction reveals how much the system can trust the probability attached to the solution.