986 resultados para Query paging


Relevância:

20.00% 20.00%

Publicador:

Resumo:

To effectively support today’s global economy, database systems need to manage data in multiple languages simultaneously. While current database systems do support the storage and management of multilingual data, they are not capable of querying across different natural languages. To address this lacuna, we have recently proposed two cross-lingual functionalities, LexEQUAL[13] and SemEQUAL[14], for matching multilingual names and concepts, respectively. In this paper, we investigate the native implementation of these multilingual functionalities as first-class operators on relational engines. Specifically, we propose a new multilingual storage datatype, and an associated algebra of the multilingual operators on this datatype. These components have been successfully implemented in the PostgreSQL database system, including integration of the algebra with the query optimizer and inclusion of a metric index in the access layer. Our experiments demonstrate that the performance of the native implementation is up to two orders-of-magnitude faster than the corresponding outsidethe- server implementation. Further, these multilingual additions do not adversely impact the existing functionality and performance. To the best of our knowledge, our prototype represents the first practical implementation of a crosslingual database query engine.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Query focused summarization is the task of producing a compressed text of original set of documents based on a query. Documents can be viewed as graph with sentences as nodes and edges can be added based on sentence similarity. Graph based ranking algorithms which use 'Biased random surfer model' like topic-sensitive LexRank have been successfully applied to query focused summarization. In these algorithms, random walk will be biased towards the sentences which contain query relevant words. Specifically, it is assumed that random surfer knows the query relevance score of the sentence to where he jumps. However, neighbourhood information of the sentence to where he jumps is completely ignored. In this paper, we propose look-ahead version of topic-sensitive LexRank. We assume that random surfer not only knows the query relevance of the sentence to where he jumps but he can also look N-step ahead from that sentence to find query relevance scores of future set of sentences. Using this look ahead information, we figure out the sentences which are indirectly related to the query by looking at number of hops to reach a sentence which has query relevant words. Then we make the random walk biased towards even to the indirect query relevant sentences along with the sentences which have query relevant words. Experimental results show 20.2% increase in ROUGE-2 score compared to topic-sensitive LexRank on DUC 2007 data set. Further, our system outperforms best systems in DUC 2006 and results are comparable to state of the art systems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An n-length block code C is said to be r-query locally correctable, if for any codeword x ∈ C, one can probabilistically recover any one of the n coordinates of the codeword x by querying at most r coordinates of a possibly corrupted version of x. It is known that linear codes whose duals contain 2-designs are locally correctable. In this article, we consider linear codes whose duals contain t-designs for larger t. It is shown here that for such codes, for a given number of queries r, under linear decoding, one can, in general, handle a larger number of corrupted bits. We exhibit to our knowledge, for the first time, a finite length code, whose dual contains 4-designs, which can tolerate a fraction of up to 0.567/r corrupted symbols as against a maximum of 0.5/r in prior constructions. We also present an upper bound that shows that 0.567 is the best possible for this code length and query complexity over this symbol alphabet thereby establishing optimality of this code in this respect. A second result in the article is a finite-length bound which relates the number of queries r and the fraction of errors that can be tolerated, for a locally correctable code that employs a randomized algorithm in which each instance of the algorithm involves t-error correction.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Query suggestion is an important feature of the search engine with the explosive and diverse growth of web contents. Different kind of suggestions like query, image, movies, music and book etc. are used every day. Various types of data sources are used for the suggestions. If we model the data into various kinds of graphs then we can build a general method for any suggestions. In this paper, we have proposed a general method for query suggestion by combining two graphs: (1) query click graph which captures the relationship between queries frequently clicked on common URLs and (2) query text similarity graph which finds the similarity between two queries using Jaccard similarity. The proposed method provides literally as well as semantically relevant queries for users' need. Simulation results show that the proposed algorithm outperforms heat diffusion method by providing more number of relevant queries. It can be used for recommendation tasks like query, image, and product suggestion.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Query-by-Example Spoken Term Detection (QbE STD) aims at retrieving data from a speech data repository given an acoustic query containing the term of interest as input. Nowadays, it has been receiving much interest due to the high volume of information stored in audio or audiovisual format. QbE STD differs from automatic speech recognition (ASR) and keyword spotting (KWS)/spoken term detection (STD) since ASR is interested in all the terms/words that appear in the speech signal and KWS/STD relies on a textual transcription of the search term to retrieve the speech data. This paper presents the systems submitted to the ALBAYZIN 2012 QbE STD evaluation held as a part of ALBAYZIN 2012 evaluation campaign within the context of the IberSPEECH 2012 Conference(a). The evaluation consists of retrieving the speech files that contain the input queries, indicating their start and end timestamps within the appropriate speech file. Evaluation is conducted on a Spanish spontaneous speech database containing a set of talks from MAVIR workshops(b), which amount at about 7 h of speech in total. We present the database metric systems submitted along with all results and some discussion. Four different research groups took part in the evaluation. Evaluation results show the difficulty of this task and the limited performance indicates there is still a lot of room for improvement. The best result is achieved by a dynamic time warping-based search over Gaussian posteriorgrams/posterior phoneme probabilities. This paper also compares the systems aiming at establishing the best technique dealing with that difficult task and looking for defining promising directions for this relatively novel task.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The primary objective of this project, “the Assessment of Existing Information on Atlantic Coastal Fish Habitat”, is to inform conservation planning for the Atlantic Coastal Fish Habitat Partnership (ACFHP). ACFHP is recognized as a Partnership by the National Fish Habitat Action Plan (NFHAP), whose overall mission is to protect, restore, and enhance the nation’s fish and aquatic communities through partnerships that foster fish habitat conservation. This project is a cooperative effort of NOAA/NOS Center for Coastal Monitoring and Assessment (CCMA) Biogeography Branch and ACFHP. The Assessment includes three components; 1. a representative bibliographic and assessment database, 2. a Geographical Information System (GIS) spatial framework, and 3. a summary document with description of methods, analyses of habitat assessment information, and recommendations for further work. The spatial bibliography was created by linking the bibliographic table developed in Microsoft Excel and exported to SQL Server, with the spatial framework developed in ArcGIS and exported to GoogleMaps. The bibliography is a comprehensive, searchable database of over 500 selected documents and data sources on Atlantic coastal fish species and habitats. Key information captured for each entry includes basic bibliographic data, spatial footprint (e.g. waterbody or watershed), species and habitats covered, and electronic availability. Information on habitat condition indicators, threats, and conservation recommendations are extracted from each entry and recorded in a separate linked table. The spatial framework is a functional digital map based on polygon layers of watersheds, estuarine and marine waterbodies derived from NOAA’s Coastal Assessment Framework, MMS/NOAA’s Multipurpose Marine Cadastre, and other sources, providing spatial reference for all of the documents cited in the bibliography. Together, the bibliography and assessment tables and their spatial framework provide a powerful tool to query and assess available information through a publicly available web interface. They were designed to support the development of priorities for ACFHP’s conservation efforts within a geographic area extending from Maine to Florida, and from coastal watersheds seaward to the edge of the continental shelf. The Atlantic Coastal Fish Habitat Partnership has made initial use of the Assessment of Existing Information. Though it has not yet applied the AEI in a systematic or structured manner, it expects to find further uses as the draft conservation strategic plan is refined, and as regional action plans are developed. It also provides a means to move beyond an “assessment of existing information” towards an “assessment of fish habitat”, and is being applied towards the National Fish Habitat Action Plan (NFHAP) 2010 Assessment. Beyond the scope of the current project, there may be application to broader initiatives such as Integrated Ecosystem Assessments (IEAs), Ecosystem Based Management (EBM), and Marine Spatial Planning (MSP).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Modelo de dados e consultas Twing. Evolução dos algoritmos para consultas Twing. Avaliação dos algoritmos apresentados. Novos desafios. Considerações finais. Conclusões.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A common problem in many types of databases is retrieving the most similar matches to a query object. Finding those matches in a large database can be too slow to be practical, especially in domains where objects are compared using computationally expensive similarity (or distance) measures. This paper proposes a novel method for approximate nearest neighbor retrieval in such spaces. Our method is embedding-based, meaning that it constructs a function that maps objects into a real vector space. The mapping preserves a large amount of the proximity structure of the original space, and it can be used to rapidly obtain a short list of likely matches to the query. The main novelty of our method is that it constructs, together with the embedding, a query-sensitive distance measure that should be used when measuring distances in the vector space. The term "query-sensitive" means that the distance measure changes depending on the current query object. We report experiments with an image database of handwritten digits, and a time-series database. In both cases, the proposed method outperforms existing state-of-the-art embedding methods, meaning that it provides significantly better trade-offs between efficiency and retrieval accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Personal communication devices are increasingly equipped with sensors that are able to collect and locally store information from their environs. The mobility of users carrying such devices, and hence the mobility of sensor readings in space and time, opens new horizons for interesting applications. In particular, we envision a system in which the collective sensing, storage and communication resources, and mobility of these devices could be leveraged to query the state of (possibly remote) neighborhoods. Such queries would have spatio-temporal constraints which must be met for the query answers to be useful. Using a simplified mobility model, we analytically quantify the benefits from cooperation (in terms of the system's ability to satisfy spatio-temporal constraints), which we show to go beyond simple space-time tradeoffs. In managing the limited storage resources of such cooperative systems, the goal should be to minimize the number of unsatisfiable spatio-temporal constraints. We show that Data Centric Storage (DCS), or "directed placement", is a viable approach for achieving this goal, but only when the underlying network is well connected. Alternatively, we propose, "amorphous placement", in which sensory samples are cached locally, and shuffling of cached samples is used to diffuse the sensory data throughout the whole network. We evaluate conditions under which directed versus amorphous placement strategies would be more efficient. These results lead us to propose a hybrid placement strategy, in which the spatio-temporal constraints associated with a sensory data type determine the most appropriate placement strategy for that data type. We perform an extensive simulation study to evaluate the performance of directed, amorphous, and hybrid placement protocols when applied to queries that are subject to timing constraints. Our results show that, directed placement is better for queries with moderately tight deadlines, whereas amorphous placement is better for queries with looser deadlines, and that under most operational conditions, the hybrid technique gives the best compromise.