22 resultados para Similarity Query
em Universidad Politécnica de Madrid
Resumo:
Se definen conceptos y se aplica el teorema de Valverde para escribir un algoritmo que computa bases de similaridades. This paper studies sorne theory and methods to build a representation theorem basis of a similarity from the basis of its subsimilarities, providing an alternative recursive method to compute the basis of a similarity.
Resumo:
This paper describes the participation of DAEDALUS at the LogCLEF lab in CLEF 2011. This year, the objectives of our participation are twofold. The first topic is to analyze if there is any measurable effect on the success of the search queries if the native language and the interface language chosen by the user are different. The idea is to determine if this difference may condition the way in which the user interacts with the search application. The second topic is to analyze the user context and his/her interaction with the system in the case of successful queries, to discover out any relation among the user native language, the language of the resource involved and the interaction strategy adopted by the user to find out such resource. Only 6.89% of queries are successful out of the 628,607 queries in the 320,001 sessions with at least one search query in the log. The main conclusion that can be drawn is that, in general for all languages, whether the native language matches the interface language or not does not seem to affect the success rate of the search queries. On the other hand, the analysis of the strategy adopted by users when looking for a particular resource shows that people tend to use the simple search tool, frequently first running short queries build up of just one specific term and then browsing through the results to locate the expected resource
Resumo:
Recommender systems play an important role in reducing the negative impact of informa- tion overload on those websites where users have the possibility of voting for their prefer- ences on items. The most normal technique for dealing with the recommendation mechanism is to use collaborative filtering, in which it is essential to discover the most similar users to whom you desire to make recommendations. The hypothesis of this paper is that the results obtained by applying traditional similarities measures can be improved by taking contextual information, drawn from the entire body of users, and using it to cal- culate the singularity which exists, for each item, in the votes cast by each pair of users that you wish to compare. As such, the greater the measure of singularity result between the votes cast by two given users, the greater the impact this will have on the similarity. The results, tested on the Movielens, Netflix and FilmAffinity databases, corroborate the excellent behaviour of the singularity measure proposed.
Resumo:
Collaborative filtering recommender systems contribute to alleviating the problem of information overload that exists on the Internet as a result of the mass use of Web 2.0 applications. The use of an adequate similarity measure becomes a determining factor in the quality of the prediction and recommendation results of the recommender system, as well as in its performance. In this paper, we present a memory-based collaborative filtering similarity measure that provides extremely high-quality and balanced results; these results are complemented with a low processing time (high performance), similar to the one required to execute traditional similarity metrics. The experiments have been carried out on the MovieLens and Netflix databases, using a representative set of information retrieval quality measures.
Resumo:
RDB2RDF systems generate RDF from relational databases, operating in two dierent manners: materializing the database content into RDF or acting as virtual RDF datastores that transform SPARQL queries into SQL. In the former, inferences on the RDF data (taking into account the ontologies that they are related to) are normally done by the RDF triple store where the RDF data is materialised and hence the results of the query answering process depend on the store. In the latter, existing RDB2RDF systems do not normally perform such inferences at query time. This paper shows how the algorithm used in the REQUIEM system, focused on handling run-time inferences for query answering, can be adapted to handle such inferences for query answering in combination with RDB2RDF systems.
Resumo:
RDB2RDF systems generate RDF from relational databases, operating in two di�erent manners: materializing the database content into RDF or acting as virtual RDF datastores that transform SPARQL queries into SQL. In the former, inferences on the RDF data (taking into account the ontologies that they are related to) are normally done by the RDF triple store where the RDF data is materialised and hence the results of the query answering process depend on the store. In the latter, existing RDB2RDF systems do not normally perform such inferences at query time. This paper shows how the algorithm used in the REQUIEM system, focused on handling run-time inferences for query answering, can be adapted to handle such inferences for query answering in combination with RDB2RDF systems.
Resumo:
Linked data offers a promising setting to encode, publish and share metadata of resources. As the matter of fact, it is already adopted by data producers such as European Environment Agency, US and some EU Governs, whose first ambition is to share (meta)data making their processes more effective and transparent. Such as an increasing interest and involvement of data providers surely represents a genuine witness of the web of data success, but in a longer perspective, frameworks supporting linked data consumers in their decision making processes will be a compelling need. In this respect, the talk is introducing SSONDE, a framework enabling in detailed comparison, ranking and selection of linked data resources through the analysis of their RDF ontology driven metadata. SSONDE implements an instance similarity especially designed to support in resource selection, namely the process stakeholders engage to choose a set of resources suitable for a given analysis purpose: (i) it deploys an asymmetric similarity assessment to emphasize information about gains and losses the stakeholders get adopting a resource in place of another; (ii) it relies on an explicit formalization of contexts to tailor the similarity assessment with respect to specific user-defined selection goals. The talk aims at providing an insight on SSONDE instance similarity and it will briefly describe some examples of SSONDE deployment in the context of linked data consumption.
Resumo:
Sensor networks are increasingly being deployed in the environment for many different purposes. The observations that they produce are made available with heterogeneous schemas, vocabularies and data formats, making it difficult to share and reuse this data, for other purposes than those for which they were originally set up. The authors propose an ontology-based approach for providing data access and query capabilities to streaming data sources, allowing users to express their needs at a conceptual level, independent of implementation and language-specific details. In this article, the authors describe the theoretical foundations and technologies that enable exposing semantically enriched sensor metadata, and querying sensor observations through SPARQL extensions, using query rewriting and data translation techniques according to mapping languages, and managing both pull and push delivery modes.
Resumo:
Testbeds proposed so far to evaluate, compare, and eventually improve SPARQL query federation systems have still some limitations. Some variables and con�gurations that may have an impact on the behavior of these systems (e.g., network latency, data partitioning and query properties) are not su�ciently de�ned; this a�ects the results and repeatability of independent evaluation studies, and hence the insights that can be obtained from them. In this paper we evaluate FedBench, the most comprehensive testbed up to now, and empirically probe the need of considering additional dimensions and variables. The evaluation has been conducted on three SPARQL query federation systems, and the analysis of these results has allowed to uncover properties of these systems that would normally be hidden with the original testbeds.
Resumo:
Numerous authors have proposed functions to quantify the degree of similarity between two fuzzy numbers using various descriptive parameters, such as the geometric distance, the distance between the centers of gravity or the perimeter. However, these similarity functions have drawback for specific situations. We propose a new similarity measure for generalized trapezoidal fuzzy numbers aimed at overcoming such drawbacks. This new measure accounts for the distance between the centers of gravity and the geometric distance but also incorporates a new term based on the shared area between the fuzzy numbers. The proposed measure is compared against other measures in the literature.
Resumo:
The physical appearance of granular media suggests the existence of geometrical scale invariance. The paper discuss how this physico-empirical property can be mathematically encoded leading to different generative models: a smooth one encoded by a differential equation and another encoded by an equation coming from a measure theoretical property.
Resumo:
Query rewriting is one of the fundamental steps in ontologybased data access (OBDA) approaches. It takes as inputs an ontology and a query written according to that ontology, and produces as an output a set of queries that should be evaluated to account for the inferences that should be considered for that query and ontology. Different query rewriting systems give support to different ontology languages with varying expressiveness, and the rewritten queries obtained as an output do also vary in expressiveness. This heterogeneity has traditionally made it difficult to compare different approaches, and the area lacks in general commonly agreed benchmarks that could be used not only for such comparisons but also for improving OBDA support. In this paper we compile data, dimensions and measurements that have been used to evaluate some of the most recent systems, we analyse and characterise these assets, and provide a unified set of them that could be used as a starting point towards a more systematic benchmarking process for such systems. Finally, we apply this initial benchmark with some of the most relevant OBDA approaches in the state of the art.
Resumo:
Aims of study: The goals of this paper are to summarize and to compare plant species richness and floristic similarity at two spatial scales; mesohabitat (normal, eutrophic, and oligotrophic dehesas) and dehesa habitat; and to establish guidelines for conserving species diversity in dehesas. Area of study: We considered four dehesa sites in the western Peninsular Spain, located along a climatic and biogeographic gradient from north to south. Main results: Average alpha richness for mesohabitats was 75.6 species, and average alpha richness for dehesa sites was 146.3. Gamma richness assessed for the overall dehesa habitat was 340.0 species. The species richness figures of normal dehesa mesohabitat were significantly lesser than of the eutrophic mesohabitat and lesser than the oligotrophic mesohabitat too. No significant differences were found for species richness among dehesa sites. We have found more dissimilarity at local scale (mesohabitat) than at regional scale (habitat). Finally, the results of the similarity assessment between dehesa sites reflected both climatic and biogeographic gradients. Research highlights: An effective conservation of dehesas must take into account local and regional conditions all along their distribution range for ensuring the conservation of the main vascular plant species assemblages as well as the associated fauna
Resumo:
There is controversy regarding the use of the similarity functions proposed in the literature to compare generalized trapezoidal fuzzy numbers since conflicting similarity values are sometimes output for the same pair of fuzzy numbers. In this paper we propose a similarity function aimed at establishing a consensus. It accounts for the different approaches of all the similarity functions. It also has better properties and can easily incorporate new parameters for future improvements. The analysis is carried out on the basis of a large and representative set of pairs of trapezoidal fuzzy numbers.
Resumo:
In this paper we study query answering and rewriting in ontologybased data access. Specifically, we present an algorithm for computing a perfect rewriting of unions of conjunctive queries posed over ontologies expressed in the description logic ELHIO, which covers the OWL 2 QL and OWL 2 EL profiles. The novelty of our algorithm is the use of a set of ABox dependencies, which are compiled into a so-called EBox, to limit the expansion of the rewriting. So far, EBoxes have only been used in query rewriting in the case of DL-Lite, which is less expressive than ELHIO. We have extensively evaluated our new query rewriting technique, and in this paper we discuss the tradeoff between the reduction of the size of the rewriting and the computational cost of our approach.