130 resultados para Ranking fuzzy numbers


Relevância:

20.00% 20.00%

Publicador:

Resumo:

A known limitation of the Probability Ranking Principle (PRP) is that it does not cater for dependence between documents. Recently, the Quantum Probability Ranking Principle (QPRP) has been proposed, which implicitly captures dependencies between documents through “quantum interference”. This paper explores whether this new ranking principle leads to improved performance for subtopic retrieval, where novelty and diversity is required. In a thorough empirical investigation, models based on the PRP, as well as other recently proposed ranking strategies for subtopic retrieval (i.e. Maximal Marginal Relevance (MMR) and Portfolio Theory(PT)), are compared against the QPRP. On the given task, it is shown that the QPRP outperforms these other ranking strategies. And unlike MMR and PT, one of the main advantages of the QPRP is that no parameter estimation/tuning is required; making the QPRP both simple and effective. This research demonstrates that the application of quantum theory to problems within information retrieval can lead to significant improvements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we consider the problem of document ranking in a non-traditional retrieval task, called subtopic retrieval. This task involves promoting relevant documents that cover many subtopics of a query at early ranks, providing thus diversity within the ranking. In the past years, several approaches have been proposed to diversify retrieval results. These approaches can be classified into two main paradigms, depending upon how the ranks of documents are revised for promoting diversity. In the first approach subtopic diversification is achieved implicitly, by choosing documents that are different from each other, while in the second approach this is done explicitly, by estimating the subtopics covered by documents. Within this context, we compare methods belonging to the two paradigms. Furthermore, we investigate possible strategies for integrating the two paradigms with the aim of formulating a new ranking method for subtopic retrieval. We conduct a number of experiments to empirically validate and contrast the state-of-the-art approaches as well as instantiations of our integration approach. The results show that the integration approach outperforms state-of-the-art strategies with respect to a number of measures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work, we summarise the development of a ranking principle based on quantum probability theory, called the Quantum Probability Ranking Principle (QPRP), and we also provide an overview of the initial experiments performed employing the QPRP. The main difference between the QPRP and the classic Probability Ranking Principle, is that the QPRP implicitly captures the dependencies between documents by means of quantum interference". Subsequently, the optimal ranking of documents is not based solely on documents' probability of relevance but also on the interference with the previously ranked documents. Our research shows that the application of quantum theory to problems within information retrieval can lead to consistently better retrieval effectiveness, while still being simple, elegant and tractable.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the last years several works have investigated a formal model for Information Retrieval (IR) based on the mathematical formalism underlying quantum theory. These works have mainly exploited geometric and logical–algebraic features of the quantum formalism, for example entanglement, superposition of states, collapse into basis states, lattice relationships. In this poster I present an analogy between a typical IR scenario and the double slit experiment. This experiment exhibits the presence of interference phenomena between events in a quantum system, causing the Kolmogorovian law of total probability to fail. The analogy allows to put forward the routes for the application of quantum probability theory in IR. However, several questions need still to be addressed; they will be the subject of my PhD research

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The assumptions underlying the Probability Ranking Principle (PRP) have led to a number of alternative approaches that cater or compensate for the PRP’s limitations. All alternatives deviate from the PRP by incorporating dependencies. This results in a re-ranking that promotes or demotes documents depending upon their relationship with the documents that have been already ranked. In this paper, we compare and contrast the behaviour of state-of-the-art ranking strategies and principles. To do so, we tease out analytical relationships between the ranking approaches and we investigate the document kinematics to visualise the effects of the different approaches on document ranking.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Quantum-inspired models have recently attracted increasing attention in Information Retrieval. An intriguing characteristic of the mathematical framework of quantum theory is the presence of complex numbers. However, it is unclear what such numbers could or would actually represent or mean in Information Retrieval. The goal of this paper is to discuss the role of complex numbers within the context of Information Retrieval. First, we introduce how complex numbers are used in quantum probability theory. Then, we examine van Rijsbergen’s proposal of evoking complex valued representations of informations objects. We empirically show that such a representation is unlikely to be effective in practice (confuting its usefulness in Information Retrieval). We then explore alternative proposals which may be more successful at realising the power of complex numbers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For TREC Crowdsourcing 2011 (Stage 2) we propose a networkbased approach for assigning an indicative measure of worker trustworthiness in crowdsourced labelling tasks. Workers, the gold standard and worker/gold standard agreements are modelled as a network. For the purpose of worker trustworthiness assignment, a variant of the PageRank algorithm, named TurkRank, is used to adaptively combine evidence that suggests worker trustworthiness, i.e., agreement with other trustworthy co-workers and agreement with the gold standard. A single parameter controls the importance of co-worker agreement versus gold standard agreement. The TurkRank score calculated for each worker is incorporated with a worker-weighted mean label aggregation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

At Eurocrypt’04, Freedman, Nissim and Pinkas introduced a fuzzy private matching problem. The problem is defined as follows. Given two parties, each of them having a set of vectors where each vector has T integer components, the fuzzy private matching is to securely test if each vector of one set matches any vector of another set for at least t components where t < T. In the conclusion of their paper, they asked whether it was possible to design a fuzzy private matching protocol without incurring a communication complexity with the factor (T t ) . We answer their question in the affirmative by presenting a protocol based on homomorphic encryption, combined with the novel notion of a share-hiding error-correcting secret sharing scheme, which we show how to implement with efficient decoding using interleaved Reed-Solomon codes. This scheme may be of independent interest. Our protocol is provably secure against passive adversaries, and has better efficiency than previous protocols for certain parameter values.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We introduce a function Z(k) which measures the number of distinct ways in which a number can be expressed as the sum of Fibonacci numbers. Using a binary table and other devices, we explore the values that Z(k) can take and reveal a surprising relationship between the values of Z(k) and the Fibonacci numbers from which they were derived. The article shows the way in which standard spreadsheet functionalities makes it possible to reveal quite striking patterns in data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The television quiz program Letters and Numbers, broadcast on the SBS network, has recently become quite popular in Australia. This paper considers an implementation in Excel 2010 and its potential as a vehicle to showcase a range of mathematical and computing concepts and principles.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The television quiz program Letters and Numbers, broadcast on the SBS network, has recently become quite popular in Australia. This paper explores the potential of this game to illustrate and engage student interest in a range of fundamental concepts of computer science and mathematics. The Numbers Game in particular has a rich mathematical structure whose analysis and solution involves concepts of counting and problem size, discrete (tree) structures, language theory, recurrences, computational complexity, and even advanced memory management. This paper presents an analysis of these games and their teaching applications, and presents some initial results of use in student assignments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the growing size and variety of social media files on the web, it’s becoming critical to efficiently organize them into clusters for further processing. This paper presents a novel scalable constrained document clustering method that harnesses the power of search engines capable of dealing with large text data. Instead of calculating distance between the documents and all of the clusters’ centroids, a neighborhood of best cluster candidates is chosen using a document ranking scheme. To make the method faster and less memory dependable, the in-memory and in-database processing are combined in a semi-incremental manner. This method has been extensively tested in the social event detection application. Empirical analysis shows that the proposed method is efficient both in computation and memory usage while producing notable accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A common problem with the use of tensor modeling in generating quality recommendations for large datasets is scalability. In this paper, we propose the Tensor-based Recommendation using Probabilistic Ranking method that generates the reconstructed tensor using block-striped parallel matrix multiplication and then probabilistically calculates the preferences of user to rank the recommended items. Empirical analysis on two real-world datasets shows that the proposed method is scalable for large tensor datasets and is able to outperform the benchmarking methods in terms of accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The assumptions underlying the Probability Ranking Principle (PRP) have led to a number of alternative approaches that cater or compensate for the PRP’s limitations. All alternatives deviate from the PRP by incorporating dependencies. This results in a re-ranking that promotes or demotes documents depending upon their relationship with the documents that have been already ranked. In this paper, we compare and contrast the behaviour of state-of-the-art ranking strategies and principles. To do so, we tease out analytical relationships between the ranking approaches and we investigate the document kinematics to visualise the effects of the different approaches on document ranking.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we consider the problem of document ranking in a non-traditional retrieval task, called subtopic retrieval. This task involves promoting relevant documents that cover many subtopics of a query at early ranks, providing thus diversity within the ranking. In the past years, several approaches have been proposed to diversify retrieval results. These approaches can be classified into two main paradigms, depending upon how the ranks of documents are revised for promoting diversity. In the first approach subtopic diversification is achieved implicitly, by choosing documents that are different from each other, while in the second approach this is done explicitly, by estimating the subtopics covered by documents. Within this context, we compare methods belonging to the two paradigms. Furthermore, we investigate possible strategies for integrating the two paradigms with the aim of formulating a new ranking method for subtopic retrieval. We conduct a number of experiments to empirically validate and contrast the state-of-the-art approaches as well as instantiations of our integration approach. The results show that the integration approach outperforms state-of-the-art strategies with respect to a number of measures.