948 resultados para Re-ranking methods
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Huge image collections are becoming available lately. In this scenario, the use of Content-Based Image Retrieval (CBIR) systems has emerged as a promising approach to support image searches. The objective of CBIR systems is to retrieve the most similar images in a collection, given a query image, by taking into account image visual properties such as texture, color, and shape. In these systems, the effectiveness of the retrieval process depends heavily on the accuracy of ranking approaches. Recently, re-ranking approaches have been proposed to improve the effectiveness of CBIR systems by taking into account the relationships among images. The re-ranking approaches consider the relationships among all images in a given dataset. These approaches typically demands a huge amount of computational power, which hampers its use in practical situations. On the other hand, these methods can be massively parallelized. In this paper, we propose to speedup the computation of the RL-Sim algorithm, a recently proposed image re-ranking approach, by using the computational power of Graphics Processing Units (GPU). GPUs are emerging as relatively inexpensive parallel processors that are becoming available on a wide range of computer systems. We address the image re-ranking performance challenges by proposing a parallel solution designed to fit the computational model of GPUs. We conducted an experimental evaluation considering different implementations and devices. Experimental results demonstrate that significant performance gains can be obtained. Our approach achieves speedups of 7x from serial implementation considering the overall algorithm and up to 36x on its core steps.
Resumo:
Document ranking is an important process in information retrieval (IR). It presents retrieved documents in an order of their estimated degrees of relevance to query. Traditional document ranking methods are mostly based on the similarity computations between documents and query. In this paper we argue that the similarity-based document ranking is insufficient in some cases. There are two reasons. Firstly it is about the increased information variety. There are far too many different types documents available now for user to search. The second is about the users variety. In many cases user may want to retrieve documents that are not only similar but also general or broad regarding a certain topic. This is particularly the case in some domains such as bio-medical IR. In this paper we propose a novel approach to re-rank the retrieved documents by incorporating the similarity with their generality. By an ontology-based analysis on the semantic cohesion of text, document generality can be quantified. The retrieved documents are then re-ranked by their combined scores of similarity and the closeness of documents’ generality to the query’s. Our experiments have shown an encouraging performance on a large bio-medical document collection, OHSUMED, containing 348,566 medical journal references and 101 test queries.
Resumo:
A cikk a páros összehasonlításokon alapuló pontozási eljárásokat tárgyalja axiomatikus megközelítésben. A szakirodalomban számos értékelő függvényt javasoltak erre a célra, néhány karakterizációs eredmény is ismert. Ennek ellenére a megfelelő módszer kiválasztása nem egy-szerű feladat, a különböző tulajdonságok bevezetése elsősorban ebben nyújthat segítséget. Itt az összehasonlított objektumok teljesítményén érvényesülő monotonitást tárgyaljuk az önkonzisztencia és önkonzisztens monotonitás axiómákból kiindulva. Bemutatásra kerülnek lehetséges gyengítéseik és kiterjesztéseik, illetve egy, az irreleváns összehasonlításoktól való függetlenséggel kapcsolatos lehetetlenségi tétel is. A tulajdonságok teljesülését három eljárásra, a klasszikus pontszám eljárásra, az ezt továbbfejlesztő általánosított sorösszegre és a legkisebb négyzetek módszerére vizsgáljuk meg, melyek mindegyike egy lineáris egyenletrendszer megoldásaként számítható. A kapott eredmények új szempontokkal gazdagítják a pontozási eljárás megválasztásának kérdését. _____ The paper provides an axiomatic analysis of some scoring procedures based on paired comparisons. Several methods have been proposed for these generalized tournaments, some of them have been also characterized by a set of properties. The choice of an appropriate method is supported by a discussion of their theoretical properties. In the paper we focus on the connections of self-consistency and self-consistent-monotonicity, two axioms based on the comparisons of object's performance. The contradiction of self-consistency and independence of irrel-evant matches is revealed, as well as some possible reductions and extensions of these properties. Their satisfiability is examined through three scoring procedures, the score, generalised row sum and least squares methods, each of them is calculated as a solution of a system of linear equations. Our results contribute to the problem of finding a proper paired comparison based scoring method.
Resumo:
A páronként összehasonlított alternatívák rangsorolásának problémája egyaránt felmerül a szavazáselmélet, a statisztika, a tudománymetria, a pszichológia és a sport területén. A nemzetközi szakirodalom alapján részletesen áttekintjük a megoldási lehetőségeket, bemutatjuk a gyakorlati alkalmazások során fellépő kérdések kezelésének, a valós adatoknak megfelelő matematikai környezet felépítésének módjait. Kiemelten tárgyaljuk a páros összehasonlítási mátrix megadását, az egyes pontozási eljárásokat és azok kapcsolatát. A tanulmány elméleti szempontból vizsgálja a Perron-Frobenius tételen alapuló invariáns, fair bets, PageRank, valamint az irányított gráfok csúcsainak rangsorolásra javasolt internal slackening és pozíciós erő módszereket. A közülük történő választáshoz az axiomatikus megközelítést ajánljuk, ennek keretében bemutatjuk az invariáns és a fair bets eljárások karakterizációját, és kitérünk a módszerek vitatható tulajdonságaira. _____ The ranking of the alternatives or selecting the best one are fundamental issues of social choice theory, statistics, psychology and sport. Different solution concepts, and various mathematical models of applications are reviewed based on the international literature. We are focusing on the de¯nition of paired comparison matrix, on main scoring procedures and their relation. The paper gives a theoretical analysis of the invariant, fair bets and PageRank methods, which are founded on Perron-Frobenius theorem, as well as the internal slackening and positional power procedures used for ranking the nodes of a directed graph. An axiomatic approach is proposed for the choice of an appropriate method. Besides some known characterizations for the invariant and fair bets methods, we also discuss the violation of some properties, meaning their main weakness.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
The assumptions underlying the Probability Ranking Principle (PRP) have led to a number of alternative approaches that cater or compensate for the PRP’s limitations. All alternatives deviate from the PRP by incorporating dependencies. This results in a re-ranking that promotes or demotes documents depending upon their relationship with the documents that have been already ranked. In this paper, we compare and contrast the behaviour of state-of-the-art ranking strategies and principles. To do so, we tease out analytical relationships between the ranking approaches and we investigate the document kinematics to visualise the effects of the different approaches on document ranking.
Resumo:
The assumptions underlying the Probability Ranking Principle (PRP) have led to a number of alternative approaches that cater or compensate for the PRP’s limitations. All alternatives deviate from the PRP by incorporating dependencies. This results in a re-ranking that promotes or demotes documents depending upon their relationship with the documents that have been already ranked. In this paper, we compare and contrast the behaviour of state-of-the-art ranking strategies and principles. To do so, we tease out analytical relationships between the ranking approaches and we investigate the document kinematics to visualise the effects of the different approaches on document ranking.
Resumo:
Each item in a given collection is characterized by a set of possible performances. A (ranking) method is a function that assigns an ordering of the items to every performance profile. Ranking by Rating consists in evaluating each item’s performance by using an exogenous rating function, and ranking items according to their performance ratings. Any such method is separable: the ordering of two items does not depend on the performances of the remaining items. We prove that every separable method must be of the ranking-by-rating type if (i) the set of possible performances is the same for all items and the method is anonymous, or (ii) the set of performances of each item is ordered and the method is monotonic. When performances are m-dimensional vectors, a separable, continuous, anonymous, monotonic, and invariant method must rank items according to a weighted geometric mean of their performances along the m dimensions.
Resumo:
Background: Selecting the highest quality 3D model of a protein structure from a number of alternatives remains an important challenge in the field of structural bioinformatics. Many Model Quality Assessment Programs (MQAPs) have been developed which adopt various strategies in order to tackle this problem, ranging from the so called "true" MQAPs capable of producing a single energy score based on a single model, to methods which rely on structural comparisons of multiple models or additional information from meta-servers. However, it is clear that no current method can separate the highest accuracy models from the lowest consistently. In this paper, a number of the top performing MQAP methods are benchmarked in the context of the potential value that they add to protein fold recognition. Two novel methods are also described: ModSSEA, which based on the alignment of predicted secondary structure elements and ModFOLD which combines several true MQAP methods using an artificial neural network. Results: The ModSSEA method is found to be an effective model quality assessment program for ranking multiple models from many servers, however further accuracy can be gained by using the consensus approach of ModFOLD. The ModFOLD method is shown to significantly outperform the true MQAPs tested and is competitive with methods which make use of clustering or additional information from multiple servers. Several of the true MQAPs are also shown to add value to most individual fold recognition servers by improving model selection, when applied as a post filter in order to re-rank models. Conclusion: MQAPs should be benchmarked appropriately for the practical context in which they are intended to be used. Clustering based methods are the top performing MQAPs where many models are available from many servers; however, they often do not add value to individual fold recognition servers when limited models are available. Conversely, the true MQAP methods tested can often be used as effective post filters for re-ranking few models from individual fold recognition servers and further improvements can be achieved using a consensus of these methods.
Resumo:
Motivation: Modelling the 3D structures of proteins can often be enhanced if more than one fold template is used during the modelling process. However, in many cases, this may also result in poorer model quality for a given target or alignment method. There is a need for modelling protocols that can both consistently and significantly improve 3D models and provide an indication of when models might not benefit from the use of multiple target-template alignments. Here, we investigate the use of both global and local model quality prediction scores produced by ModFOLDclust2, to improve the selection of target-template alignments for the construction of multiple-template models. Additionally, we evaluate clustering the resulting population of multi- and single-template models for the improvement of our IntFOLD-TS tertiary structure prediction method. Results: We find that using accurate local model quality scores to guide alignment selection is the most consistent way to significantly improve models for each of the sequence to structure alignment methods tested. In addition, using accurate global model quality for re-ranking alignments, prior to selection, further improves the majority of multi-template modelling methods tested. Furthermore, subsequent clustering of the resulting population of multiple-template models significantly improves the quality of selected models compared with the previous version of our tertiary structure prediction method, IntFOLD-TS.
Resumo:
Web service is one of the most fundamental technologies in implementing service oriented architecture (SOA) based applications. One essential challenge related to web service is to find suitable candidates with regard to web service consumer’s requests, which is normally called web service discovery. During a web service discovery protocol, it is expected that the consumer will find it hard to distinguish which ones are more suitable in the retrieval set, thereby making selection of web services a critical task. In this paper, inspired by the idea that the service composition pattern is significant hint for service selection, a personal profiling mechanism is proposed to improve ranking and recommendation performance. Since service selection is highly dependent on the composition process, personal knowledge is accumulated from previous service composition process and shared via collaborative filtering where a set of users with similar interest will be firstly identified. Afterwards a web service re-ranking mechanism is employed for personalised recommendation. Experimental studies are conduced and analysed to demonstrate the promising potential of this research.
Resumo:
Simulation-based assessment is a popular and frequently necessary approach to evaluation of statistical procedures. Sometimes overlooked is the ability to take advantage of underlying mathematical relations and we focus on this aspect. We show how to take advantage of large-sample theory when conducting a simulation using the analysis of genomic data as a motivating example. The approach uses convergence results to provide an approximation to smaller-sample results, results that are available only by simulation. We consider evaluating and comparing a variety of ranking-based methods for identifying the most highly associated SNPs in a genome-wide association study, derive integral equation representations of the pre-posterior distribution of percentiles produced by three ranking methods, and provide examples comparing performance. These results are of interest in their own right and set the framework for a more extensive set of comparisons.