996 resultados para re-ranking
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Huge image collections are becoming available lately. In this scenario, the use of Content-Based Image Retrieval (CBIR) systems has emerged as a promising approach to support image searches. The objective of CBIR systems is to retrieve the most similar images in a collection, given a query image, by taking into account image visual properties such as texture, color, and shape. In these systems, the effectiveness of the retrieval process depends heavily on the accuracy of ranking approaches. Recently, re-ranking approaches have been proposed to improve the effectiveness of CBIR systems by taking into account the relationships among images. The re-ranking approaches consider the relationships among all images in a given dataset. These approaches typically demands a huge amount of computational power, which hampers its use in practical situations. On the other hand, these methods can be massively parallelized. In this paper, we propose to speedup the computation of the RL-Sim algorithm, a recently proposed image re-ranking approach, by using the computational power of Graphics Processing Units (GPU). GPUs are emerging as relatively inexpensive parallel processors that are becoming available on a wide range of computer systems. We address the image re-ranking performance challenges by proposing a parallel solution designed to fit the computational model of GPUs. We conducted an experimental evaluation considering different implementations and devices. Experimental results demonstrate that significant performance gains can be obtained. Our approach achieves speedups of 7x from serial implementation considering the overall algorithm and up to 36x on its core steps.
Resumo:
Document ranking is an important process in information retrieval (IR). It presents retrieved documents in an order of their estimated degrees of relevance to query. Traditional document ranking methods are mostly based on the similarity computations between documents and query. In this paper we argue that the similarity-based document ranking is insufficient in some cases. There are two reasons. Firstly it is about the increased information variety. There are far too many different types documents available now for user to search. The second is about the users variety. In many cases user may want to retrieve documents that are not only similar but also general or broad regarding a certain topic. This is particularly the case in some domains such as bio-medical IR. In this paper we propose a novel approach to re-rank the retrieved documents by incorporating the similarity with their generality. By an ontology-based analysis on the semantic cohesion of text, document generality can be quantified. The retrieved documents are then re-ranked by their combined scores of similarity and the closeness of documents’ generality to the query’s. Our experiments have shown an encouraging performance on a large bio-medical document collection, OHSUMED, containing 348,566 medical journal references and 101 test queries.
Resumo:
The assumptions underlying the Probability Ranking Principle (PRP) have led to a number of alternative approaches that cater or compensate for the PRP’s limitations. All alternatives deviate from the PRP by incorporating dependencies. This results in a re-ranking that promotes or demotes documents depending upon their relationship with the documents that have been already ranked. In this paper, we compare and contrast the behaviour of state-of-the-art ranking strategies and principles. To do so, we tease out analytical relationships between the ranking approaches and we investigate the document kinematics to visualise the effects of the different approaches on document ranking.
Resumo:
The assumptions underlying the Probability Ranking Principle (PRP) have led to a number of alternative approaches that cater or compensate for the PRP’s limitations. All alternatives deviate from the PRP by incorporating dependencies. This results in a re-ranking that promotes or demotes documents depending upon their relationship with the documents that have been already ranked. In this paper, we compare and contrast the behaviour of state-of-the-art ranking strategies and principles. To do so, we tease out analytical relationships between the ranking approaches and we investigate the document kinematics to visualise the effects of the different approaches on document ranking.
Resumo:
Web service is one of the most fundamental technologies in implementing service oriented architecture (SOA) based applications. One essential challenge related to web service is to find suitable candidates with regard to web service consumer’s requests, which is normally called web service discovery. During a web service discovery protocol, it is expected that the consumer will find it hard to distinguish which ones are more suitable in the retrieval set, thereby making selection of web services a critical task. In this paper, inspired by the idea that the service composition pattern is significant hint for service selection, a personal profiling mechanism is proposed to improve ranking and recommendation performance. Since service selection is highly dependent on the composition process, personal knowledge is accumulated from previous service composition process and shared via collaborative filtering where a set of users with similar interest will be firstly identified. Afterwards a web service re-ranking mechanism is employed for personalised recommendation. Experimental studies are conduced and analysed to demonstrate the promising potential of this research.
Resumo:
Commercial environments may receive only a fraction of expected genetic gains for growth rate as predicted from the selection environment This fraction is the result of undesirable genotype-by-environment interactions (G x E) and measured by the genetic correlation (r(g)) of growth between environments. Rapid estimates of genetic correlation achieved in one generation are notoriously difficult to estimate with precision. A new design is proposed where genetic correlations can be estimated by utilising artificial mating from cryopreserved semen and unfertilised eggs stripped from a single female. We compare a traditional phenotype analysis of growth to a threshold model where only the largest fish are genotyped for sire identification. The threshold model was robust to differences in family mortality differing up to 30%. The design is unique as it negates potential re-ranking of families caused by an interaction between common maternal environmental effects and growing environment. The design is suitable for rapid assessment of G x E over one generation with a true 0.70 genetic correlation yielding standard errors as low as 0.07. Different design scenarios were tested for bias and accuracy with a range of heritability values, number of half-sib families created, number of progeny within each full-sib family, number of fish genotyped, number of fish stocked, differing family survival rates and at various simulated genetic correlation levels
Resumo:
Commercial environments may receive only a fraction of expected genetic gains for growth rate as predicted from the selection environment. This fraction is result of undesirable genotype-by-environment interactions (GxE) and measured by the genetic correlation (rg) of growth between environments. Rapid estimates of genetic correlation achieved in one generation are notoriously difficult to estimate with precision. A new design is proposed where genetic correlations can be estimated by utilising artificial mating from cryopreserved semen and unfertilised eggs stripped from a single female. We compare a traditional phenotype analysis of growth to a threshold model where only the largest fish are genotyped for sire identification. The threshold model was robust to differences in family mortality differing up to 30%. The design is unique as it negates potential re-ranking of families caused by an interaction between common maternal environmental effects and growing environment. The design is suitable for rapid assessment of GxE over one generation with a true 0.70 genetic correlation yielding standard errors as low as 0.07. Different design scenarios were tested for bias and accuracy with a range of heritability values, number of half-sib families created, number of progeny within each full-sib family, number of fish genotyped, number of fish stocked, differing family survival rates and at various simulated genetic correlation levels.
Resumo:
How to refine a near-native structure to make it closer to its native conformation is an unsolved problem in protein-structure and protein-protein complex-structure prediction. In this article, we first test several scoring functions for selecting locally resampled near-native protein-protein docking conformations and then propose a computationally efficient protocol for structure refinement via local resampling and energy minimization. The proposed method employs a statistical energy function based on a Distance-scaled Ideal-gas REference state (DFIRE) as an initial filter and an empirical energy function EMPIRE (EMpirical Protein-InteRaction Energy) for optimization and re-ranking. Significant improvement of final top-1 ranked structures over initial near-native structures is observed in the ZDOCK 2.3 decoy set for Benchmark 1.0 (74% whose global rmsd reduced by 0.5 angstrom or more and only 7% increased by 0.5 angstrom or more). Less significant improvement is observed for Benchmark 2.0 (38% versus 33%). Possible reasons are discussed.
Resumo:
Background: Selecting the highest quality 3D model of a protein structure from a number of alternatives remains an important challenge in the field of structural bioinformatics. Many Model Quality Assessment Programs (MQAPs) have been developed which adopt various strategies in order to tackle this problem, ranging from the so called "true" MQAPs capable of producing a single energy score based on a single model, to methods which rely on structural comparisons of multiple models or additional information from meta-servers. However, it is clear that no current method can separate the highest accuracy models from the lowest consistently. In this paper, a number of the top performing MQAP methods are benchmarked in the context of the potential value that they add to protein fold recognition. Two novel methods are also described: ModSSEA, which based on the alignment of predicted secondary structure elements and ModFOLD which combines several true MQAP methods using an artificial neural network. Results: The ModSSEA method is found to be an effective model quality assessment program for ranking multiple models from many servers, however further accuracy can be gained by using the consensus approach of ModFOLD. The ModFOLD method is shown to significantly outperform the true MQAPs tested and is competitive with methods which make use of clustering or additional information from multiple servers. Several of the true MQAPs are also shown to add value to most individual fold recognition servers by improving model selection, when applied as a post filter in order to re-rank models. Conclusion: MQAPs should be benchmarked appropriately for the practical context in which they are intended to be used. Clustering based methods are the top performing MQAPs where many models are available from many servers; however, they often do not add value to individual fold recognition servers when limited models are available. Conversely, the true MQAP methods tested can often be used as effective post filters for re-ranking few models from individual fold recognition servers and further improvements can be achieved using a consensus of these methods.
Resumo:
Motivation: Modelling the 3D structures of proteins can often be enhanced if more than one fold template is used during the modelling process. However, in many cases, this may also result in poorer model quality for a given target or alignment method. There is a need for modelling protocols that can both consistently and significantly improve 3D models and provide an indication of when models might not benefit from the use of multiple target-template alignments. Here, we investigate the use of both global and local model quality prediction scores produced by ModFOLDclust2, to improve the selection of target-template alignments for the construction of multiple-template models. Additionally, we evaluate clustering the resulting population of multi- and single-template models for the improvement of our IntFOLD-TS tertiary structure prediction method. Results: We find that using accurate local model quality scores to guide alignment selection is the most consistent way to significantly improve models for each of the sequence to structure alignment methods tested. In addition, using accurate global model quality for re-ranking alignments, prior to selection, further improves the majority of multi-template modelling methods tested. Furthermore, subsequent clustering of the resulting population of multiple-template models significantly improves the quality of selected models compared with the previous version of our tertiary structure prediction method, IntFOLD-TS.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
The objective of the present study was to determine the presence of genotype by environment interaction (G × E) and to characterize the phenotypic plasticity of birth weight (BW), weaning weight (WW), postweaning weight gain (PWG) and yearling scrotal circumference (SC) in composite beef cattle using the reaction norms model with unknown covariate. The animals were born between 1995 and 2008 on 33 farms located throughout all Brazilian biomes between latitude -7 and -31, longitude -40 and -63. The contemporary group was chosen as the environmental descriptor, that is, the environmental covariate of the reaction norms. In general, higher estimates of direct heritability were observed in extreme favorable environments. The mean of direct heritability across the environmental gradient ranged from 0.05 to 0.51, 0.09 to 0.43, 0.01 to 0.43 and from 0.12 to 0.26 for BW, WW, PWG and SC, respectively. The variation in direct heritability observed indicates a different response to selection according to the environment in which the animals of the population are evaluated. The correlation between the level and slope of the reaction norm for BW and PWG was high, indicating that animals with higher average breeding values responded better to improvement in environmental conditions, a fact characterizing a scale of G × E. Low correlation between the intercept and slope was obtained for WW and SC, implying re-ranking of animals in different environments. Genetic variation exists in the sensitivity of animals to the environment, a fact that permits the selection of more plastic or robust genotypes in the population studied. Thus, the G × E is an important factor that should be considered in the genetic evaluation of the present population of composite beef cattle. © The Animal Consortium 2012.