4 resultados para the SIMPLE algorithm
em DigitalCommons@The Texas Medical Center
Resumo:
Information overload is a significant problem for modern medicine. Searching MEDLINE for common topics often retrieves more relevant documents than users can review. Therefore, we must identify documents that are not only relevant, but also important. Our system ranks articles using citation counts and the PageRank algorithm, incorporating data from the Science Citation Index. However, citation data is usually incomplete. Therefore, we explore the relationship between the quantity of citation information available to the system and the quality of the result ranking. Specifically, we test the ability of citation count and PageRank to identify "important articles" as defined by experts from large result sets with decreasing citation information. We found that PageRank performs better than simple citation counts, but both algorithms are surprisingly robust to information loss. We conclude that even an incomplete citation database is likely to be effective for importance ranking.
Resumo:
Information overload is a significant problem for modern medicine. Searching MEDLINE for common topics often retrieves more relevant documents than users can review. Therefore, we must identify documents that are not only relevant, but also important. Our system ranks articles using citation counts and the PageRank algorithm, incorporating data from the Science Citation Index. However, citation data is usually incomplete. Therefore, we explore the relationship between the quantity of citation information available to the system and the quality of the result ranking. Specifically, we test the ability of citation count and PageRank to identify "important articles" as defined by experts from large result sets with decreasing citation information. We found that PageRank performs better than simple citation counts, but both algorithms are surprisingly robust to information loss. We conclude that even an incomplete citation database is likely to be effective for importance ranking.
Resumo:
In this paper, we present the Cellular Dynamic Simulator (CDS) for simulating diffusion and chemical reactions within crowded molecular environments. CDS is based on a novel event driven algorithm specifically designed for precise calculation of the timing of collisions, reactions and other events for each individual molecule in the environment. Generic mesh based compartments allow the creation / importation of very simple or detailed cellular structures that exist in a 3D environment. Multiple levels of compartments and static obstacles can be used to create a dense environment to mimic cellular boundaries and the intracellular space. The CDS algorithm takes into account volume exclusion and molecular crowding that may impact signaling cascades in small sub-cellular compartments such as dendritic spines. With the CDS, we can simulate simple enzyme reactions; aggregation, channel transport, as well as highly complicated chemical reaction networks of both freely diffusing and membrane bound multi-protein complexes. Components of the CDS are generally defined such that the simulator can be applied to a wide range of environments in terms of scale and level of detail. Through an initialization GUI, a simple simulation environment can be created and populated within minutes yet is powerful enough to design complex 3D cellular architecture. The initialization tool allows visual confirmation of the environment construction prior to execution by the simulator. This paper describes the CDS algorithm, design implementation, and provides an overview of the types of features available and the utility of those features are highlighted in demonstrations.
Resumo:
SNP genotyping arrays have been developed to characterize single-nucleotide polymorphisms (SNPs) and DNA copy number variations (CNVs). The quality of the inferences about copy number can be affected by many factors including batch effects, DNA sample preparation, signal processing, and analytical approach. Nonparametric and model-based statistical algorithms have been developed to detect CNVs from SNP genotyping data. However, these algorithms lack specificity to detect small CNVs due to the high false positive rate when calling CNVs based on the intensity values. Association tests based on detected CNVs therefore lack power even if the CNVs affecting disease risk are common. In this research, by combining an existing Hidden Markov Model (HMM) and the logistic regression model, a new genome-wide logistic regression algorithm was developed to detect CNV associations with diseases. We showed that the new algorithm is more sensitive and can be more powerful in detecting CNV associations with diseases than an existing popular algorithm, especially when the CNV association signal is weak and a limited number of SNPs are located in the CNV.^