5 resultados para STATISTICAL INFORMATION
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
We explore the meaning of information about quantities of interest. Our approach is divided in two scenarios: the analysis of observations and the planning of an experiment. First, we review the Sufficiency, Conditionality and Likelihood principles and how they relate to trivial experiments. Next, we review Blackwell Sufficiency and show that sampling without replacement is Blackwell Sufficient for sampling with replacement. Finally, we unify the two scenarios presenting an extension of the relationship between Blackwell Equivalence and the Likelihood Principle.
Resumo:
Abstract Background One goal of gene expression profiling is to identify signature genes that robustly distinguish different types or grades of tumors. Several tumor classifiers based on expression profiling have been proposed using microarray technique. Due to important differences in the probabilistic models of microarray and SAGE technologies, it is important to develop suitable techniques to select specific genes from SAGE measurements. Results A new framework to select specific genes that distinguish different biological states based on the analysis of SAGE data is proposed. The new framework applies the bolstered error for the identification of strong genes that separate the biological states in a feature space defined by the gene expression of a training set. Credibility intervals defined from a probabilistic model of SAGE measurements are used to identify the genes that distinguish the different states with more reliability among all gene groups selected by the strong genes method. A score taking into account the credibility and the bolstered error values in order to rank the groups of considered genes is proposed. Results obtained using SAGE data from gliomas are presented, thus corroborating the introduced methodology. Conclusion The model representing counting data, such as SAGE, provides additional statistical information that allows a more robust analysis. The additional statistical information provided by the probabilistic model is incorporated in the methodology described in the paper. The introduced method is suitable to identify signature genes that lead to a good separation of the biological states using SAGE and may be adapted for other counting methods such as Massive Parallel Signature Sequencing (MPSS) or the recent Sequencing-By-Synthesis (SBS) technique. Some of such genes identified by the proposed method may be useful to generate classifiers.
Resumo:
In this work we measured X-ray scatter spectra from normal and neoplastic breast tissues using photon energy of 17.44 key and a scattering angle of 90 degrees, in order to study the shape (FWHM) of the Compton peaks. The obtained results for FWHM were discussed in terms of composition and histological characteristics of each tissue type. The statistical analysis shows that the distribution of FWHM of normal adipose breast tissue clearly differs from all other investigated tissues. Comparison between experimental values of FWHM and effective atomic number revealed a strong correlation between them, showing that the FWHM values can be used to provide information about elemental composition of the tissues. (C) 2012 Elsevier Ltd. All rights reserved.
Resumo:
In the past decades, all of the efforts at quantifying systems complexity with a general tool has usually relied on using Shannon's classical information framework to address the disorder of the system through the Boltzmann-Gibbs-Shannon entropy, or one of its extensions. However, in recent years, there were some attempts to tackle the quantification of algorithmic complexities in quantum systems based on the Kolmogorov algorithmic complexity, obtaining some discrepant results against the classical approach. Therefore, an approach to the complexity measure is proposed here, using the quantum information formalism, taking advantage of the generality of the classical-based complexities, and being capable of expressing these systems' complexity on other framework than its algorithmic counterparts. To do so, the Shiner-Davison-Landsberg (SDL) complexity framework is considered jointly with linear entropy for the density operators representing the analyzed systems formalism along with the tangle for the entanglement measure. The proposed measure is then applied in a family of maximally entangled mixed state.
Resumo:
Background: Genome-wide association studies (GWAS) require large sample sizes to obtain adequate statistical power, but it may be possible to increase the power by incorporating complementary data. In this study we investigated the feasibility of automatically retrieving information from the medical literature and leveraging this information in GWAS. Methods: We developed a method that searches through PubMed abstracts for pre-assigned keywords and key concepts, and uses this information to assign prior probabilities of association for each single nucleotide polymorphism (SNP) with the phenotype of interest - the Adjusting Association Priors with Text (AdAPT) method. Association results from a GWAS can subsequently be ranked in the context of these priors using the Bayes False Discovery Probability (BFDP) framework. We initially tested AdAPT by comparing rankings of known susceptibility alleles in a previous lung cancer GWAS, and subsequently applied it in a two-phase GWAS of oral cancer. Results: Known lung cancer susceptibility SNPs were consistently ranked higher by AdAPT BFDPs than by p-values. In the oral cancer GWAS, we sought to replicate the top five SNPs as ranked by AdAPT BFDPs, of which rs991316, located in the ADH gene region of 4q23, displayed a statistically significant association with oral cancer risk in the replication phase (per-rare-allele log additive p-value [p(trend)] = 2.5 x 10(-3)). The combined OR for having one additional rare allele was 0.83 (95% CI: 0.76-0.90), and this association was independent of previously identified susceptibility SNPs that are associated with overall UADT cancer in this gene region. We also investigated if rs991316 was associated with other cancers of the upper aerodigestive tract (UADT), but no additional association signal was found. Conclusion: This study highlights the potential utility of systematically incorporating prior knowledge from the medical literature in genome-wide analyses using the AdAPT methodology. AdAPT is available online (url: http://services.gate.ac.uk/lld/gwas/service/config).