205 resultados para R-Statistical computing
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
Efficient automatic protein classification is of central importance in genomic annotation. As an independent way to check the reliability of the classification, we propose a statistical approach to test if two sets of protein domain sequences coming from two families of the Pfam database are significantly different. We model protein sequences as realizations of Variable Length Markov Chains (VLMC) and we use the context trees as a signature of each protein family. Our approach is based on a Kolmogorov-Smirnov-type goodness-of-fit test proposed by Balding et at. [Limit theorems for sequences of random trees (2008), DOI: 10.1007/s11749-008-0092-z]. The test statistic is a supremum over the space of trees of a function of the two samples; its computation grows, in principle, exponentially fast with the maximal number of nodes of the potential trees. We show how to transform this problem into a max-flow over a related graph which can be solved using a Ford-Fulkerson algorithm in polynomial time on that number. We apply the test to 10 randomly chosen protein domain families from the seed of Pfam-A database (high quality, manually curated families). The test shows that the distributions of context trees coming from different families are significantly different. We emphasize that this is a novel mathematical approach to validate the automatic clustering of sequences in any context. We also study the performance of the test via simulations on Galton-Watson related processes.
Resumo:
This work presents a statistical study on the variability of the mechanical properties of hardened self-compacting concrete, including the compressive strength, splitting tensile strength and modulus of elasticity. The comparison of the experimental results with those derived from several codes and recommendations allows evaluating if the hardened behaviour of self-compacting concrete can be appropriately predicted by the existing formulations. The variables analyzed include the maximum size aggregate, paste and gravel content. Results from the analyzed self-compacting concretes presented variability measures in the same range than the expected for conventional vibrated concrete, with all the results within a confidence level of 95%. From several formulations for conventional concrete considered in this study, it was observed that a safe estimation of the modulus of elasticity can be obtained from the value of compressive strength; with lower strength self-compacting concretes presenting higher safety margins. However, most codes overestimate the material tensile strength. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
Hydrodynamic studies were conducted in a semi-cylindrical spouted bed column of diameter 150 mm, height 1000 mm, conical base included angle of 60 degrees and inlet orifice diameter 25 mm. Pressure transducers at several axial positions were used to obtain pressure fluctuation time series with 1.2 and 2.4 mm glass beads at U/U-ms from 0.3 to 1.6, and static bed depths from 150 to 600 mm. The conditions covered several flow regimes (fixed bed, incipient spouting, stable spouting, pulsating spouting, slugging, bubble spouting and fluidization). Images of the system dynamics were also acquired through the transparent walls with a digital camera. The data were analyzed via statistical, mutual information theory, spectral and Hurst`s Rescaled Range methods to assess the potential of these methods to characterize the spouting quality. The results indicate that these methods have potential for monitoring spouted bed operation.
Resumo:
The objective of this study was to show the association patterns among seven types of dental anomalies (second pre-molar agenesis, upper side incisive reduced in size, lower first molar infra-ochlesis, enamel hypoplasia, first molar ectopic eruption, supra numerous teeth and upper canine ectopic eruption) in a population sample without dental treatment ranging in age from 7 to 14. A total of 172 patients were attended and underwent the clinical examination at the Clinica Infantil da Fundacao Educacional de Barretos. Eleven patients from this total were selected according to a first dental anomaly diagnosis and submitted to panoramic radiography. A significant association (p < 0.05) was detected among six pairs of anomalies (second pre-molar agenesis x first pre-molar ectopic eruption; second pre-molar agenesis x lower first molar infra-ochlesis; second pre-molar agenesis x upper side incisive reduced in size; supra numerous teeth x reduced size upper side incisive; first pre-molar ectopic eruption x enamel hypoplasia; lower first molar infra-ochlesis x upper side incisive reduced in size) suggesting a common genetic origin for these conditions. The association was not significant in only one case where there was anomaly sharing by the patients. The existence of an anomaly is clinically relevant for early diagnosis of a possible association and an anomaly can indicate an increased risk of other anomalies.
Resumo:
Objective: The aim of this article is to propose an integrated framework for extracting and describing patterns of disorders from medical images using a combination of linear discriminant analysis and active contour models. Methods: A multivariate statistical methodology was first used to identify the most discriminating hyperplane separating two groups of images (from healthy controls and patients with schizophrenia) contained in the input data. After this, the present work makes explicit the differences found by the multivariate statistical method by subtracting the discriminant models of controls and patients, weighted by the pooled variance between the two groups. A variational level-set technique was used to segment clusters of these differences. We obtain a label of each anatomical change using the Talairach atlas. Results: In this work all the data was analysed simultaneously rather than assuming a priori regions of interest. As a consequence of this, by using active contour models, we were able to obtain regions of interest that were emergent from the data. The results were evaluated using, as gold standard, well-known facts about the neuroanatomical changes related to schizophrenia. Most of the items in the gold standard was covered in our result set. Conclusions: We argue that such investigation provides a suitable framework for characterising the high complexity of magnetic resonance images in schizophrenia as the results obtained indicate a high sensitivity rate with respect to the gold standard. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The Direct Assessment of Functional Status-Revised (DAFS-R) is an instrument developed to objectively measure functional capacities required for independent living. The objective of this study was to translate and culturally adapt the DAFS-R for Brazilian Portuguese (DAFS-BR) and to evaluate its reliability and validity. The DAFS-BR was administered to 89 older patients classified previously as normal controls, mild cognitive impairment (MCI) and Alzheimer`s disease (AD). The results indicated good internal consistency (Cronbach`s alpha = 0.78) in the total sample. The DAFS-BR showed high interobserver reliability (0.996; p < .001) as well as test-retest stability over 1-week interval (0.995; p < .001). Correlation between the DAFS-BR total score and the Informant Questionnaire on Cognitive Decline in the Elderly (IQCODE) was moderate and significant (r = -.65, p < .001) in the total sample, whereas it did not reach statistical significance within each diagnostic group. Receiver operating characteristic curve analyses suggested that DAFS-BR has good sensitivity and specificity to identify MCI and AD. Results suggest that DAFS-BR can document degrees of severity of functional impairment among Brazilian older adults.
Resumo:
One of the top ten most influential data mining algorithms, k-means, is known for being simple and scalable. However, it is sensitive to initialization of prototypes and requires that the number of clusters be specified in advance. This paper shows that evolutionary techniques conceived to guide the application of k-means can be more computationally efficient than systematic (i.e., repetitive) approaches that try to get around the above-mentioned drawbacks by repeatedly running the algorithm from different configurations for the number of clusters and initial positions of prototypes. To do so, a modified version of a (k-means based) fast evolutionary algorithm for clustering is employed. Theoretical complexity analyses for the systematic and evolutionary algorithms under interest are provided. Computational experiments and statistical analyses of the results are presented for artificial and text mining data sets. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The use of inter-laboratory test comparisons to determine the performance of individual laboratories for specific tests (or for calibration) [ISO/IEC Guide 43-1, 1997. Proficiency testing by interlaboratory comparisons - Part 1: Development and operation of proficiency testing schemes] is called Proficiency Testing (PT). In this paper we propose the use of the generalized likelihood ratio test to compare the performance of the group of laboratories for specific tests relative to the assigned value and illustrate the procedure considering an actual data from the PT program in the area of volume. The proposed test extends the test criteria in use allowing to test for the consistency of the group of laboratories. Moreover, the class of elliptical distributions are considered for the obtained measurements. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
The crystal structures of an aspartic proteinase from Trichoderma reesei (TrAsP) and of its complex with a competitive inhibitor, pepstatin A, were solved and refined to crystallographic R-factors of 17.9% (R(free)=21.2%) at 1.70 angstrom resolution and 15.81% (R(free) = 19.2%) at 1.85 angstrom resolution, respectively. The three-dimensional structure of TrAsP is similar to structures of other members of the pepsin-like family of aspartic proteinases. Each molecule is folded in a predominantly beta-sheet bilobal structure with the N-terminal and C-terminal domains of about the same size. Structural comparison of the native structure and the TrAsP-pepstatin complex reveals that the enzyme undergoes an induced-fit, rigid-body movement upon inhibitor binding, with the N-terminal and C-terminal lobes tightly enclosing the inhibitor. Upon recognition and binding of pepstatin A, amino acid residues of the enzyme active site form a number of short hydrogen bonds to the inhibitor that may play an important role in the mechanism of catalysis and inhibition. The structures of TrAsP were used as a template for performing statistical coupling analysis of the aspartic protease family. This approach permitted, for the first time, the identification of a network of structurally linked residues putatively mediating conformational changes relevant to the function of this family of enzymes. Statistical coupling analysis reveals coevolved continuous clusters of amino acid residues that extend from the active site into the hydrophobic cores of each of the two domains and include amino acid residues from the flap regions, highlighting the importance of these parts of the protein for its enzymatic activity. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
The Birnbaum-Saunders (BS) model is a positively skewed statistical distribution that has received great attention in recent decades. A generalized version of this model was derived based on symmetrical distributions in the real line named the generalized BS (GBS) distribution. The R package named gbs was developed to analyze data from GBS models. This package contains probabilistic and reliability indicators and random number generators from GBS distributions. Parameter estimates for censored and uncensored data can also be obtained by means of likelihood methods from the gbs package. Goodness-of-fit and diagnostic methods were also implemented in this package in order to check the suitability of the GBS models. in this article, the capabilities and features of the gbs package are illustrated by using simulated and real data sets. Shape and reliability analyses for GBS models are presented. A simulation study for evaluating the quality and sensitivity of the estimation method developed in the package is provided and discussed. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
We discuss the connection between information and copula theories by showing that a copula can be employed to decompose the information content of a multivariate distribution into marginal and dependence components, with the latter quantified by the mutual information. We define the information excess as a measure of deviation from a maximum-entropy distribution. The idea of marginal invariant dependence measures is also discussed and used to show that empirical linear correlation underestimates the amplitude of the actual correlation in the case of non-Gaussian marginals. The mutual information is shown to provide an upper bound for the asymptotic empirical log-likelihood of a copula. An analytical expression for the information excess of T-copulas is provided, allowing for simple model identification within this family. We illustrate the framework in a financial data set. Copyright (C) EPLA, 2009
Resumo:
We propose a likelihood ratio test ( LRT) with Bartlett correction in order to identify Granger causality between sets of time series gene expression data. The performance of the proposed test is compared to a previously published bootstrapbased approach. LRT is shown to be significantly faster and statistically powerful even within non- Normal distributions. An R package named gGranger containing an implementation for both Granger causality identification tests is also provided.
Resumo:
OBJECTIVE: Removable partial dentures (RPD) require different hygiene care, and association of brushing and chemical cleansing is the most recommended to control biofilm formation. However, the effect of cleansers has not been evaluated in RPD metallic components. The aim of this study was to evaluate in vitro the effect of different denture cleansers on the weight and ion release of RPD. MATERIAL AND METHODS: Five specimens (12x3 mm metallic disc positioned in a 38x18x4 mm mould filled with resin), 7 cleanser agents [Periogard (PE), Cepacol (CE), Corega Tabs (CT), Medical Interporous (MI), Polident (PO), 0.05% sodium hypochlorite (NaOCl), and distilled water (DW) (control)] and 2 cobalt-chromium alloys [DeguDent (DD), and VeraPDI (VPDI)] were used for each experimental situation. One hundred and eighty immersions were performed and the weight was analyzed with a high precision analytic balance. Data were recorded before and after the immersions. The ion release was analyzed using mass spectrometry with inductively coupled plasma. Data were analyzed by two-way ANOVA and Tukey HSD post hoc test at 5% significance level. RESULTS: Statistical analysis showed that CT and MI had higher values of weight loss with higher change in VPDI alloy compared to DD. The solutions that caused more ion release were NaOCl and MI. CONCLUSIONS: It may be concluded that 0.05% NaOCl and Medical Interporous tablets are not suitable as auxiliary chemical solutions for RPD care.
Resumo:
The objective of this study was to evaluate the flexural strength (σf) and hardness (H) of direct and indirect composites, testing the hypotheses that direct resin composites produce higher σf and H values than indirect composites and that these properties are positively related. Ten bar-shaped specimens (25 mm x 2 mm x 2 mm) were fabricated for each direct [D250 - Filtek Z250 (3M-Espe) and D350 - Filtek Z350 (3M-Espe)] and indirect [ISin - Sinfony (3M-Espe) and IVM - VitaVM LC (Vita Zahnfabrik)] materials, according to the manufacturer's instructions and ISO4049 specifications. The σf was tested in three-point bending using a universal testing machine (EMIC DL 2000) at a crosshead speed of 0.5 mm/min (ISO4049). Knoop hardness (H) was measured on the specimens' fragments resultant from the σf test and calculated as H = 14.2P/l², where P is the applied load (0.1 kg; dwell time = 15 s) and l is the longest diagonal of the diamond shaped indent (ASTM E384). The data were statistically analyzed using Anova and Tukey tests (α = 0.05). The mean σf and standard deviation values (MPa) and statistical grouping were: D250 - 135.4 ± 17.6a; D350 - 123.7 ± 11.1b; ISin - 98.4 ± 6.4c; IVM - 73.1 ± 4.9d. The mean H and standard deviation values (kg/mm²) and statistical grouping were: D250 - 98.12 ± 1.8a; D350 - 86.5 ± 1.9b; ISin - 28.3 ± 0.9c; IVM - 30.8 ± 1.0c. The direct composite systems examined produce higher mean σf and H values than the indirect composites, and the mean values of these properties were positively correlated (r = 0.91), confirming the study hypotheses.
Resumo:
Uma análise da distribuição geográfica de Schefflera no Brasil extra-amazônico foi realizada com base em mapas atualizados plotando as ocorrências conhecidas das 26 espécies do gênero encontradas nessa grande área: S. angustissima (Marchal) Frodin, S. aurata Fiaschi, S. botumirimensis Fiaschi & Pirani, S. burchellii (Seem.) Frodin & Fiaschi, S. calva (Cham.) Frodin & Fiaschi, S. capixaba Fiaschi, S. cephalantha (Harms) Frodin, S. cordata (Taub.) Frodin & Fiaschi, S. distractiflora (Harms) Frodin, S. fruticosa Fiaschi & Pirani, S. gardneri (Seem.) Frodin & Fiaschi, S. glaziovii (Taub.) Frodin & Fiaschi, S. grandigemma Fiaschi, S. kollmannii Fiaschi, S. longipetiolata (Pohl ex DC.) Frodin & Fiaschi, S. lucumoides (Decne. & Planch. ex Marchal) Frodin & Fiaschi, S. macrocarpa (Cham. & Schltdl.) Frodin, S. malmei (Harms) Frodin, S. morototoni (Aubl.) Maguire, Steyermark & Frodin, S. racemifera Fiaschi & Frodin, S. ruschiana Fiaschi & Pirani, S. selloi (Marchal) Frodin & Fiaschi, S. succinea Frodin & Fiaschi, S. villosissima Fiaschi & Pirani, S. vinosa (Cham. & Schltdl.) Frodin & Fiaschi e S. aff. varisiana Frodin. Dois centros de endemismo associados com áreas de altitude elevada foram reconhecidos: Cadeia do Espinhaço em Minas Gerais e florestas montanas do Estado do Espírito Santo. Os padrões de distribuição geográfica ilustrados são discutidos com base em dados obtidos para outros grupos de angiospermas e em estudos fitogeográficos das principais fitocórias do Brasil extra-amazônico. São apresentadas também hipóteses acerca de prováveis relações filogenéticas entre alguns táxons, visando à busca de possíveis correlações entre estas e a biogeografia do grupo.