82 resultados para Error estimator
Resumo:
The crystalline structure of transition-metals (TM) has been widely known for several decades, however, our knowledge on the atomic structure of TM clusters is still far from satisfactory, which compromises an atomistic understanding of the reactivity of TM clusters. For example, almost all density functional theory (DFT) calculations for TM clusters have been based on local (local density approximation-LDA) and semilocal (generalized gradient approximation-GGA) exchange-correlation functionals, however, it is well known that plain DFT fails to correct the self-interaction error, which affects the properties of several systems. To improve our basic understanding of the atomic and electronic properties of TM clusters, we report a DFT study within two nonlocal functionals, namely, the hybrid HSE (Heyd, Scuseria, and Ernzerhof) and GGA + U functionals, of the structural and electronic properties of the Co(13), Rh(13), and Hf(13) clusters. For Co(13) and Rh(13), we found that improved exchange-correlation functionals decrease the stability of open structures such as the hexagonal bilayer (HBL) and double simple-cubic (DSC) compared with the compact icosahedron (ICO) structure, however, DFT-GGA, DFT-GGA + U, and DFT-HSE yield very similar results for Hf(13). Thus, our results suggest that the DSC structure obtained by several plain DFT calculations for Rh(13) can be improved by the use of improved functionals. Using the sd hybridization analysis, we found that a strong hybridization favors compact structures, and hence, a correct description of the sd hybridization is crucial for the relative energy stability. For example, the sd hybridization decreases for HBL and DSC and increases for ICO in the case of Co(13) and Rh(13), while for Hf(13), the sd hybridization decreases for all configurations, and hence, it does not affect the relative stability among open and compact configurations.
Resumo:
Background: Identifying local similarity between two or more sequences, or identifying repeats occurring at least twice in a sequence, is an essential part in the analysis of biological sequences and of their phylogenetic relationship. Finding such fragments while allowing for a certain number of insertions, deletions, and substitutions, is however known to be a computationally expensive task, and consequently exact methods can usually not be applied in practice. Results: The filter TUIUIU that we introduce in this paper provides a possible solution to this problem. It can be used as a preprocessing step to any multiple alignment or repeats inference method, eliminating a possibly large fraction of the input that is guaranteed not to contain any approximate repeat. It consists in the verification of several strong necessary conditions that can be checked in a fast way. We implemented three versions of the filter. The first is simply a straightforward extension to the case of multiple sequences of an application of conditions already existing in the literature. The second uses a stronger condition which, as our results show, enable to filter sensibly more with negligible (if any) additional time. The third version uses an additional condition and pushes the sensibility of the filter even further with a non negligible additional time in many circumstances; our experiments show that it is particularly useful with large error rates. The latter version was applied as a preprocessing of a multiple alignment tool, obtaining an overall time (filter plus alignment) on average 63 and at best 530 times smaller than before (direct alignment), with in most cases a better quality alignment. Conclusion: To the best of our knowledge, TUIUIU is the first filter designed for multiple repeats and for dealing with error rates greater than 10% of the repeats length.
Resumo:
We consider the problem of interaction neighborhood estimation from the partial observation of a finite number of realizations of a random field. We introduce a model selection rule to choose estimators of conditional probabilities among natural candidates. Our main result is an oracle inequality satisfied by the resulting estimator. We use then this selection rule in a two-step procedure to evaluate the interacting neighborhoods. The selection rule selects a small prior set of possible interacting points and a cutting step remove from this prior set the irrelevant points. We also prove that the Ising models satisfy the assumptions of the main theorems, without restrictions on the temperature, on the structure of the interacting graph or on the range of the interactions. It provides therefore a large class of applications for our results. We give a computationally efficient procedure in these models. We finally show the practical efficiency of our approach in a simulation study.
Resumo:
Alternative splicing of gene transcripts greatly expands the functional capacity of the genome, and certain splice isoforms may indicate specific disease states such as cancer. Splice junction microarrays interrogate thousands of splice junctions, but data analysis is difficult and error prone because of the increased complexity compared to differential gene expression analysis. We present Rank Change Detection (RCD) as a method to identify differential splicing events based upon a straightforward probabilistic model comparing the over-or underrepresentation of two or more competing isoforms. RCD has advantages over commonly used methods because it is robust to false positive errors due to nonlinear trends in microarray measurements. Further, RCD does not depend on prior knowledge of splice isoforms, yet it takes advantage of the inherent structure of mutually exclusive junctions, and it is conceptually generalizable to other types of splicing arrays or RNA-Seq. RCD specifically identifies the biologically important cases when a splice junction becomes more or less prevalent compared to other mutually exclusive junctions. The example data is from different cell lines of glioblastoma tumors assayed with Agilent microarrays.
Resumo:
Extensive ab initio calculations using a complete active space second-order perturbation theory wavefunction, including scalar and spin-orbit relativistic effects with a quadruple-zeta quality basis set were used to construct an analytical potential energy surface (PES) of the ground state of the [H, O, I] system. A total of 5344 points were fit to a three-dimensional function of the internuclear distances, with a global root-mean-square error of 1.26 kcal mol(-1). The resulting PES describes accurately the main features of this system: the HOI and HIO isomers, the transition state between them, and all dissociation asymptotes. After a small adjustment, using a scaling factor on the internal coordinates of HOI, the frequencies calculated in this work agree with the experimental data available within 10 cm(-1). (C) 2011 American Institute of Physics. [doi: 10.1063/1.3615545]
Resumo:
Background: Mutations in TP53 are common events during carcinogenesis. In addition to gene mutations, several reports have focused on TP53 polymorphisms as risk factors for malignant disease. Many studies have highlighted that the status of the TP53 codon 72 polymorphism could influence cancer susceptibility. However, the results have been inconsistent and various methodological features can contribute to departures from Hardy-Weinberg equilibrium, a condition that may influence the disease risk estimates. The most widely accepted method of detecting genotyping error is to confirm genotypes by sequencing and/or via a separate method. Results: We developed two new genotyping methods for TP53 codon 72 polymorphism detection: Denaturing High Performance Liquid Chromatography (DHPLC) and Dot Blot hybridization. These methods were compared with Restriction Fragment Length Polymorphism (RFLP) using two different restriction enzymes. We observed high agreement among all methodologies assayed. Dot-blot hybridization and DHPLC results were more highly concordant with each other than when either of these methods was compared with RFLP. Conclusions: Although variations may occur, our results indicate that DHPLC and Dot Blot hybridization can be used as reliable screening methods for TP53 codon 72 polymorphism detection, especially in molecular epidemiologic studies, where high throughput methodologies are required.
Resumo:
P>Soil bulk density values are needed to convert organic carbon content to mass of organic carbon per unit area. However, field sampling and measurement of soil bulk density are labour-intensive, costly and tedious. Near-infrared reflectance spectroscopy (NIRS) is a physically non-destructive, rapid, reproducible and low-cost method that characterizes materials according to their reflectance in the near-infrared spectral region. The aim of this paper was to investigate the ability of NIRS to predict soil bulk density and to compare its performance with published pedotransfer functions. The study was carried out on a dataset of 1184 soil samples originating from a reforestation area in the Brazilian Amazon basin, and conventional soil bulk density values were obtained with metallic ""core cylinders"". The results indicate that the modified partial least squares regression used on spectral data is an alternative method for soil bulk density predictions to the published pedotransfer functions tested in this study. The NIRS method presented the closest-to-zero accuracy error (-0.002 g cm-3) and the lowest prediction error (0.13 g cm-3) and the coefficient of variation of the validation sets ranged from 8.1 to 8.9% of the mean reference values. Nevertheless, further research is required to assess the limits and specificities of the NIRS method, but it may have advantages for soil bulk density predictions, especially in environments such as the Amazon forest.