136 resultados para Robust Convergence
Resumo:
We propose a novel method for scoring the accuracy of protein binding site predictions – the Binding-site Distance Test (BDT) score. Recently, the Matthews Correlation Coefficient (MCC) has been used to evaluate binding site predictions, both by developers of new methods and by the assessors for the community wide prediction experiment – CASP8. Whilst being a rigorous scoring method, the MCC does not take into account the actual 3D location of the predicted residues from the observed binding site. Thus, an incorrectly predicted site that is nevertheless close to the observed binding site will obtain an identical score to the same number of nonbinding residues predicted at random. The MCC is somewhat affected by the subjectivity of determining observed binding residues and the ambiguity of choosing distance cutoffs. By contrast the BDT method produces continuous scores ranging between 0 and 1, relating to the distance between the predicted and observed residues. Residues predicted close to the binding site will score higher than those more distant, providing a better reflection of the true accuracy of predictions. The CASP8 function predictions were evaluated using both the MCC and BDT methods and the scores were compared. The BDT was found to strongly correlate with the MCC scores whilst also being less susceptible to the subjectivity of defining binding residues. We therefore suggest that this new simple score is a potentially more robust method for future evaluations of protein-ligand binding site predictions.
Resumo:
Despite decades of research, it remains controversial whether ecological communities converge towards a common structure determined by environmental conditions irrespective of assembly history. Here, we show experimentally that the answer depends on the level of community organization considered. In a 9-year grassland experiment, we manipulated initial plant composition on abandoned arable land and subsequently allowed natural colonization. Initial compositional variation caused plant communities to remain divergent in species identities, even though these same communities converged strongly in species traits. This contrast between species divergence and trait convergence could not be explained by dispersal limitation or community neutrality alone. Our results show that the simultaneous operation of trait-based assembly rules and species-level priority effects drives community assembly, making it both deterministic and historically contingent, but at different levels of community organization.
Resumo:
Estimation of population size with missing zero-class is an important problem that is encountered in epidemiological assessment studies. Fitting a Poisson model to the observed data by the method of maximum likelihood and estimation of the population size based on this fit is an approach that has been widely used for this purpose. In practice, however, the Poisson assumption is seldom satisfied. Zelterman (1988) has proposed a robust estimator for unclustered data that works well in a wide class of distributions applicable for count data. In the work presented here, we extend this estimator to clustered data. The estimator requires fitting a zero-truncated homogeneous Poisson model by maximum likelihood and thereby using a Horvitz-Thompson estimator of population size. This was found to work well, when the data follow the hypothesized homogeneous Poisson model. However, when the true distribution deviates from the hypothesized model, the population size was found to be underestimated. In the search of a more robust estimator, we focused on three models that use all clusters with exactly one case, those clusters with exactly two cases and those with exactly three cases to estimate the probability of the zero-class and thereby use data collected on all the clusters in the Horvitz-Thompson estimator of population size. Loss in efficiency associated with gain in robustness was examined based on a simulation study. As a trade-off between gain in robustness and loss in efficiency, the model that uses data collected on clusters with at most three cases to estimate the probability of the zero-class was found to be preferred in general. In applications, we recommend obtaining estimates from all three models and making a choice considering the estimates from the three models, robustness and the loss in efficiency. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)
Resumo:
In this paper, we list some new orthogonal main effects plans for three-level designs for 4, 5 and 6 factors in IS runs and compare them with designs obtained from the existing L-18 orthogonal array. We show that these new designs have better projection properties and can provide better parameter estimates for a range of possible models. Additionally, we study designs in other smaller run-sizes when there are insufficient resources to perform an 18-run experiment. Plans for three-level designs for 4, 5 and 6 factors in 13 to 17 runs axe given. We show that the best designs here are efficient and deserve strong consideration in many practical situations.
Resumo:
In this paper new robust nonlinear model construction algorithms for a large class of linear-in-the-parameters models are introduced to enhance model robustness, including three algorithms using combined A- or D-optimality or PRESS statistic (Predicted REsidual Sum of Squares) with regularised orthogonal least squares algorithm respectively. A common characteristic of these algorithms is that the inherent computation efficiency associated with the orthogonalisation scheme in orthogonal least squares or regularised orthogonal least squares has been extended such that the new algorithms are computationally efficient. A numerical example is included to demonstrate effectiveness of the algorithms. Copyright (C) 2003 IFAC.
Resumo:
In this work the G(A)(0) distribution is assumed as the universal model for amplitude Synthetic Aperture (SAR) imagery data under the Multiplicative Model. The observed data, therefore, is assumed to obey a G(A)(0) (alpha; gamma, n) law, where the parameter n is related to the speckle noise, and (alpha, gamma) are related to the ground truth, giving information about the background. Therefore, maps generated by the estimation of (alpha, gamma) in each coordinate can be used as the input for classification methods. Maximum likelihood estimators are derived and used to form estimated parameter maps. This estimation can be hampered by the presence of corner reflectors, man-made objects used to calibrate SAR images that produce large return values. In order to alleviate this contamination, robust (M) estimators are also derived for the universal model. Gaussian Maximum Likelihood classification is used to obtain maps using hard-to-deal-with simulated data, and the superiority of robust estimation is quantitatively assessed.