968 resultados para statistical inference


Relevância:

60.00% 60.00%

Publicador:

Resumo:

The controversy over the interpretation of DNA profile evidence in forensic identification can be attributed in part to confusion over the mode(s) of statistical inference appropriate to this setting. Although there has been substantial discussion in the literature of, for example, the role of population genetics issues, few authors have made explicit the inferential framework which underpins their arguments. This lack of clarity has led both to unnecessary debates over ill-posed or inappropriate questions and to the neglect of some issues which can have important consequences. We argue that the mode of statistical inference which seems to underlie the arguments of some authors, based on a hypothesis testing framework, is not appropriate for forensic identification. We propose instead a logically coherent framework in which, for example, the roles both of the population genetics issues and of the nonscientific evidence in a case are incorporated. Our analysis highlights several widely held misconceptions in the DNA profiling debate. For example, the profile frequency is not directly relevant to forensic inference. Further, very small match probabilities may in some settings be consistent with acquittal. Although DNA evidence is typically very strong, our analysis of the coherent approach highlights situations which can arise in practice where alternative methods for assessing DNA evidence may be misleading.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Many long-lived marine species exhibit life history traits. that make them more vulnerable to overexploitation. Accurate population trend analysis is essential for development and assessment of management plans for these species. However, because many of these species disperse over large geographic areas, have life stages inaccessible to human surveyors, and/or undergo complex developmental migrations, data on trends in abundance are often available for only one stage of the population, usually breeding adults. The green turtle (Chelonia mydas) is one of these long-lived species for which population trends are based almost exclusively on either numbers of females that emerge to nest or numbers of nests deposited each year on geographically restricted beaches. In this study, we generated estimates of annual abundance for juvenile green turtles at two foraging grounds in the Bahamas based on long-term capture-mark-recapture (CMR) studies at Union Creek (24 years) and Conception Creek (13 years), using a two-stage approach. First, we estimated recapture probabilities from CMR data using the Cormack-Jolly-Seber models in the software program MARK; second, we estimated annual abundance of green turtles. at both study sites using the recapture probabilities in a Horvitz-Thompson type estimation procedure. Green turtle abundance did not change significantly in Conception Creek, but, in Union Creek, green turtle abundance had successive phases of significant increase, significant decrease, and stability. These changes in abundance resulted from changes in immigration, not survival or emigration. The trends in abundance on the foraging grounds did not conform to the significantly increasing trend for the major nesting population at Tortuguero, Costa Rica. This disparity highlights the challenges of assessing population-wide trends of green turtles and other long-lived species. The best approach for monitoring population trends may be a combination of (1) extensive surveys to provide data for large-scale trends in relative population abundance, and (2) intensive surveys, using CMR techniques, to estimate absolute abundance and evaluate the demographic processes' driving the trends.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Vector error-correction models (VECMs) have become increasingly important in their application to financial markets. Standard full-order VECM models assume non-zero entries in all their coefficient matrices. However, applications of VECM models to financial market data have revealed that zero entries are often a necessary part of efficient modelling. In such cases, the use of full-order VECM models may lead to incorrect inferences. Specifically, if indirect causality or Granger non-causality exists among the variables, the use of over-parameterised full-order VECM models may weaken the power of statistical inference. In this paper, it is argued that the zero–non-zero (ZNZ) patterned VECM is a more straightforward and effective means of testing for both indirect causality and Granger non-causality. For a ZNZ patterned VECM framework for time series of integrated order two, we provide a new algorithm to select cointegrating and loading vectors that can contain zero entries. Two case studies are used to demonstrate the usefulness of the algorithm in tests of purchasing power parity and a three-variable system involving the stock market.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Statistics is known to be an art as well as a science. The training of mathematical physicists predisposes them towards hypothesising plausible Bayesean priors. Tony Bracken and I were of that mind [1], but in our discussions we also recognised the Bayesean will-o'-the-wisp illustrated below.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Neural networks can be regarded as statistical models, and can be analysed in a Bayesian framework. Generalisation is measured by the performance on independent test data drawn from the same distribution as the training data. Such performance can be quantified by the posterior average of the information divergence between the true and the model distributions. Averaging over the Bayesian posterior guarantees internal coherence; Using information divergence guarantees invariance with respect to representation. The theory generalises the least mean squares theory for linear Gaussian models to general problems of statistical estimation. The main results are: (1)~the ideal optimal estimate is always given by average over the posterior; (2)~the optimal estimate within a computational model is given by the projection of the ideal estimate to the model. This incidentally shows some currently popular methods dealing with hyperpriors are in general unnecessary and misleading. The extension of information divergence to positive normalisable measures reveals a remarkable relation between the dlt dual affine geometry of statistical manifolds and the geometry of the dual pair of Banach spaces Ld and Ldd. It therefore offers conceptual simplification to information geometry. The general conclusion on the issue of evaluating neural network learning rules and other statistical inference methods is that such evaluations are only meaningful under three assumptions: The prior P(p), describing the environment of all the problems; the divergence Dd, specifying the requirement of the task; and the model Q, specifying available computing resources.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The problem of evaluating different learning rules and other statistical estimators is analysed. A new general theory of statistical inference is developed by combining Bayesian decision theory with information geometry. It is coherent and invariant. For each sample a unique ideal estimate exists and is given by an average over the posterior. An optimal estimate within a model is given by a projection of the ideal estimate. The ideal estimate is a sufficient statistic of the posterior, so practical learning rules are functions of the ideal estimator. If the sole purpose of learning is to extract information from the data, the learning rule must also approximate the ideal estimator. This framework is applicable to both Bayesian and non-Bayesian methods, with arbitrary statistical models, and to supervised, unsupervised and reinforcement learning schemes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Online learning is discussed from the viewpoint of Bayesian statistical inference. By replacing the true posterior distribution with a simpler parametric distribution, one can define an online algorithm by a repetition of two steps: An update of the approximate posterior, when a new example arrives, and an optimal projection into the parametric family. Choosing this family to be Gaussian, we show that the algorithm achieves asymptotic efficiency. An application to learning in single layer neural networks is given.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Accurate protein structure prediction remains an active objective of research in bioinformatics. Membrane proteins comprise approximately 20% of most genomes. They are, however, poorly tractable targets of experimental structure determination. Their analysis using bioinformatics thus makes an important contribution to their on-going study. Using a method based on Bayesian Networks, which provides a flexible and powerful framework for statistical inference, we have addressed the alignment-free discrimination of membrane from non-membrane proteins. The method successfully identifies prokaryotic and eukaryotic α-helical membrane proteins at 94.4% accuracy, β-barrel proteins at 72.4% accuracy, and distinguishes assorted non-membranous proteins with 85.9% accuracy. The method here is an important potential advance in the computational analysis of membrane protein structure. It represents a useful tool for the characterisation of membrane proteins with a wide variety of potential applications.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Membrane proteins, which constitute approximately 20% of most genomes, are poorly tractable targets for experimental structure determination, thus analysis by prediction and modelling makes an important contribution to their on-going study. Membrane proteins form two main classes: alpha helical and beta barrel trans-membrane proteins. By using a method based on Bayesian Networks, which provides a flexible and powerful framework for statistical inference, we addressed alpha-helical topology prediction. This method has accuracies of 77.4% for prokaryotic proteins and 61.4% for eukaryotic proteins. The method described here represents an important advance in the computational determination of membrane protein topology and offers a useful, and complementary, tool for the analysis of membrane proteins for a range of applications.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Membrane proteins, which constitute approximately 20% of most genomes, form two main classes: alpha helical and beta barrel transmembrane proteins. Using methods based on Bayesian Networks, a powerful approach for statistical inference, we have sought to address beta-barrel topology prediction. The beta-barrel topology predictor reports individual strand accuracies of 88.6%. The method outlined here represents a potentially important advance in the computational determination of membrane protein topology.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 60J80.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 62G32, 62G20.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Multiple linear regression model plays a key role in statistical inference and it has extensive applications in business, environmental, physical and social sciences. Multicollinearity has been a considerable problem in multiple regression analysis. When the regressor variables are multicollinear, it becomes difficult to make precise statistical inferences about the regression coefficients. There are some statistical methods that can be used, which are discussed in this thesis are ridge regression, Liu, two parameter biased and LASSO estimators. Firstly, an analytical comparison on the basis of risk was made among ridge, Liu and LASSO estimators under orthonormal regression model. I found that LASSO dominates least squares, ridge and Liu estimators over a significant portion of the parameter space for large dimension. Secondly, a simulation study was conducted to compare performance of ridge, Liu and two parameter biased estimator by their mean squared error criterion. I found that two parameter biased estimator performs better than its corresponding ridge regression estimator. Overall, Liu estimator performs better than both ridge and two parameter biased estimator.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This work's objective is the development of a methodology to represent an unknown soil through a stratified horizontal multilayer soil model, from which the engineer may carry out eletrical grounding projects with high precision. The methodology uses the experimental electrical apparent resistivity curve, obtained through measurements on the ground, using a 4-wire earth ground resistance tester kit, along with calculations involving the measured resistance. This curve is then compared with the theoretical electrical apparent resistivity curve, obtained through calculations over a horizontally strati ed soil, whose parameters are conjectured. This soil model parameters, such as the number of layers, in addition to the resistivity and the thickness of each layer, are optimized by Differential Evolution method, with enhanced performance through parallel computing, in order to both apparent resistivity curves get close enough, and it is possible to represent the unknown soil through the multilayer horizontal soil model fitted with optimized parameters. In order to assist the Differential Evolution method, in case of a stagnation during an arbitrary amount of generations, an optimization process unstuck methodology is proposed, to expand the search space and test new combinations, allowing the algorithm to nd a better solution and/or leave the local minima. It is further proposed an error improvement methodology, in order to smooth the error peaks between the apparent resistivity curves, by giving opportunities for other more uniform solutions to excel, in order to improve the whole algorithm precision, minimizing the maximum error. Methodologies to verify the polynomial approximation of the soil characteristic function and the theoretical apparent resistivity calculations are also proposed by including middle points among the approximated ones in the verification. Finally, a statistical evaluation prodecure is presented, in order to enable the classication of soil samples. The soil stratification methodology is used in a control group, formed by horizontally stratified soils. By using statistical inference, one may calculate the amount of soils that, within an error margin, does not follow the horizontal multilayer model.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Dengue is an important vector-borne virus that infects on the order of 400 million individuals per year. Infection with one of the virus's four serotypes (denoted DENV-1 to 4) may be silent, result in symptomatic dengue 'breakbone' fever, or develop into the more severe dengue hemorrhagic fever/dengue shock syndrome (DHF/DSS). Extensive research has therefore focused on identifying factors that influence dengue infection outcomes. It has been well-documented through epidemiological studies that DHF is most likely to result from a secondary heterologous infection, and that individuals experiencing a DENV-2 or DENV-3 infection typically are more likely to present with more severe dengue disease than those individuals experiencing a DENV-1 or DENV-4 infection. However, a mechanistic understanding of how these risk factors affect disease outcomes, and further, how the virus's ability to evolve these mechanisms will affect disease severity patterns over time, is lacking. In the second chapter of my dissertation, I formulate mechanistic mathematical models of primary and secondary dengue infections that describe how the dengue virus interacts with the immune response and the results of this interaction on the risk of developing severe dengue disease. I show that only the innate immune response is needed to reproduce characteristic features of a primary infection whereas the adaptive immune response is needed to reproduce characteristic features of a secondary dengue infection. I then add to these models a quantitative measure of disease severity that assumes immunopathology, and analyze the effectiveness of virological indicators of disease severity. In the third chapter of my dissertation, I then statistically fit these mathematical models to viral load data of dengue patients to understand the mechanisms that drive variation in viral load. I specifically consider the roles that immune status, clinical disease manifestation, and serotype may play in explaining viral load variation observed across the patients. With this analysis, I show that there is statistical support for the theory of antibody dependent enhancement in the development of severe disease in secondary dengue infections and that there is statistical support for serotype-specific differences in viral infectivity rates, with infectivity rates of DENV-2 and DENV-3 exceeding those of DENV-1. In the fourth chapter of my dissertation, I integrate these within-host models with a vector-borne epidemiological model to understand the potential for virulence evolution in dengue. Critically, I show that dengue is expected to evolve towards intermediate virulence, and that the optimal virulence of the virus depends strongly on the number of serotypes that co-circulate. Together, these dissertation chapters show that dengue viral load dynamics provide insight into the within-host mechanisms driving differences in dengue disease patterns and that these mechanisms have important implications for dengue virulence evolution.