30 resultados para score test information matrix artificial regression
em CentAUR: Central Archive University of Reading - UK
The sequential analysis of repeated binary responses: a score test for the case of three time points
Resumo:
In this paper a robust method is developed for the analysis of data consisting of repeated binary observations taken at up to three fixed time points on each subject. The primary objective is to compare outcomes at the last time point, using earlier observations to predict this for subjects with incomplete records. A score test is derived. The method is developed for application to sequential clinical trials, as at interim analyses there will be many incomplete records occurring in non-informative patterns. Motivation for the methodology comes from experience with clinical trials in stroke and head injury, and data from one such trial is used to illustrate the approach. Extensions to more than three time points and to allow for stratification are discussed. Copyright © 2005 John Wiley & Sons, Ltd.
Resumo:
A score test is developed for binary clinical trial data, which incorporates patient non-compliance while respecting randomization. It is assumed in this paper that compliance is all-or-nothing, in the sense that a patient either accepts all of the treatment assigned as specified in the protocol, or none of it. Direct analytic comparisons of the adjusted test statistic for both the score test and the likelihood ratio test are made with the corresponding test statistics that adhere to the intention-to-treat principle. It is shown that no gain in power is possible over the intention-to-treat analysis, by adjusting for patient non-compliance. Sample size formulae are derived and simulation studies are used to demonstrate that the sample size approximation holds. Copyright © 2003 John Wiley & Sons, Ltd.
Resumo:
There is growing interest, especially for trials in stroke, in combining multiple endpoints in a single clinical evaluation of an experimental treatment. The endpoints might be repeated evaluations of the same characteristic or alternative measures of progress on different scales. Often they will be binary or ordinal, and those are the cases studied here. In this paper we take a direct approach to combining the univariate score statistics for comparing treatments with respect to each endpoint. The correlations between the score statistics are derived and used to allow a valid combined score test to be applied. A sample size formula is deduced and application in sequential designs is discussed. The method is compared with an alternative approach based on generalized estimating equations in an illustrative analysis and replicated simulations, and the advantages and disadvantages of the two approaches are discussed.
Resumo:
This paper considers methods for testing for superiority or non-inferiority in active-control trials with binary data, when the relative treatment effect is expressed as an odds ratio. Three asymptotic tests for the log-odds ratio based on the unconditional binary likelihood are presented, namely the likelihood ratio, Wald and score tests. All three tests can be implemented straightforwardly in standard statistical software packages, as can the corresponding confidence intervals. Simulations indicate that the three alternatives are similar in terms of the Type I error, with values close to the nominal level. However, when the non-inferiority margin becomes large, the score test slightly exceeds the nominal level. In general, the highest power is obtained from the score test, although all three tests are similar and the observed differences in power are not of practical importance. Copyright (C) 2007 John Wiley & Sons, Ltd.
Resumo:
Hydrophobic chemicals are known to associate with sediment particles including those from both suspended particulate matter and bottom deposits. The complex and variable composition of natural particles makes it very difficult therefore, to predict the bioavailability of sediment-bound contaminants. To overcome these problems we have previously devised a test system using artificial particles, with or without humic acids, for use as an experimental model of natural sediments. In the present work we have applied this experimental technique to investigate the bioavailability and bioaccumulation of pyrene by the freshwater fingernail clam Sphaerium corneum. The uptake and accumulation of pyrene in clams exposed to the chemical in the presence of a sample of natural sediment was also investigated. According to the results obtained, particle surface properties and organic matter content are the key factors for assessing the bioavailability and bioaccumulation of pyrene by clams. (C) 2002 Elsevier Science Ltd. All rights reserved.
Resumo:
Multi-agent systems have been adopted to build intelligent environment in recent years. It was claimed that energy efficiency and occupants' comfort were the most important factors for evaluating the performance of modem work environment, and multi-agent systems presented a viable solution to handling the complexity of dynamic building environment. While previous research has made significant advance in some aspects, the proposed systems or models were often not applicable in a "shared environment". This paper introduces an ongoing project on multi-agent for building control, which aims to achieve both energy efficiency and occupants' comfort in a shared environment.
Resumo:
Motivation: In order to enhance genome annotation, the fully automatic fold recognition method GenTHREADER has been improved and benchmarked. The previous version of GenTHREADER consisted of a simple neural network which was trained to combine sequence alignment score, length information and energy potentials derived from threading into a single score representing the relationship between two proteins, as designated by CATH. The improved version incorporates PSI-BLAST searches, which have been jumpstarted with structural alignment profiles from FSSP, and now also makes use of PSIPRED predicted secondary structure and bi-directional scoring in order to calculate the final alignment score. Pairwise potentials and solvation potentials are calculated from the given sequence alignment which are then used as inputs to a multi-layer, feed-forward neural network, along with the alignment score, alignment length and sequence length. The neural network has also been expanded to accommodate the secondary structure element alignment (SSEA) score as an extra input and it is now trained to learn the FSSP Z-score as a measurement of similarity between two proteins. Results: The improvements made to GenTHREADER increase the number of remote homologues that can be detected with a low error rate, implying higher reliability of score, whilst also increasing the quality of the models produced. We find that up to five times as many true positives can be detected with low error rate per query. Total MaxSub score is doubled at low false positive rates using the improved method.
Resumo:
Conventional economic theory, applied to information released by listed companies, equates ‘useful’ with ‘price-sensitive’. Stock exchange rules accordingly prohibit the selec- tive, private communication of price-sensitive information. Yet, even in the absence of such communication, UK equity fund managers routinely meet privately with the senior execu- tives of the companies in which they invest. Moreover, they consider these brief, formal and formulaic meetings to be their most important sources of investment information. In this paper we ask how that can be. Drawing on interview and observation data with fund managers and CFOs, we find evidence for three, non-mutually exclusive explanations: that the characterisation of information in conventional economic theory is too restricted, that fund managers fail to act with the rationality that conventional economic theory assumes, and/or that the primary value of the meetings for fund managers is not related to their investment decision making but to the claims of superior knowledge made to clients in marketing their active fund management expertise. Our findings suggest a disconnect between economic theory and economic policy based on that theory, as well as a corre- sponding limitation in research studies that test information-usefulness by assuming it to be synonymous with price-sensitivity. We draw implications for further research into the role of tacit knowledge in equity investment decision-making, and also into the effects of the principal–agent relationship between fund managers and their clients.
Resumo:
This paper presents an approximate closed form sample size formula for determining non-inferiority in active-control trials with binary data. We use the odds-ratio as the measure of the relative treatment effect, derive the sample size formula based on the score test and compare it with a second, well-known formula based on the Wald test. Both closed form formulae are compared with simulations based on the likelihood ratio test. Within the range of parameter values investigated, the score test closed form formula is reasonably accurate when non-inferiority margins are based on odds-ratios of about 0.5 or above and when the magnitude of the odds ratio under the alternative hypothesis lies between about 1 and 2.5. The accuracy generally decreases as the odds ratio under the alternative hypothesis moves upwards from 1. As the non-inferiority margin odds ratio decreases from 0.5, the score test closed form formula increasingly overestimates the sample size irrespective of the magnitude of the odds ratio under the alternative hypothesis. The Wald test closed form formula is also reasonably accurate in the cases where the score test closed form formula works well. Outside these scenarios, the Wald test closed form formula can either underestimate or overestimate the sample size, depending on the magnitude of the non-inferiority margin odds ratio and the odds ratio under the alternative hypothesis. Although neither approximation is accurate for all cases, both approaches lead to satisfactory sample size calculation for non-inferiority trials with binary data where the odds ratio is the parameter of interest.
Resumo:
This paper presents an efficient construction algorithm for obtaining sparse kernel density estimates based on a regression approach that directly optimizes model generalization capability. Computational efficiency of the density construction is ensured using an orthogonal forward regression, and the algorithm incrementally minimizes the leave-one-out test score. A local regularization method is incorporated naturally into the density construction process to further enforce sparsity. An additional advantage of the proposed algorithm is that it is fully automatic and the user is not required to specify any criterion to terminate the density construction procedure. This is in contrast to an existing state-of-art kernel density estimation method using the support vector machine (SVM), where the user is required to specify some critical algorithm parameter. Several examples are included to demonstrate the ability of the proposed algorithm to effectively construct a very sparse kernel density estimate with comparable accuracy to that of the full sample optimized Parzen window density estimate. Our experimental results also demonstrate that the proposed algorithm compares favorably with the SVM method, in terms of both test accuracy and sparsity, for constructing kernel density estimates.
Resumo:
Using the classical Parzen window (PW) estimate as the target function, the sparse kernel density estimator is constructed in a forward constrained regression manner. The leave-one-out (LOO) test score is used for kernel selection. The jackknife parameter estimator subject to positivity constraint check is used for the parameter estimation of a single parameter at each forward step. As such the proposed approach is simple to implement and the associated computational cost is very low. An illustrative example is employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with comparable accuracy to that of the classical Parzen window estimate.
Resumo:
We propose a novel method for scoring the accuracy of protein binding site predictions – the Binding-site Distance Test (BDT) score. Recently, the Matthews Correlation Coefficient (MCC) has been used to evaluate binding site predictions, both by developers of new methods and by the assessors for the community wide prediction experiment – CASP8. Whilst being a rigorous scoring method, the MCC does not take into account the actual 3D location of the predicted residues from the observed binding site. Thus, an incorrectly predicted site that is nevertheless close to the observed binding site will obtain an identical score to the same number of nonbinding residues predicted at random. The MCC is somewhat affected by the subjectivity of determining observed binding residues and the ambiguity of choosing distance cutoffs. By contrast the BDT method produces continuous scores ranging between 0 and 1, relating to the distance between the predicted and observed residues. Residues predicted close to the binding site will score higher than those more distant, providing a better reflection of the true accuracy of predictions. The CASP8 function predictions were evaluated using both the MCC and BDT methods and the scores were compared. The BDT was found to strongly correlate with the MCC scores whilst also being less susceptible to the subjectivity of defining binding residues. We therefore suggest that this new simple score is a potentially more robust method for future evaluations of protein-ligand binding site predictions.
Resumo:
The influence matrix is used in ordinary least-squares applications for monitoring statistical multiple-regression analyses. Concepts related to the influence matrix provide diagnostics on the influence of individual data on the analysis - the analysis change that would occur by leaving one observation out, and the effective information content (degrees of freedom for signal) in any sub-set of the analysed data. In this paper, the corresponding concepts have been derived in the context of linear statistical data assimilation in numerical weather prediction. An approximate method to compute the diagonal elements of the influence matrix (the self-sensitivities) has been developed for a large-dimension variational data assimilation system (the four-dimensional variational system of the European Centre for Medium-Range Weather Forecasts). Results show that, in the boreal spring 2003 operational system, 15% of the global influence is due to the assimilated observations in any one analysis, and the complementary 85% is the influence of the prior (background) information, a short-range forecast containing information from earlier assimilated observations. About 25% of the observational information is currently provided by surface-based observing systems, and 75% by satellite systems. Low-influence data points usually occur in data-rich areas, while high-influence data points are in data-sparse areas or in dynamically active regions. Background-error correlations also play an important role: high correlation diminishes the observation influence and amplifies the importance of the surrounding real and pseudo observations (prior information in observation space). Incorrect specifications of background and observation-error covariance matrices can be identified, interpreted and better understood by the use of influence-matrix diagnostics for the variety of observation types and observed variables used in the data assimilation system. Copyright © 2004 Royal Meteorological Society