127 resultados para Test data generation
The sequential analysis of repeated binary responses: a score test for the case of three time points
Resumo:
In this paper a robust method is developed for the analysis of data consisting of repeated binary observations taken at up to three fixed time points on each subject. The primary objective is to compare outcomes at the last time point, using earlier observations to predict this for subjects with incomplete records. A score test is derived. The method is developed for application to sequential clinical trials, as at interim analyses there will be many incomplete records occurring in non-informative patterns. Motivation for the methodology comes from experience with clinical trials in stroke and head injury, and data from one such trial is used to illustrate the approach. Extensions to more than three time points and to allow for stratification are discussed. Copyright © 2005 John Wiley & Sons, Ltd.
Resumo:
The proportional odds model provides a powerful tool for analysing ordered categorical data and setting sample size, although for many clinical trials its validity is questionable. The purpose of this paper is to present a new class of constrained odds models which includes the proportional odds model. The efficient score and Fisher's information are derived from the profile likelihood for the constrained odds model. These results are new even for the special case of proportional odds where the resulting statistics define the Mann-Whitney test. A strategy is described involving selecting one of these models in advance, requiring assumptions as strong as those underlying proportional odds, but allowing a choice of such models. The accuracy of the new procedure and its power are evaluated.
Resumo:
The order Fabales, including Leguminosae, Polygalaceae, Quillajaceae and Surianaceae, represents a novel hypothesis emerging from angiosperm molecular phylogenies. Despite good support for the order, molecular studies to date have suggested contradictory, poorly supported interfamilial relationships. Our reappraisal of relationships within Fabales addresses past taxon sampling deficiencies, and employs parsimony and Bayesian approaches using sequences from the plastid regions rbcL (166 spp.) and matK (78 spp.). Five alternative hypotheses for interfamilial relationships within Fabales were recovered. The Shimodaira-Hasegawa test found the likelihood of a resolved topology significantly higher than the one calculated for a polytomy, but did not favour any of the alternative hypotheses of relationship within Fabales. In the light of the morphological evidence available and the comparative behavior of rbcL and matK, the topology recovering Polygalaceae as sister to the rest of the order Fabales with Leguminosae more closely related to Quillajaceae + Surianaceae, is considered the most likely hypothesis of interfamilial relationships of the order. Dating of selected crown clades in the Fabales phylogeny using penalized likelihood suggests rapid radiation of the Leguminosae, Polygalaceae, and (Quillajaceae + Surianaceae) crown clades.
Resumo:
The article considers screening human populations with two screening tests. If any of the two tests is positive, then full evaluation of the disease status is undertaken; however, if both diagnostic tests are negative, then disease status remains unknown. This procedure leads to a data constellation in which, for each disease status, the 2 x 2 table associated with the two diagnostic tests used in screening has exactly one empty, unknown cell. To estimate the unobserved cell counts, previous approaches assume independence of the two diagnostic tests and use specific models, including the special mixture model of Walter or unconstrained capture-recapture estimates. Often, as is also demonstrated in this article by means of a simple test, the independence of the two screening tests is not supported by the data. Two new estimators are suggested that allow associations of the screening test, although the form of association must be assumed to be homogeneous over disease status. These estimators are modifications of the simple capture-recapture estimator and easy to construct. The estimators are investigated for several screening studies with fully evaluated disease status in which the superior behavior of the new estimators compared to the previous conventional ones can be shown. Finally, the performance of the new estimators is compared with maximum likelihood estimators, which are more difficult to obtain in these models. The results indicate the loss of efficiency as minor.
Resumo:
We describe a simple comparative method for determining whether rates of diversification are correlated with continuous traits in species-level phylogenies. This involves comparing traits of species with net speciation rate (number of nodes linking extant species with the root divided by the root to tip evolutionary distance), using a phylogenetically corrected correlation. We use simulations to examine the power of this test. We find that the approach has acceptable power to uncover relationships between speciation and a continuous trait and is robust to background random extinction; however, the power of the approach is reduced when the rate of trait evolution is decreased. The test has low power to relate diversification to traits when extinction rate is correlated with the trait. Clearly, there are inherent limitations in using only data on extant species to infer correlates of extinction; however, this approach is potentially a powerful tool in analyzing correlates of speciation.
Resumo:
This paper considers methods for testing for superiority or non-inferiority in active-control trials with binary data, when the relative treatment effect is expressed as an odds ratio. Three asymptotic tests for the log-odds ratio based on the unconditional binary likelihood are presented, namely the likelihood ratio, Wald and score tests. All three tests can be implemented straightforwardly in standard statistical software packages, as can the corresponding confidence intervals. Simulations indicate that the three alternatives are similar in terms of the Type I error, with values close to the nominal level. However, when the non-inferiority margin becomes large, the score test slightly exceeds the nominal level. In general, the highest power is obtained from the score test, although all three tests are similar and the observed differences in power are not of practical importance. Copyright (C) 2007 John Wiley & Sons, Ltd.
Resumo:
This paper presents a reappraisal of the blood clotting response (BCR) tests for anticoagulant rodenticides, and proposes a standardised methodology for identifying and quantifying physiological resistance in populations of rodent species. The standardisation is based on the International Normalised Ratio, which is standardised against a WHO international reference preparation of thromboplastin, and allows comparison of data obtained using different thromboplastin reagents. ne methodology is statistically sound, being based on the 50% response, and has been validated against the Norway rat (Rattus norvegicus) and the house mouse (Mus domesticus). Susceptibility baseline data are presented for warfarin, diphacinone, chlorophacinone and coumatetralyl against the Norway rat, and for bromadiolone, difenacoum, difethialone, flocoumafen and brodifacoum against the Norway rat and the house mouse. A 'test dose' of twice the ED50 can be used for initial identification of resistance, and will provide a similar level of information to previously published methods. Higher multiples of the ED50 can be used to assess the resistance factor, and to predict the likely impact on field control.
Resumo:
Heterogeneity in lifetime data may be modelled by multiplying an individual's hazard by an unobserved frailty. We test for the presence of frailty of this kind in univariate and bivariate data with Weibull distributed lifetimes, using statistics based on the ordered Cox-Snell residuals from the null model of no frailty. The form of the statistics is suggested by outlier testing in the gamma distribution. We find through simulation that the sum of the k largest or k smallest order statistics, for suitably chosen k , provides a powerful test when the frailty distribution is assumed to be gamma or positive stable, respectively. We provide recommended values of k for sample sizes up to 100 and simple formulae for estimated critical values for tests at the 5% level.
Resumo:
Inferring the spatial expansion dynamics of invading species from molecular data is notoriously difficult due to the complexity of the processes involved. For these demographic scenarios, genetic data obtained from highly variable markers may be profitably combined with specific sampling schemes and information from other sources using a Bayesian approach. The geographic range of the introduced toad Bufo marinus is still expanding in eastern and northern Australia, in each case from isolates established around 1960. A large amount of demographic and historical information is available on both expansion areas. In each area, samples were collected along a transect representing populations of different ages and genotyped at 10 microsatellite loci. Five demographic models of expansion, differing in the dispersal pattern for migrants and founders and in the number of founders, were considered. Because the demographic history is complex, we used an approximate Bayesian method, based on a rejection-regression algorithm. to formally test the relative likelihoods of the five models of expansion and to infer demographic parameters. A stepwise migration-foundation model with founder events was statistically better supported than other four models in both expansion areas. Posterior distributions supported different dynamics of expansion in the studied areas. Populations in the eastern expansion area have a lower stable effective population size and have been founded by a smaller number of individuals than those in the northern expansion area. Once demographically stabilized, populations exchange a substantial number of effective migrants per generation in both expansion areas, and such exchanges are larger in northern than in eastern Australia. The effective number of migrants appears to be considerably lower than that of founders in both expansion areas. We found our inferences to be relatively robust to various assumptions on marker. demographic, and historical features. The method presented here is the only robust, model-based method available so far, which allows inferring complex population dynamics over a short time scale. It also provides the basis for investigating the interplay between population dynamics, drift, and selection in invasive species.
Resumo:
Severe acute respiratory syndrome (SARS) coronavirus (SCoV) spike (S) protein is the major surface antigen of the virus and is responsible for receptor binding and the generation of neutralizing antibody. To investigate SCoV S protein, full-length and individual domains of S protein were expressed on the surface of insect cells and were characterized for cleavability and reactivity with serum samples obtained from patients during the convalescent phase of SARS. S protein could be cleaved by exogenous trypsin but not by coexpressed furin, suggesting that the protein is not normally processed during infection. Reactivity was evident by both flow cytometry and Western blot assays, but the pattern of reactivity varied according to assay and sequence of the antigen. The antibody response to SCoV S protein involves antibodies to both linear and conformational epitopes, with linear epitopes associated with the carboxyl domain and conformational epitopes associated with the amino terminal domain. Recombinant SCoV S protein appears to be a suitable antigen for the development of an efficient and sensitive diagnostic test for SARS, but our data suggest that assay format and choice of S antigen are important considerations.
Resumo:
Quantitative control of aroma generation during the Maillard reaction presents great scientific and industrial interest. Although there have been many studies conducted in simplified model systems, the results are difficult to apply to complex food systems, where the presence of other components can have a significant impact. In this work, an aqueous extract of defatted beef liver was chosen as a simplified food matrix for studying the kinetics of the Mallard reaction. Aliquots of the extract were heated under different time and temperature conditions and analyzed for sugars, amino acids, and methylbutanals, which are important Maillard-derived aroma compounds formed in cooked meat. Multiresponse kinetic modeling, based on a simplified mechanistic pathway, gave a good fit with the experimental data, but only when additional steps were introduced to take into account the interactions of glucose and glucose-derived intermediates with protein and other amino compounds. This emphasizes the significant role of the food matrix in controlling the Maillard reaction.
Resumo:
Random number generation (RNG) is a functionally complex process that is highly controlled and therefore dependent on Baddeley's central executive. This study addresses this issue by investigating whether key predictions from this framework are compatible with empirical data. In Experiment 1, the effect of increasing task demands by increasing the rate of the paced generation was comprehensively examined. As expected, faster rates affected performance negatively because central resources were increasingly depleted. Next, the effects of participants' exposure were manipulated in Experiment 2 by providing increasing amounts of practice on the task. There was no improvement over 10 practice trials, suggesting that the high level of strategic control required by the task was constant and not amenable to any automatization gain with repeated exposure. Together, the results demonstrate that RNG performance is a highly controlled and demanding process sensitive to additional demands on central resources (Experiment 1) and is unaffected by repeated performance or practice (Experiment 2). These features render the easily administered RNG task an ideal and robust index of executive function that is highly suitable for repeated clinical use.
Resumo:
In this paper, we address issues in segmentation Of remotely sensed LIDAR (LIght Detection And Ranging) data. The LIDAR data, which were captured by airborne laser scanner, contain 2.5 dimensional (2.5D) terrain surface height information, e.g. houses, vegetation, flat field, river, basin, etc. Our aim in this paper is to segment ground (flat field)from non-ground (houses and high vegetation) in hilly urban areas. By projecting the 2.5D data onto a surface, we obtain a texture map as a grey-level image. Based on the image, Gabor wavelet filters are applied to generate Gabor wavelet features. These features are then grouped into various windows. Among these windows, a combination of their first and second order of statistics is used as a measure to determine the surface properties. The test results have shown that ground areas can successfully be segmented from LIDAR data. Most buildings and high vegetation can be detected. In addition, Gabor wavelet transform can partially remove hill or slope effects in the original data by tuning Gabor parameters.
Resumo:
As Terabyte datasets become the norm, the focus has shifted away from our ability to produce and store ever larger amounts of data, onto its utilization. It is becoming increasingly difficult to gain meaningful insights into the data produced. Also many forms of the data we are currently producing cannot easily fit into traditional visualization methods. This paper presents a new and novel visualization technique based on the concept of a Data Forest. Our Data Forest has been designed to be used with vir tual reality (VR) as its presentation method. VR is a natural medium for investigating large datasets. Our approach can easily be adapted to be used in a variety of different ways, from a stand alone single user environment to large multi-user collaborative environments. A test application is presented using multi-dimensional data to demonstrate the concepts involved.
Resumo:
As we increase our ability to produce and store ever larger amounts of data, it is becoming increasingly difficult to understand what the data is trying to tell us. Not all the data we are currently producing can easily fit into traditional visualization methods. This paper presents a new and novel visualization technique based on the concept of a Data Forest. Our Data Forest has been developed to be utilised by virtual reality (VR) systems. VR is a natural information medium. This approach can easily be adapted to be used in collaborative environments. A test application has been developed to demonstrate the concepts involved and a collaborative version tested.