75 resultados para Data-driven Methods
Resumo:
Estimating trajectories and parameters of dynamical systems from observations is a problem frequently encountered in various branches of science; geophysicists for example refer to this problem as data assimilation. Unlike as in estimation problems with exchangeable observations, in data assimilation the observations cannot easily be divided into separate sets for estimation and validation; this creates serious problems, since simply using the same observations for estimation and validation might result in overly optimistic performance assessments. To circumvent this problem, a result is presented which allows us to estimate this optimism, thus allowing for a more realistic performance assessment in data assimilation. The presented approach becomes particularly simple for data assimilation methods employing a linear error feedback (such as synchronization schemes, nudging, incremental 3DVAR and 4DVar, and various Kalman filter approaches). Numerical examples considering a high gain observer confirm the theory.
Resumo:
We present a data-driven mathematical model of a key initiating step in platelet activation, a central process in the prevention of bleeding following Injury. In vascular disease, this process is activated inappropriately and causes thrombosis, heart attacks and stroke. The collagen receptor GPVI is the primary trigger for platelet activation at sites of injury. Understanding the complex molecular mechanisms initiated by this receptor is important for development of more effective antithrombotic medicines. In this work we developed a series of nonlinear ordinary differential equation models that are direct representations of biological hypotheses surrounding the initial steps in GPVI-stimulated signal transduction. At each stage model simulations were compared to our own quantitative, high-temporal experimental data that guides further experimental design, data collection and model refinement. Much is known about the linear forward reactions within platelet signalling pathways but knowledge of the roles of putative reverse reactions are poorly understood. An initial model, that includes a simple constitutively active phosphatase, was unable to explain experimental data. Model revisions, incorporating a complex pathway of interactions (and specifically the phosphatase TULA-2), provided a good description of the experimental data both based on observations of phosphorylation in samples from one donor and in those of a wider population. Our model was used to investigate the levels of proteins involved in regulating the pathway and the effect of low GPVI levels that have been associated with disease. Results indicate a clear separation in healthy and GPVI deficient states in respect of the signalling cascade dynamics associated with Syk tyrosine phosphorylation and activation. Our approach reveals the central importance of this negative feedback pathway that results in the temporal regulation of a specific class of protein tyrosine phosphatases in controlling the rate, and therefore extent, of GPVI-stimulated platelet activation.
Resumo:
Nonlinear data assimilation is high on the agenda in all fields of the geosciences as with ever increasing model resolution and inclusion of more physical (biological etc.) processes, and more complex observation operators the data-assimilation problem becomes more and more nonlinear. The suitability of particle filters to solve the nonlinear data assimilation problem in high-dimensional geophysical problems will be discussed. Several existing and new schemes will be presented and it is shown that at least one of them, the Equivalent-Weights Particle Filter, does indeed beat the curse of dimensionality and provides a way forward to solve the problem of nonlinear data assimilation in high-dimensional systems.
Resumo:
Accurate knowledge of the location and magnitude of ocean heat content (OHC) variability and change is essential for understanding the processes that govern decadal variations in surface temperature, quantifying changes in the planetary energy budget, and developing constraints on the transient climate response to external forcings. We present an overview of the temporal and spatial characteristics of OHC variability and change as represented by an ensemble of dynamical and statistical ocean reanalyses (ORAs). Spatial maps of the 0–300 m layer show large regions of the Pacific and Indian Oceans where the interannual variability of the ensemble mean exceeds ensemble spread, indicating that OHC variations are well-constrained by the available observations over the period 1993–2009. At deeper levels, the ORAs are less well-constrained by observations with the largest differences across the ensemble mostly associated with areas of high eddy kinetic energy, such as the Southern Ocean and boundary current regions. Spatial patterns of OHC change for the period 1997–2009 show good agreement in the upper 300 m and are characterized by a strong dipole pattern in the Pacific Ocean. There is less agreement in the patterns of change at deeper levels, potentially linked to differences in the representation of ocean dynamics, such as water mass formation processes. However, the Atlantic and Southern Oceans are regions in which many ORAs show widespread warming below 700 m over the period 1997–2009. Annual time series of global and hemispheric OHC change for 0–700 m show the largest spread for the data sparse Southern Hemisphere and a number of ORAs seem to be subject to large initialization ‘shock’ over the first few years. In agreement with previous studies, a number of ORAs exhibit enhanced ocean heat uptake below 300 and 700 m during the mid-1990s or early 2000s. The ORA ensemble mean (±1 standard deviation) of rolling 5-year trends in full-depth OHC shows a relatively steady heat uptake of approximately 0.9 ± 0.8 W m−2 (expressed relative to Earth’s surface area) between 1995 and 2002, which reduces to about 0.2 ± 0.6 W m−2 between 2004 and 2006, in qualitative agreement with recent analysis of Earth’s energy imbalance. There is a marked reduction in the ensemble spread of OHC trends below 300 m as the Argo profiling float observations become available in the early 2000s. In general, we suggest that ORAs should be treated with caution when employed to understand past ocean warming trends—especially when considering the deeper ocean where there is little in the way of observational constraints. The current work emphasizes the need to better observe the deep ocean, both for providing observational constraints for future ocean state estimation efforts and also to develop improved models and data assimilation methods.
Resumo:
TIGGE was a major component of the THORPEX (The Observing System Research and Predictability Experiment) research program, whose aim is to accelerate improvements in forecasting high-impact weather. By providing ensemble prediction data from leading operational forecast centers, TIGGE has enhanced collaboration between the research and operational meteorological communities and enabled research studies on a wide range of topics. The paper covers the objective evaluation of the TIGGE data. For a range of forecast parameters, it is shown to be beneficial to combine ensembles from several data providers in a Multi-model Grand Ensemble. Alternative methods to correct systematic errors, including the use of reforecast data, are also discussed. TIGGE data have been used for a range of research studies on predictability and dynamical processes. Tropical cyclones are the most destructive weather systems in the world, and are a focus of multi-model ensemble research. Their extra-tropical transition also has a major impact on skill of mid-latitude forecasts. We also review how TIGGE has added to our understanding of the dynamics of extra-tropical cyclones and storm tracks. Although TIGGE is a research project, it has proved invaluable for the development of products for future operational forecasting. Examples include the forecasting of tropical cyclone tracks, heavy rainfall, strong winds, and flood prediction through coupling hydrological models to ensembles. Finally the paper considers the legacy of TIGGE. We discuss the priorities and key issues in predictability and ensemble forecasting, including the new opportunities of convective-scale ensembles, links with ensemble data assimilation methods, and extension of the range of useful forecast skill.
Resumo:
In a sequential clinical trial, accrual of data on patients often continues after the stopping criterion for the study has been met. This is termed “overrunning.” Overrunning occurs mainly when the primary response from each patient is measured after some extended observation period. The objective of this article is to compare two methods of allowing for overrunning. In particular, simulation studies are reported that assess the two procedures in terms of how well they maintain the intended type I error rate. The effect on power resulting from the incorporation of “overrunning data” using the two procedures is evaluated.
Resumo:
Background: Meta-analyses based on individual patient data (IPD) are regarded as the gold standard for systematic reviews. However, the methods used for analysing and presenting results from IPD meta-analyses have received little discussion. Methods We review 44 IPD meta-analyses published during the years 1999–2001. We summarize whether they obtained all the data they sought, what types of approaches were used in the analysis, including assumptions of common or random effects, and how they examined the effects of covariates. Results: Twenty-four out of 44 analyses focused on time-to-event outcomes, and most analyses (28) estimated treatment effects within each trial and then combined the results assuming a common treatment effect across trials. Three analyses failed to stratify by trial, analysing the data is if they came from a single mega-trial. Only nine analyses used random effects methods. Covariate-treatment interactions were generally investigated by subgrouping patients. Seven of the meta-analyses included data from less than 80% of the randomized patients sought, but did not address the resulting potential biases. Conclusions: Although IPD meta-analyses have many advantages in assessing the effects of health care, there are several aspects that could be further developed to make fuller use of the potential of these time-consuming projects. In particular, IPD could be used to more fully investigate the influence of covariates on heterogeneity of treatment effects, both within and between trials. The impact of heterogeneity, or use of random effects, are seldom discussed. There is thus considerable scope for enhancing the methods of analysis and presentation of IPD meta-analysis.
Resumo:
The proportional odds model provides a powerful tool for analysing ordered categorical data and setting sample size, although for many clinical trials its validity is questionable. The purpose of this paper is to present a new class of constrained odds models which includes the proportional odds model. The efficient score and Fisher's information are derived from the profile likelihood for the constrained odds model. These results are new even for the special case of proportional odds where the resulting statistics define the Mann-Whitney test. A strategy is described involving selecting one of these models in advance, requiring assumptions as strong as those underlying proportional odds, but allowing a choice of such models. The accuracy of the new procedure and its power are evaluated.
Resumo:
Background: Molecular tools may help to uncover closely related and still diverging species from a wide variety of taxa and provide insight into the mechanisms, pace and geography of marine speciation. There is a certain controversy on the phylogeography and speciation modes of species-groups with an Eastern Atlantic-Western Indian Ocean distribution, with previous studies suggesting that older events (Miocene) and/or more recent (Pleistocene) oceanographic processes could have influenced the phylogeny of marine taxa. The spiny lobster genus Palinurus allows for testing among speciation hypotheses, since it has a particular distribution with two groups of three species each in the Northeastern Atlantic (P. elephas, P. mauritanicus and P. charlestoni) and Southeastern Atlantic and Southwestern Indian Oceans (P. gilchristi, P. delagoae and P. barbarae). In the present study, we obtain a more complete understanding of the phylogenetic relationships among these species through a combined dataset with both nuclear and mitochondrial markers, by testing alternative hypotheses on both the mutation rate and tree topology under the recently developed approximate Bayesian computation (ABC) methods. Results: Our analyses support a North-to-South speciation pattern in Palinurus with all the South-African species forming a monophyletic clade nested within the Northern Hemisphere species. Coalescent-based ABC methods allowed us to reject the previously proposed hypothesis of a Middle Miocene speciation event related with the closure of the Tethyan Seaway. Instead, divergence times obtained for Palinurus species using the combined mtDNA-microsatellite dataset and standard mutation rates for mtDNA agree with known glaciation-related processes occurring during the last 2 my. Conclusion: The Palinurus speciation pattern is a typical example of a series of rapid speciation events occurring within a group, with very short branches separating different species. Our results support the hypothesis that recent climate change-related oceanographic processes have influenced the phylogeny of marine taxa, with most Palinurus species originating during the last two million years. The present study highlights the value of new coalescent-based statistical methods such as ABC for testing different speciation hypotheses using molecular data.
Resumo:
This paper considers methods for testing for superiority or non-inferiority in active-control trials with binary data, when the relative treatment effect is expressed as an odds ratio. Three asymptotic tests for the log-odds ratio based on the unconditional binary likelihood are presented, namely the likelihood ratio, Wald and score tests. All three tests can be implemented straightforwardly in standard statistical software packages, as can the corresponding confidence intervals. Simulations indicate that the three alternatives are similar in terms of the Type I error, with values close to the nominal level. However, when the non-inferiority margin becomes large, the score test slightly exceeds the nominal level. In general, the highest power is obtained from the score test, although all three tests are similar and the observed differences in power are not of practical importance. Copyright (C) 2007 John Wiley & Sons, Ltd.