75 resultados para Data-driven Methods

em CentAUR: Central Archive University of Reading - UK


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Seamless phase II/III clinical trials are conducted in two stages with treatment selection at the first stage. In the first stage, patients are randomized to a control or one of k > 1 experimental treatments. At the end of this stage, interim data are analysed, and a decision is made concerning which experimental treatment should continue to the second stage. If the primary endpoint is observable only after some period of follow-up, at the interim analysis data may be available on some early outcome on a larger number of patients than those for whom the primary endpoint is available. These early endpoint data can thus be used for treatment selection. For two previously proposed approaches, the power has been shown to be greater for one or other method depending on the true treatment effects and correlations. We propose a new approach that builds on the previously proposed approaches and uses data available at the interim analysis to estimate these parameters and then, on the basis of these estimates, chooses the treatment selection method with the highest probability of correctly selecting the most effective treatment. This method is shown to perform well compared with the two previously described methods for a wide range of true parameter values. In most cases, the performance of the new method is either similar to or, in some cases, better than either of the two previously proposed methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We investigate the error dynamics for cycled data assimilation systems, such that the inverse problem of state determination is solved at tk, k = 1, 2, 3, ..., with a first guess given by the state propagated via a dynamical system model from time tk − 1 to time tk. In particular, for nonlinear dynamical systems that are Lipschitz continuous with respect to their initial states, we provide deterministic estimates for the development of the error ||ek|| := ||x(a)k − x(t)k|| between the estimated state x(a) and the true state x(t) over time. Clearly, observation error of size δ > 0 leads to an estimation error in every assimilation step. These errors can accumulate, if they are not (a) controlled in the reconstruction and (b) damped by the dynamical system under consideration. A data assimilation method is called stable, if the error in the estimate is bounded in time by some constant C. The key task of this work is to provide estimates for the error ||ek||, depending on the size δ of the observation error, the reconstruction operator Rα, the observation operator H and the Lipschitz constants K(1) and K(2) on the lower and higher modes of controlling the damping behaviour of the dynamics. We show that systems can be stabilized by choosing α sufficiently small, but the bound C will then depend on the data error δ in the form c||Rα||δ with some constant c. Since ||Rα|| → ∞ for α → 0, the constant might be large. Numerical examples for this behaviour in the nonlinear case are provided using a (low-dimensional) Lorenz '63 system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Performance modelling is a useful tool in the lifeycle of high performance scientific software, such as weather and climate models, especially as a means of ensuring efficient use of available computing resources. In particular, sufficiently accurate performance prediction could reduce the effort and experimental computer time required when porting and optimising a climate model to a new machine. In this paper, traditional techniques are used to predict the computation time of a simple shallow water model which is illustrative of the computation (and communication) involved in climate models. These models are compared with real execution data gathered on AMD Opteron-based systems, including several phases of the U.K. academic community HPC resource, HECToR. Some success is had in relating source code to achieved performance for the K10 series of Opterons, but the method is found to be inadequate for the next-generation Interlagos processor. The experience leads to the investigation of a data-driven application benchmarking approach to performance modelling. Results for an early version of the approach are presented using the shallow model as an example.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We systematically compare the performance of ETKF-4DVAR, 4DVAR-BEN and 4DENVAR with respect to two traditional methods (4DVAR and ETKF) and an ensemble transform Kalman smoother (ETKS) on the Lorenz 1963 model. We specifically investigated this performance with increasing nonlinearity and using a quasi-static variational assimilation algorithm as a comparison. Using the analysis root mean square error (RMSE) as a metric, these methods have been compared considering (1) assimilation window length and observation interval size and (2) ensemble size to investigate the influence of hybrid background error covariance matrices and nonlinearity on the performance of the methods. For short assimilation windows with close to linear dynamics, it has been shown that all hybrid methods show an improvement in RMSE compared to the traditional methods. For long assimilation window lengths in which nonlinear dynamics are substantial, the variational framework can have diffculties fnding the global minimum of the cost function, so we explore a quasi-static variational assimilation (QSVA) framework. Of the hybrid methods, it is seen that under certain parameters, hybrid methods which do not use a climatological background error covariance do not need QSVA to perform accurately. Generally, results show that the ETKS and hybrid methods that do not use a climatological background error covariance matrix with QSVA outperform all other methods due to the full flow dependency of the background error covariance matrix which also allows for the most nonlinearity.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this lecture is to review recent development in data analysis, initialization and data assimilation. The development of 3-dimensional multivariate schemes has been very timely because of its suitability to handle the many different types of observations during FGGE. Great progress has taken place in the initialization of global models by the aid of non-linear normal mode technique. However, in spite of great progress, several fundamental problems are still unsatisfactorily solved. Of particular importance is the question of the initialization of the divergent wind fields in the Tropics and to find proper ways to initialize weather systems driven by non-adiabatic processes. The unsatisfactory ways in which such processes are being initialized are leading to excessively long spin-up times.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, various types of fault detection methods for fuel cells are compared. For example, those that use a model based approach or a data driven approach or a combination of the two. The potential advantages and drawbacks of each method are discussed and comparisons between methods are made. In particular, classification algorithms are investigated, which separate a data set into classes or clusters based on some prior knowledge or measure of similarity. In particular, the application of classification methods to vectors of reconstructed currents by magnetic tomography or to vectors of magnetic field measurements directly is explored. Bases are simulated using the finite integration technique (FIT) and regularization techniques are employed to overcome ill-posedness. Fisher's linear discriminant is used to illustrate these concepts. Numerical experiments show that the ill-posedness of the magnetic tomography problem is a part of the classification problem on magnetic field measurements as well. This is independent of the particular working mode of the cell but influenced by the type of faulty behavior that is studied. The numerical results demonstrate the ill-posedness by the exponential decay behavior of the singular values for three examples of fault classes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: In many experimental pipelines, clustering of multidimensional biological datasets is used to detect hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design and execute scientific workflows and aid in silico experimentation. The availability of fast unsupervised methods for clustering and visualization in the Taverna platform is important to support a data-driven scientific discovery in complex and explorative bioinformatics applications. Results: This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides a nonlinear, topology preserving projection for the visualization of the input data and their similarities. The core algorithm in the BioDICE plugin is Fast Learning Self Organizing Map (FLSOM), which is an improved variant of the Self Organizing Map (SOM) algorithm. The plugin generates an interactive 2D map that allows the visual exploration of multidimensional data and the identification of groups of similar objects. The effectiveness of the plugin is demonstrated on a case study related to chemical compounds. Conclusions: The number and variety of available tools and its extensibility have made Taverna a popular choice for the development of scientific data workflows. This work presents a novel plugin, BioDICE, which adds a data-driven knowledge discovery component to Taverna. BioDICE provides an effective and powerful clustering tool, which can be adopted for the explorative analysis of biological datasets.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The contribution investigates the problem of estimating the size of a population, also known as the missing cases problem. Suppose a registration system is targeting to identify all cases having a certain characteristic such as a specific disease (cancer, heart disease, ...), disease related condition (HIV, heroin use, ...) or a specific behavior (driving a car without license). Every case in such a registration system has a certain notification history in that it might have been identified several times (at least once) which can be understood as a particular capture-recapture situation. Typically, cases are left out which have never been listed at any occasion, and it is this frequency one wants to estimate. In this paper modelling is concentrating on the counting distribution, e.g. the distribution of the variable that counts how often a given case has been identified by the registration system. Besides very simple models like the binomial or Poisson distribution, finite (nonparametric) mixtures of these are considered providing rather flexible modelling tools. Estimation is done using maximum likelihood by means of the EM algorithm. A case study on heroin users in Bangkok in the year 2001 is completing the contribution.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A wireless sensor network (WSN) is a group of sensors linked by wireless medium to perform distributed sensing tasks. WSNs have attracted a wide interest from academia and industry alike due to their diversity of applications, including home automation, smart environment, and emergency services, in various buildings. The primary goal of a WSN is to collect data sensed by sensors. These data are characteristic of being heavily noisy, exhibiting temporal and spatial correlation. In order to extract useful information from such data, as this paper will demonstrate, people need to utilise various techniques to analyse the data. Data mining is a process in which a wide spectrum of data analysis methods is used. It is applied in the paper to analyse data collected from WSNs monitoring an indoor environment in a building. A case study is given to demonstrate how data mining can be used to optimise the use of the office space in a building.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Objective. This study investigated whether trait positive schizotypy or trait dissociation was associated with increased levels of data-driven processing and symptoms of post-traumatic distress following a road traffic accident. Methods. Forty-five survivors of road traffic accidents were recruited from a London Accident and Emergency service. Each completed measures of trait positive schizotypy, trait dissociation, data-driven processing, and post-traumatic stress. Results. Trait positive schizotypy was associated with increased levels of data-driven processing and post-traumatic symptoms during a road traffic accident, whereas trait dissociation was not. Conclusions. Previous results which report a significant relationship between trait dissociation and post-traumatic symptoms may be an artefact of the relationship between trait positive schizotypy and trait dissociation.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Transient neural assemblies mediated by synchrony in particular frequency ranges are thought to underlie cognition. We propose a new approach to their detection, using empirical mode decomposition (EMD), a data-driven approach removing the need for arbitrary bandpass filter cut-offs. Phase locking is sought between modes. We explore the features of EMD, including making a quantitative assessment of its ability to preserve phase content of signals, and proceed to develop a statistical framework with which to assess synchrony episodes. Furthermore, we propose a new approach to ensure signal decomposition using EMD. We adapt the Hilbert spectrum to a time-frequency representation of phase locking and are able to locate synchrony successfully in time and frequency between synthetic signals reminiscent of EEG. We compare our approach, which we call EMD phase locking analysis (EMDPL) with existing methods and show it to offer improved time-frequency localisation of synchrony.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Although extensively studied within the lidar community, the multiple scattering phenomenon has always been considered a rare curiosity by radar meteorologists. Up to few years ago its appearance has only been associated with two- or three-body-scattering features (e.g. hail flares and mirror images) involving highly reflective surfaces. Recent atmospheric research aimed at better understanding of the water cycle and the role played by clouds and precipitation in affecting the Earth's climate has driven the deployment of high frequency radars in space. Examples are the TRMM 13.5 GHz, the CloudSat 94 GHz, the upcoming EarthCARE 94 GHz, and the GPM dual 13-35 GHz radars. These systems are able to detect the vertical distribution of hydrometeors and thus provide crucial feedbacks for radiation and climate studies. The shift towards higher frequencies increases the sensitivity to hydrometeors, improves the spatial resolution and reduces the size and weight of the radar systems. On the other hand, higher frequency radars are affected by stronger extinction, especially in the presence of large precipitating particles (e.g. raindrops or hail particles), which may eventually drive the signal below the minimum detection threshold. In such circumstances the interpretation of the radar equation via the single scattering approximation may be problematic. Errors will be large when the radiation emitted from the radar after interacting more than once with the medium still contributes substantially to the received power. This is the case if the transport mean-free-path becomes comparable with the instrument footprint (determined by the antenna beam-width and the platform altitude). This situation resembles to what has already been experienced in lidar observations, but with a predominance of wide- versus small-angle scattering events. At millimeter wavelengths, hydrometeors diffuse radiation rather isotropically compared to the visible or near infrared region where scattering is predominantly in the forward direction. A complete understanding of radiation transport modeling and data analysis methods under wide-angle multiple scattering conditions is mandatory for a correct interpretation of echoes observed by space-borne millimeter radars. This paper reviews the status of research in this field. Different numerical techniques currently implemented to account for higher order scattering are reviewed and their weaknesses and strengths highlighted. Examples of simulated radar backscattering profiles are provided with particular emphasis given to situations in which the multiple scattering contributions become comparable or overwhelm the single scattering signal. We show evidences of multiple scattering effects from air-borne and from CloudSat observations, i.e. unique signatures which cannot be explained by single scattering theory. Ideas how to identify and tackle the multiple scattering effects are discussed. Finally perspectives and suggestions for future work are outlined. This work represents a reference-guide for studies focused at modeling the radiation transport and at interpreting data from high frequency space-borne radar systems that probe highly opaque scattering media such as thick ice clouds or precipitating clouds.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Data assimilation is predominantly used for state estimation; combining observational data with model predictions to produce an updated model state that most accurately approximates the true system state whilst keeping the model parameters fixed. This updated model state is then used to initiate the next model forecast. Even with perfect initial data, inaccurate model parameters will lead to the growth of prediction errors. To generate reliable forecasts we need good estimates of both the current system state and the model parameters. This paper presents research into data assimilation methods for morphodynamic model state and parameter estimation. First, we focus on state estimation and describe implementation of a three dimensional variational(3D-Var) data assimilation scheme in a simple 2D morphodynamic model of Morecambe Bay, UK. The assimilation of observations of bathymetry derived from SAR satellite imagery and a ship-borne survey is shown to significantly improve the predictive capability of the model over a 2 year run. Here, the model parameters are set by manual calibration; this is laborious and is found to produce different parameter values depending on the type and coverage of the validation dataset. The second part of this paper considers the problem of model parameter estimation in more detail. We explain how, by employing the technique of state augmentation, it is possible to use data assimilation to estimate uncertain model parameters concurrently with the model state. This approach removes inefficiencies associated with manual calibration and enables more effective use of observational data. We outline the development of a novel hybrid sequential 3D-Var data assimilation algorithm for joint state-parameter estimation and demonstrate its efficacy using an idealised 1D sediment transport model. The results of this study are extremely positive and suggest that there is great potential for the use of data assimilation-based state-parameter estimation in coastal morphodynamic modelling.