85 resultados para Data replication processes
Resumo:
In the last decade, a vast number of land surface schemes has been designed for use in global climate models, atmospheric weather prediction, mesoscale numerical models, ecological models, and models of global changes. Since land surface schemes are designed for different purposes they have various levels of complexity in the treatment of bare soil processes, vegetation, and soil water movement. This paper is a contribution to a little group of papers dealing with intercomparison of differently designed and oriented land surface schemes. For that purpose we have chosen three schemes for classification: i) global climate models, BATS (Dickinson et al., 1986; Dickinson et al., 1992); ii) mesoscale and ecological models, LEAF (Lee, 1992) and iii) mesoscale models, LAPS (Mihailović, 1996; Mihailović and Kallos, 1997; Mihailović et al., 1999) according to the Shao et al. (1995) classification. These schemes were compared using surface fluxes and leaf temperature outputs obtained by time integrations of data sets derived from the micrometeorological measurements above a maize field at an experimental site in De Sinderhoeve (The Netherlands) for 18 August, 8 September, and 4 October 1988. Finally, comparison of the schemes was supported applying a simple statistical analysis on the surface flux outputs.
Resumo:
This dissertation deals with aspects of sequential data assimilation (in particular ensemble Kalman filtering) and numerical weather forecasting. In the first part, the recently formulated Ensemble Kalman-Bucy (EnKBF) filter is revisited. It is shown that the previously used numerical integration scheme fails when the magnitude of the background error covariance grows beyond that of the observational error covariance in the forecast window. Therefore, we present a suitable integration scheme that handles the stiffening of the differential equations involved and doesn’t represent further computational expense. Moreover, a transform-based alternative to the EnKBF is developed: under this scheme, the operations are performed in the ensemble space instead of in the state space. Advantages of this formulation are explained. For the first time, the EnKBF is implemented in an atmospheric model. The second part of this work deals with ensemble clustering, a phenomenon that arises when performing data assimilation using of deterministic ensemble square root filters in highly nonlinear forecast models. Namely, an M-member ensemble detaches into an outlier and a cluster of M-1 members. Previous works may suggest that this issue represents a failure of EnSRFs; this work dispels that notion. It is shown that ensemble clustering can be reverted also due to nonlinear processes, in particular the alternation between nonlinear expansion and compression of the ensemble for different regions of the attractor. Some EnSRFs that use random rotations have been developed to overcome this issue; these formulations are analyzed and their advantages and disadvantages with respect to common EnSRFs are discussed. The third and last part contains the implementation of the Robert-Asselin-Williams (RAW) filter in an atmospheric model. The RAW filter is an improvement to the widely popular Robert-Asselin filter that successfully suppresses spurious computational waves while avoiding any distortion in the mean value of the function. Using statistical significance tests both at the local and field level, it is shown that the climatology of the SPEEDY model is not modified by the changed time stepping scheme; hence, no retuning of the parameterizations is required. It is found the accuracy of the medium-term forecasts is increased by using the RAW filter.
Resumo:
Data assimilation algorithms are a crucial part of operational systems in numerical weather prediction, hydrology and climate science, but are also important for dynamical reconstruction in medical applications and quality control for manufacturing processes. Usually, a variety of diverse measurement data are employed to determine the state of the atmosphere or to a wider system including land and oceans. Modern data assimilation systems use more and more remote sensing data, in particular radiances measured by satellites, radar data and integrated water vapor measurements via GPS/GNSS signals. The inversion of some of these measurements are ill-posed in the classical sense, i.e. the inverse of the operator H which maps the state onto the data is unbounded. In this case, the use of such data can lead to significant instabilities of data assimilation algorithms. The goal of this work is to provide a rigorous mathematical analysis of the instability of well-known data assimilation methods. Here, we will restrict our attention to particular linear systems, in which the instability can be explicitly analyzed. We investigate the three-dimensional variational assimilation and four-dimensional variational assimilation. A theory for the instability is developed using the classical theory of ill-posed problems in a Banach space framework. Further, we demonstrate by numerical examples that instabilities can and will occur, including an example from dynamic magnetic tomography.
Resumo:
OBJECTIVES: The prediction of protein structure and the precise understanding of protein folding and unfolding processes remains one of the greatest challenges in structural biology and bioinformatics. Computer simulations based on molecular dynamics (MD) are at the forefront of the effort to gain a deeper understanding of these complex processes. Currently, these MD simulations are usually on the order of tens of nanoseconds, generate a large amount of conformational data and are computationally expensive. More and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data. METHODS: To adequately organize, manage, and analyze the data generated by unfolding simulation studies, we designed a data warehouse system that is embedded in a grid environment to facilitate the seamless sharing of available computer resources and thus enable many groups to share complex molecular dynamics simulations on a more regular basis. RESULTS: To gain insight into the conformational fluctuations and stability of the monomeric forms of the amyloidogenic protein transthyretin (TTR), molecular dynamics unfolding simulations of the monomer of human TTR have been conducted. Trajectory data and meta-data of the wild-type (WT) protein and the highly amyloidogenic variant L55P-TTR represent the test case for the data warehouse. CONCLUSIONS: Web and grid services, especially pre-defined data mining services that can run on or 'near' the data repository of the data warehouse, are likely to play a pivotal role in the analysis of molecular dynamics unfolding data.
Resumo:
Advances in hardware and software in the past decade allow to capture, record and process fast data streams at a large scale. The research area of data stream mining has emerged as a consequence from these advances in order to cope with the real time analysis of potentially large and changing data streams. Examples of data streams include Google searches, credit card transactions, telemetric data and data of continuous chemical production processes. In some cases the data can be processed in batches by traditional data mining approaches. However, in some applications it is required to analyse the data in real time as soon as it is being captured. Such cases are for example if the data stream is infinite, fast changing, or simply too large in size to be stored. One of the most important data mining techniques on data streams is classification. This involves training the classifier on the data stream in real time and adapting it to concept drifts. Most data stream classifiers are based on decision trees. However, it is well known in the data mining community that there is no single optimal algorithm. An algorithm may work well on one or several datasets but badly on others. This paper introduces eRules, a new rule based adaptive classifier for data streams, based on an evolving set of Rules. eRules induces a set of rules that is constantly evaluated and adapted to changes in the data stream by adding new and removing old rules. It is different from the more popular decision tree based classifiers as it tends to leave data instances rather unclassified than forcing a classification that could be wrong. The ongoing development of eRules aims to improve its accuracy further through dynamic parameter setting which will also address the problem of changing feature domain values.
Resumo:
Hepatitis C virus (HCV) infection is associated with dysregulation of both lipid and glucose metabolism. As well as contributing to viral replication, these perturbations influence the pathogenesis associated with the virus, including steatosis, insulin resistance, and type 2 diabetes. AMP-activated protein kinase (AMPK) plays a key role in regulation of both lipid and glucose metabolism. We show here that, in cells either infected with HCV or harboring an HCV subgenomic replicon, phosphorylation of AMPK at threonine 172 and concomitant AMPK activity are dramatically reduced. We demonstrate that this effect is mediated by activation of the serine/threonine kinase, protein kinase B, which inhibits AMPK by phosphorylating serine 485. The physiological significance of this inhibition is demonstrated by the observation that pharmacological restoration of AMPK activity not only abrogates the lipid accumulation observed in virus-infected and subgenomic replicon-harboring cells but also efficiently inhibits viral replication. These data demonstrate that inhibition of AMPK is required for HCV replication and that the restoration of AMPK activity may present a target for much needed anti-HCV therapies.
Resumo:
Data from civil engineering projects can inform the operation of built infrastructure. This paper captures lessons for such data handover, from projects into operations, through interviews with leading clients and their supply chain. Clients are found to value receiving accurate and complete data. They recognise opportunities to use high quality information in decision-making about capital and operational expenditure; as well as in ensuring compliance with regulatory requirements. Providing this value to clients is a motivation for information management in projects. However, data handover is difficult as key people leave before project completion; and different data formats and structures are used in project delivery and operations. Lessons learnt from leading practice include defining data requirements at the outset, getting operations teams involved early, shaping the evolution of interoperable systems and standards, developing handover processes to check data rather than documentation, and fostering skills to use and update project data in operations
Resumo:
In the last decade, a vast number of land surface schemes has been designed for use in global climate models, atmospheric weather prediction, mesoscale numerical models, ecological models, and models of global changes. Since land surface schemes are designed for different purposes they have various levels of complexity in the treatment of bare soil processes, vegetation, and soil water movement. This paper is a contribution to a little group of papers dealing with intercomparison of differently designed and oriented land surface schemes. For that purpose we have chosen three schemes for classification: i) global climate models, BATS (Dickinson et al., 1986; Dickinson et al., 1992); ii) mesoscale and ecological models, LEAF (Lee, 1992) and iii) mesoscale models, LAPS (Mihailović, 1996; Mihailović and Kallos, 1997; Mihailović et al., 1999) according to the Shao et al. (1995) classification. These schemes were compared using surface fluxes and leaf temperature outputs obtained by time integrations of data sets derived from the micrometeorological measurements above a maize field at an experimental site in De Sinderhoeve (The Netherlands) for 18 August, 8 September, and 4 October 1988. Finally, comparison of the schemes was supported applying a simple statistical analysis on the surface flux outputs.
Resumo:
The purpose of this lecture is to review recent development in data analysis, initialization and data assimilation. The development of 3-dimensional multivariate schemes has been very timely because of its suitability to handle the many different types of observations during FGGE. Great progress has taken place in the initialization of global models by the aid of non-linear normal mode technique. However, in spite of great progress, several fundamental problems are still unsatisfactorily solved. Of particular importance is the question of the initialization of the divergent wind fields in the Tropics and to find proper ways to initialize weather systems driven by non-adiabatic processes. The unsatisfactory ways in which such processes are being initialized are leading to excessively long spin-up times.
Resumo:
In this paper we report on a study conducted using the Middle Atmospheric Nitrogen TRend Assessment (MANTRA) balloon measurements of stratospheric constituents and temperature and the Canadian Middle Atmosphere Model (CMAM). Three different kinds of data are used to assess the inter-consistency of the combined dataset: single profiles of long-lived species from MANTRA 1998, sparse climatologies from the ozonesonde measurements during the four MANTRA campaigns and from HALOE satellite measurements, and the CMAM climatology. In doing so, we evaluate the ability of the model to reproduce the measured fields and to thereby test our ability to describe mid-latitude summertime stratospheric processes. The MANTRA campaigns were conducted at Vanscoy, Saskatchewan, Canada (52◦ N, 107◦ W)in late August and early September of 1998, 2000, 2002 and 2004. During late summer at mid-latitudes, the stratosphere is close to photochemical control, providing an ideal scenario for the study reported here. From this analysis we find that: (1) reducing the value for the vertical diffusion coefficient in CMAM to a more physically reasonable value results in the model better reproducing the measured profiles of long-lived species; (2) the existence of compact correlations among the constituents, as expected from independent measurements in the literature and from models, confirms the self-consistency of the MANTRA measurements; and (3) the 1998 measurements show structures in the chemical species profiles that can be associated with transport, adding to the growing evidence that the summertime stratosphere can be much more disturbed than anticipated. The mechanisms responsible for such disturbances need to be understood in order to assess the representativeness of the measurements and to isolate longterm trends.
Resumo:
A novel version of the classical surface pressure tendency equation (PTE) is applied to ERA-Interim reanalysis data to quantitatively assess the contribution of diabatic processes to the deepening of extratropical cyclones relative to effects of temperature advection and vertical motions. The five cyclone cases selected, Lothar and Martin in December 1999, Kyrill in January 2007, Klaus in January 2009, and Xynthia in February 2010, all showed explosive deepening and brought considerable damage to parts of Europe. For Xynthia, Klaus and Lothar diabatic processes contribute more to the observed surface pressure fall than horizontal temperature advection during their respective explosive deepening phases, while Kyrill and Martin appear to be more baroclinically driven storms. The powerful new diagnostic tool presented here can easily be applied to large numbers of cyclones and will help to better understand the role of diabatic processes in future changes in extratropical storminess.
Resumo:
It is widely accepted that some of the most accurate Value-at-Risk (VaR) estimates are based on an appropriately specified GARCH process. But when the forecast horizon is greater than the frequency of the GARCH model, such predictions have typically required time-consuming simulations of the aggregated returns distributions. This paper shows that fast, quasi-analytic GARCH VaR calculations can be based on new formulae for the first four moments of aggregated GARCH returns. Our extensive empirical study compares the Cornish–Fisher expansion with the Johnson SU distribution for fitting distributions to analytic moments of normal and Student t, symmetric and asymmetric (GJR) GARCH processes to returns data on different financial assets, for the purpose of deriving accurate GARCH VaR forecasts over multiple horizons and significance levels.
Resumo:
Sub-seasonal variability including equatorial waves significantly influence the dehydration and transport processes in the tropical tropopause layer (TTL). This study investigates the wave activity in the TTL in 7 reanalysis data sets (RAs; NCEP1, NCEP2, ERA40, ERA-Interim, JRA25, MERRA, and CFSR) and 4 chemistry climate models (CCMs; CCSRNIES, CMAM, MRI, and WACCM) using the zonal wave number-frequency spectral analysis method with equatorially symmetric-antisymmetric decomposition. Analyses are made for temperature and horizontal winds at 100 hPa in the RAs and CCMs and for outgoing longwave radiation (OLR), which is a proxy for convective activity that generates tropopause-level disturbances, in satellite data and the CCMs. Particular focus is placed on equatorial Kelvin waves, mixed Rossby-gravity (MRG) waves, and the Madden-Julian Oscillation (MJO). The wave activity is defined as the variance, i.e., the power spectral density integrated in a particular zonal wave number-frequency region. It is found that the TTL wave activities show significant difference among the RAs, ranging from ∼0.7 (for NCEP1 and NCEP2) to ∼1.4 (for ERA-Interim, MERRA, and CFSR) with respect to the averages from the RAs. The TTL activities in the CCMs lie generally within the range of those in the RAs, with a few exceptions. However, the spectral features in OLR for all the CCMs are very different from those in the observations, and the OLR wave activities are too low for CCSRNIES, CMAM, and MRI. It is concluded that the broad range of wave activity found in the different RAs decreases our confidence in their validity and in particular their value for validation of CCM performance in the TTL, thereby limiting our quantitative understanding of the dehydration and transport processes in the TTL.
Resumo:
The use of Bayesian inference in the inference of time-frequency representations has, thus far, been limited to offline analysis of signals, using a smoothing spline based model of the time-frequency plane. In this paper we introduce a new framework that allows the routine use of Bayesian inference for online estimation of the time-varying spectral density of a locally stationary Gaussian process. The core of our approach is the use of a likelihood inspired by a local Whittle approximation. This choice, along with the use of a recursive algorithm for non-parametric estimation of the local spectral density, permits the use of a particle filter for estimating the time-varying spectral density online. We provide demonstrations of the algorithm through tracking chirps and the analysis of musical data.
Resumo:
During the last termination (from ~18 000 years ago to ~9000 years ago), the climate significantly warmed and the ice sheets melted. Simultaneously, atmospheric CO2 increased from ~190 ppm to ~260 ppm. Although this CO2 rise plays an important role in the deglacial warming, the reasons for its evolution are difficult to explain. Only box models have been used to run transient simulations of this carbon cycle transition, but by forcing the model with data constrained scenarios of the evolution of temperature, sea level, sea ice, NADW formation, Southern Ocean vertical mixing and biological carbon pump. More complex models (including GCMs) have investigated some of these mechanisms but they have only been used to try and explain LGM versus present day steady-state climates. In this study we use a coupled climate-carbon model of intermediate complexity to explore the role of three oceanic processes in transient simulations: the sinking of brines, stratification-dependent diffusion and iron fertilization. Carbonate compensation is accounted for in these simulations. We show that neither iron fertilization nor the sinking of brines alone can account for the evolution of CO2, and that only the combination of the sinking of brines and interactive diffusion can simultaneously simulate the increase in deep Southern Ocean δ13C. The scenario that agrees best with the data takes into account all mechanisms and favours a rapid cessation of the sinking of brines around 18 000 years ago, when the Antarctic ice sheet extent was at its maximum. In this scenario, we make the hypothesis that sea ice formation was then shifted to the open ocean where the salty water is quickly mixed with fresher water, which prevents deep sinking of salty water and therefore breaks down the deep stratification and releases carbon from the abyss. Based on this scenario, it is possible to simulate both the amplitude and timing of the long-term CO2 increase during the last termination in agreement with ice core data. The atmospheric δ13C appears to be highly sensitive to changes in the terrestrial biosphere, underlining the need to better constrain the vegetation evolution during the termination.