987 resultados para Time-series analyses


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a method to integrate environmental time series into stock assessment models and to test the significance of correlations between population processes and the environmental time series. Parameters that relate the environmental time series to population processes are included in the stock assessment model, and likelihood ratio tests are used to determine if the parameters improve the fit to the data significantly. Two approaches are considered to integrate the environmental relationship. In the environmental model, the population dynamics process (e.g. recruitment) is proportional to the environmental variable, whereas in the environmental model with process error it is proportional to the environmental variable, but the model allows an additional temporal variation (process error) constrained by a log-normal distribution. The methods are tested by using simulation analysis and compared to the traditional method of correlating model estimates with environmental variables outside the estimation procedure. In the traditional method, the estimates of recruitment were provided by a model that allowed the recruitment only to have a temporal variation constrained by a log-normal distribution. We illustrate the methods by applying them to test the statistical significance of the correlation between sea-surface temperature (SST) and recruitment to the snapper (Pagrus auratus) stock in the Hauraki Gulf–Bay of Plenty, New Zealand. Simulation analyses indicated that the integrated approach with additional process error is superior to the traditional method of correlating model estimates with environmental variables outside the estimation procedure. The results suggest that, for the snapper stock, recruitment is positively correlated with SST at the time of spawning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We live in an era of abundant data. This has necessitated the development of new and innovative statistical algorithms to get the most from experimental data. For example, faster algorithms make practical the analysis of larger genomic data sets, allowing us to extend the utility of cutting-edge statistical methods. We present a randomised algorithm that accelerates the clustering of time series data using the Bayesian Hierarchical Clustering (BHC) statistical method. BHC is a general method for clustering any discretely sampled time series data. In this paper we focus on a particular application to microarray gene expression data. We define and analyse the randomised algorithm, before presenting results on both synthetic and real biological data sets. We show that the randomised algorithm leads to substantial gains in speed with minimal loss in clustering quality. The randomised time series BHC algorithm is available as part of the R package BHC, which is available for download from Bioconductor (version 2.10 and above) via http://bioconductor.org/packages/2.10/bioc/html/BHC.html. We have also made available a set of R scripts which can be used to reproduce the analyses carried out in this paper. These are available from the following URL. https://sites.google.com/site/randomisedbhc/.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is estimated that the quantity of digital data being transferred, processed or stored at any one time currently stands at 4.4 zettabytes (4.4 × 2 70 bytes) and this figure is expected to have grown by a factor of 10 to 44 zettabytes by 2020. Exploiting this data is, and will remain, a significant challenge. At present there is the capacity to store 33% of digital data in existence at any one time; by 2020 this capacity is expected to fall to 15%. These statistics suggest that, in the era of Big Data, the identification of important, exploitable data will need to be done in a timely manner. Systems for the monitoring and analysis of data, e.g. stock markets, smart grids and sensor networks, can be made up of massive numbers of individual components. These components can be geographically distributed yet may interact with one another via continuous data streams, which in turn may affect the state of the sender or receiver. This introduces a dynamic causality, which further complicates the overall system by introducing a temporal constraint that is difficult to accommodate. Practical approaches to realising the system described above have led to a multiplicity of analysis techniques, each of which concentrates on specific characteristics of the system being analysed and treats these characteristics as the dominant component affecting the results being sought. The multiplicity of analysis techniques introduces another layer of heterogeneity, that is heterogeneity of approach, partitioning the field to the extent that results from one domain are difficult to exploit in another. The question is asked can a generic solution for the monitoring and analysis of data that: accommodates temporal constraints; bridges the gap between expert knowledge and raw data; and enables data to be effectively interpreted and exploited in a transparent manner, be identified? The approach proposed in this dissertation acquires, analyses and processes data in a manner that is free of the constraints of any particular analysis technique, while at the same time facilitating these techniques where appropriate. Constraints are applied by defining a workflow based on the production, interpretation and consumption of data. This supports the application of different analysis techniques on the same raw data without the danger of incorporating hidden bias that may exist. To illustrate and to realise this approach a software platform has been created that allows for the transparent analysis of data, combining analysis techniques with a maintainable record of provenance so that independent third party analysis can be applied to verify any derived conclusions. In order to demonstrate these concepts, a complex real world example involving the near real-time capturing and analysis of neurophysiological data from a neonatal intensive care unit (NICU) was chosen. A system was engineered to gather raw data, analyse that data using different analysis techniques, uncover information, incorporate that information into the system and curate the evolution of the discovered knowledge. The application domain was chosen for three reasons: firstly because it is complex and no comprehensive solution exists; secondly, it requires tight interaction with domain experts, thus requiring the handling of subjective knowledge and inference; and thirdly, given the dearth of neurophysiologists, there is a real world need to provide a solution for this domain

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Evidence for climate-correlated low frequency variability of various components of marine ecosystems has accumulated rapidly over the past 2 decades. There has also been a growing recognition that society needs to learn how the fluctuations of these various components are linked, and to predict the likely amplitude and steepness of future changes. Demographic characteristics of marine zooplankton make them especially suitable for examining variability of marine ecosystems at interannual to decadal time scales. Their life cycle duration is short enough that there is little carryover of population membership from year to year, but long enough that variability can be tracked with monthly-to-seasonal sampling. Because zooplankton are rarely fished, comparative analysis of changes in their abundance can greatly enhance our ability to evaluate the importance of and interaction between physical environment, food web, and fishery harvest as causal mechanisms driving ecosystem level changes. A number of valuable within-region analyses of zooplankton time series have been published in the past decade, covering a variety of modes of variability including changes in total biomass, changes in size structure and species composition, changes in spatial distribution, and changes in seasonal timing. But because most zooplankton time series are relatively short compared to the time scales of interest, the statistical power of local analyses is often low, and between-region and between-variable comparisons are also needed. In this paper, we review the results of recent within- and between-region analyses, and suggest some priorities for future work.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of this paper is to introduce a diVerent approach, called the ecological-longitudinal, to carrying out pooled analysis in time series ecological studies. Because it gives a larger number of data points and, hence, increases the statistical power of the analysis, this approach, unlike conventional ones, allows the complementation of aspects such as accommodation of random effect models, of lags, of interaction between pollutants and between pollutants and meteorological variables, that are hardly implemented in conventional approaches. Design—The approach is illustrated by providing quantitative estimates of the short-termeVects of air pollution on mortality in three Spanish cities, Barcelona,Valencia and Vigo, for the period 1992–1994. Because the dependent variable was a count, a Poisson generalised linear model was first specified. Several modelling issues are worth mentioning. Firstly, because the relations between mortality and explanatory variables were nonlinear, cubic splines were used for covariate control, leading to a generalised additive model, GAM. Secondly, the effects of the predictors on the response were allowed to occur with some lag. Thirdly, the residual autocorrelation, because of imperfect control, was controlled for by means of an autoregressive Poisson GAM. Finally, the longitudinal design demanded the consideration of the existence of individual heterogeneity, requiring the consideration of mixed models. Main results—The estimates of the relative risks obtained from the individual analyses varied across cities, particularly those associated with sulphur dioxide. The highest relative risks corresponded to black smoke in Valencia. These estimates were higher than those obtained from the ecological-longitudinal analysis. Relative risks estimated from this latter analysis were practically identical across cities, 1.00638 (95% confidence intervals 1.0002, 1.0011) for a black smoke increase of 10 μg/m3 and 1.00415 (95% CI 1.0001, 1.0007) for a increase of 10 μg/m3 of sulphur dioxide. Because the statistical power is higher than in the individual analysis more interactions were statistically significant,especially those among air pollutants and meteorological variables. Conclusions—Air pollutant levels were related to mortality in the three cities of the study, Barcelona, Valencia and Vigo. These results were consistent with similar studies in other cities, with other multicentric studies and coherent with both, previous individual, for each city, and multicentric studies for all three cities

Relevância:

100.00% 100.00%

Publicador:

Resumo:

African societies are dependent on rainfall for agricultural and other water-dependent activities, yet rainfall is extremely variable in both space and time and reoccurring water shocks, such as drought, can have considerable social and economic impacts. To help improve our knowledge of the rainfall climate, we have constructed a 30-year (1983–2012), temporally consistent rainfall dataset for Africa known as TARCAT (TAMSAT African Rainfall Climatology And Time-series) using archived Meteosat thermal infra-red (TIR) imagery, calibrated against rain gauge records collated from numerous African agencies. TARCAT has been produced at 10-day (dekad) scale at a spatial resolution of 0.0375°. An intercomparison of TARCAT from 1983 to 2010 with six long-term precipitation datasets indicates that TARCAT replicates the spatial and seasonal rainfall patterns and interannual variability well, with correlation coefficients of 0.85 and 0.70 with the Climate Research Unit (CRU) and Global Precipitation Climatology Centre (GPCC) gridded-gauge analyses respectively in the interannual variability of the Africa-wide mean monthly rainfall. The design of the algorithm for drought monitoring leads to TARCAT underestimating the Africa-wide mean annual rainfall on average by −0.37 mm day−1 (21%) compared to other datasets. As the TARCAT rainfall estimates are historically calibrated across large climatically homogeneous regions, the data can provide users with robust estimates of climate related risk, even in regions where gauge records are inconsistent in time.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Brazil is the largest sugarcane producer in the world and has a privileged position to attend to national and international market places. To maintain the high production of sugarcane, it is fundamental to improve the forecasting models of crop seasons through the use of alternative technologies, such as remote sensing. Thus, the main purpose of this article is to assess the results of two different statistical forecasting methods applied to an agroclimatic index (the water requirement satisfaction index; WRSI) and the sugarcane spectral response (normalized difference vegetation index; NDVI) registered on National Oceanic and Atmospheric Administration Advanced Very High Resolution Radiometer (NOAA-AVHRR) satellite images. We also evaluated the cross-correlation between these two indexes. According to the results obtained, there are meaningful correlations between NDVI and WRSI with time lags. Additionally, the adjusted model for NDVI presented more accurate results than the forecasting models for WRSI. Finally, the analyses indicate that NDVI is more predictable due to its seasonality and the WRSI values are more variable making it difficult to forecast.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work proposes a system for classification of industrial steel pieces by means of magnetic nondestructive device. The proposed classification system presents two main stages, online system stage and off-line system stage. In online stage, the system classifies inputs and saves misclassification information in order to perform posterior analyses. In the off-line optimization stage, the topology of a Probabilistic Neural Network is optimized by a Feature Selection algorithm combined with the Probabilistic Neural Network to increase the classification rate. The proposed Feature Selection algorithm searches for the signal spectrogram by combining three basic elements: a Sequential Forward Selection algorithm, a Feature Cluster Grow algorithm with classification rate gradient analysis and a Sequential Backward Selection. Also, a trash-data recycling algorithm is proposed to obtain the optimal feedback samples selected from the misclassified ones.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Arrhythmia is one kind of cardiovascular diseases that give rise to the number of deaths and potentially yields immedicable danger. Arrhythmia is a life threatening condition originating from disorganized propagation of electrical signals in heart resulting in desynchronization among different chambers of the heart. Fundamentally, the synchronization process means that the phase relationship of electrical activities between the chambers remains coherent, maintaining a constant phase difference over time. If desynchronization occurs due to arrhythmia, the coherent phase relationship breaks down resulting in chaotic rhythm affecting the regular pumping mechanism of heart. This phenomenon was explored by using the phase space reconstruction technique which is a standard analysis technique of time series data generated from nonlinear dynamical system. In this project a novel index is presented for predicting the onset of ventricular arrhythmias. Analysis of continuously captured long-term ECG data recordings was conducted up to the onset of arrhythmia by the phase space reconstruction method, obtaining 2-dimensional images, analysed by the box counting method. The method was tested using the ECG data set of three different kinds including normal (NR), Ventricular Tachycardia (VT), Ventricular Fibrillation (VF), extracted from the Physionet ECG database. Statistical measures like mean (μ), standard deviation (σ) and coefficient of variation (σ/μ) for the box-counting in phase space diagrams are derived for a sliding window of 10 beats of ECG signal. From the results of these statistical analyses, a threshold was derived as an upper bound of Coefficient of Variation (CV) for box-counting of ECG phase portraits which is capable of reliably predicting the impeding arrhythmia long before its actual occurrence. As future work of research, it was planned to validate this prediction tool over a wider population of patients affected by different kind of arrhythmia, like atrial fibrillation, bundle and brunch block, and set different thresholds for them, in order to confirm its clinical applicability.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The rotational nature of shifting cultivation poses several challenges to its detection by remote sensing. Consequently, there is a lack of spatial data on the dynamics of shifting cultivation landscapes on a regional, i.e. sub-national, or national level. We present an approach based on a time series of Landsat and MODIS data and landscape metrics to delineate the dynamics of shifting cultivation landscapes. Our results reveal that shifting cultivation is a land use system still widely and dynamically utilized in northern Laos. While there is an overall reduction in the areas dominated by shifting cultivation, some regions also show an expansion. A review of relevant reports and articles indicates that policies tend to lead to a reduction while market forces can result in both expansion and reduction. For a better understanding of the different factors affecting shifting cultivation landscapes in Laos, further research should focus on spatially explicit analyses.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

While many time-series studies of ozone and daily mortality identified positive associations,others yielded null or inconclusive results. We performed a meta-analysis of 144 effect estimates from 39 time-series studies, and estimated pooled effects by lags, age groups,cause-specific mortality, and concentration metrics. We compared results to estimates from the National Morbidity, Mortality, and Air Pollution Study (NMMAPS), a time-series study of 95 large U.S. cities from 1987 to 2000. Both meta-analysis and NMMAPS results provided strong evidence of a short-term association between ozone and mortality, with larger effects for cardiovascular and respiratory mortality, the elderly, and current day ozone exposure as compared to other single day lags. In both analyses, results were not sensitive to adjustment for particulate matter and model specifications. In the meta-analysis we found that a 10 ppb increase in daily ozone is associated with a 0.83 (95% confidence interval: 0.53, 1.12%) increase in total mortality, whereas the corresponding NMMAPS estimate is 0.25%(0.12, 0.39%). Meta-analysis results were consistently larger than those from NMMAPS,indicating publication bias. Additional publication bias is evident regarding the choice of lags in time-series studies, and the larger heterogeneity in posterior city-specific estimates in the meta-analysis, as compared with NMAMPS.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the summers of 2001 and 2002, glacio-climatological research was performed at 4110-4120 m a.s.l. on the Belukha snow/firn plateau, Siberian Altai. Hundreds of samples from snow pits and a 21 m snow/firn core were collected to establish the annual/seasonal/monthly depth-accumulation scale, based on stable-isotope records, stratigraphic analyses and meteorological and synoptic data. The fluctuations of water stable-isotope records show well-preserved seasonal variations. The delta(18)O and delta D relationships in precipitation, snow pits and the snow/firn core have the same slope to the covariance as that of the global meteoric water line. The origins of precipitation nourishing the Belukha plateau were determined based on clustering analysis of delta(18)O and d-excess records and examination of synoptic atmospheric patterns. Calibration and validation of the developed clusters occurred at event and monthly timescales with about 15% uncertainty. Two distinct moisture sources were shown: oceanic sources with d-excess < 12 parts per thousand, and the Aral-Caspian closed drainage basin sources with d-excess > 12 parts per thousand. Two-thirds of the annual accumulation was from oceanic precipitation, of which more than half had isotopic ratios corresponding to moisture evaporated over the Atlantic Ocean. Precipitation from the Arctic/Pacific Ocean had the lowest deuterium excess, contributing one-tenth to annual accumulation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The first manuscript, entitled "Time-Series Analysis as Input for Clinical Predictive Modeling: Modeling Cardiac Arrest in a Pediatric ICU" lays out the theoretical background for the project. There are several core concepts presented in this paper. First, traditional multivariate models (where each variable is represented by only one value) provide single point-in-time snapshots of patient status: they are incapable of characterizing deterioration. Since deterioration is consistently identified as a precursor to cardiac arrests, we maintain that the traditional multivariate paradigm is insufficient for predicting arrests. We identify time series analysis as a method capable of characterizing deterioration in an objective, mathematical fashion, and describe how to build a general foundation for predictive modeling using time series analysis results as latent variables. Building a solid foundation for any given modeling task involves addressing a number of issues during the design phase. These include selecting the proper candidate features on which to base the model, and selecting the most appropriate tool to measure them. We also identified several unique design issues that are introduced when time series data elements are added to the set of candidate features. One such issue is in defining the duration and resolution of time series elements required to sufficiently characterize the time series phenomena being considered as candidate features for the predictive model. Once the duration and resolution are established, there must also be explicit mathematical or statistical operations that produce the time series analysis result to be used as a latent candidate feature. In synthesizing the comprehensive framework for building a predictive model based on time series data elements, we identified at least four classes of data that can be used in the model design. The first two classes are shared with traditional multivariate models: multivariate data and clinical latent features. Multivariate data is represented by the standard one value per variable paradigm and is widely employed in a host of clinical models and tools. These are often represented by a number present in a given cell of a table. Clinical latent features derived, rather than directly measured, data elements that more accurately represent a particular clinical phenomenon than any of the directly measured data elements in isolation. The second two classes are unique to the time series data elements. The first of these is the raw data elements. These are represented by multiple values per variable, and constitute the measured observations that are typically available to end users when they review time series data. These are often represented as dots on a graph. The final class of data results from performing time series analysis. This class of data represents the fundamental concept on which our hypothesis is based. The specific statistical or mathematical operations are up to the modeler to determine, but we generally recommend that a variety of analyses be performed in order to maximize the likelihood that a representation of the time series data elements is produced that is able to distinguish between two or more classes of outcomes. The second manuscript, entitled "Building Clinical Prediction Models Using Time Series Data: Modeling Cardiac Arrest in a Pediatric ICU" provides a detailed description, start to finish, of the methods required to prepare the data, build, and validate a predictive model that uses the time series data elements determined in the first paper. One of the fundamental tenets of the second paper is that manual implementations of time series based models are unfeasible due to the relatively large number of data elements and the complexity of preprocessing that must occur before data can be presented to the model. Each of the seventeen steps is analyzed from the perspective of how it may be automated, when necessary. We identify the general objectives and available strategies of each of the steps, and we present our rationale for choosing a specific strategy for each step in the case of predicting cardiac arrest in a pediatric intensive care unit. Another issue brought to light by the second paper is that the individual steps required to use time series data for predictive modeling are more numerous and more complex than those used for modeling with traditional multivariate data. Even after complexities attributable to the design phase (addressed in our first paper) have been accounted for, the management and manipulation of the time series elements (the preprocessing steps in particular) are issues that are not present in a traditional multivariate modeling paradigm. In our methods, we present the issues that arise from the time series data elements: defining a reference time; imputing and reducing time series data in order to conform to a predefined structure that was specified during the design phase; and normalizing variable families rather than individual variable instances. The final manuscript, entitled: "Using Time-Series Analysis to Predict Cardiac Arrest in a Pediatric Intensive Care Unit" presents the results that were obtained by applying the theoretical construct and its associated methods (detailed in the first two papers) to the case of cardiac arrest prediction in a pediatric intensive care unit. Our results showed that utilizing the trend analysis from the time series data elements reduced the number of classification errors by 73%. The area under the Receiver Operating Characteristic curve increased from a baseline of 87% to 98% by including the trend analysis. In addition to the performance measures, we were also able to demonstrate that adding raw time series data elements without their associated trend analyses improved classification accuracy as compared to the baseline multivariate model, but diminished classification accuracy as compared to when just the trend analysis features were added (ie, without adding the raw time series data elements). We believe this phenomenon was largely attributable to overfitting, which is known to increase as the ratio of candidate features to class examples rises. Furthermore, although we employed several feature reduction strategies to counteract the overfitting problem, they failed to improve the performance beyond that which was achieved by exclusion of the raw time series elements. Finally, our data demonstrated that pulse oximetry and systolic blood pressure readings tend to start diminishing about 10-20 minutes before an arrest, whereas heart rates tend to diminish rapidly less than 5 minutes before an arrest.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A monitoring programme for microzooplankton was started at the long-term sampling station ''Kabeltonne'' at Helgoland Roads (54°11.30' N; 7°54.00' E) in January 2007 in order to provide more detailed knowledge on microzooplankton occurrence, composition and seasonality patterns at this site and to complement the existing plankton data series. Ciliate and dinoflagellate cell concentration and carbon biomass were recorded on a weekly basis. Heterotrophic dinoflagellates were considerably more important in terms of biomass than ciliates, especially during the summer months. However, in early spring, ciliates were the major group of microzooplankton grazers as they responded more quickly to phytoplankton food availability. Mixotrophic dinoflagellates played a secondary role in terms of biomass when compared to heterotrophic species; nevertheless, they made up an intense late summer bloom in 2007. The photosynthetic ciliate Myrionecta rubra bloomed at the end of the sampling period. Due to its high biomass when compared to crustacean plankton especially during the spring bloom, microzooplankton should be regarded as the more important phytoplankton grazer group at Helgoland Roads. Based on these results, analyses of biotic and abiotic factors driving microzooplankton composition and abundance are necessary for a full understanding of this important component of the plankton.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The composition and vertical distribution of planktonic ciliates within the surface layer was monitored over four diel cycles in May 95, during the JGOFS-France DYNAPROC cruise in the Ligurian Sea (NW Mediterranean). Ciliates were placed into size and trophic categories: micro- and nano-heterotrophic ciliates, mixotrophic ciliates, tintinnids and the autotrophic Mesodinium rubrum. Mixotrophic ciliates (micro and nano) represented an average of 46% of oligotrich abundance and 39% of oligotrich biomass; nano-ciliates (hetero and mixotrophic) were abundant, representing about 60 and 17% of oligotrich abundance and biomass, respectively. Tintinnid ciliates were a minor part of heterotrophic ciliates. The estimated contribution of mixotrophs to chlorophyll a concentration was modest, never exceeding 9% in discrete samples. Vertical profiles of ciliates showed that chlorophyll-containing ciliates (mixotrophs and autotrophs) were mainly concentrated and remained at the chlorophyll a maximum depth. In contrast, among heterotrophic ciliates, a portion of the population appeared to migrate from 20-30 m depth during the day to the surface at night or in the early morning. Correlation analyses of ciliate groups and phytoplankton pigments showed a strong relationship between nano-ciliates and zeaxanthin, and between chlorophyll-containing ciliates and chlorophyll a, as well as other pigments that were maximal at the chlorophyll a maximum depth. Total surface layer concentrations showed minima of ciliates during nightime/early morning hours.