12 resultados para Longitudinal Data Analysis and Time Series

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract Background A popular model for gene regulatory networks is the Boolean network model. In this paper, we propose an algorithm to perform an analysis of gene regulatory interactions using the Boolean network model and time-series data. Actually, the Boolean network is restricted in the sense that only a subset of all possible Boolean functions are considered. We explore some mathematical properties of the restricted Boolean networks in order to avoid the full search approach. The problem is modeled as a Constraint Satisfaction Problem (CSP) and CSP techniques are used to solve it. Results We applied the proposed algorithm in two data sets. First, we used an artificial dataset obtained from a model for the budding yeast cell cycle. The second data set is derived from experiments performed using HeLa cells. The results show that some interactions can be fully or, at least, partially determined under the Boolean model considered. Conclusions The algorithm proposed can be used as a first step for detection of gene/protein interactions. It is able to infer gene relationships from time-series data of gene expression, and this inference process can be aided by a priori knowledge available.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Each plasma physics laboratory has a proprietary scheme to control and data acquisition system. Usually, it is different from one laboratory to another. It means that each laboratory has its own way to control the experiment and retrieving data from the database. Fusion research relies to a great extent on international collaboration and this private system makes it difficult to follow the work remotely. The TCABR data analysis and acquisition system has been upgraded to support a joint research programme using remote participation technologies. The choice of MDSplus (Model Driven System plus) is proved by the fact that it is widely utilized, and the scientists from different institutions may use the same system in different experiments in different tokamaks without the need to know how each system treats its acquisition system and data analysis. Another important point is the fact that the MDSplus has a library system that allows communication between different types of language (JAVA, Fortran, C, C++, Python) and programs such as MATLAB, IDL, OCTAVE. In the case of tokamak TCABR interfaces (object of this paper) between the system already in use and MDSplus were developed, instead of using the MDSplus at all stages, from the control, and data acquisition to the data analysis. This was done in the way to preserve a complex system already in operation and otherwise it would take a long time to migrate. This implementation also allows add new components using the MDSplus fully at all stages. (c) 2012 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Complexity in time series is an intriguing feature of living dynamical systems, with potential use for identification of system state. Although various methods have been proposed for measuring physiologic complexity, uncorrelated time series are often assigned high values of complexity, errouneously classifying them as a complex physiological signals. Here, we propose and discuss a method for complex system analysis based on generalized statistical formalism and surrogate time series. Sample entropy (SampEn) was rewritten inspired in Tsallis generalized entropy, as function of q parameter (qSampEn). qSDiff curves were calculated, which consist of differences between original and surrogate series qSampEn. We evaluated qSDiff for 125 real heart rate variability (HRV) dynamics, divided into groups of 70 healthy, 44 congestive heart failure (CHF), and 11 atrial fibrillation (AF) subjects, and for simulated series of stochastic and chaotic process. The evaluations showed that, for nonperiodic signals, qSDiff curves have a maximum point (qSDiff(max)) for q not equal 1. Values of q where the maximum point occurs and where qSDiff is zero were also evaluated. Only qSDiff(max) values were capable of distinguish HRV groups (p-values 5.10 x 10(-3); 1.11 x 10(-7), and 5.50 x 10(-7) for healthy vs. CHF, healthy vs. AF, and CHF vs. AF, respectively), consistently with the concept of physiologic complexity, and suggests a potential use for chaotic system analysis. (C) 2012 American Institute of Physics. [http://dx.doi.org/10.1063/1.4758815]

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The scope of this paper was to analyze the association between homicides and public security indicators in Sao Paulo between 1996 and 2008, after monitoring the unemployment rate and the proportion of youths in the population. A time-series ecological study for 1996 and 2008 was conducted with Sao Paulo as the unit of analysis. Dependent variable: number of deaths by homicide per year. Main independent variables: arrest-incarceration rate, access to firearms, police activity. Data analysis was conducted using Stata. IC 10.0 software. Simple and multivariate negative binomial regression models were created. Deaths by homicide and arrest-incarceration, as well as police activity were significantly associated in simple regression analysis. Access to firearms was not significantly associated to the reduction in the number of deaths by homicide (p>0,05). After adjustment, the associations with both the public security indicators were not significant. In Sao Paulo the role of public security indicators are less important as explanatory factors for a reduction in homicide rates, after adjustment for unemployment rate and a reduction in the proportion of youths. The results reinforce the importance of socioeconomic and demographic factors for a change in the public security scenario in Sao Paulo.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Brazil is the largest sugarcane producer in the world and has a privileged position to attend to national and international market places. To maintain the high production of sugarcane, it is fundamental to improve the forecasting models of crop seasons through the use of alternative technologies, such as remote sensing. Thus, the main purpose of this article is to assess the results of two different statistical forecasting methods applied to an agroclimatic index (the water requirement satisfaction index; WRSI) and the sugarcane spectral response (normalized difference vegetation index; NDVI) registered on National Oceanic and Atmospheric Administration Advanced Very High Resolution Radiometer (NOAA-AVHRR) satellite images. We also evaluated the cross-correlation between these two indexes. According to the results obtained, there are meaningful correlations between NDVI and WRSI with time lags. Additionally, the adjusted model for NDVI presented more accurate results than the forecasting models for WRSI. Finally, the analyses indicate that NDVI is more predictable due to its seasonality and the WRSI values are more variable making it difficult to forecast.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we present approximate distributions for the ratio of the cumulative wavelet periodograms considering stationary and non-stationary time series generated from independent Gaussian processes. We also adapt an existing procedure to use this statistic and its approximate distribution in order to test if two regularly or irregularly spaced time series are realizations of the same generating process. Simulation studies show good size and power properties for the test statistic. An application with financial microdata illustrates the test usefulness. We conclude advocating the use of these approximate distributions instead of the ones obtained through randomizations, mainly in the case of irregular time series. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results: In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions: This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work proposes a system for classification of industrial steel pieces by means of magnetic nondestructive device. The proposed classification system presents two main stages, online system stage and off-line system stage. In online stage, the system classifies inputs and saves misclassification information in order to perform posterior analyses. In the off-line optimization stage, the topology of a Probabilistic Neural Network is optimized by a Feature Selection algorithm combined with the Probabilistic Neural Network to increase the classification rate. The proposed Feature Selection algorithm searches for the signal spectrogram by combining three basic elements: a Sequential Forward Selection algorithm, a Feature Cluster Grow algorithm with classification rate gradient analysis and a Sequential Backward Selection. Also, a trash-data recycling algorithm is proposed to obtain the optimal feedback samples selected from the misclassified ones.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The autoregressive (AR) estimator, a non-parametric method, is used to analyze functional magnetic resonance imaging (fMRI) data. The same method has been used, with success, in several other time series data analysis. It uses exclusively the available experimental data points to estimate the most plausible power spectra compatible with the experimental data and there is no need to make any assumption about non-measured points. The time series, obtained from fMRI block paradigm data, is analyzed by the AR method to determine the brain active regions involved in the processing of a given stimulus. This method is considerably more reliable than the fast Fourier transform or the parametric methods. The time series corresponding to each image pixel is analyzed using the AR estimator and the corresponding poles are obtained. The pole distribution gives the shape of power spectra, and the pixels with poles at the stimulation frequency are considered as the active regions. The method was applied in simulated and real data, its superiority is shown by the receiver operating characteristic curves which were obtained using the simulated data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Current scientific applications have been producing large amounts of data. The processing, handling and analysis of such data require large-scale computing infrastructures such as clusters and grids. In this area, studies aim at improving the performance of data-intensive applications by optimizing data accesses. In order to achieve this goal, distributed storage systems have been considering techniques of data replication, migration, distribution, and access parallelism. However, the main drawback of those studies is that they do not take into account application behavior to perform data access optimization. This limitation motivated this paper which applies strategies to support the online prediction of application behavior in order to optimize data access operations on distributed systems, without requiring any information on past executions. In order to accomplish such a goal, this approach organizes application behaviors as time series and, then, analyzes and classifies those series according to their properties. By knowing properties, the approach selects modeling techniques to represent series and perform predictions, which are, later on, used to optimize data access operations. This new approach was implemented and evaluated using the OptorSim simulator, sponsored by the LHC-CERN project and widely employed by the scientific community. Experiments confirm this new approach reduces application execution time in about 50 percent, specially when handling large amounts of data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract Background For analyzing longitudinal familial data we adopted a log-linear form to incorporate heterogeneity in genetic variance components over the time, and additionally a serial correlation term in the genetic effects at different levels of ages. Due to the availability of multiple measures on the same individual, we permitted environmental correlations that may change across time. Results Systolic blood pressure from family members from the first and second cohort was used in the current analysis. Measures of subjects receiving hypertension treatment were set as censored values and they were corrected. An initial check of the variance and covariance functions proposed for analyzing longitudinal familial data, using empirical semi-variogram plots, indicated that the observed trait dispersion pattern follows the assumptions adopted. Conclusion The corrections for censored phenotypes based on ordinary linear models may be an appropriate simple model to correct the data, ensuring that the original variability in the data was retained. In addition, empirical semi-variogram plots are useful for diagnosis of the (co)variance model adopted.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work we compared the estimates of the parameters of ARCH models using a complete Bayesian method and an empirical Bayesian method in which we adopted a non-informative prior distribution and informative prior distribution, respectively. We also considered a reparameterization of those models in order to map the space of the parameters into real space. This procedure permits choosing prior normal distributions for the transformed parameters. The posterior summaries were obtained using Monte Carlo Markov chain methods (MCMC). The methodology was evaluated by considering the Telebras series from the Brazilian financial market. The results show that the two methods are able to adjust ARCH models with different numbers of parameters. The empirical Bayesian method provided a more parsimonious model to the data and better adjustment than the complete Bayesian method.