965 resultados para Integer-Valued Time Series
Resumo:
This work is supported by Brazilian agencies Fapesp, CAPES and CNPq
Resumo:
Background: A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results: In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions: This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them.
Resumo:
In this work we compared the estimates of the parameters of ARCH models using a complete Bayesian method and an empirical Bayesian method in which we adopted a non-informative prior distribution and informative prior distribution, respectively. We also considered a reparameterization of those models in order to map the space of the parameters into real space. This procedure permits choosing prior normal distributions for the transformed parameters. The posterior summaries were obtained using Monte Carlo Markov chain methods (MCMC). The methodology was evaluated by considering the Telebras series from the Brazilian financial market. The results show that the two methods are able to adjust ARCH models with different numbers of parameters. The empirical Bayesian method provided a more parsimonious model to the data and better adjustment than the complete Bayesian method.
Resumo:
The ubiquity of time series data across almost all human endeavors has produced a great interest in time series data mining in the last decade. While dozens of classification algorithms have been applied to time series, recent empirical evidence strongly suggests that simple nearest neighbor classification is exceptionally difficult to beat. The choice of distance measure used by the nearest neighbor algorithm is important, and depends on the invariances required by the domain. For example, motion capture data typically requires invariance to warping, and cardiology data requires invariance to the baseline (the mean value). Similarly, recent work suggests that for time series clustering, the choice of clustering algorithm is much less important than the choice of distance measure used.In this work we make a somewhat surprising claim. There is an invariance that the community seems to have missed, complexity invariance. Intuitively, the problem is that in many domains the different classes may have different complexities, and pairs of complex objects, even those which subjectively may seem very similar to the human eye, tend to be further apart under current distance measures than pairs of simple objects. This fact introduces errors in nearest neighbor classification, where some complex objects may be incorrectly assigned to a simpler class. Similarly, for clustering this effect can introduce errors by “suggesting” to the clustering algorithm that subjectively similar, but complex objects belong in a sparser and larger diameter cluster than is truly warranted.We introduce the first complexity-invariant distance measure for time series, and show that it generally produces significant improvements in classification and clustering accuracy. We further show that this improvement does not compromise efficiency, since we can lower bound the measure and use a modification of triangular inequality, thus making use of most existing indexing and data mining algorithms. We evaluate our ideas with the largest and most comprehensive set of time series mining experiments ever attempted in a single work, and show that complexity-invariant distance measures can produce improvements in classification and clustering in the vast majority of cases.
Resumo:
This work proposes a system for classification of industrial steel pieces by means of magnetic nondestructive device. The proposed classification system presents two main stages, online system stage and off-line system stage. In online stage, the system classifies inputs and saves misclassification information in order to perform posterior analyses. In the off-line optimization stage, the topology of a Probabilistic Neural Network is optimized by a Feature Selection algorithm combined with the Probabilistic Neural Network to increase the classification rate. The proposed Feature Selection algorithm searches for the signal spectrogram by combining three basic elements: a Sequential Forward Selection algorithm, a Feature Cluster Grow algorithm with classification rate gradient analysis and a Sequential Backward Selection. Also, a trash-data recycling algorithm is proposed to obtain the optimal feedback samples selected from the misclassified ones.
Resumo:
This work provides a forward step in the study and comprehension of the relationships between stochastic processes and a certain class of integral-partial differential equation, which can be used in order to model anomalous diffusion and transport in statistical physics. In the first part, we brought the reader through the fundamental notions of probability and stochastic processes, stochastic integration and stochastic differential equations as well. In particular, within the study of H-sssi processes, we focused on fractional Brownian motion (fBm) and its discrete-time increment process, the fractional Gaussian noise (fGn), which provide examples of non-Markovian Gaussian processes. The fGn, together with stationary FARIMA processes, is widely used in the modeling and estimation of long-memory, or long-range dependence (LRD). Time series manifesting long-range dependence, are often observed in nature especially in physics, meteorology, climatology, but also in hydrology, geophysics, economy and many others. We deepely studied LRD, giving many real data examples, providing statistical analysis and introducing parametric methods of estimation. Then, we introduced the theory of fractional integrals and derivatives, which indeed turns out to be very appropriate for studying and modeling systems with long-memory properties. After having introduced the basics concepts, we provided many examples and applications. For instance, we investigated the relaxation equation with distributed order time-fractional derivatives, which describes models characterized by a strong memory component and can be used to model relaxation in complex systems, which deviates from the classical exponential Debye pattern. Then, we focused in the study of generalizations of the standard diffusion equation, by passing through the preliminary study of the fractional forward drift equation. Such generalizations have been obtained by using fractional integrals and derivatives of distributed orders. In order to find a connection between the anomalous diffusion described by these equations and the long-range dependence, we introduced and studied the generalized grey Brownian motion (ggBm), which is actually a parametric class of H-sssi processes, which have indeed marginal probability density function evolving in time according to a partial integro-differential equation of fractional type. The ggBm is of course Non-Markovian. All around the work, we have remarked many times that, starting from a master equation of a probability density function f(x,t), it is always possible to define an equivalence class of stochastic processes with the same marginal density function f(x,t). All these processes provide suitable stochastic models for the starting equation. Studying the ggBm, we just focused on a subclass made up of processes with stationary increments. The ggBm has been defined canonically in the so called grey noise space. However, we have been able to provide a characterization notwithstanding the underline probability space. We also pointed out that that the generalized grey Brownian motion is a direct generalization of a Gaussian process and in particular it generalizes Brownain motion and fractional Brownain motion as well. Finally, we introduced and analyzed a more general class of diffusion type equations related to certain non-Markovian stochastic processes. We started from the forward drift equation, which have been made non-local in time by the introduction of a suitable chosen memory kernel K(t). The resulting non-Markovian equation has been interpreted in a natural way as the evolution equation of the marginal density function of a random time process l(t). We then consider the subordinated process Y(t)=X(l(t)) where X(t) is a Markovian diffusion. The corresponding time-evolution of the marginal density function of Y(t) is governed by a non-Markovian Fokker-Planck equation which involves the same memory kernel K(t). We developed several applications and derived the exact solutions. Moreover, we considered different stochastic models for the given equations, providing path simulations.
Resumo:
In the present work we perform an econometric analysis of the Tribal art market. To this aim, we use a unique and original database that includes information on Tribal art market auctions worldwide from 1998 to 2011. In Literature, art prices are modelled through the hedonic regression model, a classic fixed-effect model. The main drawback of the hedonic approach is the large number of parameters, since, in general, art data include many categorical variables. In this work, we propose a multilevel model for the analysis of Tribal art prices that takes into account the influence of time on artwork prices. In fact, it is natural to assume that time exerts an influence over the price dynamics in various ways. Nevertheless, since the set of objects change at every auction date, we do not have repeated measurements of the same items over time. Hence, the dataset does not constitute a proper panel; rather, it has a two-level structure in that items, level-1 units, are grouped in time points, level-2 units. The main theoretical contribution is the extension of classical multilevel models to cope with the case described above. In particular, we introduce a model with time dependent random effects at the second level. We propose a novel specification of the model, derive the maximum likelihood estimators and implement them through the E-M algorithm. We test the finite sample properties of the estimators and the validity of the own-written R-code by means of a simulation study. Finally, we show that the new model improves considerably the fit of the Tribal art data with respect to both the hedonic regression model and the classic multilevel model.
A Phase Space Box-counting based Method for Arrhythmia Prediction from Electrocardiogram Time Series
Resumo:
Arrhythmia is one kind of cardiovascular diseases that give rise to the number of deaths and potentially yields immedicable danger. Arrhythmia is a life threatening condition originating from disorganized propagation of electrical signals in heart resulting in desynchronization among different chambers of the heart. Fundamentally, the synchronization process means that the phase relationship of electrical activities between the chambers remains coherent, maintaining a constant phase difference over time. If desynchronization occurs due to arrhythmia, the coherent phase relationship breaks down resulting in chaotic rhythm affecting the regular pumping mechanism of heart. This phenomenon was explored by using the phase space reconstruction technique which is a standard analysis technique of time series data generated from nonlinear dynamical system. In this project a novel index is presented for predicting the onset of ventricular arrhythmias. Analysis of continuously captured long-term ECG data recordings was conducted up to the onset of arrhythmia by the phase space reconstruction method, obtaining 2-dimensional images, analysed by the box counting method. The method was tested using the ECG data set of three different kinds including normal (NR), Ventricular Tachycardia (VT), Ventricular Fibrillation (VF), extracted from the Physionet ECG database. Statistical measures like mean (μ), standard deviation (σ) and coefficient of variation (σ/μ) for the box-counting in phase space diagrams are derived for a sliding window of 10 beats of ECG signal. From the results of these statistical analyses, a threshold was derived as an upper bound of Coefficient of Variation (CV) for box-counting of ECG phase portraits which is capable of reliably predicting the impeding arrhythmia long before its actual occurrence. As future work of research, it was planned to validate this prediction tool over a wider population of patients affected by different kind of arrhythmia, like atrial fibrillation, bundle and brunch block, and set different thresholds for them, in order to confirm its clinical applicability.
Resumo:
The main objective of this thesis is to explore the short and long run causality patterns in the finance – growth nexus and finance-growth-trade nexus before and after the global financial crisis, in the case of Albania. To this end we use quarterly data on real GDP, 13 proxy measures for financial development and the trade openness indicator for the period 1998Q1 – 2013Q2 and 1998Q1-2008Q3. Causality patterns will be explored in a VAR-VECM framework. For this purpose we will proceed as follows: (i) testing for the integration order of the variables; (ii) cointegration analysis and (iii) performing Granger causality tests in a VAR-VECM framework. In the finance-growth nexus, empirical evidence suggests for a positive long run relationship between finance and economic growth, with causality running from financial development to economic growth. The global financial crisis seems to have not affected the causality direction in the finance and growth nexus, thus supporting the finance led growth hypothesis in the long run in the case of Albania. In the finance-growth-trade openness nexus, we found evidence for a positive long run relationship the variables, with causality direction depending on the proxy used for financial development. When the pre-crisis sample is considered, we find evidence for causality running from financial development and trade openness to economic growth. The global financial crisis seems to have affected somewhat the causality direction in the finance-growth-trade nexus, which has become sensible to the proxy used for financial development. On the short run, empirical evidence suggests for a clear unidirectional relationship between finance and growth, with causality mostly running from economic growth to financial development. When we consider the per-crisis sub sample results are mixed, depending on the proxy used for financial development. The same results are confirmed when trade openness is taken into account.
Resumo:
The thesis is concerned with local trigonometric regression methods. The aim was to develop a method for extraction of cyclical components in time series. The main results of the thesis are the following. First, a generalization of the filter proposed by Christiano and Fitzgerald is furnished for the smoothing of ARIMA(p,d,q) process. Second, a local trigonometric filter is built, with its statistical properties. Third, they are discussed the convergence properties of trigonometric estimators, and the problem of choosing the order of the model. A large scale simulation experiment has been designed in order to assess the performance of the proposed models and methods. The results show that local trigonometric regression may be a useful tool for periodic time series analysis.
Resumo:
La tesi tratta una panoramica generale sui Time Series database e relativi gestori. Successivamente l'attenzione è focalizzata sul DBMS InfluxDB. Infine viene mostrato un progetto che implementa InfluxDB
Resumo:
Currently, a variety of linear and nonlinear measures is in use to investigate spatiotemporal interrelation patterns of multivariate time series. Whereas the former are by definition insensitive to nonlinear effects, the latter detect both nonlinear and linear interrelation. In the present contribution we employ a uniform surrogate-based approach, which is capable of disentangling interrelations that significantly exceed random effects and interrelations that significantly exceed linear correlation. The bivariate version of the proposed framework is explored using a simple model allowing for separate tuning of coupling and nonlinearity of interrelation. To demonstrate applicability of the approach to multivariate real-world time series we investigate resting state functional magnetic resonance imaging (rsfMRI) data of two healthy subjects as well as intracranial electroencephalograms (iEEG) of two epilepsy patients with focal onset seizures. The main findings are that for our rsfMRI data interrelations can be described by linear cross-correlation. Rejection of the null hypothesis of linear iEEG interrelation occurs predominantly for epileptogenic tissue as well as during epileptic seizures.