966 resultados para Time Series Models
Resumo:
The ubiquity of time series data across almost all human endeavors has produced a great interest in time series data mining in the last decade. While dozens of classification algorithms have been applied to time series, recent empirical evidence strongly suggests that simple nearest neighbor classification is exceptionally difficult to beat. The choice of distance measure used by the nearest neighbor algorithm is important, and depends on the invariances required by the domain. For example, motion capture data typically requires invariance to warping, and cardiology data requires invariance to the baseline (the mean value). Similarly, recent work suggests that for time series clustering, the choice of clustering algorithm is much less important than the choice of distance measure used.In this work we make a somewhat surprising claim. There is an invariance that the community seems to have missed, complexity invariance. Intuitively, the problem is that in many domains the different classes may have different complexities, and pairs of complex objects, even those which subjectively may seem very similar to the human eye, tend to be further apart under current distance measures than pairs of simple objects. This fact introduces errors in nearest neighbor classification, where some complex objects may be incorrectly assigned to a simpler class. Similarly, for clustering this effect can introduce errors by “suggesting” to the clustering algorithm that subjectively similar, but complex objects belong in a sparser and larger diameter cluster than is truly warranted.We introduce the first complexity-invariant distance measure for time series, and show that it generally produces significant improvements in classification and clustering accuracy. We further show that this improvement does not compromise efficiency, since we can lower bound the measure and use a modification of triangular inequality, thus making use of most existing indexing and data mining algorithms. We evaluate our ideas with the largest and most comprehensive set of time series mining experiments ever attempted in a single work, and show that complexity-invariant distance measures can produce improvements in classification and clustering in the vast majority of cases.
Resumo:
This work proposes a system for classification of industrial steel pieces by means of magnetic nondestructive device. The proposed classification system presents two main stages, online system stage and off-line system stage. In online stage, the system classifies inputs and saves misclassification information in order to perform posterior analyses. In the off-line optimization stage, the topology of a Probabilistic Neural Network is optimized by a Feature Selection algorithm combined with the Probabilistic Neural Network to increase the classification rate. The proposed Feature Selection algorithm searches for the signal spectrogram by combining three basic elements: a Sequential Forward Selection algorithm, a Feature Cluster Grow algorithm with classification rate gradient analysis and a Sequential Backward Selection. Also, a trash-data recycling algorithm is proposed to obtain the optimal feedback samples selected from the misclassified ones.
A Phase Space Box-counting based Method for Arrhythmia Prediction from Electrocardiogram Time Series
Resumo:
Arrhythmia is one kind of cardiovascular diseases that give rise to the number of deaths and potentially yields immedicable danger. Arrhythmia is a life threatening condition originating from disorganized propagation of electrical signals in heart resulting in desynchronization among different chambers of the heart. Fundamentally, the synchronization process means that the phase relationship of electrical activities between the chambers remains coherent, maintaining a constant phase difference over time. If desynchronization occurs due to arrhythmia, the coherent phase relationship breaks down resulting in chaotic rhythm affecting the regular pumping mechanism of heart. This phenomenon was explored by using the phase space reconstruction technique which is a standard analysis technique of time series data generated from nonlinear dynamical system. In this project a novel index is presented for predicting the onset of ventricular arrhythmias. Analysis of continuously captured long-term ECG data recordings was conducted up to the onset of arrhythmia by the phase space reconstruction method, obtaining 2-dimensional images, analysed by the box counting method. The method was tested using the ECG data set of three different kinds including normal (NR), Ventricular Tachycardia (VT), Ventricular Fibrillation (VF), extracted from the Physionet ECG database. Statistical measures like mean (μ), standard deviation (σ) and coefficient of variation (σ/μ) for the box-counting in phase space diagrams are derived for a sliding window of 10 beats of ECG signal. From the results of these statistical analyses, a threshold was derived as an upper bound of Coefficient of Variation (CV) for box-counting of ECG phase portraits which is capable of reliably predicting the impeding arrhythmia long before its actual occurrence. As future work of research, it was planned to validate this prediction tool over a wider population of patients affected by different kind of arrhythmia, like atrial fibrillation, bundle and brunch block, and set different thresholds for them, in order to confirm its clinical applicability.
Resumo:
The main objective of this thesis is to explore the short and long run causality patterns in the finance – growth nexus and finance-growth-trade nexus before and after the global financial crisis, in the case of Albania. To this end we use quarterly data on real GDP, 13 proxy measures for financial development and the trade openness indicator for the period 1998Q1 – 2013Q2 and 1998Q1-2008Q3. Causality patterns will be explored in a VAR-VECM framework. For this purpose we will proceed as follows: (i) testing for the integration order of the variables; (ii) cointegration analysis and (iii) performing Granger causality tests in a VAR-VECM framework. In the finance-growth nexus, empirical evidence suggests for a positive long run relationship between finance and economic growth, with causality running from financial development to economic growth. The global financial crisis seems to have not affected the causality direction in the finance and growth nexus, thus supporting the finance led growth hypothesis in the long run in the case of Albania. In the finance-growth-trade openness nexus, we found evidence for a positive long run relationship the variables, with causality direction depending on the proxy used for financial development. When the pre-crisis sample is considered, we find evidence for causality running from financial development and trade openness to economic growth. The global financial crisis seems to have affected somewhat the causality direction in the finance-growth-trade nexus, which has become sensible to the proxy used for financial development. On the short run, empirical evidence suggests for a clear unidirectional relationship between finance and growth, with causality mostly running from economic growth to financial development. When we consider the per-crisis sub sample results are mixed, depending on the proxy used for financial development. The same results are confirmed when trade openness is taken into account.
Resumo:
Zeitreihen sind allgegenwärtig. Die Erfassung und Verarbeitung kontinuierlich gemessener Daten ist in allen Bereichen der Naturwissenschaften, Medizin und Finanzwelt vertreten. Das enorme Anwachsen aufgezeichneter Datenmengen, sei es durch automatisierte Monitoring-Systeme oder integrierte Sensoren, bedarf außerordentlich schneller Algorithmen in Theorie und Praxis. Infolgedessen beschäftigt sich diese Arbeit mit der effizienten Berechnung von Teilsequenzalignments. Komplexe Algorithmen wie z.B. Anomaliedetektion, Motivfabfrage oder die unüberwachte Extraktion von prototypischen Bausteinen in Zeitreihen machen exzessiven Gebrauch von diesen Alignments. Darin begründet sich der Bedarf nach schnellen Implementierungen. Diese Arbeit untergliedert sich in drei Ansätze, die sich dieser Herausforderung widmen. Das umfasst vier Alignierungsalgorithmen und ihre Parallelisierung auf CUDA-fähiger Hardware, einen Algorithmus zur Segmentierung von Datenströmen und eine einheitliche Behandlung von Liegruppen-wertigen Zeitreihen.rnrnDer erste Beitrag ist eine vollständige CUDA-Portierung der UCR-Suite, die weltführende Implementierung von Teilsequenzalignierung. Das umfasst ein neues Berechnungsschema zur Ermittlung lokaler Alignierungsgüten unter Verwendung z-normierten euklidischen Abstands, welches auf jeder parallelen Hardware mit Unterstützung für schnelle Fouriertransformation einsetzbar ist. Des Weiteren geben wir eine SIMT-verträgliche Umsetzung der Lower-Bound-Kaskade der UCR-Suite zur effizienten Berechnung lokaler Alignierungsgüten unter Dynamic Time Warping an. Beide CUDA-Implementierungen ermöglichen eine um ein bis zwei Größenordnungen schnellere Berechnung als etablierte Methoden.rnrnAls zweites untersuchen wir zwei Linearzeit-Approximierungen für das elastische Alignment von Teilsequenzen. Auf der einen Seite behandeln wir ein SIMT-verträgliches Relaxierungschema für Greedy DTW und seine effiziente CUDA-Parallelisierung. Auf der anderen Seite führen wir ein neues lokales Abstandsmaß ein, den Gliding Elastic Match (GEM), welches mit der gleichen asymptotischen Zeitkomplexität wie Greedy DTW berechnet werden kann, jedoch eine vollständige Relaxierung der Penalty-Matrix bietet. Weitere Verbesserungen umfassen Invarianz gegen Trends auf der Messachse und uniforme Skalierung auf der Zeitachse. Des Weiteren wird eine Erweiterung von GEM zur Multi-Shape-Segmentierung diskutiert und auf Bewegungsdaten evaluiert. Beide CUDA-Parallelisierung verzeichnen Laufzeitverbesserungen um bis zu zwei Größenordnungen.rnrnDie Behandlung von Zeitreihen beschränkt sich in der Literatur in der Regel auf reellwertige Messdaten. Der dritte Beitrag umfasst eine einheitliche Methode zur Behandlung von Liegruppen-wertigen Zeitreihen. Darauf aufbauend werden Distanzmaße auf der Rotationsgruppe SO(3) und auf der euklidischen Gruppe SE(3) behandelt. Des Weiteren werden speichereffiziente Darstellungen und gruppenkompatible Erweiterungen elastischer Maße diskutiert.
Resumo:
La tesi tratta una panoramica generale sui Time Series database e relativi gestori. Successivamente l'attenzione è focalizzata sul DBMS InfluxDB. Infine viene mostrato un progetto che implementa InfluxDB
Resumo:
Currently, a variety of linear and nonlinear measures is in use to investigate spatiotemporal interrelation patterns of multivariate time series. Whereas the former are by definition insensitive to nonlinear effects, the latter detect both nonlinear and linear interrelation. In the present contribution we employ a uniform surrogate-based approach, which is capable of disentangling interrelations that significantly exceed random effects and interrelations that significantly exceed linear correlation. The bivariate version of the proposed framework is explored using a simple model allowing for separate tuning of coupling and nonlinearity of interrelation. To demonstrate applicability of the approach to multivariate real-world time series we investigate resting state functional magnetic resonance imaging (rsfMRI) data of two healthy subjects as well as intracranial electroencephalograms (iEEG) of two epilepsy patients with focal onset seizures. The main findings are that for our rsfMRI data interrelations can be described by linear cross-correlation. Rejection of the null hypothesis of linear iEEG interrelation occurs predominantly for epileptogenic tissue as well as during epileptic seizures.
Resumo:
Model based calibration has gained popularity in recent years as a method to optimize increasingly complex engine systems. However virtually all model based techniques are applied to steady state calibration. Transient calibration is by and large an emerging technology. An important piece of any transient calibration process is the ability to constrain the optimizer to treat the problem as a dynamic one and not as a quasi-static process. The optimized air-handling parameters corresponding to any instant of time must be achievable in a transient sense; this in turn depends on the trajectory of the same parameters over previous time instances. In this work dynamic constraint models have been proposed to translate commanded to actually achieved air-handling parameters. These models enable the optimization to be realistic in a transient sense. The air handling system has been treated as a linear second order system with PD control. Parameters for this second order system have been extracted from real transient data. The model has been shown to be the best choice relative to a list of appropriate candidates such as neural networks and first order models. The selected second order model was used in conjunction with transient emission models to predict emissions over the FTP cycle. It has been shown that emission predictions based on air-handing parameters predicted by the dynamic constraint model do not differ significantly from corresponding emissions based on measured air-handling parameters.
Resumo:
The original cefepime product was withdrawn from the Swiss market in January 2007 and replaced by a generic 10 months later. The goals of the study were to assess the impact of this cefepime shortage on the use and costs of alternative broad-spectrum antibiotics, on antibiotic policy, and on resistance of Pseudomonas aeruginosa toward carbapenems, ceftazidime, and piperacillin-tazobactam. A generalized regression-based interrupted time series model assessed how much the shortage changed the monthly use and costs of cefepime and of selected alternative broad-spectrum antibiotics (ceftazidime, imipenem-cilastatin, meropenem, piperacillin-tazobactam) in 15 Swiss acute care hospitals from January 2005 to December 2008. Resistance of P. aeruginosa was compared before and after the cefepime shortage. There was a statistically significant increase in the consumption of piperacillin-tazobactam in hospitals with definitive interruption of cefepime supply and of meropenem in hospitals with transient interruption of cefepime supply. Consumption of each alternative antibiotic tended to increase during the cefepime shortage and to decrease when the cefepime generic was released. These shifts were associated with significantly higher overall costs. There was no significant change in hospitals with uninterrupted cefepime supply. The alternative antibiotics for which an increase in consumption showed the strongest association with a progression of resistance were the carbapenems. The use of alternative antibiotics after cefepime withdrawal was associated with a significant increase in piperacillin-tazobactam and meropenem use and in overall costs and with a decrease in susceptibility of P. aeruginosa in hospitals. This warrants caution with regard to shortages and withdrawals of antibiotics.
Resumo:
Prediction of glycemic profile is an important task for both early recognition of hypoglycemia and enhancement of the control algorithms for optimization of insulin infusion rate. Adaptive models for glucose prediction and recognition of hypoglycemia based on statistical and artificial intelligence techniques are presented.