962 resultados para Biomedical time series classification


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Time series classification has been extensively explored in many fields of study. Most methods are based on the historical or current information extracted from data. However, if interest is in a specific future time period, methods that directly relate to forecasts of time series are much more appropriate. An approach to time series classification is proposed based on a polarization measure of forecast densities of time series. By fitting autoregressive models, forecast replicates of each time series are obtained via the bias-corrected bootstrap, and a stationarity correction is considered when necessary. Kernel estimators are then employed to approximate forecast densities, and discrepancies of forecast densities of pairs of time series are estimated by a polarization measure, which evaluates the extent to which two densities overlap. Following the distributional properties of the polarization measure, a discriminant rule and a clustering method are proposed to conduct the supervised and unsupervised classification, respectively. The proposed methodology is applied to both simulated and real data sets, and the results show desirable properties.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we propose a novel family of kernels for multivariate time-series classification problems. Each time-series is approximated by a linear combination of piecewise polynomial functions in a Reproducing Kernel Hilbert Space by a novel kernel interpolation technique. Using the associated kernel function a large margin classification formulation is proposed which can discriminate between two classes. The formulation leads to kernels, between two multivariate time-series, which can be efficiently computed. The kernels have been successfully applied to writer independent handwritten character recognition.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we consider the problem of time series classification. Using piecewise linear interpolation various novel kernels are obtained which can be used with Support vector machines for designing classifiers capable of deciding the class of a given time series. The approach is general and is applicable in many scenarios. We apply the method to the task of Online Tamil handwritten character recognition with promising results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Time series classification deals with the problem of classification of data that is multivariate in nature. This means that one or more of the attributes is in the form of a sequence. The notion of similarity or distance, used in time series data, is significant and affects the accuracy, time, and space complexity of the classification algorithm. There exist numerous similarity measures for time series data, but each of them has its own disadvantages. Instead of relying upon a single similarity measure, our aim is to find the near optimal solution to the classification problem by combining different similarity measures. In this work, we use genetic algorithms to combine the similarity measures so as to get the best performance. The weightage given to different similarity measures evolves over a number of generations so as to get the best combination. We test our approach on a number of benchmark time series datasets and present promising results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis addresses the problem of information hiding in low dimensional digital data focussing on issues of privacy and security in Electronic Patient Health Records (EPHRs). The thesis proposes a new security protocol based on data hiding techniques for EPHRs. This thesis contends that embedding of sensitive patient information inside the EPHR is the most appropriate solution currently available to resolve the issues of security in EPHRs. Watermarking techniques are applied to one-dimensional time series data such as the electroencephalogram (EEG) to show that they add a level of confidence (in terms of privacy and security) in an individual’s diverse bio-profile (the digital fingerprint of an individual’s medical history), ensure belief that the data being analysed does indeed belong to the correct person, and also that it is not being accessed by unauthorised personnel. Embedding information inside single channel biomedical time series data is more difficult than the standard application for images due to the reduced redundancy. A data hiding approach which has an in built capability to protect against illegal data snooping is developed. The capability of this secure method is enhanced by embedding not just a single message but multiple messages into an example one-dimensional EEG signal. Embedding multiple messages of similar characteristics, for example identities of clinicians accessing the medical record helps in creating a log of access while embedding multiple messages of dissimilar characteristics into an EPHR enhances confidence in the use of the EPHR. The novel method of embedding multiple messages of both similar and dissimilar characteristics into a single channel EEG demonstrated in this thesis shows how this embedding of data boosts the implementation and use of the EPHR securely.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The problem of classification of time series data is an interesting problem in the field of data mining. Even though several algorithms have been proposed for the problem of time series classification we have developed an innovative algorithm which is computationally fast and accurate in several cases when compared with 1NN classifier. In our method we are calculating the fuzzy membership of each test pattern to be classified to each class. We have experimented with 6 benchmark datasets and compared our method with 1NN classifier.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A new test of hypothesis for classifying stationary time series based on the bias-adjusted estimators of the fitted autoregressive model is proposed. It is shown theoretically that the proposed test has desirable properties. Simulation results show that when time series are short, the size and power estimates of the proposed test are reasonably good, and thus this test is reliable in discriminating between short-length time series. As the length of the time series increases, the performance of the proposed test improves, but the benefit of bias-adjustment reduces. The proposed hypothesis test is applied to two real data sets: the annual real GDP per capita of six European countries, and quarterly real GDP per capita of five European countries. The application results demonstrate that the proposed test displays reasonably good performance in classifying relatively short time series.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work applies a variety of multilinear function factorisation techniques to extract appropriate features or attributes from high dimensional multivariate time series for classification. Recently, a great deal of work has centred around designing time series classifiers using more and more complex feature extraction and machine learning schemes. This paper argues that complex learners and domain specific feature extraction schemes of this type are not necessarily needed for time series classification, as excellent classification results can be obtained by simply applying a number of existing matrix factorisation or linear projection techniques, which are simple and computationally inexpensive. We highlight this using a geometric separability measure and classification accuracies obtained though experiments on four different high dimensional multivariate time series datasets. © 2013 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The ubiquity of time series data across almost all human endeavors has produced a great interest in time series data mining in the last decade. While dozens of classification algorithms have been applied to time series, recent empirical evidence strongly suggests that simple nearest neighbor classification is exceptionally difficult to beat. The choice of distance measure used by the nearest neighbor algorithm is important, and depends on the invariances required by the domain. For example, motion capture data typically requires invariance to warping, and cardiology data requires invariance to the baseline (the mean value). Similarly, recent work suggests that for time series clustering, the choice of clustering algorithm is much less important than the choice of distance measure used.In this work we make a somewhat surprising claim. There is an invariance that the community seems to have missed, complexity invariance. Intuitively, the problem is that in many domains the different classes may have different complexities, and pairs of complex objects, even those which subjectively may seem very similar to the human eye, tend to be further apart under current distance measures than pairs of simple objects. This fact introduces errors in nearest neighbor classification, where some complex objects may be incorrectly assigned to a simpler class. Similarly, for clustering this effect can introduce errors by “suggesting” to the clustering algorithm that subjectively similar, but complex objects belong in a sparser and larger diameter cluster than is truly warranted.We introduce the first complexity-invariant distance measure for time series, and show that it generally produces significant improvements in classification and clustering accuracy. We further show that this improvement does not compromise efficiency, since we can lower bound the measure and use a modification of triangular inequality, thus making use of most existing indexing and data mining algorithms. We evaluate our ideas with the largest and most comprehensive set of time series mining experiments ever attempted in a single work, and show that complexity-invariant distance measures can produce improvements in classification and clustering in the vast majority of cases.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Introduction The acute health effects of heatwaves in a subtropical climate and their impact on emergency departments (ED) are not well known. The purpose of this study is to examine overt heat-related presentations to EDs associated with heatwaves in Brisbane. Methods Data were obtained for the summer seasons (December to February) from 2000-2012. Heatwave events were defined as two or more successive days with daily maximum temperature >=34[degree sign]C (HWD1) or >=37[degree sign]C (HWD2). Poisson generalised additive model was used to assess the effect of heatwaves on heat-related visits (International Classification of Diseases (ICD) 10 codes T67 and X30; ICD 9 codes 992 and E900.0). Results Overall, 628 cases presented for heat-related illnesses. The presentations significantly increased on heatwave days based on HWD1 (relative risk (RR) = 4.9, 95% confidence interval (CI): 3.8, 6.3) and HWD2 (RR = 18.5, 95% CI: 12.0, 28.4). The RRs in different age groups ranged between 3-9.2 (HWD1) and 7.5-37.5 (HWD2). High acuity visits significantly increased based on HWD1 (RR = 4.7, 95% CI: 2.3, 9.6) and HWD2 (RR = 81.7, 95% CI: 21.5, 310.0). Average length of stay in ED significantly increased by >1 hour (HWD1) and >2 hours (HWD2). Conclusions Heatwaves significantly increase ED visits and workload even in a subtropical climate. The degree of impact is directly related to the extent of temperature increases and varies by socio-demographic characteristics of the patients. Heatwave action plans should be tailored according to the population needs and level of vulnerability. EDs should have plans to increase their surge capacity during heatwaves.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The applicability of ultra-short-term wind power prediction (USTWPP) models is reviewed. The USTWPP method proposed extracts featrues from historical data of wind power time series (WPTS), and classifies every short WPTS into one of several different subsets well defined by stationary patterns. All the WPTS that cannot match any one of the stationary patterns are sorted into the subset of nonstationary pattern. Every above WPTS subset needs a USTWPP model specially optimized for it offline. For on-line application, the pattern of the last short WPTS is recognized, then the corresponding prediction model is called for USTWPP. The validity of the proposed method is verified by simulations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work proposes a system for classification of industrial steel pieces by means of magnetic nondestructive device. The proposed classification system presents two main stages, online system stage and off-line system stage. In online stage, the system classifies inputs and saves misclassification information in order to perform posterior analyses. In the off-line optimization stage, the topology of a Probabilistic Neural Network is optimized by a Feature Selection algorithm combined with the Probabilistic Neural Network to increase the classification rate. The proposed Feature Selection algorithm searches for the signal spectrogram by combining three basic elements: a Sequential Forward Selection algorithm, a Feature Cluster Grow algorithm with classification rate gradient analysis and a Sequential Backward Selection. Also, a trash-data recycling algorithm is proposed to obtain the optimal feedback samples selected from the misclassified ones.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A time series is a sequence of observations made over time. Examples in public health include daily ozone concentrations, weekly admissions to an emergency department or annual expenditures on health care in the United States. Time series models are used to describe the dependence of the response at each time on predictor variables including covariates and possibly previous values in the series. Time series methods are necessary to account for the correlation among repeated responses over time. This paper gives an overview of time series ideas and methods used in public health research.