976 resultados para time series data
Resumo:
Min/max autocorrelation factor analysis (MAFA) and dynamic factor analysis (DFA) are complementary techniques for analysing short (> 15-25 y), non-stationary, multivariate data sets. We illustrate the two techniques using catch rate (cpue) time-series (1982-2001) for 17 species caught during trawl surveys off Mauritania, with the NAO index, an upwelling index, sea surface temperature, and an index of fishing effort as explanatory variables. Both techniques gave coherent results, the most important common trend being a decrease in cpue during the latter half of the time-series, and the next important being an increase during the first half. A DFA model with SST and UPW as explanatory variables and two common trends gave good fits to most of the cpue time-series. (c) 2004 International Council for the Exploration of the Sea. Published by Elsevier Ltd. All rights reserved.
Resumo:
In this paper, we use time series analysis to evaluate predictive scenarios using search engine transactional logs. Our goal is to develop models for the analysis of searchers’ behaviors over time and investigate if time series analysis is a valid method for predicting relationships between searcher actions. Time series analysis is a method often used to understand the underlying characteristics of temporal data in order to make forecasts. In this study, we used a Web search engine transactional log and time series analysis to investigate users’ actions. We conducted our analysis in two phases. In the initial phase, we employed a basic analysis and found that 10% of searchers clicked on sponsored links. However, from 22:00 to 24:00, searchers almost exclusively clicked on the organic links, with almost no clicks on sponsored links. In the second and more extensive phase, we used a one-step prediction time series analysis method along with a transfer function method. The period rarely affects navigational and transactional queries, while rates for transactional queries vary during different periods. Our results show that the average length of a searcher session is approximately 2.9 interactions and that this average is consistent across time periods. Most importantly, our findings shows that searchers who submit the shortest queries (i.e., in number of terms) click on highest ranked results. We discuss implications, including predictive value, and future research.
Resumo:
Financial processes may possess long memory and their probability densities may display heavy tails. Many models have been developed to deal with this tail behaviour, which reflects the jumps in the sample paths. On the other hand, the presence of long memory, which contradicts the efficient market hypothesis, is still an issue for further debates. These difficulties present challenges with the problems of memory detection and modelling the co-presence of long memory and heavy tails. This PhD project aims to respond to these challenges. The first part aims to detect memory in a large number of financial time series on stock prices and exchange rates using their scaling properties. Since financial time series often exhibit stochastic trends, a common form of nonstationarity, strong trends in the data can lead to false detection of memory. We will take advantage of a technique known as multifractal detrended fluctuation analysis (MF-DFA) that can systematically eliminate trends of different orders. This method is based on the identification of scaling of the q-th-order moments and is a generalisation of the standard detrended fluctuation analysis (DFA) which uses only the second moment; that is, q = 2. We also consider the rescaled range R/S analysis and the periodogram method to detect memory in financial time series and compare their results with the MF-DFA. An interesting finding is that short memory is detected for stock prices of the American Stock Exchange (AMEX) and long memory is found present in the time series of two exchange rates, namely the French franc and the Deutsche mark. Electricity price series of the five states of Australia are also found to possess long memory. For these electricity price series, heavy tails are also pronounced in their probability densities. The second part of the thesis develops models to represent short-memory and longmemory financial processes as detected in Part I. These models take the form of continuous-time AR(∞) -type equations whose kernel is the Laplace transform of a finite Borel measure. By imposing appropriate conditions on this measure, short memory or long memory in the dynamics of the solution will result. A specific form of the models, which has a good MA(∞) -type representation, is presented for the short memory case. Parameter estimation of this type of models is performed via least squares, and the models are applied to the stock prices in the AMEX, which have been established in Part I to possess short memory. By selecting the kernel in the continuous-time AR(∞) -type equations to have the form of Riemann-Liouville fractional derivative, we obtain a fractional stochastic differential equation driven by Brownian motion. This type of equations is used to represent financial processes with long memory, whose dynamics is described by the fractional derivative in the equation. These models are estimated via quasi-likelihood, namely via a continuoustime version of the Gauss-Whittle method. The models are applied to the exchange rates and the electricity prices of Part I with the aim of confirming their possible long-range dependence established by MF-DFA. The third part of the thesis provides an application of the results established in Parts I and II to characterise and classify financial markets. We will pay attention to the New York Stock Exchange (NYSE), the American Stock Exchange (AMEX), the NASDAQ Stock Exchange (NASDAQ) and the Toronto Stock Exchange (TSX). The parameters from MF-DFA and those of the short-memory AR(∞) -type models will be employed in this classification. We propose the Fisher discriminant algorithm to find a classifier in the two and three-dimensional spaces of data sets and then provide cross-validation to verify discriminant accuracies. This classification is useful for understanding and predicting the behaviour of different processes within the same market. The fourth part of the thesis investigates the heavy-tailed behaviour of financial processes which may also possess long memory. We consider fractional stochastic differential equations driven by stable noise to model financial processes such as electricity prices. The long memory of electricity prices is represented by a fractional derivative, while the stable noise input models their non-Gaussianity via the tails of their probability density. A method using the empirical densities and MF-DFA will be provided to estimate all the parameters of the model and simulate sample paths of the equation. The method is then applied to analyse daily spot prices for five states of Australia. Comparison with the results obtained from the R/S analysis, periodogram method and MF-DFA are provided. The results from fractional SDEs agree with those from MF-DFA, which are based on multifractal scaling, while those from the periodograms, which are based on the second order, seem to underestimate the long memory dynamics of the process. This highlights the need and usefulness of fractal methods in modelling non-Gaussian financial processes with long memory.
Resumo:
We evaluate the performance of several specification tests for Markov regime-switching time-series models. We consider the Lagrange multiplier (LM) and dynamic specification tests of Hamilton (1996) and Ljung–Box tests based on both the generalized residual and a standard-normal residual constructed using the Rosenblatt transformation. The size and power of the tests are studied using Monte Carlo experiments. We find that the LM tests have the best size and power properties. The Ljung–Box tests exhibit slight size distortions, though tests based on the Rosenblatt transformation perform better than the generalized residual-based tests. The tests exhibit impressive power to detect both autocorrelation and autoregressive conditional heteroscedasticity (ARCH). The tests are illustrated with a Markov-switching generalized ARCH (GARCH) model fitted to the US dollar–British pound exchange rate, with the finding that both autocorrelation and GARCH effects are needed to adequately fit the data.
Resumo:
Background It remains unclear over whether it is possible to develop an epidemic forecasting model for transmission of dengue fever in Queensland, Australia. Objectives To examine the potential impact of El Niño/Southern Oscillation on the transmission of dengue fever in Queensland, Australia and explore the possibility of developing a forecast model of dengue fever. Methods Data on the Southern Oscillation Index (SOI), an indicator of El Niño/Southern Oscillation activity, were obtained from the Australian Bureau of Meteorology. Numbers of dengue fever cases notified and the numbers of postcode areas with dengue fever cases between January 1993 and December 2005 were obtained from the Queensland Health and relevant population data were obtained from the Australia Bureau of Statistics. A multivariate Seasonal Auto-regressive Integrated Moving Average model was developed and validated by dividing the data file into two datasets: the data from January 1993 to December 2003 were used to construct a model and those from January 2004 to December 2005 were used to validate it. Results A decrease in the average SOI (ie, warmer conditions) during the preceding 3–12 months was significantly associated with an increase in the monthly numbers of postcode areas with dengue fever cases (β=−0.038; p = 0.019). Predicted values from the Seasonal Auto-regressive Integrated Moving Average model were consistent with the observed values in the validation dataset (root-mean-square percentage error: 1.93%). Conclusions Climate variability is directly and/or indirectly associated with dengue transmission and the development of an SOI-based epidemic forecasting system is possible for dengue fever in Queensland, Australia.
Resumo:
ABSTRACT Objectives: To investigate the effect of hot and cold temperatures on ambulance attendances. Design: An ecological time series study. Setting and participants: The study was conducted in Brisbane, Australia. We collected information on 783 935 daily ambulance attendances, along with data of associated meteorological variables and air pollutants, for the period of 2000–2007. Outcome measures: The total number of ambulance attendances was examined, along with those related to cardiovascular, respiratory and other non-traumatic conditions. Generalised additive models were used to assess the relationship between daily mean temperature and the number of ambulance attendances. Results: There were statistically significant relationships between mean temperature and ambulance attendances for all categories. Acute heat effects were found with a 1.17% (95% CI: 0.86%, 1.48%) increase in total attendances for 1 °C increase above threshold (0–1 days lag). Cold effects were delayed and longer lasting with a 1.30% (0.87%, 1.73%) increase in total attendances for a 1 °C decrease below the threshold (2–15 days lag). Harvesting was observed following initial acute periods of heat effects, but not for cold effects. Conclusions: This study shows that both hot and cold temperatures led to increases in ambulance attendances for different medical conditions. Our findings support the notion that ambulance attendance records are a valid and timely source of data for use in the development of local weather/health early warning systems.
Resumo:
Background: Extreme temperatures are associated with cardiovascular disease (CVD) deaths. Previous studies have investigated the relative CVD mortality risk of temperature, but this risk is heavily influenced by deaths in frail elderly persons. To better estimate the burden of extreme temperatures we estimated their effects on years of life lost due to CVD. Methods and Results: The data were daily observations on weather and CVD mortality for Brisbane, Australia between 1996 and 2004. We estimated the association between daily mean temperature and years of life lost due to CVD, after adjusting for trend, season, day of the week, and humidity. To examine the non-linear and delayed effects of temperature, a distributed lag non-linear model was used. The model’s residuals were examined to investigate if there were any added effects due to cold spells and heat waves. The exposure-response curve between temperature and years of life lost was U-shaped, with the lowest years of life lost at 24 °C. The curve had a sharper rise at extremes of heat than of cold. The effect of cold peaked two days after exposure, whereas the greatest effect of heat occurred on the day of exposure. There were significantly added effects of heat waves on years of life lost. Conclusions: Increased years of life lost due to CVD are associated with both cold and hot temperatures. Research on specific interventions is needed to reduce temperature-related years of life lost from CVD deaths.
Resumo:
Background: Malaria is a major public health burden in the tropics with the potential to significantly increase in response to climate change. Analyses of data from the recent past can elucidate how short-term variations in weather factors affect malaria transmission. This study explored the impact of climate variability on the transmission of malaria in the tropical rain forest area of Mengla County, south-west China. Methods: Ecological time-series analysis was performed on data collected between 1971 and 1999. Auto-regressive integrated moving average (ARIMA) models were used to evaluate the relationship between weather factors and malaria incidence. Results: At the time scale of months, the predictors for malaria incidence included: minimum temperature, maximum temperature, and fog day frequency. The effect of minimum temperature on malaria incidence was greater in the cool months than in the hot months. The fog day frequency in October had a positive effect on malaria incidence in May of the following year. At the time scale of years, the annual fog day frequency was the only weather predictor of the annual incidence of malaria. Conclusion: Fog day frequency was for the first time found to be a predictor of malaria incidence in a rain forest area. The one-year delayed effect of fog on malaria transmission may involve providing water input and maintaining aquatic breeding sites for mosquitoes in vulnerable times when there is little rainfall in the 6-month dry seasons. These findings should be considered in the prediction of future patterns of malaria for similar tropical rain forest areas worldwide.
Resumo:
A satellite based observation system can continuously or repeatedly generate a user state vector time series that may contain useful information. One typical example is the collection of International GNSS Services (IGS) station daily and weekly combined solutions. Another example is the epoch-by-epoch kinematic position time series of a receiver derived by a GPS real time kinematic (RTK) technique. Although some multivariate analysis techniques have been adopted to assess the noise characteristics of multivariate state time series, statistic testings are limited to univariate time series. After review of frequently used hypotheses test statistics in univariate analysis of GNSS state time series, the paper presents a number of T-squared multivariate analysis statistics for use in the analysis of multivariate GNSS state time series. These T-squared test statistics have taken the correlation between coordinate components into account, which is neglected in univariate analysis. Numerical analysis was conducted with the multi-year time series of an IGS station to schematically demonstrate the results from the multivariate hypothesis testing in comparison with the univariate hypothesis testing results. The results have demonstrated that, in general, the testing for multivariate mean shifts and outliers tends to reject less data samples than the testing for univariate mean shifts and outliers under the same confidence level. It is noted that neither univariate nor multivariate data analysis methods are intended to replace physical analysis. Instead, these should be treated as complementary statistical methods for a prior or posteriori investigations. Physical analysis is necessary subsequently to refine and interpret the results.
Resumo:
Most studies examining the temperature–mortality association in a city used temperatures from one site or the average from a network of sites. This may cause measurement error as temperature varies across a city due to effects such as urban heat islands. We examined whether spatiotemporal models using spatially resolved temperatures produced different associations between temperature and mortality compared with time series models that used non-spatial temperatures. We obtained daily mortality data in 163 areas across Brisbane city, Australia from 2000 to 2004. We used ordinary kriging to interpolate spatial temperature variation across the city based on 19 monitoring sites. We used a spatiotemporal model to examine the impact of spatially resolved temperatures on mortality. Also, we used a time series model to examine non-spatial temperatures using a single site and the average temperature from three sites. We used squared Pearson scaled residuals to compare model fit. We found that kriged temperatures were consistent with observed temperatures. Spatiotemporal models using kriged temperature data yielded slightly better model fit than time series models using a single site or the average of three sites' data. Despite this better fit, spatiotemporal and time series models produced similar associations between temperature and mortality. In conclusion, time series models using non-spatial temperatures were equally good at estimating the city-wide association between temperature and mortality as spatiotemporal models.
Resumo:
Background The association between temperature and mortality has been examined mainly in North America and Europe. However, less evidence is available in developing countries, especially in Thailand. In this study, we examined the relationship between temperature and mortality in Chiang Mai city, Thailand, during 1999–2008. Method A time series model was used to examine the effects of temperature on cause-specific mortality (non-external, cardiopulmonary, cardiovascular, and respiratory) and age-specific non-external mortality (<=64, 65–74, 75–84, and > =85 years), while controlling for relative humidity, air pollution, day of the week, season and long-term trend. We used a distributed lag non-linear model to examine the delayed effects of temperature on mortality up to 21 days. Results We found non-linear effects of temperature on all mortality types and age groups. Both hot and cold temperatures resulted in immediate increase in all mortality types and age groups. Generally, the hot effects on all mortality types and age groups were short-term, while the cold effects lasted longer. The relative risk of non-external mortality associated with cold temperature (19.35°C, 1st percentile of temperature) relative to 24.7°C (25th percentile of temperature) was 1.29 (95% confidence interval (CI): 1.16, 1.44) for lags 0–21. The relative risk of non-external mortality associated with high temperature (31.7°C, 99th percentile of temperature) relative to 28°C (75th percentile of temperature) was 1.11 (95% CI: 1.00, 1.24) for lags 0–21. Conclusion This study indicates that exposure to both hot and cold temperatures were related to increased mortality. Both cold and hot effects occurred immediately but cold effects lasted longer than hot effects. This study provides useful data for policy makers to better prepare local responses to manage the impact of hot and cold temperatures on population health.
Resumo:
Introduction The acute health effects of heatwaves in a subtropical climate and their impact on emergency departments (ED) are not well known. The purpose of this study is to examine overt heat-related presentations to EDs associated with heatwaves in Brisbane. Methods Data were obtained for the summer seasons (December to February) from 2000-2012. Heatwave events were defined as two or more successive days with daily maximum temperature >=34[degree sign]C (HWD1) or >=37[degree sign]C (HWD2). Poisson generalised additive model was used to assess the effect of heatwaves on heat-related visits (International Classification of Diseases (ICD) 10 codes T67 and X30; ICD 9 codes 992 and E900.0). Results Overall, 628 cases presented for heat-related illnesses. The presentations significantly increased on heatwave days based on HWD1 (relative risk (RR) = 4.9, 95% confidence interval (CI): 3.8, 6.3) and HWD2 (RR = 18.5, 95% CI: 12.0, 28.4). The RRs in different age groups ranged between 3-9.2 (HWD1) and 7.5-37.5 (HWD2). High acuity visits significantly increased based on HWD1 (RR = 4.7, 95% CI: 2.3, 9.6) and HWD2 (RR = 81.7, 95% CI: 21.5, 310.0). Average length of stay in ED significantly increased by >1 hour (HWD1) and >2 hours (HWD2). Conclusions Heatwaves significantly increase ED visits and workload even in a subtropical climate. The degree of impact is directly related to the extent of temperature increases and varies by socio-demographic characteristics of the patients. Heatwave action plans should be tailored according to the population needs and level of vulnerability. EDs should have plans to increase their surge capacity during heatwaves.
Resumo:
A new test of hypothesis for classifying stationary time series based on the bias-adjusted estimators of the fitted autoregressive model is proposed. It is shown theoretically that the proposed test has desirable properties. Simulation results show that when time series are short, the size and power estimates of the proposed test are reasonably good, and thus this test is reliable in discriminating between short-length time series. As the length of the time series increases, the performance of the proposed test improves, but the benefit of bias-adjustment reduces. The proposed hypothesis test is applied to two real data sets: the annual real GDP per capita of six European countries, and quarterly real GDP per capita of five European countries. The application results demonstrate that the proposed test displays reasonably good performance in classifying relatively short time series.
Resumo:
Time series classification has been extensively explored in many fields of study. Most methods are based on the historical or current information extracted from data. However, if interest is in a specific future time period, methods that directly relate to forecasts of time series are much more appropriate. An approach to time series classification is proposed based on a polarization measure of forecast densities of time series. By fitting autoregressive models, forecast replicates of each time series are obtained via the bias-corrected bootstrap, and a stationarity correction is considered when necessary. Kernel estimators are then employed to approximate forecast densities, and discrepancies of forecast densities of pairs of time series are estimated by a polarization measure, which evaluates the extent to which two densities overlap. Following the distributional properties of the polarization measure, a discriminant rule and a clustering method are proposed to conduct the supervised and unsupervised classification, respectively. The proposed methodology is applied to both simulated and real data sets, and the results show desirable properties.
Resumo:
Objective: Examining the association between socioeconomic disadvantage and heat-related emergency department (ED) visits during heatwave periods in Brisbane, 2000–2008. Methods: Data from 10 public EDs were analysed using a generalised additive model for disease categories, age groups and gender. Results: Cumulative relative risks (RR) for non-external causes other than cardiovascular and respiratory diseases were 1.11 and 1.05 in most and least disadvantaged areas, respectively. The pattern persisted on lags 0–2. Elevated risks were observed for all age groups above 15 years in all areas. However, with RRs of 1.19–1.28, the 65–74 years age group in more disadvantaged areas stood out, compared with RR=1.08 in less disadvantaged areas. This pattern was observed on lag 0 but did not persist. The RRs for male presentations were 1.10 and 1.04 in most and less disadvantaged areas; for females, RR was 1.04 in less disadvantaged areas. This pattern persisted across lags 0–2. Conclusions: Heat-related ED visits increased during heatwaves. However, due to overlapping confidence intervals, variations across socioeconomic areas should be interpreted cautiously. Implications: ED data may be utilised for monitoring heat-related health impacts, particularly on the first day of heatwaves, to facilitate prompt interventions and targeted resource allocation.