7 resultados para financial time series prediction
em Collection Of Biostatistics Research Archive
Resumo:
Visualization and exploratory analysis is an important part of any data analysis and is made more challenging when the data are voluminous and high-dimensional. One such example is environmental monitoring data, which are often collected over time and at multiple locations, resulting in a geographically indexed multivariate time series. Financial data, although not necessarily containing a geographic component, present another source of high-volume multivariate time series data. We present the mvtsplot function which provides a method for visualizing multivariate time series data. We outline the basic design concepts and provide some examples of its usage by applying it to a database of ambient air pollution measurements in the United States and to a hypothetical portfolio of stocks.
Resumo:
Boston Harbor has had a history of poor water quality, including contamination by enteric pathogens. We conduct a statistical analysis of data collected by the Massachusetts Water Resources Authority (MWRA) between 1996 and 2002 to evaluate the effects of court-mandated improvements in sewage treatment. Motivated by the ineffectiveness of standard Poisson mixture models and their zero-inflated counterparts, we propose a new negative binomial model for time series of Enterococcus counts in Boston Harbor, where nonstationarity and autocorrelation are modeled using a nonparametric smooth function of time in the predictor. Without further restrictions, this function is not identifiable in the presence of time-dependent covariates; consequently we use a basis orthogonal to the space spanned by the covariates and use penalized quasi-likelihood (PQL) for estimation. We conclude that Enterococcus counts were greatly reduced near the Nut Island Treatment Plant (NITP) outfalls following the transfer of wastewaters from NITP to the Deer Island Treatment Plant (DITP) and that the transfer of wastewaters from Boston Harbor to the offshore diffusers in Massachusetts Bay reduced the Enterococcus counts near the DITP outfalls.
Resumo:
Multi-site time series studies of air pollution and mortality and morbidity have figured prominently in the literature as comprehensive approaches for estimating acute effects of air pollution on health. Hierarchical models are generally used to combine site-specific information and estimate pooled air pollution effects taking into account both within-site statistical uncertainty, and across-site heterogeneity. Within a site, characteristics of time series data of air pollution and health (small pollution effects, missing data, highly correlated predictors, non linear confounding etc.) make modelling all sources of uncertainty challenging. One potential consequence is underestimation of the statistical variance of the site-specific effects to be combined. In this paper we investigate the impact of variance underestimation on the pooled relative rate estimate. We focus on two-stage normal-normal hierarchical models and on under- estimation of the statistical variance at the first stage. By mathematical considerations and simulation studies, we found that variance underestimation does not affect the pooled estimate substantially. However, some sensitivity of the pooled estimate to variance underestimation is observed when the number of sites is small and underestimation is severe. These simulation results are applicable to any two-stage normal-normal hierarchical model for combining information of site-specific results, and they can be easily extended to more general hierarchical formulations. We also examined the impact of variance underestimation on the national average relative rate estimate from the National Morbidity Mortality Air Pollution Study and we found that variance underestimation as much as 40% has little effect on the national average.
Resumo:
While many time-series studies of ozone and daily mortality identified positive associations,others yielded null or inconclusive results. We performed a meta-analysis of 144 effect estimates from 39 time-series studies, and estimated pooled effects by lags, age groups,cause-specific mortality, and concentration metrics. We compared results to estimates from the National Morbidity, Mortality, and Air Pollution Study (NMMAPS), a time-series study of 95 large U.S. cities from 1987 to 2000. Both meta-analysis and NMMAPS results provided strong evidence of a short-term association between ozone and mortality, with larger effects for cardiovascular and respiratory mortality, the elderly, and current day ozone exposure as compared to other single day lags. In both analyses, results were not sensitive to adjustment for particulate matter and model specifications. In the meta-analysis we found that a 10 ppb increase in daily ozone is associated with a 0.83 (95% confidence interval: 0.53, 1.12%) increase in total mortality, whereas the corresponding NMMAPS estimate is 0.25%(0.12, 0.39%). Meta-analysis results were consistently larger than those from NMMAPS,indicating publication bias. Additional publication bias is evident regarding the choice of lags in time-series studies, and the larger heterogeneity in posterior city-specific estimates in the meta-analysis, as compared with NMAMPS.
Resumo:
Granger causality (GC) is a statistical technique used to estimate temporal associations in multivariate time series. Many applications and extensions of GC have been proposed since its formulation by Granger in 1969. Here we control for potentially mediating or confounding associations between time series in the context of event-related electrocorticographic (ECoG) time series. A pruning approach to remove spurious connections and simultaneously reduce the required number of estimations to fit the effective connectivity graph is proposed. Additionally, we consider the potential of adjusted GC applied to independent components as a method to explore temporal relationships between underlying source signals. Both approaches overcome limitations encountered when estimating many parameters in multivariate time-series data, an increasingly common predicament in today's brain mapping studies.
Resumo:
A time series is a sequence of observations made over time. Examples in public health include daily ozone concentrations, weekly admissions to an emergency department or annual expenditures on health care in the United States. Time series models are used to describe the dependence of the response at each time on predictor variables including covariates and possibly previous values in the series. Time series methods are necessary to account for the correlation among repeated responses over time. This paper gives an overview of time series ideas and methods used in public health research.