933 resultados para Data Models
Resumo:
This thesis addresses modeling of financial time series, especially stock market returns and daily price ranges. Modeling data of this kind can be approached with so-called multiplicative error models (MEM). These models nest several well known time series models such as GARCH, ACD and CARR models. They are able to capture many well established features of financial time series including volatility clustering and leptokurtosis. In contrast to these phenomena, different kinds of asymmetries have received relatively little attention in the existing literature. In this thesis asymmetries arise from various sources. They are observed in both conditional and unconditional distributions, for variables with non-negative values and for variables that have values on the real line. In the multivariate context asymmetries can be observed in the marginal distributions as well as in the relationships of the variables modeled. New methods for all these cases are proposed. Chapter 2 considers GARCH models and modeling of returns of two stock market indices. The chapter introduces the so-called generalized hyperbolic (GH) GARCH model to account for asymmetries in both conditional and unconditional distribution. In particular, two special cases of the GARCH-GH model which describe the data most accurately are proposed. They are found to improve the fit of the model when compared to symmetric GARCH models. The advantages of accounting for asymmetries are also observed through Value-at-Risk applications. Both theoretical and empirical contributions are provided in Chapter 3 of the thesis. In this chapter the so-called mixture conditional autoregressive range (MCARR) model is introduced, examined and applied to daily price ranges of the Hang Seng Index. The conditions for the strict and weak stationarity of the model as well as an expression for the autocorrelation function are obtained by writing the MCARR model as a first order autoregressive process with random coefficients. The chapter also introduces inverse gamma (IG) distribution to CARR models. The advantages of CARR-IG and MCARR-IG specifications over conventional CARR models are found in the empirical application both in- and out-of-sample. Chapter 4 discusses the simultaneous modeling of absolute returns and daily price ranges. In this part of the thesis a vector multiplicative error model (VMEM) with asymmetric Gumbel copula is found to provide substantial benefits over the existing VMEM models based on elliptical copulas. The proposed specification is able to capture the highly asymmetric dependence of the modeled variables thereby improving the performance of the model considerably. The economic significance of the results obtained is established when the information content of the volatility forecasts derived is examined.
Resumo:
This work belongs to the field of computational high-energy physics (HEP). The key methods used in this thesis work to meet the challenges raised by the Large Hadron Collider (LHC) era experiments are object-orientation with software engineering, Monte Carlo simulation, the computer technology of clusters, and artificial neural networks. The first aspect discussed is the development of hadronic cascade models, used for the accurate simulation of medium-energy hadron-nucleus reactions, up to 10 GeV. These models are typically needed in hadronic calorimeter studies and in the estimation of radiation backgrounds. Various applications outside HEP include the medical field (such as hadron treatment simulations), space science (satellite shielding), and nuclear physics (spallation studies). Validation results are presented for several significant improvements released in Geant4 simulation tool, and the significance of the new models for computing in the Large Hadron Collider era is estimated. In particular, we estimate the ability of the Bertini cascade to simulate Compact Muon Solenoid (CMS) hadron calorimeter HCAL. LHC test beam activity has a tightly coupled cycle of simulation-to-data analysis. Typically, a Geant4 computer experiment is used to understand test beam measurements. Thus an another aspect of this thesis is a description of studies related to developing new CMS H2 test beam data analysis tools and performing data analysis on the basis of CMS Monte Carlo events. These events have been simulated in detail using Geant4 physics models, full CMS detector description, and event reconstruction. Using the ROOT data analysis framework we have developed an offline ANN-based approach to tag b-jets associated with heavy neutral Higgs particles, and we show that this kind of NN methodology can be successfully used to separate the Higgs signal from the background in the CMS experiment.
Resumo:
The problem of time variant reliability analysis of existing structures subjected to stationary random dynamic excitations is considered. The study assumes that samples of dynamic response of the structure, under the action of external excitations, have been measured at a set of sparse points on the structure. The utilization of these measurements m in updating reliability models, postulated prior to making any measurements, is considered. This is achieved by using dynamic state estimation methods which combine results from Markov process theory and Bayes' theorem. The uncertainties present in measurements as well as in the postulated model for the structural behaviour are accounted for. The samples of external excitations are taken to emanate from known stochastic models and allowance is made for ability (or lack of it) to measure the applied excitations. The future reliability of the structure is modeled using expected structural response conditioned on all the measurements made. This expected response is shown to have a time varying mean and a random component that can be treated as being weakly stationary. For linear systems, an approximate analytical solution for the problem of reliability model updating is obtained by combining theories of discrete Kalman filter and level crossing statistics. For the case of nonlinear systems, the problem is tackled by combining particle filtering strategies with data based extreme value analysis. In all these studies, the governing stochastic differential equations are discretized using the strong forms of Ito-Taylor's discretization schemes. The possibility of using conditional simulation strategies, when applied external actions are measured, is also considered. The proposed procedures are exemplifiedmby considering the reliability analysis of a few low-dimensional dynamical systems based on synthetically generated measurement data. The performance of the procedures developed is also assessed based on a limited amount of pertinent Monte Carlo simulations. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
Predictions of two popular closed-form models for unsaturated hydraulic conductivity (K) are compared with in situ measurements made in a sandy loam field soil. Whereas the Van Genuchten model estimates were very close to field measured values, the Brooks-Corey model predictions were higher by about one order of magnitude in the wetter range. Estimation of parameters of the Van Genuchten soil moisture characteristic (SMC) equation, however, involves the use of non-linear regression techniques. The Brooks-Corey SMC equation has the advantage of being amenable to application of linear regression techniques for estimation of its parameters from retention data. A conversion technique, whereby known Brooks-Corey model parameters may be converted into Van Genuchten model parameters, is formulated. The proposed conversion algorithm may be used to obtain the parameters of the preferred Van Genuchten model from in situ retention data, without the use of non-linear regression techniques.
Resumo:
A systematic assessment of the submodels of conditional moment closure (CMC) formalism for the autoignition problem is carried out using direct numerical simulation (DNS) data. An initially non-premixed, n-heptane/air system, subjected to a three-dimensional, homogeneous, isotropic, and decaying turbulence, is considered. Two kinetic schemes, (1) a one-step and (2) a reduced four-step reaction mechanism, are considered for chemistry An alternative formulation is developed for closure of the mean chemical source term
Resumo:
This paper presents a novel algorithm for compression of single lead Electrocardiogram (ECG) signals. The method is based on Pole-Zero modelling of the Discrete Cosine Transformed (DCT) signal. An extension is proposed to the well known Steiglitz-Hcbride algorithm, to model the higher frequency components of the input signal more accurately. This is achieved by weighting the error function minimized by the algorithm to estimate the model parameters. The data compression achieved by the parametric model is further enhanced by Differential Pulse Code Modulation (DPCM) of the model parameters. The method accomplishes a compression ratio in the range of 1:20 to 1:40, which far exceeds those achieved by most of the current methods.
Resumo:
Two-dimensional magnetic recording (2-D TDMR) is an emerging technology that aims to achieve areal densities as high as 10 Tb/in(2) using sophisticated 2-D signal-processing algorithms. High areal densities are achieved by reducing the size of a bit to the order of the size of magnetic grains, resulting in severe 2-D intersymbol interference (ISI). Jitter noise due to irregular grain positions on the magnetic medium is more pronounced at these areal densities. Therefore, a viable read-channel architecture for TDMR requires 2-D signal-detection algorithms that can mitigate 2-D ISI and combat noise comprising jitter and electronic components. Partial response maximum likelihood (PRML) detection scheme allows controlled ISI as seen by the detector. With the controlled and reduced span of 2-D ISI, the PRML scheme overcomes practical difficulties such as Nyquist rate signaling required for full response 2-D equalization. As in the case of 1-D magnetic recording, jitter noise can be handled using a data-dependent noise-prediction (DDNP) filter bank within a 2-D signal-detection engine. The contributions of this paper are threefold: 1) we empirically study the jitter noise characteristics in TDMR as a function of grain density using a Voronoi-based granular media model; 2) we develop a 2-D DDNP algorithm to handle the media noise seen in TDMR; and 3) we also develop techniques to design 2-D separable and nonseparable targets for generalized partial response equalization for TDMR. This can be used along with a 2-D signal-detection algorithm. The DDNP algorithm is observed to give a 2.5 dB gain in SNR over uncoded data compared with the noise predictive maximum likelihood detection for the same choice of channel model parameters to achieve a channel bit density of 1.3 Tb/in(2) with media grain center-to-center distance of 10 nm. The DDNP algorithm is observed to give similar to 10% gain in areal density near 5 grains/bit. The proposed signal-processing framework can broadly scale to various TDMR realizations and areal density points.
Resumo:
Existing devices for communicating information to computers are bulky, slow to use, or unreliable. Dasher is a new interface incorporating language modelling and driven by continuous two-dimensional gestures, e.g. a mouse, touchscreen, or eye-tracker. Tests have shown that this device can be used to enter text at a rate of up to 34 words per minute, compared with typical ten-finger keyboard typing of 40-60 words per minute. Although the interface is slower than a conventional keyboard, it is small and simple, and could be used on personal data assistants and by motion-impaired computer users.
Resumo:
Revisions of US macroeconomic data are not white-noise. They are persistent, correlated with real-time data, and with high variability (around 80% of volatility observed in US real-time data). Their business cycle effects are examined in an estimated DSGE model extended with both real-time and final data. After implementing a Bayesian estimation approach, the role of both habit formation and price indexation fall significantly in the extended model. The results show how revision shocks of both output and inflation are expansionary because they occur when real-time published data are too low and the Fed reacts by cutting interest rates. Consumption revisions, by contrast, are countercyclical as consumption habits mirror the observed reduction in real-time consumption. In turn, revisions of the three variables explain 9.3% of changes of output in its long-run variance decomposition.
Resumo:
The brain is perhaps the most complex system to have ever been subjected to rigorous scientific investigation. The scale is staggering: over 10^11 neurons, each making an average of 10^3 synapses, with computation occurring on scales ranging from a single dendritic spine, to an entire cortical area. Slowly, we are beginning to acquire experimental tools that can gather the massive amounts of data needed to characterize this system. However, to understand and interpret these data will also require substantial strides in inferential and statistical techniques. This dissertation attempts to meet this need, extending and applying the modern tools of latent variable modeling to problems in neural data analysis.
It is divided into two parts. The first begins with an exposition of the general techniques of latent variable modeling. A new, extremely general, optimization algorithm is proposed - called Relaxation Expectation Maximization (REM) - that may be used to learn the optimal parameter values of arbitrary latent variable models. This algorithm appears to alleviate the common problem of convergence to local, sub-optimal, likelihood maxima. REM leads to a natural framework for model size selection; in combination with standard model selection techniques the quality of fits may be further improved, while the appropriate model size is automatically and efficiently determined. Next, a new latent variable model, the mixture of sparse hidden Markov models, is introduced, and approximate inference and learning algorithms are derived for it. This model is applied in the second part of the thesis.
The second part brings the technology of part I to bear on two important problems in experimental neuroscience. The first is known as spike sorting; this is the problem of separating the spikes from different neurons embedded within an extracellular recording. The dissertation offers the first thorough statistical analysis of this problem, which then yields the first powerful probabilistic solution. The second problem addressed is that of characterizing the distribution of spike trains recorded from the same neuron under identical experimental conditions. A latent variable model is proposed. Inference and learning in this model leads to new principled algorithms for smoothing and clustering of spike data.