2 resultados para statistical methods
em Coffee Science - Universidade Federal de Lavras
Resumo:
Hypertrophic cardiomyopathy (HCM) is a cardiovascular disease where the heart muscle is partially thickened and blood flow is - potentially fatally - obstructed. It is one of the leading causes of sudden cardiac death in young people. Electrocardiography (ECG) and Echocardiography (Echo) are the standard tests for identifying HCM and other cardiac abnormalities. The American Heart Association has recommended using a pre-participation questionnaire for young athletes instead of ECG or Echo tests due to considerations of cost and time involved in interpreting the results of these tests by an expert cardiologist. Initially we set out to develop a classifier for automated prediction of young athletes’ heart conditions based on the answers to the questionnaire. Classification results and further in-depth analysis using computational and statistical methods indicated significant shortcomings of the questionnaire in predicting cardiac abnormalities. Automated methods for analyzing ECG signals can help reduce cost and save time in the pre-participation screening process by detecting HCM and other cardiac abnormalities. Therefore, the main goal of this dissertation work is to identify HCM through computational analysis of 12-lead ECG. ECG signals recorded on one or two leads have been analyzed in the past for classifying individual heartbeats into different types of arrhythmia as annotated primarily in the MIT-BIH database. In contrast, we classify complete sequences of 12-lead ECGs to assign patients into two groups: HCM vs. non-HCM. The challenges and issues we address include missing ECG waves in one or more leads and the dimensionality of a large feature-set. We address these by proposing imputation and feature-selection methods. We develop heartbeat-classifiers by employing Random Forests and Support Vector Machines, and propose a method to classify full 12-lead ECGs based on the proportion of heartbeats classified as HCM. The results from our experiments show that the classifiers developed using our methods perform well in identifying HCM. Thus the two contributions of this thesis are the utilization of computational and statistical methods for discovering shortcomings in a current screening procedure and the development of methods to identify HCM through computational analysis of 12-lead ECG signals.
Resumo:
This research develops an econometric framework to analyze time series processes with bounds. The framework is general enough that it can incorporate several different kinds of bounding information that constrain continuous-time stochastic processes between discretely-sampled observations. It applies to situations in which the process is known to remain within an interval between observations, by way of either a known constraint or through the observation of extreme realizations of the process. The main statistical technique employs the theory of maximum likelihood estimation. This approach leads to the development of the asymptotic distribution theory for the estimation of the parameters in bounded diffusion models. The results of this analysis present several implications for empirical research. The advantages are realized in the form of efficiency gains, bias reduction and in the flexibility of model specification. A bias arises in the presence of bounding information that is ignored, while it is mitigated within this framework. An efficiency gain arises, in the sense that the statistical methods make use of conditioning information, as revealed by the bounds. Further, the specification of an econometric model can be uncoupled from the restriction to the bounds, leaving the researcher free to model the process near the bound in a way that avoids bias from misspecification. One byproduct of the improvements in model specification is that the more precise model estimation exposes other sources of misspecification. Some processes reveal themselves to be unlikely candidates for a given diffusion model, once the observations are analyzed in combination with the bounding information. A closer inspection of the theoretical foundation behind diffusion models leads to a more general specification of the model. This approach is used to produce a set of algorithms to make the model computationally feasible and more widely applicable. Finally, the modeling framework is applied to a series of interest rates, which, for several years, have been constrained by the lower bound of zero. The estimates from a series of diffusion models suggest a substantial difference in estimation results between models that ignore bounds and the framework that takes bounding information into consideration.