2 resultados para machine learning modelli lineari missing data biomarcatori
em Coffee Science - Universidade Federal de Lavras
Resumo:
With the quick advance of web service technologies, end-users can conduct various on-line tasks, such as shopping on-line. Usually, end-users compose a set of services to accomplish a task, and need to enter values to services to invoke the composite services. Quite often, users re-visit websites and use services to perform re-occurring tasks. The users are required to enter the same information into various web services to accomplish such re-occurring tasks. However, repetitively typing the same information into services is a tedious job for end-users. It can negatively impact user experience when an end-user needs to type the re-occurring information repetitively into web services. Recent studies have proposed several approaches to help users fill in values to services automatically. However, prior studies mainly suffer the following drawbacks: (1) limited support of collecting and analyzing user inputs; (2) poor accuracy of filling values to services; (3) not designed for service composition. To overcome the aforementioned drawbacks, we need maximize the reuse of previous user inputs across services and end-users. In this thesis, we introduce our approaches that prevent end-users from entering the same information into repetitive on-line tasks. More specifically, we improve the process of filling out services in the following 4 aspects: First, we investigate the characteristics of input parameters. We propose an ontology-based approach to automatically categorize parameters and fill values to the categorized input parameters. Second, we propose a comprehensive framework that leverages user contexts and usage patterns into the process of filling values to services. Third, we propose an approach for maximizing the value propagation among services and end-users by linking a set of semantically related parameters together and similar end-users. Last, we propose a ranking-based framework that ranks a list of previous user inputs for an input parameter to save a user from unnecessary data entries. Our framework learns and analyzes interactions of user inputs and input parameters to rank user inputs for input parameters under different contexts.
Resumo:
Hypertrophic cardiomyopathy (HCM) is a cardiovascular disease where the heart muscle is partially thickened and blood flow is - potentially fatally - obstructed. It is one of the leading causes of sudden cardiac death in young people. Electrocardiography (ECG) and Echocardiography (Echo) are the standard tests for identifying HCM and other cardiac abnormalities. The American Heart Association has recommended using a pre-participation questionnaire for young athletes instead of ECG or Echo tests due to considerations of cost and time involved in interpreting the results of these tests by an expert cardiologist. Initially we set out to develop a classifier for automated prediction of young athletes’ heart conditions based on the answers to the questionnaire. Classification results and further in-depth analysis using computational and statistical methods indicated significant shortcomings of the questionnaire in predicting cardiac abnormalities. Automated methods for analyzing ECG signals can help reduce cost and save time in the pre-participation screening process by detecting HCM and other cardiac abnormalities. Therefore, the main goal of this dissertation work is to identify HCM through computational analysis of 12-lead ECG. ECG signals recorded on one or two leads have been analyzed in the past for classifying individual heartbeats into different types of arrhythmia as annotated primarily in the MIT-BIH database. In contrast, we classify complete sequences of 12-lead ECGs to assign patients into two groups: HCM vs. non-HCM. The challenges and issues we address include missing ECG waves in one or more leads and the dimensionality of a large feature-set. We address these by proposing imputation and feature-selection methods. We develop heartbeat-classifiers by employing Random Forests and Support Vector Machines, and propose a method to classify full 12-lead ECGs based on the proportion of heartbeats classified as HCM. The results from our experiments show that the classifiers developed using our methods perform well in identifying HCM. Thus the two contributions of this thesis are the utilization of computational and statistical methods for discovering shortcomings in a current screening procedure and the development of methods to identify HCM through computational analysis of 12-lead ECG signals.