968 resultados para Computational prediction
Resumo:
Hypertrophic cardiomyopathy (HCM) is a cardiovascular disease where the heart muscle is partially thickened and blood flow is - potentially fatally - obstructed. It is one of the leading causes of sudden cardiac death in young people. Electrocardiography (ECG) and Echocardiography (Echo) are the standard tests for identifying HCM and other cardiac abnormalities. The American Heart Association has recommended using a pre-participation questionnaire for young athletes instead of ECG or Echo tests due to considerations of cost and time involved in interpreting the results of these tests by an expert cardiologist. Initially we set out to develop a classifier for automated prediction of young athletes’ heart conditions based on the answers to the questionnaire. Classification results and further in-depth analysis using computational and statistical methods indicated significant shortcomings of the questionnaire in predicting cardiac abnormalities. Automated methods for analyzing ECG signals can help reduce cost and save time in the pre-participation screening process by detecting HCM and other cardiac abnormalities. Therefore, the main goal of this dissertation work is to identify HCM through computational analysis of 12-lead ECG. ECG signals recorded on one or two leads have been analyzed in the past for classifying individual heartbeats into different types of arrhythmia as annotated primarily in the MIT-BIH database. In contrast, we classify complete sequences of 12-lead ECGs to assign patients into two groups: HCM vs. non-HCM. The challenges and issues we address include missing ECG waves in one or more leads and the dimensionality of a large feature-set. We address these by proposing imputation and feature-selection methods. We develop heartbeat-classifiers by employing Random Forests and Support Vector Machines, and propose a method to classify full 12-lead ECGs based on the proportion of heartbeats classified as HCM. The results from our experiments show that the classifiers developed using our methods perform well in identifying HCM. Thus the two contributions of this thesis are the utilization of computational and statistical methods for discovering shortcomings in a current screening procedure and the development of methods to identify HCM through computational analysis of 12-lead ECG signals.
Resumo:
The role of computer modeling has grown recently to integrate itself as an inseparable tool to experimental studies for the optimization of automotive engines and the development of future fuels. Traditionally, computer models rely on simplified global reaction steps to simulate the combustion and pollutant formation inside the internal combustion engine. With the current interest in advanced combustion modes and injection strategies, this approach depends on arbitrary adjustment of model parameters that could reduce credibility of the predictions. The purpose of this study is to enhance the combustion model of KIVA, a computational fluid dynamics code, by coupling its fluid mechanics solution with detailed kinetic reactions solved by the chemistry solver, CHEMKIN. As a result, an engine-friendly reaction mechanism for n-heptane was selected to simulate diesel oxidation. Each cell in the computational domain is considered as a perfectly-stirred reactor which undergoes adiabatic constant- volume combustion. The model was applied to an ideally-prepared homogeneous- charge compression-ignition combustion (HCCI) and direct injection (DI) diesel combustion. Ignition and combustion results show that the code successfully simulates the premixed HCCI scenario when compared to traditional combustion models. Direct injection cases, on the other hand, do not offer a reliable prediction mainly due to the lack of turbulent-mixing model, inherent in the perfectly-stirred reactor formulation. In addition, the model is sensitive to intake conditions and experimental uncertainties which require implementation of enhanced predictive tools. It is recommended that future improvements consider turbulent-mixing effects as well as optimization techniques to accurately simulate actual in-cylinder process with reduced computational cost. Furthermore, the model requires the extension of existing fuel oxidation mechanisms to include pollutant formation kinetics for emission control studies.
Resumo:
Phosphorylation is amongst the most crucial and well-studied post-translational modifications. It is involved in multiple cellular processes which makes phosphorylation prediction vital for understanding protein functions. However, wet-lab techniques are labour and time intensive. Thus, computational tools are required for efficiency. This project aims to provide a novel way to predict phosphorylation sites from protein sequences by adding flexibility and Sezerman Grouping amino acid similarity measure to previous methods, as discovering new protein sequences happens at a greater rate than determining protein structures. The predictor – NOPAY - relies on Support Vector Machines (SVMs) for classification. The features include amino acid encoding, amino acid grouping, predicted secondary structure, predicted protein disorder, predicted protein flexibility, solvent accessibility, hydrophobicity and volume. As a result, we have managed to improve phosphorylation prediction accuracy for Homo sapiens by 3% and 6.1% for Mus musculus. Sensitivity at 99% specificity was also increased by 6% for Homo sapiens and for Mus musculus by 5% on independent test sets. In this study, we have managed to increase phosphorylation prediction accuracy for Homo sapiens and Mus musculus. When there is enough data, future versions of the software may also be able to predict other organisms.
Resumo:
In this work, the existing understanding of flame spread dynamics is enhanced through an extensive study of the heat transfer from flames spreading vertically upwards across 5 cm wide, 20 cm tall samples of extruded Poly (Methyl Methacrylate) (PMMA). These experiments have provided highly spatially resolved measurements of flame to surface heat flux and material burning rate at the critical length scale of interest, with a level of accuracy and detail unmatched by previous empirical or computational studies. Using these measurements, a wall flame model was developed that describes a flame’s heat feedback profile (both in the continuous flame region and the thermal plume above) solely as a function of material burning rate. Additional experiments were conducted to measure flame heat flux and sample mass loss rate as flames spread vertically upwards over the surface of seven other commonly used polymers, two of which are glass reinforced composite materials. Using these measurements, our wall flame model has been generalized such that it can predict heat feedback from flames supported by a wide range of materials. For the seven materials tested here – which present a varied range of burning behaviors including dripping, polymer melt flow, sample burnout, and heavy soot formation – model-predicted flame heat flux has been shown to match experimental measurements (taken across the full length of the flame) with an average accuracy of 3.9 kW m-2 (approximately 10 – 15 % of peak measured flame heat flux). This flame model has since been coupled with a powerful solid phase pyrolysis solver, ThermaKin2D, which computes the transient rate of gaseous fuel production of constituents of a pyrolyzing solid in response to an external heat flux, based on fundamental physical and chemical properties. Together, this unified model captures the two fundamental controlling mechanisms of upward flame spread – gas phase flame heat transfer and solid phase material degradation. This has enabled simulations of flame spread dynamics with a reasonable computational cost and accuracy beyond that of current models. This unified model of material degradation provides the framework to quantitatively study material burning behavior in response to a wide range of common fire scenarios.
Resumo:
The central motif of this work is prediction and optimization in presence of multiple interacting intelligent agents. We use the phrase `intelligent agents' to imply in some sense, a `bounded rationality', the exact meaning of which varies depending on the setting. Our agents may not be `rational' in the classical game theoretic sense, in that they don't always optimize a global objective. Rather, they rely on heuristics, as is natural for human agents or even software agents operating in the real-world. Within this broad framework we study the problem of influence maximization in social networks where behavior of agents is myopic, but complication stems from the structure of interaction networks. In this setting, we generalize two well-known models and give new algorithms and hardness results for our models. Then we move on to models where the agents reason strategically but are faced with considerable uncertainty. For such games, we give a new solution concept and analyze a real-world game using out techniques. Finally, the richest model we consider is that of Network Cournot Competition which deals with strategic resource allocation in hypergraphs, where agents reason strategically and their interaction is specified indirectly via player's utility functions. For this model, we give the first equilibrium computability results. In all of the above problems, we assume that payoffs for the agents are known. However, for real-world games, getting the payoffs can be quite challenging. To this end, we also study the inverse problem of inferring payoffs, given game history. We propose and evaluate a data analytic framework and we show that it is fast and performant.
Resumo:
The aim of this thesis is to test the ability of some correlative models such as Alpert correlations on 1972 and re-examined on 2011, the investigation of Heskestad and Delichatsios in 1978, the correlations produced by Cooper in 1982, to define both dynamic and thermal characteristics of a fire induced ceiling-jet flow. The flow occurs when the fire plume impinges the ceiling and develops in the radial direction of the fire axis. Both temperature and velocity predictions are decisive for sprinklers positioning, fire alarms positions, detectors (heat, smoke) positions and activation times and back-layering predictions. These correlative models will be compared with a 3D numerical simulation software CFAST. For the results comparison of temperature and velocity near the ceiling. These results are also compared with a Computational Fluid Dynamics (CFD) analysis, using ANSYS FLUENT.
Resumo:
In the presented thesis work, meshfree method with distance fields is applied to create a novel computational approach which enables inclusion of the realistic geometric models of the microstructure and liberates Finite Element Analysis(FEA) from thedependance on and limitations of meshing of fine microstructural feature such as splats and porosity.Manufacturing processes of ceramics produce materials with complex porosity microstructure.Geometry of pores, their size and location substantially affect macro scale physical properties of the material. Complex structure and geometry of the pores severely limit application of modern Finite Element Analysis methods because they require construction of spatial grids (meshes) that conform to the geometric shape of the structure. As a result, there are virtually no effective tools available for predicting overall mechanical and thermal properties of porous materials based on their microstructure. This thesis is a separate handling and controls of geometric and physical computational models that are seamlessly combined at solution run time. Using the proposedapproach we will determine the effective thermal conductivity tensor of real porous ceramic materials featuring both isotropic and anisotropic thermal properties. This work involved development and implementation of numerical algorithms, data structure, and software.
Resumo:
The length of stay of preterm infants in a neonatology service has become an issue of a growing concern, namely considering, on the one hand, the mothers and infants health conditions and, on the other hand, the scarce healthcare facilities own resources. Thus, a pro-active strategy for problem solving has to be put in place, either to improve the quality-of-service provided or to reduce the inherent financial costs. Therefore, this work will focus on the development of a diagnosis decision support system in terms of a formal agenda built on a Logic Programming approach to knowledge representation and reasoning, complemented with a case-based problem solving methodology to computing, that caters for the handling of incomplete, unknown, or even contradictory in-formation. The proposed model has been quite accurate in predicting the length of stay (overall accuracy of 84.9%) and by reducing the computational time with values around 21.3%.
Resumo:
It is well known that the dimensions of the pelvic bones depend on the gender and vary with the age of the individual. Indeed, and as a matter of fact, this work will focus on the development of an intelligent decision support system to predict individual’s age based on pelvis’ dimensions criteria. On the one hand, some basic image processing technics were applied in order to extract the relevant features from pelvic X-rays. On the other hand, the computational framework presented here was built on top of a Logic Programming approach to knowledge representation and reasoning, that caters for the handling of incomplete, unknown, or even self-contradictory information, complemented with a Case Base approach to computing.
Resumo:
Modern scientific discoveries are driven by an unsatisfiable demand for computational resources. High-Performance Computing (HPC) systems are an aggregation of computing power to deliver considerably higher performance than one typical desktop computer can provide, to solve large problems in science, engineering, or business. An HPC room in the datacenter is a complex controlled environment that hosts thousands of computing nodes that consume electrical power in the range of megawatts, which gets completely transformed into heat. Although a datacenter contains sophisticated cooling systems, our studies indicate quantitative evidence of thermal bottlenecks in real-life production workload, showing the presence of significant spatial and temporal thermal and power heterogeneity. Therefore minor thermal issues/anomalies can potentially start a chain of events that leads to an unbalance between the amount of heat generated by the computing nodes and the heat removed by the cooling system originating thermal hazards. Although thermal anomalies are rare events, anomaly detection/prediction in time is vital to avoid IT and facility equipment damage and outage of the datacenter, with severe societal and business losses. For this reason, automated approaches to detect thermal anomalies in datacenters have considerable potential. This thesis analyzed and characterized the power and thermal characteristics of a Tier0 datacenter (CINECA) during production and under abnormal thermal conditions. Then, a Deep Learning (DL)-powered thermal hazard prediction framework is proposed. The proposed models are validated against real thermal hazard events reported for the studied HPC cluster while in production. This thesis is the first empirical study of thermal anomaly detection and prediction techniques of a real large-scale HPC system to the best of my knowledge. For this thesis, I used a large-scale dataset, monitoring data of tens of thousands of sensors for around 24 months with a data collection rate of around 20 seconds.
Resumo:
Spectral sensors are a wide class of devices that are extremely useful for detecting essential information of the environment and materials with high degree of selectivity. Recently, they have achieved high degrees of integration and low implementation cost to be suited for fast, small, and non-invasive monitoring systems. However, the useful information is hidden in spectra and it is difficult to decode. So, mathematical algorithms are needed to infer the value of the variables of interest from the acquired data. Between the different families of predictive modeling, Principal Component Analysis and the techniques stemmed from it can provide very good performances, as well as small computational and memory requirements. For these reasons, they allow the implementation of the prediction even in embedded and autonomous devices. In this thesis, I will present 4 practical applications of these algorithms to the prediction of different variables: moisture of soil, moisture of concrete, freshness of anchovies/sardines, and concentration of gasses. In all of these cases, the workflow will be the same. Initially, an acquisition campaign was performed to acquire both spectra and the variables of interest from samples. Then these data are used as input for the creation of the prediction models, to solve both classification and regression problems. From these models, an array of calibration coefficients is derived and used for the implementation of the prediction in an embedded system. The presented results will show that this workflow was successfully applied to very different scientific fields, obtaining autonomous and non-invasive devices able to predict the value of physical parameters of choice from new spectral acquisitions.
Resumo:
Hematological cancers are a heterogeneous family of diseases that can be divided into leukemias, lymphomas, and myelomas, often called “liquid tumors”. Since they cannot be surgically removable, chemotherapy represents the mainstay of their treatment. However, it still faces several challenges like drug resistance and low response rate, and the need for new anticancer agents is compelling. The drug discovery process is long-term, costly, and prone to high failure rates. With the rapid expansion of biological and chemical "big data", some computational techniques such as machine learning tools have been increasingly employed to speed up and economize the whole process. Machine learning algorithms can create complex models with the aim to determine the biological activity of compounds against several targets, based on their chemical properties. These models are defined as multi-target Quantitative Structure-Activity Relationship (mt-QSAR) and can be used to virtually screen small and large chemical libraries for the identification of new molecules with anticancer activity. The aim of my Ph.D. project was to employ machine learning techniques to build an mt-QSAR classification model for the prediction of cytotoxic drugs simultaneously active against 43 hematological cancer cell lines. For this purpose, first, I constructed a large and diversified dataset of molecules extracted from the ChEMBL database. Then, I compared the performance of different ML classification algorithms, until Random Forest was identified as the one returning the best predictions. Finally, I used different approaches to maximize the performance of the model, which achieved an accuracy of 88% by correctly classifying 93% of inactive molecules and 72% of active molecules in a validation set. This model was further applied to the virtual screening of a small dataset of molecules tested in our laboratory, where it showed 100% accuracy in correctly classifying all molecules. This result is confirmed by our previous in vitro experiments.
Resumo:
Previous earthquakes showed that shear wall damage could lead to catastrophic failures of the reinforced concrete building. The lateral load capacity of shear walls needs to be estimated to minimize associated losses during catastrophic events; hence it is necessary to develop and validate reliable and stable numerical methods able to converge to reasonable estimations with minimum computational effort. The beam-column 1-D line element with fiber-type cross-section model is a practical option that yields results in agreement with experimental data. However, shortcomings of using this model to predict the local damage response may come from the fact that the model requires fine calibration of material properties to overcome regularization and size effects. To reduce the mesh-dependency of the numerical model, a regularization method based on the concept of post-yield energy is applied in this work to both the concrete and the steel material constitutive laws to predict the nonlinear cyclic response and failure mechanism of concrete shear walls. Different categories of wall specimens known to produce a different response under in plane cyclic loading for their varied geometric and detailing characteristics are considered in this study, namely: 1) scaled wall specimens designed according to the European seismic design code and 2) unique full-scale wall specimens detailed according to the U.S. design code to develop a ductile behavior under cyclic loading. To test the boundaries of application of the proposed method, two full-scale walls with a mixed shear-flexure response and different values of applied axial load are also considered. The results of this study show that the use of regularized constitutive models considerably enhances the response predictions capabilities of the model with regards to global force-drift response and failure mode. The simulations presented in this thesis demonstrate the proposed model to be a valuable tool for researchers and engineers.
Resumo:
New DNA-based predictive tests for physical characteristics and inference of ancestry are highly informative tools that are being increasingly used in forensic genetic analysis. Two eye colour prediction models: a Bayesian classifier - Snipper and a multinomial logistic regression (MLR) system for the Irisplex assay, have been described for the analysis of unadmixed European populations. Since multiple SNPs in combination contribute in varying degrees to eye colour predictability in Europeans, it is likely that these predictive tests will perform in different ways amongst admixed populations that have European co-ancestry, compared to unadmixed Europeans. In this study we examined 99 individuals from two admixed South American populations comparing eye colour versus ancestry in order to reveal a direct correlation of light eye colour phenotypes with European co-ancestry in admixed individuals. Additionally, eye colour prediction following six prediction models, using varying numbers of SNPs and based on Snipper and MLR, were applied to the study populations. Furthermore, patterns of eye colour prediction have been inferred for a set of publicly available admixed and globally distributed populations from the HGDP-CEPH panel and 1000 Genomes databases with a special emphasis on admixed American populations similar to those of the study samples.
Resumo:
Negative-ion mode electrospray ionization, ESI(-), with Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) was coupled to a Partial Least Squares (PLS) regression and variable selection methods to estimate the total acid number (TAN) of Brazilian crude oil samples. Generally, ESI(-)-FT-ICR mass spectra present a power of resolution of ca. 500,000 and a mass accuracy less than 1 ppm, producing a data matrix containing over 5700 variables per sample. These variables correspond to heteroatom-containing species detected as deprotonated molecules, [M - H](-) ions, which are identified primarily as naphthenic acids, phenols and carbazole analog species. The TAN values for all samples ranged from 0.06 to 3.61 mg of KOH g(-1). To facilitate the spectral interpretation, three methods of variable selection were studied: variable importance in the projection (VIP), interval partial least squares (iPLS) and elimination of uninformative variables (UVE). The UVE method seems to be more appropriate for selecting important variables, reducing the dimension of the variables to 183 and producing a root mean square error of prediction of 0.32 mg of KOH g(-1). By reducing the size of the data, it was possible to relate the selected variables with their corresponding molecular formulas, thus identifying the main chemical species responsible for the TAN values.