39 resultados para Data Mining and Machine Learning
em BORIS: Bern Open Repository and Information System - Berna - Suiça
Resumo:
This paper addresses an investigation with machine learning (ML) classification techniques to assist in the problem of flash flood now casting. We have been attempting to build a Wireless Sensor Network (WSN) to collect measurements from a river located in an urban area. The machine learning classification methods were investigated with the aim of allowing flash flood now casting, which in turn allows the WSN to give alerts to the local population. We have evaluated several types of ML taking account of the different now casting stages (i.e. Number of future time steps to forecast). We have also evaluated different data representation to be used as input of the ML techniques. The results show that different data representation can lead to results significantly better for different stages of now casting.
Resumo:
Approximate models (proxies) can be employed to reduce the computational costs of estimating uncertainty. The price to pay is that the approximations introduced by the proxy model can lead to a biased estimation. To avoid this problem and ensure a reliable uncertainty quantification, we propose to combine functional data analysis and machine learning to build error models that allow us to obtain an accurate prediction of the exact response without solving the exact model for all realizations. We build the relationship between proxy and exact model on a learning set of geostatistical realizations for which both exact and approximate solvers are run. Functional principal components analysis (FPCA) is used to investigate the variability in the two sets of curves and reduce the dimensionality of the problem while maximizing the retained information. Once obtained, the error model can be used to predict the exact response of any realization on the basis of the sole proxy response. This methodology is purpose-oriented as the error model is constructed directly for the quantity of interest, rather than for the state of the system. Also, the dimensionality reduction performed by FPCA allows a diagnostic of the quality of the error model to assess the informativeness of the learning set and the fidelity of the proxy to the exact model. The possibility of obtaining a prediction of the exact response for any newly generated realization suggests that the methodology can be effectively used beyond the context of uncertainty quantification, in particular for Bayesian inference and optimization.
Resumo:
Finite element (FE) analysis is an important computational tool in biomechanics. However, its adoption into clinical practice has been hampered by its computational complexity and required high technical competences for clinicians. In this paper we propose a supervised learning approach to predict the outcome of the FE analysis. We demonstrate our approach on clinical CT and X-ray femur images for FE predictions ( FEP), with features extracted, respectively, from a statistical shape model and from 2D-based morphometric and density information. Using leave-one-out experiments and sensitivity analysis, comprising a database of 89 clinical cases, our method is capable of predicting the distribution of stress values for a walking loading condition with an average correlation coefficient of 0.984 and 0.976, for CT and X-ray images, respectively. These findings suggest that supervised learning approaches have the potential to leverage the clinical integration of mechanical simulations for the treatment of musculoskeletal conditions.
Resumo:
Index tracking has become one of the most common strategies in asset management. The index-tracking problem consists of constructing a portfolio that replicates the future performance of an index by including only a subset of the index constituents in the portfolio. Finding the most representative subset is challenging when the number of stocks in the index is large. We introduce a new three-stage approach that at first identifies promising subsets by employing data-mining techniques, then determines the stock weights in the subsets using mixed-binary linear programming, and finally evaluates the subsets based on cross validation. The best subset is returned as the tracking portfolio. Our approach outperforms state-of-the-art methods in terms of out-of-sample performance and running times.
Resumo:
This paper presents a shallow dialogue analysis model, aimed at human-human dialogues in the context of staff or business meetings. Four components of the model are defined, and several machine learning techniques are used to extract features from dialogue transcripts: maximum entropy classifiers for dialogue acts, latent semantic analysis for topic segmentation, or decision tree classifiers for discourse markers. A rule-based approach is proposed for solving cross-modal references to meeting documents. The methods are trained and evaluated thanks to a common data set and annotation format. The integration of the components into an automated shallow dialogue parser opens the way to multimodal meeting processing and retrieval applications.
Resumo:
Abstract Radiation metabolomics employing mass spectral technologies represents a plausible means of high-throughput minimally invasive radiation biodosimetry. A simplified metabolomics protocol is described that employs ubiquitous gas chromatography-mass spectrometry and open source software including random forests machine learning algorithm to uncover latent biomarkers of 3 Gy gamma radiation in rats. Urine was collected from six male Wistar rats and six sham-irradiated controls for 7 days, 4 prior to irradiation and 3 after irradiation. Water and food consumption, urine volume, body weight, and sodium, potassium, calcium, chloride, phosphate and urea excretion showed major effects from exposure to gamma radiation. The metabolomics protocol uncovered several urinary metabolites that were significantly up-regulated (glyoxylate, threonate, thymine, uracil, p-cresol) and down-regulated (citrate, 2-oxoglutarate, adipate, pimelate, suberate, azelaate) as a result of radiation exposure. Thymine and uracil were shown to derive largely from thymidine and 2'-deoxyuridine, which are known radiation biomarkers in the mouse. The radiation metabolomic phenotype in rats appeared to derive from oxidative stress and effects on kidney function. Gas chromatography-mass spectrometry is a promising platform on which to develop the field of radiation metabolomics further and to assist in the design of instrumentation for use in detecting biological consequences of environmental radiation release.
Resumo:
Background In Switzerland there are about 150,000 equestrians. Horse related injuries, including head and spinal injuries, are frequently treated at our level I trauma centre. Objectives To analyse injury patterns, protective factors, and risk factors related to horse riding, and to define groups of safer riders and those at greater risk Methods We present a retrospective and a case-control survey at conducted a tertiary trauma centre in Bern, Switzerland. Injured equestrians from July 2000 - June 2006 were retrospectively classified by injury pattern and neurological symptoms. Injured equestrians from July-December 2008 were prospectively collected using a questionnaire with 17 variables. The same questionnaire was applied in non-injured controls. Multiple logistic regression was performed, and combined risk factors were calculated using inference trees. Results Retrospective survey A total of 528 injuries occured in 365 patients. The injury pattern revealed as follows: extremities (32%: upper 17%, lower 15%), head (24%), spine (14%), thorax (9%), face (9%), pelvis (7%) and abdomen (2%). Two injuries were fatal. One case resulted in quadriplegia, one in paraplegia. Case-control survey 61 patients and 102 controls (patients: 72% female, 28% male; controls: 63% female, 37% male) were included. Falls were most frequent (65%), followed by horse kicks (19%) and horse bites (2%). Variables statistically significant for the controls were: Older age (p = 0.015), male gender (p = 0.04) and holding a diploma in horse riding (p = 0.004). Inference trees revealed typical groups less and more likely to suffer injury. Conclusions Experience with riding and having passed a diploma in horse riding seem to be protective factors. Educational levels and injury risk should be graded within an educational level-injury risk index.
Resumo:
Learned irrelevance (LIrr) refers to a form of selective learning that develops as a result of prior noncorrelated exposures of the predicted and predictor stimuli. In learning situations that depend on the associative link between the predicted and predictor stimuli, LIrr is expressed as a retardation of learning. It represents a form of modulation of learning by selective attention. Given the relevance of selective attention impairment to both positive and cognitive schizophrenia symptoms, the question remains whether LIrr impairment represents a state (relating to symptom manifestation) or trait (relating to schizophrenia endophenotypes) marker of human psychosis. We examined this by evaluating the expression of LIrr in an associative learning paradigm in (1) asymptomatic first-degree relatives of schizophrenia patients (SZ-relatives) and in (2) individuals exhibiting prodromal signs of psychosis ("ultrahigh risk" [UHR] patients) in each case relative to demographically matched healthy control subjects. There was no evidence for aberrant LIrr in SZ-relatives, but LIrr as well as associative learning were attenuated in UHR patients. It is concluded that LIrr deficiency in conjunction with a learning impairment might be a useful state marker predictive of psychotic state but a relatively weak link to a potential schizophrenia endophenotype.