776 resultados para Machine learning methods
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
This thesis addresses the Batch Reinforcement Learning methods in Robotics. This sub-class of Reinforcement Learning has shown promising results and has been the focus of recent research. Three contributions are proposed that aim to extend the state-of-art methods allowing for a faster and more stable learning process, such as required for learning in Robotics. The Q-learning update-rule is widely applied, since it allows to learn without the presence of a model of the environment. However, this update-rule is transition-based and does not take advantage of the underlying episodic structure of collected batch of interactions. The Q-Batch update-rule is proposed in this thesis, to process experiencies along the trajectories collected in the interaction phase. This allows a faster propagation of obtained rewards and penalties, resulting in faster and more robust learning. Non-parametric function approximations are explored, such as Gaussian Processes. This type of approximators allows to encode prior knowledge about the latent function, in the form of kernels, providing a higher level of exibility and accuracy. The application of Gaussian Processes in Batch Reinforcement Learning presented a higher performance in learning tasks than other function approximations used in the literature. Lastly, in order to extract more information from the experiences collected by the agent, model-learning techniques are incorporated to learn the system dynamics. In this way, it is possible to augment the set of collected experiences with experiences generated through planning using the learned models. Experiments were carried out mainly in simulation, with some tests carried out in a physical robotic platform. The obtained results show that the proposed approaches are able to outperform the classical Fitted Q Iteration.
Resumo:
This thesis builds a framework for evaluating downside risk from multivariate data via a special class of risk measures (RM). The peculiarity of the analysis lies in getting rid of strong data distributional assumptions and in orientation towards the most critical data in risk management: those with asymmetries and heavy tails. At the same time, under typical assumptions, such as the ellipticity of the data probability distribution, the conformity with classical methods is shown. The constructed class of RM is a multivariate generalization of the coherent distortion RM, which possess valuable properties for a risk manager. The design of the framework is twofold. The first part contains new computational geometry methods for the high-dimensional data. The developed algorithms demonstrate computability of geometrical concepts used for constructing the RM. These concepts bring visuality and simplify interpretation of the RM. The second part develops models for applying the framework to actual problems. The spectrum of applications varies from robust portfolio selection up to broader spheres, such as stochastic conic optimization with risk constraints or supervised machine learning.
Resumo:
Acoustic Emission (AE) monitoring can be used to detect the presence of damage as well as determine its location in Structural Health Monitoring (SHM) applications. Information on the time difference of the signal generated by the damage event arriving at different sensors is essential in performing localization. This makes the time of arrival (ToA) an important piece of information to retrieve from the AE signal. Generally, this is determined using statistical methods such as the Akaike Information Criterion (AIC) which is particularly prone to errors in the presence of noise. And given that the structures of interest are surrounded with harsh environments, a way to accurately estimate the arrival time in such noisy scenarios is of particular interest. In this work, two new methods are presented to estimate the arrival times of AE signals which are based on Machine Learning. Inspired by great results in the field, two models are presented which are Deep Learning models - a subset of machine learning. They are based on Convolutional Neural Network (CNN) and Capsule Neural Network (CapsNet). The primary advantage of such models is that they do not require the user to pre-define selected features but only require raw data to be given and the models establish non-linear relationships between the inputs and outputs. The performance of the models is evaluated using AE signals generated by a custom ray-tracing algorithm by propagating them on an aluminium plate and compared to AIC. It was found that the relative error in estimation on the test set was < 5% for the models compared to around 45% of AIC. The testing process was further continued by preparing an experimental setup and acquiring real AE signals to test on. Similar performances were observed where the two models not only outperform AIC by more than a magnitude in their average errors but also they were shown to be a lot more robust as compared to AIC which fails in the presence of noise.
Resumo:
Collecting and analysing data is an important element in any field of human activity and research. Even in sports, collecting and analyzing statistical data is attracting a growing interest. Some exemplar use cases are: improvement of technical/tactical aspects for team coaches, definition of game strategies based on the opposite team play or evaluation of the performance of players. Other advantages are related to taking more precise and impartial judgment in referee decisions: a wrong decision can change the outcomes of important matches. Finally, it can be useful to provide better representations and graphic effects that make the game more engaging for the audience during the match. Nowadays it is possible to delegate this type of task to automatic software systems that can use cameras or even hardware sensors to collect images or data and process them. One of the most efficient methods to collect data is to process the video images of the sporting event through mixed techniques concerning machine learning applied to computer vision. As in other domains in which computer vision can be applied, the main tasks in sports are related to object detection, player tracking, and to the pose estimation of athletes. The goal of the present thesis is to apply different models of CNNs to analyze volleyball matches. Starting from video frames of a volleyball match, we reproduce a bird's eye view of the playing court where all the players are projected, reporting also for each player the type of action she/he is performing.
Resumo:
Although the debate of what data science is has a long history and has not reached a complete consensus yet, Data Science can be summarized as the process of learning from data. Guided by the above vision, this thesis presents two independent data science projects developed in the scope of multidisciplinary applied research. The first part analyzes fluorescence microscopy images typically produced in life science experiments, where the objective is to count how many marked neuronal cells are present in each image. Aiming to automate the task for supporting research in the area, we propose a neural network architecture tuned specifically for this use case, cell ResUnet (c-ResUnet), and discuss the impact of alternative training strategies in overcoming particular challenges of our data. The approach provides good results in terms of both detection and counting, showing performance comparable to the interpretation of human operators. As a meaningful addition, we release the pre-trained model and the Fluorescent Neuronal Cells dataset collecting pixel-level annotations of where neuronal cells are located. In this way, we hope to help future research in the area and foster innovative methodologies for tackling similar problems. The second part deals with the problem of distributed data management in the context of LHC experiments, with a focus on supporting ATLAS operations concerning data transfer failures. In particular, we analyze error messages produced by failed transfers and propose a Machine Learning pipeline that leverages the word2vec language model and K-means clustering. This provides groups of similar errors that are presented to human operators as suggestions of potential issues to investigate. The approach is demonstrated on one full day of data, showing promising ability in understanding the message content and providing meaningful groupings, in line with previously reported incidents by human operators.
Resumo:
Deep learning methods are extremely promising machine learning tools to analyze neuroimaging data. However, their potential use in clinical settings is limited because of the existing challenges of applying these methods to neuroimaging data. In this study, first a data leakage type caused by slice-level data split that is introduced during training and validation of a 2D CNN is surveyed and a quantitative assessment of the model’s performance overestimation is presented. Second, an interpretable, leakage-fee deep learning software written in a python language with a wide range of options has been developed to conduct both classification and regression analysis. The software was applied to the study of mild cognitive impairment (MCI) in patients with small vessel disease (SVD) using multi-parametric MRI data where the cognitive performance of 58 patients measured by five neuropsychological tests is predicted using a multi-input CNN model taking brain image and demographic data. Each of the cognitive test scores was predicted using different MRI-derived features. As MCI due to SVD has been hypothesized to be the effect of white matter damage, DTI-derived features MD and FA produced the best prediction outcome of the TMT-A score which is consistent with the existing literature. In a second study, an interpretable deep learning system aimed at 1) classifying Alzheimer disease and healthy subjects 2) examining the neural correlates of the disease that causes a cognitive decline in AD patients using CNN visualization tools and 3) highlighting the potential of interpretability techniques to capture a biased deep learning model is developed. Structural magnetic resonance imaging (MRI) data of 200 subjects was used by the proposed CNN model which was trained using a transfer learning-based approach producing a balanced accuracy of 71.6%. Brain regions in the frontal and parietal lobe showing the cerebral cortex atrophy were highlighted by the visualization tools.
Resumo:
Astrocytes are the most numerous glial cell type in the mammalian brain and permeate the entire CNS interacting with neurons, vasculature, and other glial cells. Astrocytes display intracellular calcium signals that encode information about local synaptic function, distributed network activity, and high-level cognitive functions. Several studies have investigated the calcium dynamics of astrocytes in sensory areas and have shown that these cells can encode sensory stimuli. Nevertheless, only recently the neuro-scientific community has focused its attention on the role and functions of astrocytes in associative areas such as the hippocampus. In our first study, we used the information theory formalism to show that astrocytes in the CA1 area of the hippocampus recorded with 2-photon fluorescence microscopy during spatial navigation encode spatial information that is complementary and synergistic to information encoded by nearby "place cell" neurons. In our second study, we investigated various computational aspects of applying the information theory formalism to astrocytic calcium data. For this reason, we generated realistic simulations of calcium signals in astrocytes to determine optimal hyperparameters and procedures of information measures and applied them to real astrocytic calcium imaging data. Calcium signals of astrocytes are characterized by complex spatiotemporal dynamics occurring in subcellular parcels of the astrocytic domain which makes studying these cells in 2-photon calcium imaging recordings difficult. However, current analytical tools which identify the astrocytic subcellular regions are time consuming and extensively rely on user-defined parameters. Here, we present Rapid Astrocytic calcium Spatio-Temporal Analysis (RASTA), a novel machine learning algorithm for spatiotemporal semantic segmentation of 2-photon calcium imaging recordings of astrocytes which operates without human intervention. We found that RASTA provided fast and accurate identification of astrocytic cell somata, processes, and cellular domains, extracting calcium signals from identified regions of interest across individual cells and populations of hundreds of astrocytes recorded in awake mice.
Resumo:
The discovery of new materials and their functions has always been a fundamental component of technological progress. Nowadays, the quest for new materials is stronger than ever: sustainability, medicine, robotics and electronics are all key assets which depend on the ability to create specifically tailored materials. However, designing materials with desired properties is a difficult task, and the complexity of the discipline makes it difficult to identify general criteria. While scientists developed a set of best practices (often based on experience and expertise), this is still a trial-and-error process. This becomes even more complex when dealing with advanced functional materials. Their properties depend on structural and morphological features, which in turn depend on fabrication procedures and environment, and subtle alterations leads to dramatically different results. Because of this, materials modeling and design is one of the most prolific research fields. Many techniques and instruments are continuously developed to enable new possibilities, both in the experimental and computational realms. Scientists strive to enforce cutting-edge technologies in order to make progress. However, the field is strongly affected by unorganized file management, proliferation of custom data formats and storage procedures, both in experimental and computational research. Results are difficult to find, interpret and re-use, and a huge amount of time is spent interpreting and re-organizing data. This also strongly limit the application of data-driven and machine learning techniques. This work introduces possible solutions to the problems described above. Specifically, it talks about developing features for specific classes of advanced materials and use them to train machine learning models and accelerate computational predictions for molecular compounds; developing method for organizing non homogeneous materials data; automate the process of using devices simulations to train machine learning models; dealing with scattered experimental data and use them to discover new patterns.
Resumo:
This research activity aims at providing a reliable estimation of particular state variables or parameters concerning the dynamics and performance optimization of a MotoGP-class motorcycle, integrating the classical model-based approach with new methodologies involving artificial intelligence. The first topic of the research focuses on the estimation of the thermal behavior of the MotoGP carbon braking system. Numerical tools are developed to assess the instantaneous surface temperature distribution in the motorcycle's front brake discs. Within this application other important brake parameters are identified using Kalman filters, such as the disc convection coefficient and the power distribution in the disc-pads contact region. Subsequently, a physical model of the brake is built to estimate the instantaneous braking torque. However, the results obtained with this approach are highly limited by the knowledge of the friction coefficient (μ) between the disc rotor and the pads. Since the value of μ is a highly nonlinear function of many variables (namely temperature, pressure and angular velocity of the disc), an analytical model for the friction coefficient estimation appears impractical to establish. To overcome this challenge, an innovative hybrid solution is implemented, combining the benefit of artificial intelligence (AI) with classical model-based approach. Indeed, the disc temperature estimated through the thermal model previously implemented is processed by a machine learning algorithm that outputs the actual value of the friction coefficient thus improving the braking torque computation performed by the physical model of the brake. Finally, the last topic of this research activity regards the development of an AI algorithm to estimate the current sideslip angle of the motorcycle's front tire. While a single-track motorcycle kinematic model and IMU accelerometer signals theoretically enable sideslip calculation, the presence of accelerometer noise leads to a significant drift over time. To address this issue, a long short-term memory (LSTM) network is implemented.
Resumo:
The purpose of this investigation was to evaluate three learning methods for teaching basic oral surgical skills Thirty predoctoral dental students without any surgical knowledge or previous surgical experience were divided Into three groups (n=10 each) according to instructional strategy Group 1, active learning Group 2, text reading only, and Group 3, text reading and video demonstration After instruction, the apprentices were allowed to practice incision dissection and suture maneuvers in a bench learning model During the students' performance, a structured practice evaluation test to account for correct or incorrect maneuvers was applied by trained observers Evaluation tests were repeated after thirty and sixty days Data from resulting scores between groups and periods were considered for statistical analysis (ANOVA and Tukey Kramer) with a significant level of a=0 05 Results showed that the active learning group presented the significantly best learning outcomes related to immediate assimilation of surgical procedures compared to other groups All groups results were similar after sixty days of the first practice Assessment tests were fundamental to evaluate teaching strategies and allowed theoretical and proficiency learning feedbacks Repetition and interactive practice promoted retention of knowledge on basic oral surgical skills
Resumo:
Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdos-Renyi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabasi-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree k variation, decreasing its network recovery rate with the increase of k. The signal size was important for the inference method to get better accuracy in the network identification rate, presenting very good results with small expression profiles. However, the adopted inference method was not sensible to recognize distinct structures of interaction among genes, presenting a similar behavior when applied to different network topologies. In summary, the proposed framework, though simple, was adequate for the validation of the inferred networks by identifying some properties of the evaluated method, which can be extended to other inference methods.
Resumo:
Objective: To develop a model to predict the bleeding source and identify the cohort amongst patients with acute gastrointestinal bleeding (GIB) who require urgent intervention, including endoscopy. Patients with acute GIB, an unpredictable event, are most commonly evaluated and managed by non-gastroenterologists. Rapid and consistently reliable risk stratification of patients with acute GIB for urgent endoscopy may potentially improve outcomes amongst such patients by targeting scarce health-care resources to those who need it the most. Design and methods: Using ICD-9 codes for acute GIB, 189 patients with acute GIB and all. available data variables required to develop and test models were identified from a hospital medical records database. Data on 122 patients was utilized for development of the model and on 67 patients utilized to perform comparative analysis of the models. Clinical data such as presenting signs and symptoms, demographic data, presence of co-morbidities, laboratory data and corresponding endoscopic diagnosis and outcomes were collected. Clinical data and endoscopic diagnosis collected for each patient was utilized to retrospectively ascertain optimal management for each patient. Clinical presentations and corresponding treatment was utilized as training examples. Eight mathematical models including artificial neural network (ANN), support vector machine (SVM), k-nearest neighbor, linear discriminant analysis (LDA), shrunken centroid (SC), random forest (RF), logistic regression, and boosting were trained and tested. The performance of these models was compared using standard statistical analysis and ROC curves. Results: Overall the random forest model best predicted the source, need for resuscitation, and disposition with accuracies of approximately 80% or higher (accuracy for endoscopy was greater than 75%). The area under ROC curve for RF was greater than 0.85, indicating excellent performance by the random forest model Conclusion: While most mathematical models are effective as a decision support system for evaluation and management of patients with acute GIB, in our testing, the RF model consistently demonstrated the best performance. Amongst patients presenting with acute GIB, mathematical models may facilitate the identification of the source of GIB, need for intervention and allow optimization of care and healthcare resource allocation; these however require further validation. (c) 2007 Elsevier B.V. All rights reserved.
Resumo:
Learning organizations are a special form of organization where enhancing learning is a strategy to increase intellectual capital. Developing learning organizations has become an imperative for many managers, since an organization's learning methods and rate may be the only source of sustainable competitive advantage. However, learning organization theory tends to be prescriptive and rhetorical, with empirical research still relatively new. This paper contributes to the literature by reporting case-study research in progress based on four Australian organizations. In the organizations studied, use of the learning organization metaphor was coupled with an emergent metaphor: organization as `family". By employing structure mapping of metaphor within analytical induction, both established methods but not combined before, this paper shows how theory might be developed from metaphor.
Resumo:
Binning and truncation of data are common in data analysis and machine learning. This paper addresses the problem of fitting mixture densities to multivariate binned and truncated data. The EM approach proposed by McLachlan and Jones (Biometrics, 44: 2, 571-578, 1988) for the univariate case is generalized to multivariate measurements. The multivariate solution requires the evaluation of multidimensional integrals over each bin at each iteration of the EM procedure. Naive implementation of the procedure can lead to computationally inefficient results. To reduce the computational cost a number of straightforward numerical techniques are proposed. Results on simulated data indicate that the proposed methods can achieve significant computational gains with no loss in the accuracy of the final parameter estimates. Furthermore, experimental results suggest that with a sufficient number of bins and data points it is possible to estimate the true underlying density almost as well as if the data were not binned. The paper concludes with a brief description of an application of this approach to diagnosis of iron deficiency anemia, in the context of binned and truncated bivariate measurements of volume and hemoglobin concentration from an individual's red blood cells.