963 resultados para multisensory statistical learning
Resumo:
The research considers the problem of spatial data classification using machine learning algorithms: probabilistic neural networks (PNN) and support vector machines (SVM). As a benchmark model simple k-nearest neighbor algorithm is considered. PNN is a neural network reformulation of well known nonparametric principles of probability density modeling using kernel density estimator and Bayesian optimal or maximum a posteriori decision rules. PNN is well suited to problems where not only predictions but also quantification of accuracy and integration of prior information are necessary. An important property of PNN is that they can be easily used in decision support systems dealing with problems of automatic classification. Support vector machine is an implementation of the principles of statistical learning theory for the classification tasks. Recently they were successfully applied for different environmental topics: classification of soil types and hydro-geological units, optimization of monitoring networks, susceptibility mapping of natural hazards. In the present paper both simulated and real data case studies (low and high dimensional) are considered. The main attention is paid to the detection and learning of spatial patterns by the algorithms applied.
Resumo:
One objective of artificial intelligence is to model the behavior of an intelligent agent interacting with its environment. The environment's transformations can be modeled as a Markov chain, whose state is partially observable to the agent and affected by its actions; such processes are known as partially observable Markov decision processes (POMDPs). While the environment's dynamics are assumed to obey certain rules, the agent does not know them and must learn. In this dissertation we focus on the agent's adaptation as captured by the reinforcement learning framework. This means learning a policy---a mapping of observations into actions---based on feedback from the environment. The learning can be viewed as browsing a set of policies while evaluating them by trial through interaction with the environment. The set of policies is constrained by the architecture of the agent's controller. POMDPs require a controller to have a memory. We investigate controllers with memory, including controllers with external memory, finite state controllers and distributed controllers for multi-agent systems. For these various controllers we work out the details of the algorithms which learn by ascending the gradient of expected cumulative reinforcement. Building on statistical learning theory and experiment design theory, a policy evaluation algorithm is developed for the case of experience re-use. We address the question of sufficient experience for uniform convergence of policy evaluation and obtain sample complexity bounds for various estimators. Finally, we demonstrate the performance of the proposed algorithms on several domains, the most complex of which is simulated adaptive packet routing in a telecommunication network.
Resumo:
Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives---the likelihood-based and the Bayesian. The goal is two-fold: to place current neural network approaches to missing data within a statistical framework, and to describe a set of algorithms, derived from the likelihood-based framework, that handle clustering, classification, and function approximation from incomplete data in a principled and efficient manner. These algorithms are based on mixture modeling and make two distinct appeals to the Expectation-Maximization (EM) principle (Dempster, Laird, and Rubin 1977)---both for the estimation of mixture components and for coping with the missing data.
Resumo:
Abstract Background Educational computer games are examples of computer-assisted learning objects, representing an educational strategy of growing interest. Given the changes in the digital world over the last decades, students of the current generation expect technology to be used in advancing their learning requiring a need to change traditional passive learning methodologies to an active multisensory experimental learning methodology. The objective of this study was to compare a computer game-based learning method with a traditional learning method, regarding learning gains and knowledge retention, as means of teaching head and neck Anatomy and Physiology to Speech-Language and Hearing pathology undergraduate students. Methods Students were randomized to participate to one of the learning methods and the data analyst was blinded to which method of learning the students had received. Students’ prior knowledge (i.e. before undergoing the learning method), short-term knowledge retention and long-term knowledge retention (i.e. six months after undergoing the learning method) were assessed with a multiple choice questionnaire. Students’ performance was compared considering the three moments of assessment for both for the mean total score and for separated mean scores for Anatomy questions and for Physiology questions. Results Students that received the game-based method performed better in the pos-test assessment only when considering the Anatomy questions section. Students that received the traditional lecture performed better in both post-test and long-term post-test when considering the Anatomy and Physiology questions. Conclusions The game-based learning method is comparable to the traditional learning method in general and in short-term gains, while the traditional lecture still seems to be more effective to improve students’ short and long-term knowledge retention.
Resumo:
Statistical modelling and statistical learning theory are two powerful analytical frameworks for analyzing signals and developing efficient processing and classification algorithms. In this thesis, these frameworks are applied for modelling and processing biomedical signals in two different contexts: ultrasound medical imaging systems and primate neural activity analysis and modelling. In the context of ultrasound medical imaging, two main applications are explored: deconvolution of signals measured from a ultrasonic transducer and automatic image segmentation and classification of prostate ultrasound scans. In the former application a stochastic model of the radio frequency signal measured from a ultrasonic transducer is derived. This model is then employed for developing in a statistical framework a regularized deconvolution procedure, for enhancing signal resolution. In the latter application, different statistical models are used to characterize images of prostate tissues, extracting different features. These features are then uses to segment the images in region of interests by means of an automatic procedure based on a statistical model of the extracted features. Finally, machine learning techniques are used for automatic classification of the different region of interests. In the context of neural activity signals, an example of bio-inspired dynamical network was developed to help in studies of motor-related processes in the brain of primate monkeys. The presented model aims to mimic the abstract functionality of a cell population in 7a parietal region of primate monkeys, during the execution of learned behavioural tasks.
Resumo:
Implicit task sequence learning (TSL) can be considered as an extension of implicit sequence learning which is typically tested with the classical serial reaction time task (SRTT). By design, in the SRTT there is a correlation between the sequence of stimuli to which participants must attend and the sequence of motor movements/key presses with which participants must respond. The TSL paradigm allows to disentangle this correlation and to separately manipulate the presences/absence of a sequence of tasks, a sequence of responses, and even other streams of information such as stimulus locations or stimulus-response mappings. Here I review the state of TSL research which seems to point at the critical role of the presence of correlated streams of information in implicit sequence learning. On a more general level, I propose that beyond correlated streams of information, a simple statistical learning mechanism may also be involved in implicit sequence learning, and that the relative contribution of these two explanations differ according to task requirements. With this differentiation, conflicting results can be integrated into a coherent framework.
Resumo:
Machine and Statistical Learning techniques are used in almost all online advertisement systems. The problem of discovering which content is more demanded (e.g. receive more clicks) can be modeled as a multi-armed bandit problem. Contextual bandits (i.e., bandits with covariates, side information or associative reinforcement learning) associate, to each specific content, several features that define the “context” in which it appears (e.g. user, web page, time, region). This problem can be studied in the stochastic/statistical setting by means of the conditional probability paradigm using the Bayes’ theorem. However, for very large contextual information and/or real-time constraints, the exact calculation of the Bayes’ rule is computationally infeasible. In this article, we present a method that is able to handle large contextual information for learning in contextual-bandits problems. This method was tested in the Challenge on Yahoo! dataset at ICML2012’s Workshop “new Challenges for Exploration & Exploitation 3”, obtaining the second place. Its basic exploration policy is deterministic in the sense that for the same input data (as a time-series) the same results are obtained. We address the deterministic exploration vs. exploitation issue, explaining the way in which the proposed method deterministically finds an effective dynamic trade-off based solely in the input-data, in contrast to other methods that use a random number generator.
Resumo:
Background: We introduced a series of computer-supported workshops in our undergraduate statistics courses, in the hope that it would help students to gain a deeper understanding of statistical concepts. This raised questions about the appropriate design of the Virtual Learning Environment (VLE) in which such an approach had to be implemented. Therefore, we investigated two competing software design models for VLEs. In the first system, all learning features were a function of the classical VLE. The second system was designed from the perspective that learning features should be a function of the course's core content (statistical analyses), which required us to develop a specific-purpose Statistical Learning Environment (SLE) based on Reproducible Computing and newly developed Peer Review (PR) technology. Objectives: The main research question is whether the second VLE design improved learning efficiency as compared to the standard type of VLE design that is commonly used in education. As a secondary objective we provide empirical evidence about the usefulness of PR as a constructivist learning activity which supports non-rote learning. Finally, this paper illustrates that it is possible to introduce a constructivist learning approach in large student populations, based on adequately designed educational technology, without subsuming educational content to technological convenience. Methods: Both VLE systems were tested within a two-year quasi-experiment based on a Reliable Nonequivalent Group Design. This approach allowed us to draw valid conclusions about the treatment effect of the changed VLE design, even though the systems were implemented in successive years. The methodological aspects about the experiment's internal validity are explained extensively. Results: The effect of the design change is shown to have substantially increased the efficiency of constructivist, computer-assisted learning activities for all cohorts of the student population under investigation. The findings demonstrate that a content-based design outperforms the traditional VLE-based design. © 2011 Wessa et al.
Resumo:
Bayesian methods offer a flexible and convenient probabilistic learning framework to extract interpretable knowledge from complex and structured data. Such methods can characterize dependencies among multiple levels of hidden variables and share statistical strength across heterogeneous sources. In the first part of this dissertation, we develop two dependent variational inference methods for full posterior approximation in non-conjugate Bayesian models through hierarchical mixture- and copula-based variational proposals, respectively. The proposed methods move beyond the widely used factorized approximation to the posterior and provide generic applicability to a broad class of probabilistic models with minimal model-specific derivations. In the second part of this dissertation, we design probabilistic graphical models to accommodate multimodal data, describe dynamical behaviors and account for task heterogeneity. In particular, the sparse latent factor model is able to reveal common low-dimensional structures from high-dimensional data. We demonstrate the effectiveness of the proposed statistical learning methods on both synthetic and real-world data.
Resumo:
Second language (L2) learning outcomes may depend on the structure of the input and learners’ cognitive abilities. This study tested whether less predictable input might facilitate learning and generalization of L2 morphology while evaluating contributions of statistical learning ability, nonverbal intelligence, phonological short-term memory, and verbal working memory. Over three sessions, 54 adults were exposed to a Russian case-marking paradigm with a balanced or skewed item distribution in the input. Whereas statistical learning ability and nonverbal intelligence predicted learning of trained items, only nonverbal intelligence also predicted generalization of case-marking inflections to new vocabulary. Neither measure of temporary storage capacity predicted learning. Balanced, less predictable input was associated with higher accuracy in generalization but only in the initial test session. These results suggest that individual differences in pattern extraction play a more sustained role in L2 acquisition than instructional manipulations that vary the predictability of lexical items in the input.
Resumo:
Functional magnetic resonance imaging (fMRI) is currently one of the most widely used methods for studying human brain function in vivo. Although many different approaches to fMRI analysis are available, the most widely used methods employ so called ""mass-univariate"" modeling of responses in a voxel-by-voxel fashion to construct activation maps. However, it is well known that many brain processes involve networks of interacting regions and for this reason multivariate analyses might seem to be attractive alternatives to univariate approaches. The current paper focuses on one multivariate application of statistical learning theory: the statistical discrimination maps (SDM) based on support vector machine, and seeks to establish some possible interpretations when the results differ from univariate `approaches. In fact, when there are changes not only on the activation level of two conditions but also on functional connectivity, SDM seems more informative. We addressed this question using both simulations and applications to real data. We have shown that the combined use of univariate approaches and SDM yields significant new insights into brain activations not available using univariate methods alone. In the application to a visual working memory fMRI data, we demonstrated that the interaction among brain regions play a role in SDM`s power to detect discriminative voxels. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
The long term goal of this research is to develop a program able to produce an automatic segmentation and categorization of textual sequences into discourse types. In this preliminary contribution, we present the construction of an algorithm which takes a segmented text as input and attempts to produce a categorization of sequences, such as narrative, argumentative, descriptive and so on. Also, this work aims at investigating a possible convergence between the typological approach developed in particular in the field of text and discourse analysis in French by Adam (2008) and Bronckart (1997) and unsupervised statistical learning.
Resumo:
Recently, kernel-based Machine Learning methods have gained great popularity in many data analysis and data mining fields: pattern recognition, biocomputing, speech and vision, engineering, remote sensing etc. The paper describes the use of kernel methods to approach the processing of large datasets from environmental monitoring networks. Several typical problems of the environmental sciences and their solutions provided by kernel-based methods are considered: classification of categorical data (soil type classification), mapping of environmental and pollution continuous information (pollution of soil by radionuclides), mapping with auxiliary information (climatic data from Aral Sea region). The promising developments, such as automatic emergency hot spot detection and monitoring network optimization are discussed as well.
Resumo:
To be diagnostically useful, structural MRI must reliably distinguish Alzheimer's disease (AD) from normal aging in individual scans. Recent advances in statistical learning theory have led to the application of support vector machines to MRI for detection of a variety of disease states. The aims of this study were to assess how successfully support vector machines assigned individual diagnoses and to determine whether data-sets combined from multiple scanners and different centres could be used to obtain effective classification of scans. We used linear support vector machines to classify the grey matter segment of T1-weighted MR scans from pathologically proven AD patients and cognitively normal elderly individuals obtained from two centres with different scanning equipment. Because the clinical diagnosis of mild AD is difficult we also tested the ability of support vector machines to differentiate control scans from patients without post-mortem confirmation. Finally we sought to use these methods to differentiate scans between patients suffering from AD from those with frontotemporal lobar degeneration. Up to 96% of pathologically verified AD patients were correctly classified using whole brain images. Data from different centres were successfully combined achieving comparable results from the separate analyses. Importantly, data from one centre could be used to train a support vector machine to accurately differentiate AD and normal ageing scans obtained from another centre with different subjects and different scanner equipment. Patients with mild, clinically probable AD and age/sex matched controls were correctly separated in 89% of cases which is compatible with published diagnosis rates in the best clinical centres. This method correctly assigned 89% of patients with post-mortem confirmed diagnosis of either AD or frontotemporal lobar degeneration to their respective group. Our study leads to three conclusions: Firstly, support vector machines successfully separate patients with AD from healthy aging subjects. Secondly, they perform well in the differential diagnosis of two different forms of dementia. Thirdly, the method is robust and can be generalized across different centres. This suggests an important role for computer based diagnostic image analysis for clinical practice.
Resumo:
Due to the advances in sensor networks and remote sensing technologies, the acquisition and storage rates of meteorological and climatological data increases every day and ask for novel and efficient processing algorithms. A fundamental problem of data analysis and modeling is the spatial prediction of meteorological variables in complex orography, which serves among others to extended climatological analyses, for the assimilation of data into numerical weather prediction models, for preparing inputs to hydrological models and for real time monitoring and short-term forecasting of weather.In this thesis, a new framework for spatial estimation is proposed by taking advantage of a class of algorithms emerging from the statistical learning theory. Nonparametric kernel-based methods for nonlinear data classification, regression and target detection, known as support vector machines (SVM), are adapted for mapping of meteorological variables in complex orography.With the advent of high resolution digital elevation models, the field of spatial prediction met new horizons. In fact, by exploiting image processing tools along with physical heuristics, an incredible number of terrain features which account for the topographic conditions at multiple spatial scales can be extracted. Such features are highly relevant for the mapping of meteorological variables because they control a considerable part of the spatial variability of meteorological fields in the complex Alpine orography. For instance, patterns of orographic rainfall, wind speed and cold air pools are known to be correlated with particular terrain forms, e.g. convex/concave surfaces and upwind sides of mountain slopes.Kernel-based methods are employed to learn the nonlinear statistical dependence which links the multidimensional space of geographical and topographic explanatory variables to the variable of interest, that is the wind speed as measured at the weather stations or the occurrence of orographic rainfall patterns as extracted from sequences of radar images. Compared to low dimensional models integrating only the geographical coordinates, the proposed framework opens a way to regionalize meteorological variables which are multidimensional in nature and rarely show spatial auto-correlation in the original space making the use of classical geostatistics tangled.The challenges which are explored during the thesis are manifolds. First, the complexity of models is optimized to impose appropriate smoothness properties and reduce the impact of noisy measurements. Secondly, a multiple kernel extension of SVM is considered to select the multiscale features which explain most of the spatial variability of wind speed. Then, SVM target detection methods are implemented to describe the orographic conditions which cause persistent and stationary rainfall patterns. Finally, the optimal splitting of the data is studied to estimate realistic performances and confidence intervals characterizing the uncertainty of predictions.The resulting maps of average wind speeds find applications within renewable resources assessment and opens a route to decrease the temporal scale of analysis to meet hydrological requirements. Furthermore, the maps depicting the susceptibility to orographic rainfall enhancement can be used to improve current radar-based quantitative precipitation estimation and forecasting systems and to generate stochastic ensembles of precipitation fields conditioned upon the orography.