895 resultados para nonparametric regression
Resumo:
Bayesian nonparametric models, such as the Gaussian process and the Dirichlet process, have been extensively applied for target kinematics modeling in various applications including environmental monitoring, traffic planning, endangered species tracking, dynamic scene analysis, autonomous robot navigation, and human motion modeling. As shown by these successful applications, Bayesian nonparametric models are able to adjust their complexities adaptively from data as necessary, and are resistant to overfitting or underfitting. However, most existing works assume that the sensor measurements used to learn the Bayesian nonparametric target kinematics models are obtained a priori or that the target kinematics can be measured by the sensor at any given time throughout the task. Little work has been done for controlling the sensor with bounded field of view to obtain measurements of mobile targets that are most informative for reducing the uncertainty of the Bayesian nonparametric models. To present the systematic sensor planning approach to leaning Bayesian nonparametric models, the Gaussian process target kinematics model is introduced at first, which is capable of describing time-invariant spatial phenomena, such as ocean currents, temperature distributions and wind velocity fields. The Dirichlet process-Gaussian process target kinematics model is subsequently discussed for modeling mixture of mobile targets, such as pedestrian motion patterns.
Novel information theoretic functions are developed for these introduced Bayesian nonparametric target kinematics models to represent the expected utility of measurements as a function of sensor control inputs and random environmental variables. A Gaussian process expected Kullback Leibler divergence is developed as the expectation of the KL divergence between the current (prior) and posterior Gaussian process target kinematics models with respect to the future measurements. Then, this approach is extended to develop a new information value function that can be used to estimate target kinematics described by a Dirichlet process-Gaussian process mixture model. A theorem is proposed that shows the novel information theoretic functions are bounded. Based on this theorem, efficient estimators of the new information theoretic functions are designed, which are proved to be unbiased with the variance of the resultant approximation error decreasing linearly as the number of samples increases. Computational complexities for optimizing the novel information theoretic functions under sensor dynamics constraints are studied, and are proved to be NP-hard. A cumulative lower bound is then proposed to reduce the computational complexity to polynomial time.
Three sensor planning algorithms are developed according to the assumptions on the target kinematics and the sensor dynamics. For problems where the control space of the sensor is discrete, a greedy algorithm is proposed. The efficiency of the greedy algorithm is demonstrated by a numerical experiment with data of ocean currents obtained by moored buoys. A sweep line algorithm is developed for applications where the sensor control space is continuous and unconstrained. Synthetic simulations as well as physical experiments with ground robots and a surveillance camera are conducted to evaluate the performance of the sweep line algorithm. Moreover, a lexicographic algorithm is designed based on the cumulative lower bound of the novel information theoretic functions, for the scenario where the sensor dynamics are constrained. Numerical experiments with real data collected from indoor pedestrians by a commercial pan-tilt camera are performed to examine the lexicographic algorithm. Results from both the numerical simulations and the physical experiments show that the three sensor planning algorithms proposed in this dissertation based on the novel information theoretic functions are superior at learning the target kinematics with
little or no prior knowledge
Resumo:
Fitting statistical models is computationally challenging when the sample size or the dimension of the dataset is huge. An attractive approach for down-scaling the problem size is to first partition the dataset into subsets and then fit using distributed algorithms. The dataset can be partitioned either horizontally (in the sample space) or vertically (in the feature space), and the challenge arise in defining an algorithm with low communication, theoretical guarantees and excellent practical performance in general settings. For sample space partitioning, I propose a MEdian Selection Subset AGgregation Estimator ({\em message}) algorithm for solving these issues. The algorithm applies feature selection in parallel for each subset using regularized regression or Bayesian variable selection method, calculates the `median' feature inclusion index, estimates coefficients for the selected features in parallel for each subset, and then averages these estimates. The algorithm is simple, involves very minimal communication, scales efficiently in sample size, and has theoretical guarantees. I provide extensive experiments to show excellent performance in feature selection, estimation, prediction, and computation time relative to usual competitors.
While sample space partitioning is useful in handling datasets with large sample size, feature space partitioning is more effective when the data dimension is high. Existing methods for partitioning features, however, are either vulnerable to high correlations or inefficient in reducing the model dimension. In the thesis, I propose a new embarrassingly parallel framework named {\em DECO} for distributed variable selection and parameter estimation. In {\em DECO}, variables are first partitioned and allocated to m distributed workers. The decorrelated subset data within each worker are then fitted via any algorithm designed for high-dimensional problems. We show that by incorporating the decorrelation step, DECO can achieve consistent variable selection and parameter estimation on each subset with (almost) no assumptions. In addition, the convergence rate is nearly minimax optimal for both sparse and weakly sparse models and does NOT depend on the partition number m. Extensive numerical experiments are provided to illustrate the performance of the new framework.
For datasets with both large sample sizes and high dimensionality, I propose a new "divided-and-conquer" framework {\em DEME} (DECO-message) by leveraging both the {\em DECO} and the {\em message} algorithm. The new framework first partitions the dataset in the sample space into row cubes using {\em message} and then partition the feature space of the cubes using {\em DECO}. This procedure is equivalent to partitioning the original data matrix into multiple small blocks, each with a feasible size that can be stored and fitted in a computer in parallel. The results are then synthezied via the {\em DECO} and {\em message} algorithm in a reverse order to produce the final output. The whole framework is extremely scalable.
Resumo:
Abstract Purpose The purpose of the study is to review recent studies published from 2007-2015 on tourism and hotel demand modeling and forecasting with a view to identifying the emerging topics and methods studied and to pointing future research directions in the field. Design/Methodology/approach Articles on tourism and hotel demand modeling and forecasting published in both science citation index (SCI) and social science citation index (SSCI) journals were identified and analyzed. Findings This review found that the studies focused on hotel demand are relatively less than those on tourism demand. It is also observed that more and more studies have moved away from the aggregate tourism demand analysis, while disaggregate markets and niche products have attracted increasing attention. Some studies have gone beyond neoclassical economic theory to seek additional explanations of the dynamics of tourism and hotel demand, such as environmental factors, tourist online behavior and consumer confidence indicators, among others. More sophisticated techniques such as nonlinear smooth transition regression, mixed-frequency modeling technique and nonparametric singular spectrum analysis have also been introduced to this research area. Research limitations/implications The main limitation of this review is that the articles included in this study only cover the English literature. Future review of this kind should also include articles published in other languages. The review provides a useful guide for researchers who are interested in future research on tourism and hotel demand modeling and forecasting. Practical implications This review provides important suggestions and recommendations for improving the efficiency of tourism and hospitality management practices. Originality/value The value of this review is that it identifies the current trends in tourism and hotel demand modeling and forecasting research and points out future research directions.
Characterising granuloma regression and liver recovery in a murine model of schistosomiasis japonica
Resumo:
For hepatic schistosomiasis the egg-induced granulomatous response and the development of extensive fibrosis are the main pathologies. We used a Schistosoma japonicum-infected mouse model to characterise the multi-cellular pathways associated with the recovery from hepatic fibrosis following clearance of the infection with the anti-schistosomal drug, praziquantel. In the recovering liver splenomegaly, granuloma density and liver fibrosis were all reduced. Inflammatory cell infiltration into the liver was evident, and the numbers of neutrophils, eosinophils and macrophages were significantly decreased. Transcriptomic analysis revealed the up-regulation of fatty acid metabolism genes and the identification of Peroxisome proliferator activated receptor alpha as the upstream regulator of liver recovery. The aryl hydrocarbon receptor signalling pathway which regulates xenobiotic metabolism was also differentially up-regulated. These findings provide a better understanding of the mechanisms associated with the regression of hepatic schistosomiasis.
Resumo:
This paper discusses areas for future research opportunities by addressing accounting issues faced by management accountants practicing in hospitality organizations. Specifically, the article focuses on the use of the uniform system of accounts by operating properties, the usefulness of allocating support costs to operated departments, extending our understanding of operating costs and performance measurement systems and the certification of practicing accountants.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
L’un des problèmes importants en apprentissage automatique est de déterminer la complexité du modèle à apprendre. Une trop grande complexité mène au surapprentissage, ce qui correspond à trouver des structures qui n’existent pas réellement dans les données, tandis qu’une trop faible complexité mène au sous-apprentissage, c’est-à-dire que l’expressivité du modèle est insuffisante pour capturer l’ensemble des structures présentes dans les données. Pour certains modèles probabilistes, la complexité du modèle se traduit par l’introduction d’une ou plusieurs variables cachées dont le rôle est d’expliquer le processus génératif des données. Il existe diverses approches permettant d’identifier le nombre approprié de variables cachées d’un modèle. Cette thèse s’intéresse aux méthodes Bayésiennes nonparamétriques permettant de déterminer le nombre de variables cachées à utiliser ainsi que leur dimensionnalité. La popularisation des statistiques Bayésiennes nonparamétriques au sein de la communauté de l’apprentissage automatique est assez récente. Leur principal attrait vient du fait qu’elles offrent des modèles hautement flexibles et dont la complexité s’ajuste proportionnellement à la quantité de données disponibles. Au cours des dernières années, la recherche sur les méthodes d’apprentissage Bayésiennes nonparamétriques a porté sur trois aspects principaux : la construction de nouveaux modèles, le développement d’algorithmes d’inférence et les applications. Cette thèse présente nos contributions à ces trois sujets de recherches dans le contexte d’apprentissage de modèles à variables cachées. Dans un premier temps, nous introduisons le Pitman-Yor process mixture of Gaussians, un modèle permettant l’apprentissage de mélanges infinis de Gaussiennes. Nous présentons aussi un algorithme d’inférence permettant de découvrir les composantes cachées du modèle que nous évaluons sur deux applications concrètes de robotique. Nos résultats démontrent que l’approche proposée surpasse en performance et en flexibilité les approches classiques d’apprentissage. Dans un deuxième temps, nous proposons l’extended cascading Indian buffet process, un modèle servant de distribution de probabilité a priori sur l’espace des graphes dirigés acycliques. Dans le contexte de réseaux Bayésien, ce prior permet d’identifier à la fois la présence de variables cachées et la structure du réseau parmi celles-ci. Un algorithme d’inférence Monte Carlo par chaîne de Markov est utilisé pour l’évaluation sur des problèmes d’identification de structures et d’estimation de densités. Dans un dernier temps, nous proposons le Indian chefs process, un modèle plus général que l’extended cascading Indian buffet process servant à l’apprentissage de graphes et d’ordres. L’avantage du nouveau modèle est qu’il admet les connections entres les variables observables et qu’il prend en compte l’ordre des variables. Nous présentons un algorithme d’inférence Monte Carlo par chaîne de Markov avec saut réversible permettant l’apprentissage conjoint de graphes et d’ordres. L’évaluation est faite sur des problèmes d’estimations de densité et de test d’indépendance. Ce modèle est le premier modèle Bayésien nonparamétrique permettant d’apprendre des réseaux Bayésiens disposant d’une structure complètement arbitraire.
Resumo:
Resumo:
We evaluate the integration of 3D preoperative computed tomography angiography of the coronary arteries with intraoperative 2D X-ray angiographies by a recently proposed novel registration-by-regression method. The method relates image features of 2D projection images to the transformation parameters of the 3D image. We compared different sets of features and studied the influence of preprocessing the training set. For the registration evaluation, a gold standard was developed from eight X-ray angiography sequences from six different patients. The alignment quality was measured using the 3D mean target registration error (mTRE). The registration-by-regression method achieved moderate accuracy (median mTRE of 15 mm) on real images. It does therefore not provide yet a complete solution to the 3D–2D registration problem but it could be used as an initialisation method to eliminate the need for manual initialisation.
Resumo:
We present an IP-based nonparametric (revealed preference) testing procedure for rational consumption behavior in terms of general collective models, which include consumption externalities and public consumption. An empirical application to data drawn from the Russia Longitudinal Monitoring Survey (RLMS) demonstrates the practical usefulness of the procedure. Finally, we present extensions of the testing procedure to evaluate the goodness-of- t of the collective model subject to testing, and to quantify and improve the power of the corresponding collective rationality tests.
Resumo:
Current practice for analysing functional neuroimaging data is to average the brain signals recorded at multiple sensors or channels on the scalp over time across hundreds of trials or replicates to eliminate noise and enhance the underlying signal of interest. These studies recording brain signals non-invasively using functional neuroimaging techniques such as electroencephalography (EEG) and magnetoencephalography (MEG) generate complex, high dimensional and noisy data for many subjects at a number of replicates. Single replicate (or single trial) analysis of neuroimaging data have gained focus as they are advantageous to study the features of the signals at each replicate without averaging out important features in the data that the current methods employ. The research here is conducted to systematically develop flexible regression mixed models for single trial analysis of specific brain activities using examples from EEG and MEG to illustrate the models. This thesis follows three specific themes: i) artefact correction to estimate the `brain' signal which is of interest, ii) characterisation of the signals to reduce their dimensions, and iii) model fitting for single trials after accounting for variations between subjects and within subjects (between replicates). The models are developed to establish evidence of two specific neurological phenomena - entrainment of brain signals to an $\alpha$ band of frequencies (8-12Hz) and dipolar brain activation in the same $\alpha$ frequency band in an EEG experiment and a MEG study, respectively.
Resumo:
Autoimmune hepatitis (AIH) is a disease of unknown aetiology with drug-induced AIH being the most complex and not fully understood type. We present the case of a 57-year-old female patient with acute icteric hepatitis after interferon-beta-1b (IFNβ-1b) administration for multiple sclerosis (MS). Based on liver autoimmune serology, histology and appropriate exclusion of other liver diseases, a diagnosis of AIH-related cirrhosis was established. Following discontinuation of IFNβ-1b, a complete resolution of biochemical activity indices was observed and the patient remained untreated on her own decision. However, 3 years later, after a course of intravenous methylprednisolone for MS, a new acute transaminase flare was recorded which subsided again spontaneously after 3 weeks. Liver biopsy and elastography showed significant fibrosis regression (F2 fibrosis). To our knowledge, this is the first report showing spontaneous cirrhosis regression in an IFNβ-1b-induced AIH-like syndrome following drug withdrawal, suggesting that cirrhosis might be reversible if the offending fibrogenic stimulus is withdrawn.