995 resultados para component-wise gradient boosting
Resumo:
OBJECTIVES: The aim of the study was to assess whether prospective follow-up data within the Swiss HIV Cohort Study can be used to predict patients who stop smoking; or among smokers who stop, those who start smoking again. METHODS: We built prediction models first using clinical reasoning ('clinical models') and then by selecting from numerous candidate predictors using advanced statistical methods ('statistical models'). Our clinical models were based on literature that suggests that motivation drives smoking cessation, while dependence drives relapse in those attempting to stop. Our statistical models were based on automatic variable selection using additive logistic regression with component-wise gradient boosting. RESULTS: Of 4833 smokers, 26% stopped smoking, at least temporarily; because among those who stopped, 48% started smoking again. The predictive performance of our clinical and statistical models was modest. A basic clinical model for cessation, with patients classified into three motivational groups, was nearly as discriminatory as a constrained statistical model with just the most important predictors (the ratio of nonsmoking visits to total visits, alcohol or drug dependence, psychiatric comorbidities, recent hospitalization and age). A basic clinical model for relapse, based on the maximum number of cigarettes per day prior to stopping, was not as discriminatory as a constrained statistical model with just the ratio of nonsmoking visits to total visits. CONCLUSIONS: Predicting smoking cessation and relapse is difficult, so that simple models are nearly as discriminatory as complex ones. Patients with a history of attempting to stop and those known to have stopped recently are the best candidates for an intervention.
Resumo:
OBJECTIVES: The aim of the study was to assess whether prospective follow-up data within the Swiss HIV Cohort Study can be used to predict patients who stop smoking; or among smokers who stop, those who start smoking again. METHODS: We built prediction models first using clinical reasoning ('clinical models') and then by selecting from numerous candidate predictors using advanced statistical methods ('statistical models'). Our clinical models were based on literature that suggests that motivation drives smoking cessation, while dependence drives relapse in those attempting to stop. Our statistical models were based on automatic variable selection using additive logistic regression with component-wise gradient boosting. RESULTS: Of 4833 smokers, 26% stopped smoking, at least temporarily; because among those who stopped, 48% started smoking again. The predictive performance of our clinical and statistical models was modest. A basic clinical model for cessation, with patients classified into three motivational groups, was nearly as discriminatory as a constrained statistical model with just the most important predictors (the ratio of nonsmoking visits to total visits, alcohol or drug dependence, psychiatric comorbidities, recent hospitalization and age). A basic clinical model for relapse, based on the maximum number of cigarettes per day prior to stopping, was not as discriminatory as a constrained statistical model with just the ratio of nonsmoking visits to total visits. CONCLUSIONS: Predicting smoking cessation and relapse is difficult, so that simple models are nearly as discriminatory as complex ones. Patients with a history of attempting to stop and those known to have stopped recently are the best candidates for an intervention.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
Purpose To this day, the slit lamp remains the first tool used by an ophthalmologist to examine patient eyes. Imaging of the retina poses, however, a variety of problems, namely a shallow depth of focus, reflections from the optical system, a small field of view and non-uniform illumination. For ophthalmologists, the use of slit lamp images for documentation and analysis purposes, however, remains extremely challenging due to large image artifacts. For this reason, we propose an automatic retinal slit lamp video mosaicking, which enlarges the field of view and reduces amount of noise and reflections, thus enhancing image quality. Methods Our method is composed of three parts: (i) viable content segmentation, (ii) global registration and (iii) image blending. Frame content is segmented using gradient boosting with custom pixel-wise features. Speeded-up robust features are used for finding pair-wise translations between frames with robust random sample consensus estimation and graph-based simultaneous localization and mapping for global bundle adjustment. Foreground-aware blending based on feathering merges video frames into comprehensive mosaics. Results Foreground is segmented successfully with an area under the curve of the receiver operating characteristic curve of 0.9557. Mosaicking results and state-of-the-art methods were compared and rated by ophthalmologists showing a strong preference for a large field of view provided by our method. Conclusions The proposed method for global registration of retinal slit lamp images of the retina into comprehensive mosaics improves over state-of-the-art methods and is preferred qualitatively.
Resumo:
Cette thèse s'intéresse à étudier les propriétés extrémales de certains modèles de risque d'intérêt dans diverses applications de l'assurance, de la finance et des statistiques. Cette thèse se développe selon deux axes principaux, à savoir: Dans la première partie, nous nous concentrons sur deux modèles de risques univariés, c'est-à- dire, un modèle de risque de déflation et un modèle de risque de réassurance. Nous étudions le développement des queues de distribution sous certaines conditions des risques commun¬s. Les principaux résultats sont ainsi illustrés par des exemples typiques et des simulations numériques. Enfin, les résultats sont appliqués aux domaines des assurances, par exemple, les approximations de Value-at-Risk, d'espérance conditionnelle unilatérale etc. La deuxième partie de cette thèse est consacrée à trois modèles à deux variables: Le premier modèle concerne la censure à deux variables des événements extrême. Pour ce modèle, nous proposons tout d'abord une classe d'estimateurs pour les coefficients de dépendance et la probabilité des queues de distributions. Ces estimateurs sont flexibles en raison d'un paramètre de réglage. Leurs distributions asymptotiques sont obtenues sous certaines condi¬tions lentes bivariées de second ordre. Ensuite, nous donnons quelques exemples et présentons une petite étude de simulations de Monte Carlo, suivie par une application sur un ensemble de données réelles d'assurance. L'objectif de notre deuxième modèle de risque à deux variables est l'étude de coefficients de dépendance des queues de distributions obliques et asymétriques à deux variables. Ces distri¬butions obliques et asymétriques sont largement utiles dans les applications statistiques. Elles sont générées principalement par le mélange moyenne-variance de lois normales et le mélange de lois normales asymétriques d'échelles, qui distinguent la structure de dépendance de queue comme indiqué par nos principaux résultats. Le troisième modèle de risque à deux variables concerne le rapprochement des maxima de séries triangulaires elliptiques obliques. Les résultats théoriques sont fondés sur certaines hypothèses concernant le périmètre aléatoire sous-jacent des queues de distributions. -- This thesis aims to investigate the extremal properties of certain risk models of interest in vari¬ous applications from insurance, finance and statistics. This thesis develops along two principal lines, namely: In the first part, we focus on two univariate risk models, i.e., deflated risk and reinsurance risk models. Therein we investigate their tail expansions under certain tail conditions of the common risks. Our main results are illustrated by some typical examples and numerical simu¬lations as well. Finally, the findings are formulated into some applications in insurance fields, for instance, the approximations of Value-at-Risk, conditional tail expectations etc. The second part of this thesis is devoted to the following three bivariate models: The first model is concerned with bivariate censoring of extreme events. For this model, we first propose a class of estimators for both tail dependence coefficient and tail probability. These estimators are flexible due to a tuning parameter and their asymptotic distributions are obtained under some second order bivariate slowly varying conditions of the model. Then, we give some examples and present a small Monte Carlo simulation study followed by an application on a real-data set from insurance. The objective of our second bivariate risk model is the investigation of tail dependence coefficient of bivariate skew slash distributions. Such skew slash distributions are extensively useful in statistical applications and they are generated mainly by normal mean-variance mixture and scaled skew-normal mixture, which distinguish the tail dependence structure as shown by our principle results. The third bivariate risk model is concerned with the approximation of the component-wise maxima of skew elliptical triangular arrays. The theoretical results are based on certain tail assumptions on the underlying random radius.
Resumo:
Meropenem, a carbapenem antibiotic displaying a broad spectrum of antibacterial activity, is administered in Medical Intensive Care Unit to critically ill patients undergoing continuous veno-venous haemodiafiltration (CVVHDF). However, there are limited data available to substantial rational dosing decisions in this condition. In an attempt to refine our knowledge and propose a rationally designed dosage regimen, we have developed a HPLC method to determine meropenem after solid-phase extraction (SPE) of plasma and dialysate fluids obtained from patients under CVVHDF. The assay comprises the simultaneous measurement of meropenem's open-ring metabolite UK-1a, whose fate has never been studied in CVVHDF patients. The clean-up procedure involved a SPE on C18 cartridge. Matrix components were eliminated with phosphate buffer pH 7.4 followed by 15:85 MeOH-phosphate buffer pH 7.4. Meropenem and UK-1a were subsequently desorbed with MeOH. The eluates were evaporated under nitrogen at room temperature (RT) and reconstituted in phosphate buffer pH 7.4. Separation was performed at RT on a Nucleosil 100-5 microm C18 AB cartridge column (125 x 4 mm I.D.) equipped with a guard column (8 x 4 mm I.D.) with UV-DAD detection set at 208 nm. The mobile phase was 1 ml min(-1), using a step-wise gradient elution program: %MeOH/0.005 M tetrabutylammonium chloride pH 7.4; 10/90-50/50 in 27 min. Over the range of 5-100 microg ml(-1), the regression coefficient of the calibration curves (plasma and dialysate) were >0.998. The absolute extraction recoveries of meropenem and UK-1a in plasma and filtrate-dialysate were stable and ranged from 88-93 to 72-77% for meropenem, and from 95-104 to 75-82% for UK-1a. In plasma and filtrate-dialysate, respectively, the mean intra-assay precision was 4.1 and 2.6% for meropenem and 4.2 and 3.7% for UK-1a. The inter-assay variability was 2.8 and 3.6% for meropenem and 2.3 and 2.8% for UK-1a. The accuracy was satisfactory for both meropenem and UK-1a with deviation never exceeding 9.0% of the nominal concentrations. The stability of meropenem, studied in biological samples left at RT and at +4 degrees C, was satisfactory with < 5% degradation after 1.5 h in blood but reached 22% in filtrate-dialysate samples stored at RT for 8 h, precluding accurate measurements of meropenem excreted unchanged in the filtrate-dialysate left at RT during the CVVHDF procedure. The method reported here enables accurate measurements of meropenem in critically ill patients under CVVHDF, making dosage individualisation possible in such patients. The levels of the metabolite UK-1a encountered in this population of patients were higher than those observed in healthy volunteers but was similar to those observed in patients with renal impairment under hemodialysis.
Resumo:
An effective solution to model and apply planning domain knowledge for deliberation and action in probabilistic, agent-oriented control is presented. Specifically, the addition of a task structure planning component and supporting components to an agent-oriented architecture and agent implementation is described. For agent control in risky or uncertain environments, an approach and method of goal reduction to task plan sets and schedules of action is presented. Additionally, some issues related to component-wise, situation-dependent control of a task planning agent that schedules its tasks separately from planning them are motivated and discussed.
Resumo:
In the spinal cord of the anesthetized cat, spontaneous cord dorsum potentials (CDPs) appear synchronously along the lumbo-sacral segments. These CDPs have different shapes and magnitudes. Previous work has indicated that some CDPs appear to be specially associated with the activation of spinal pathways that lead to primary afferent depolarization and presynaptic inhibition. Visual detection and classification of these CDPs provides relevant information on the functional organization of the neural networks involved in the control of sensory information and allows the characterization of the changes produced by acute nerve and spinal lesions. We now present a novel feature extraction approach for signal classification, applied to CDP detection. The method is based on an intuitive procedure. We first remove by convolution the noise from the CDPs recorded in each given spinal segment. Then, we assign a coefficient for each main local maximum of the signal using its amplitude and distance to the most important maximum of the signal. These coefficients will be the input for the subsequent classification algorithm. In particular, we employ gradient boosting classification trees. This combination of approaches allows a faster and more accurate discrimination of CDPs than is obtained by other methods.
Resumo:
In many e-commerce Web sites, product recommendation is essential to improve user experience and boost sales. Most existing product recommender systems rely on historical transaction records or Web-site-browsing history of consumers in order to accurately predict online users’ preferences for product recommendation. As such, they are constrained by limited information available on specific e-commerce Web sites. With the prolific use of social media platforms, it now becomes possible to extract product demographics from online product reviews and social networks built from microblogs. Moreover, users’ public profiles available on social media often reveal their demographic attributes such as age, gender, and education. In this paper, we propose to leverage the demographic information of both products and users extracted from social media for product recommendation. In specific, we frame recommendation as a learning to rank problem which takes as input the features derived from both product and user demographics. An ensemble method based on the gradient-boosting regression trees is extended to make it suitable for our recommendation task. We have conducted extensive experiments to obtain both quantitative and qualitative evaluation results. Moreover, we have also conducted a user study to gauge the performance of our proposed recommender system in a real-world deployment. All the results show that our system is more effective in generating recommendation results better matching users’ preferences than the competitive baselines.
Resumo:
In recent years, the boundaries between e-commerce and social networking have become increasingly blurred. Many e-commerce websites support the mechanism of social login where users can sign on the websites using their social network identities such as their Facebook or Twitter accounts. Users can also post their newly purchased products on microblogs with links to the e-commerce product web pages. In this paper, we propose a novel solution for cross-site cold-start product recommendation, which aims to recommend products from e-commerce websites to users at social networking sites in 'cold-start' situations, a problem which has rarely been explored before. A major challenge is how to leverage knowledge extracted from social networking sites for cross-site cold-start product recommendation. We propose to use the linked users across social networking sites and e-commerce websites (users who have social networking accounts and have made purchases on e-commerce websites) as a bridge to map users' social networking features to another feature representation for product recommendation. In specific, we propose learning both users' and products' feature representations (called user embeddings and product embeddings, respectively) from data collected from e-commerce websites using recurrent neural networks and then apply a modified gradient boosting trees method to transform users' social networking features into user embeddings. We then develop a feature-based matrix factorization approach which can leverage the learnt user embeddings for cold-start product recommendation. Experimental results on a large dataset constructed from the largest Chinese microblogging service Sina Weibo and the largest Chinese B2C e-commerce website JingDong have shown the effectiveness of our proposed framework.
Resumo:
Thesis (Master's)--University of Washington, 2016-08
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Understanding how virus strains offer protection against closely related emerging strains is vital for creating effective vaccines. For many viruses, including Foot-and-Mouth Disease Virus (FMDV) and the Influenza virus where multiple serotypes often co-circulate, in vitro testing of large numbers of vaccines can be infeasible. Therefore the development of an in silico predictor of cross-protection between strains is important to help optimise vaccine choice. Vaccines will offer cross-protection against closely related strains, but not against those that are antigenically distinct. To be able to predict cross-protection we must understand the antigenic variability within a virus serotype, distinct lineages of a virus, and identify the antigenic residues and evolutionary changes that cause the variability. In this thesis we present a family of sparse hierarchical Bayesian models for detecting relevant antigenic sites in virus evolution (SABRE), as well as an extended version of the method, the extended SABRE (eSABRE) method, which better takes into account the data collection process. The SABRE methods are a family of sparse Bayesian hierarchical models that use spike and slab priors to identify sites in the viral protein which are important for the neutralisation of the virus. In this thesis we demonstrate how the SABRE methods can be used to identify antigenic residues within different serotypes and show how the SABRE method outperforms established methods, mixed-effects models based on forward variable selection or l1 regularisation, on both synthetic and viral datasets. In addition we also test a number of different versions of the SABRE method, compare conjugate and semi-conjugate prior specifications and an alternative to the spike and slab prior; the binary mask model. We also propose novel proposal mechanisms for the Markov chain Monte Carlo (MCMC) simulations, which improve mixing and convergence over that of the established component-wise Gibbs sampler. The SABRE method is then applied to datasets from FMDV and the Influenza virus in order to identify a number of known antigenic residue and to provide hypotheses of other potentially antigenic residues. We also demonstrate how the SABRE methods can be used to create accurate predictions of the important evolutionary changes of the FMDV serotypes. In this thesis we provide an extended version of the SABRE method, the eSABRE method, based on a latent variable model. The eSABRE method takes further into account the structure of the datasets for FMDV and the Influenza virus through the latent variable model and gives an improvement in the modelling of the error. We show how the eSABRE method outperforms the SABRE methods in simulation studies and propose a new information criterion for selecting the random effects factors that should be included in the eSABRE method; block integrated Widely Applicable Information Criterion (biWAIC). We demonstrate how biWAIC performs equally to two other methods for selecting the random effects factors and combine it with the eSABRE method to apply it to two large Influenza datasets. Inference in these large datasets is computationally infeasible with the SABRE methods, but as a result of the improved structure of the likelihood, we are able to show how the eSABRE method offers a computational improvement, leading it to be used on these datasets. The results of the eSABRE method show that we can use the method in a fully automatic manner to identify a large number of antigenic residues on a variety of the antigenic sites of two Influenza serotypes, as well as making predictions of a number of nearby sites that may also be antigenic and are worthy of further experiment investigation.
Resumo:
RESUMO - Métodos de reconhecimento de frutos baseados na utilização de diferentes descritores e classificadores foram estudados. Foi utilizada uma base de dados de 3.393 imagens de café e não-café anteriormente criada e rotulada manualmente. Testes quantitativos demonstraram a identificação de bagas com 93% de precisão e 77% de cobertura utilizando descritores HoG adicionados a mediana dos componentes de cor do formato La*b*, aliados ao classificador Gradient Boosting. Esses resultados melhoram o método anteriormente proposto por Santos (2015), e demonstram a possibilidade de evolução de métodos que podem ser aplicados em metodologias de agricultura de precisão, monitoramento e predição de safra.