903 resultados para Data-driven knowledge acquisition
Resumo:
A methodology of exploratory data analysis investigating the phenomenon of orographic precipitation enhancement is proposed. The precipitation observations obtained from three Swiss Doppler weather radars are analysed for the major precipitation event of August 2005 in the Alps. Image processing techniques are used to detect significant precipitation cells/pixels from radar images while filtering out spurious effects due to ground clutter. The contribution of topography to precipitation patterns is described by an extensive set of topographical descriptors computed from the digital elevation model at multiple spatial scales. Additionally, the motion vector field is derived from subsequent radar images and integrated into a set of topographic features to highlight the slopes exposed to main flows. Following the exploratory data analysis with a recent algorithm of spectral clustering, it is shown that orographic precipitation cells are generated under specific flow and topographic conditions. Repeatability of precipitation patterns in particular spatial locations is found to be linked to specific local terrain shapes, e.g. at the top of hills and on the upwind side of the mountains. This methodology and our empirical findings for the Alpine region provide a basis for building computational data-driven models of orographic enhancement and triggering of precipitation. Copyright (C) 2011 Royal Meteorological Society .
Resumo:
A ubiquitous assessment of swimming velocity (main metric of the performance) is essential for the coach to provide a tailored feedback to the trainee. We present a probabilistic framework for the data-driven estimation of the swimming velocity at every cycle using a low-cost wearable inertial measurement unit (IMU). The statistical validation of the method on 15 swimmers shows that an average relative error of 0.1 ± 9.6% and high correlation with the tethered reference system (rX,Y=0.91 ) is achievable. Besides, a simple tool to analyze the influence of sacrum kinematics on the performance is provided.
Resumo:
Gut microbiota has recently been proposed as a crucial environmental factor in the development of metabolic diseases such as obesity and type 2 diabetes, mainly due to its contribution in the modulation of several processes including host energy metabolism, gut epithelial permeability, gut peptide hormone secretion, and host inflammatory state. Since the symbiotic interaction between the gut microbiota and the host is essentially reflected in specific metabolic signatures, much expectation is placed on the application of metabolomic approaches to unveil the key mechanisms linking the gut microbiota composition and activity with disease development. The present review aims to summarize the gut microbial-host co-metabolites identified so far by targeted and untargeted metabolomic studies in humans, in association with impaired glucose homeostasis and/or obesity. An alteration of the co-metabolism of bile acids, branched fatty acids, choline, vitamins (i.e., niacin), purines, and phenolic compounds has been associated so far with the obese or diabese phenotype, in respect to healthy controls. Furthermore, anti-diabetic treatments such as metformin and sulfonylurea have been observed to modulate the gut microbiota or at least their metabolic profiles, thereby potentially affecting insulin resistance through indirect mechanisms still unknown. Despite the scarcity of the metabolomic studies currently available on the microbial-host crosstalk, the data-driven results largely confirmed findings independently obtained from in vitro and animal model studies, putting forward the mechanisms underlying the implication of a dysfunctional gut microbiota in the development of metabolic disorders.
Resumo:
Functional connectivity (FC) as measured by correlation between fMRI BOLD time courses of distinct brain regions has revealed meaningful organization of spontaneous fluctuations in the resting brain. However, an increasing amount of evidence points to non-stationarity of FC; i.e., FC dynamically changes over time reflecting additional and rich information about brain organization, but representing new challenges for analysis and interpretation. Here, we propose a data-driven approach based on principal component analysis (PCA) to reveal hidden patterns of coherent FC dynamics across multiple subjects. We demonstrate the feasibility and relevance of this new approach by examining the differences in dynamic FC between 13 healthy control subjects and 15 minimally disabled relapse-remitting multiple sclerosis patients. We estimated whole-brain dynamic FC of regionally-averaged BOLD activity using sliding time windows. We then used PCA to identify FC patterns, termed "eigenconnectivities", that reflect meaningful patterns in FC fluctuations. We then assessed the contributions of these patterns to the dynamic FC at any given time point and identified a network of connections centered on the default-mode network with altered contribution in patients. Our results complement traditional stationary analyses, and reveal novel insights into brain connectivity dynamics and their modulation in a neurodegenerative disease.
Resumo:
El projecte ha tingut com a objectiu principal abordar la instrumentalització política de la immigració per part de partits polítics de nova extrema dreta. Aquest fenomen, que ha adquirit una gran rellevància a gran part dels països europeus, està adquirint una creixent rellevància en els casos britànic i català a partir de, entre altres coses, l'emergència electoral dels partits Plataforma per Catalunya i British National Party. En aquest sentit, el projecte ha tractat de desenvolupar una recerca que produís un conjunt de dades i coneixements que permetessin abordar de forma fonamentada un fenomen “nou” en el context català i que, fins al moment, ha rebut escassa atenció per part del món acadèmic. Dins d’aquest objectiu cal destacar el fet que la àmplia experiència de l’equip investigador britànic en l’anàlisi de la nova extrema dreta ha permès que els investigadors catalans poguessin desenvolupar la seva recerca recolzant-se i dialogant en la seva contrapart britànica. Els resultats de la recerca són certament novedosos i representaran una important contribució al coneixement d’aquest fenomen. Així, en el marc de la recerca s’han desenvolupat una sèrie d’entrevistes a membres i a votants de PxC, així com una enquesta a votants i una anàlisi agregada sobre el vot al partit. En aquest sentit, cal destacar que és la primera vegada que s’aconsegueixen aquest tipus de dades. Un fet que està fent, i farà, que la seva explotació i divulgació adquireixi una gran rellevància tant en el món acadèmic com en el de les administracions públiques. Finalment, convé ressaltar que el projecte també ha servit per consolidar la relació entre els equips d’investigació britànic i català i per impulsar la inserció de l’equip català en les xarxes europees d’investigació sobre aquesta matèria.
Resumo:
Given the rate of projected environmental change for the 21st century, urgent adaptation and mitigation measures are required to slow down the on-going erosion of biodiversity. Even though increasing evidence shows that recent human-induced environmental changes have already triggered species' range shifts, changes in phenology and species' extinctions, accurate projections of species' responses to future environmental changes are more difficult to ascertain. This is problematic, since there is a growing awareness of the need to adopt proactive conservation planning measures using forecasts of species' responses to future environmental changes. There is a substantial body of literature describing and assessing the impacts of various scenarios of climate and land-use change on species' distributions. Model predictions include a wide range of assumptions and limitations that are widely acknowledged but compromise their use for developing reliable adaptation and mitigation strategies for biodiversity. Indeed, amongst the most used models, few, if any, explicitly deal with migration processes, the dynamics of population at the "trailing edge" of shifting populations, species' interactions and the interaction between the effects of climate and land-use. In this review, we propose two main avenues to progress the understanding and prediction of the different processes A occurring on the leading and trailing edge of the species' distribution in response to any global change phenomena. Deliberately focusing on plant species, we first explore the different ways to incorporate species' migration in the existing modelling approaches, given data and knowledge limitations and the dual effects of climate and land-use factors. Secondly, we explore the mechanisms and processes happening at the trailing edge of a shifting species' distribution and how to implement them into a modelling approach. We finally conclude this review with clear guidelines on how such modelling improvements will benefit conservation strategies in a changing world. (c) 2007 Rubel Foundation, ETH Zurich. Published by Elsevier GrnbH. All rights reserved.
Resumo:
OBJECTIVE: To develop a provisional definition for the evaluation of response to therapy in juvenile dermatomyositis (DM) based on the Paediatric Rheumatology International Trials Organisation juvenile DM core set of variables. METHODS: Thirty-seven experienced pediatric rheumatologists from 27 countries achieved consensus on 128 difficult patient profiles as clinically improved or not improved using a stepwise approach (patient's rating, statistical analysis, definition selection). Using the physicians' consensus ratings as the "gold standard measure," chi-square, sensitivity, specificity, false-positive and-negative rates, area under the receiver operating characteristic curve, and kappa agreement for candidate definitions of improvement were calculated. Definitions with kappa values >0.8 were multiplied by the face validity score to select the top definitions. RESULTS: The top definition of improvement was at least 20% improvement from baseline in 3 of 6 core set variables with no more than 1 of the remaining worsening by more than 30%, which cannot be muscle strength. The second-highest scoring definition was at least 20% improvement from baseline in 3 of 6 core set variables with no more than 2 of the remaining worsening by more than 25%, which cannot be muscle strength (definition P1 selected by the International Myositis Assessment and Clinical Studies group). The third is similar to the second with the maximum amount of worsening set to 30%. This indicates convergent validity of the process. CONCLUSION: We propose a provisional data-driven definition of improvement that reflects well the consensus rating of experienced clinicians, which incorporates clinically meaningful change in core set variables in a composite end point for the evaluation of global response to therapy in juvenile DM.
Resumo:
This paper characterizes the innovation strategy of manufacturing firms andexamines the relation between the innovation strategy and importantindustry-, firm- and innovation-specific characteristics using Belgiandata from the Eurostat Community Innovation Survey. In addition to importantsize effects explaining innovation, we find that high perceived risks andcosts and low appropriability of innovations do not discourage innovation,but rather determine how the innovation sourcing strategy is chosen. Withrespect to the determinants of the decision of the innovative firm toproduce technology itself (Make) or to source technology externally (Buy),we find that small firms are more likely restrict their innovation strategyto an exclusive make or buy strategy, while large firms are more likely tocombine both internal and external knowledge acquisition in their innovationstrategy. An interesting result that highlights the complementary nature ofthe Make and Buy decisions, is that, controlled for firm size, companies forwhich internal information is an important source for innovation are morelikely to combine internal and external sources of technology. We find thisto be evidence of the fact that in-house R&D generates the necessaryabsorptive capacity to profit from external knowledge acquisition. Also theeffectiveness of different mechanisms to appropriate the benefits ofinnovations and the internal organizational resistance against change areimportant determinants of the firm's technology sourcing strategy.
Resumo:
Contemporary thoracic and cardiovascular surgery uses extensive equipment and devices to enable its performance. As the specialties develop and new frontiers are crossed, the technology needs to advance in a parallel fashion. Strokes of genius or problem-solving brain-storming may generate great ideas, but the metamorphosis of an idea into a physical functioning tool requires a lot more than just a thinking process. A modern surgical device is the end-point of a sophisticated, complicated and potentially treacherous route, which incorporates new skills and knowledge acquisition. Processes including technology transfer, commercialisation, corporate and product development, intellectual property and regulatory routes all play pivotal roles in this voyage. Many good ideas may fall by the wayside for a multitude of reasons as they may not be marketable or may be badly marketed. In this article, we attempt to illuminate the components required in the process of surgical innovation, which we believe must remain in the remit of the modern-day thoracic and cardiovascular surgeon.
Resumo:
Uncertainty quantification of petroleum reservoir models is one of the present challenges, which is usually approached with a wide range of geostatistical tools linked with statistical optimisation or/and inference algorithms. The paper considers a data driven approach in modelling uncertainty in spatial predictions. Proposed semi-supervised Support Vector Regression (SVR) model has demonstrated its capability to represent realistic features and describe stochastic variability and non-uniqueness of spatial properties. It is able to capture and preserve key spatial dependencies such as connectivity, which is often difficult to achieve with two-point geostatistical models. Semi-supervised SVR is designed to integrate various kinds of conditioning data and learn dependences from them. A stochastic semi-supervised SVR model is integrated into a Bayesian framework to quantify uncertainty with multiple models fitted to dynamic observations. The developed approach is illustrated with a reservoir case study. The resulting probabilistic production forecasts are described by uncertainty envelopes.
Resumo:
This study investigated the spatial, spectral, temporal and functional proprieties of functional brain connections involved in the concurrent execution of unrelated visual perception and working memory tasks. Electroencephalography data was analysed using a novel data-driven approach assessing source coherence at the whole-brain level. Three connections in the beta-band (18-24 Hz) and one in the gamma-band (30-40 Hz) were modulated by dual-task performance. Beta-coherence increased within two dorsofrontal-occipital connections in dual-task conditions compared to the single-task condition, with the highest coherence seen during low working memory load trials. In contrast, beta-coherence in a prefrontal-occipital functional connection and gamma-coherence in an inferior frontal-occipitoparietal connection was not affected by the addition of the second task and only showed elevated coherence under high working memory load. Analysis of coherence as a function of time suggested that the dorsofrontal-occipital beta-connections were relevant to working memory maintenance, while the prefrontal-occipital beta-connection and the inferior frontal-occipitoparietal gamma-connection were involved in top-down control of concurrent visual processing. The fact that increased coherence in the gamma-connection, from low to high working memory load, was negatively correlated with faster reaction time on the perception task supports this interpretation. Together, these results demonstrate that dual-task demands trigger non-linear changes in functional interactions between frontal-executive and occipitoparietal-perceptual cortices.
Resumo:
In this paper, we develop a data-driven methodology to characterize the likelihood of orographic precipitation enhancement using sequences of weather radar images and a digital elevation model (DEM). Geographical locations with topographic characteristics favorable to enforce repeatable and persistent orographic precipitation such as stationary cells, upslope rainfall enhancement, and repeated convective initiation are detected by analyzing the spatial distribution of a set of precipitation cells extracted from radar imagery. Topographic features such as terrain convexity and gradients computed from the DEM at multiple spatial scales as well as velocity fields estimated from sequences of weather radar images are used as explanatory factors to describe the occurrence of localized precipitation enhancement. The latter is represented as a binary process by defining a threshold on the number of cell occurrences at particular locations. Both two-class and one-class support vector machine classifiers are tested to separate the presumed orographic cells from the nonorographic ones in the space of contributing topographic and flow features. Site-based validation is carried out to estimate realistic generalization skills of the obtained spatial prediction models. Due to the high class separability, the decision function of the classifiers can be interpreted as a likelihood or susceptibility of orographic precipitation enhancement. The developed approach can serve as a basis for refining radar-based quantitative precipitation estimates and short-term forecasts or for generating stochastic precipitation ensembles conditioned on the local topography.
Resumo:
Résumé Suite aux recentes avancées technologiques, les archives d'images digitales ont connu une croissance qualitative et quantitative sans précédent. Malgré les énormes possibilités qu'elles offrent, ces avancées posent de nouvelles questions quant au traitement des masses de données saisies. Cette question est à la base de cette Thèse: les problèmes de traitement d'information digitale à très haute résolution spatiale et/ou spectrale y sont considérés en recourant à des approches d'apprentissage statistique, les méthodes à noyau. Cette Thèse étudie des problèmes de classification d'images, c'est à dire de catégorisation de pixels en un nombre réduit de classes refletant les propriétés spectrales et contextuelles des objets qu'elles représentent. L'accent est mis sur l'efficience des algorithmes, ainsi que sur leur simplicité, de manière à augmenter leur potentiel d'implementation pour les utilisateurs. De plus, le défi de cette Thèse est de rester proche des problèmes concrets des utilisateurs d'images satellite sans pour autant perdre de vue l'intéret des méthodes proposées pour le milieu du machine learning dont elles sont issues. En ce sens, ce travail joue la carte de la transdisciplinarité en maintenant un lien fort entre les deux sciences dans tous les développements proposés. Quatre modèles sont proposés: le premier répond au problème de la haute dimensionalité et de la redondance des données par un modèle optimisant les performances en classification en s'adaptant aux particularités de l'image. Ceci est rendu possible par un système de ranking des variables (les bandes) qui est optimisé en même temps que le modèle de base: ce faisant, seules les variables importantes pour résoudre le problème sont utilisées par le classifieur. Le manque d'information étiquétée et l'incertitude quant à sa pertinence pour le problème sont à la source des deux modèles suivants, basés respectivement sur l'apprentissage actif et les méthodes semi-supervisées: le premier permet d'améliorer la qualité d'un ensemble d'entraînement par interaction directe entre l'utilisateur et la machine, alors que le deuxième utilise les pixels non étiquetés pour améliorer la description des données disponibles et la robustesse du modèle. Enfin, le dernier modèle proposé considère la question plus théorique de la structure entre les outputs: l'intègration de cette source d'information, jusqu'à présent jamais considérée en télédétection, ouvre des nouveaux défis de recherche. Advanced kernel methods for remote sensing image classification Devis Tuia Institut de Géomatique et d'Analyse du Risque September 2009 Abstract The technical developments in recent years have brought the quantity and quality of digital information to an unprecedented level, as enormous archives of satellite images are available to the users. However, even if these advances open more and more possibilities in the use of digital imagery, they also rise several problems of storage and treatment. The latter is considered in this Thesis: the processing of very high spatial and spectral resolution images is treated with approaches based on data-driven algorithms relying on kernel methods. In particular, the problem of image classification, i.e. the categorization of the image's pixels into a reduced number of classes reflecting spectral and contextual properties, is studied through the different models presented. The accent is put on algorithmic efficiency and the simplicity of the approaches proposed, to avoid too complex models that would not be used by users. The major challenge of the Thesis is to remain close to concrete remote sensing problems, without losing the methodological interest from the machine learning viewpoint: in this sense, this work aims at building a bridge between the machine learning and remote sensing communities and all the models proposed have been developed keeping in mind the need for such a synergy. Four models are proposed: first, an adaptive model learning the relevant image features has been proposed to solve the problem of high dimensionality and collinearity of the image features. This model provides automatically an accurate classifier and a ranking of the relevance of the single features. The scarcity and unreliability of labeled. information were the common root of the second and third models proposed: when confronted to such problems, the user can either construct the labeled set iteratively by direct interaction with the machine or use the unlabeled data to increase robustness and quality of the description of data. Both solutions have been explored resulting into two methodological contributions, based respectively on active learning and semisupervised learning. Finally, the more theoretical issue of structured outputs has been considered in the last model, which, by integrating outputs similarity into a model, opens new challenges and opportunities for remote sensing image processing.
Resumo:
Little is known about Internet use among adolescents with chronic conditions (CCs). Our results indicate that CC females, but not males, are more likely to be heavy Internet users than their peers. CC youths are also more likely to visit health-related web sites, but less frequently than other sites.
Resumo:
This paper presents multiple kernel learning (MKL) regression as an exploratory spatial data analysis and modelling tool. The MKL approach is introduced as an extension of support vector regression, where MKL uses dedicated kernels to divide a given task into sub-problems and to treat them separately in an effective way. It provides better interpretability to non-linear robust kernel regression at the cost of a more complex numerical optimization. In particular, we investigate the use of MKL as a tool that allows us to avoid using ad-hoc topographic indices as covariables in statistical models in complex terrains. Instead, MKL learns these relationships from the data in a non-parametric fashion. A study on data simulated from real terrain features confirms the ability of MKL to enhance the interpretability of data-driven models and to aid feature selection without degrading predictive performances. Here we examine the stability of the MKL algorithm with respect to the number of training data samples and to the presence of noise. The results of a real case study are also presented, where MKL is able to exploit a large set of terrain features computed at multiple spatial scales, when predicting mean wind speed in an Alpine region.