46 resultados para Data-driven knowledge acquisition
Resumo:
In this paper, we develop a data-driven methodology to characterize the likelihood of orographic precipitation enhancement using sequences of weather radar images and a digital elevation model (DEM). Geographical locations with topographic characteristics favorable to enforce repeatable and persistent orographic precipitation such as stationary cells, upslope rainfall enhancement, and repeated convective initiation are detected by analyzing the spatial distribution of a set of precipitation cells extracted from radar imagery. Topographic features such as terrain convexity and gradients computed from the DEM at multiple spatial scales as well as velocity fields estimated from sequences of weather radar images are used as explanatory factors to describe the occurrence of localized precipitation enhancement. The latter is represented as a binary process by defining a threshold on the number of cell occurrences at particular locations. Both two-class and one-class support vector machine classifiers are tested to separate the presumed orographic cells from the nonorographic ones in the space of contributing topographic and flow features. Site-based validation is carried out to estimate realistic generalization skills of the obtained spatial prediction models. Due to the high class separability, the decision function of the classifiers can be interpreted as a likelihood or susceptibility of orographic precipitation enhancement. The developed approach can serve as a basis for refining radar-based quantitative precipitation estimates and short-term forecasts or for generating stochastic precipitation ensembles conditioned on the local topography.
Resumo:
Résumé Suite aux recentes avancées technologiques, les archives d'images digitales ont connu une croissance qualitative et quantitative sans précédent. Malgré les énormes possibilités qu'elles offrent, ces avancées posent de nouvelles questions quant au traitement des masses de données saisies. Cette question est à la base de cette Thèse: les problèmes de traitement d'information digitale à très haute résolution spatiale et/ou spectrale y sont considérés en recourant à des approches d'apprentissage statistique, les méthodes à noyau. Cette Thèse étudie des problèmes de classification d'images, c'est à dire de catégorisation de pixels en un nombre réduit de classes refletant les propriétés spectrales et contextuelles des objets qu'elles représentent. L'accent est mis sur l'efficience des algorithmes, ainsi que sur leur simplicité, de manière à augmenter leur potentiel d'implementation pour les utilisateurs. De plus, le défi de cette Thèse est de rester proche des problèmes concrets des utilisateurs d'images satellite sans pour autant perdre de vue l'intéret des méthodes proposées pour le milieu du machine learning dont elles sont issues. En ce sens, ce travail joue la carte de la transdisciplinarité en maintenant un lien fort entre les deux sciences dans tous les développements proposés. Quatre modèles sont proposés: le premier répond au problème de la haute dimensionalité et de la redondance des données par un modèle optimisant les performances en classification en s'adaptant aux particularités de l'image. Ceci est rendu possible par un système de ranking des variables (les bandes) qui est optimisé en même temps que le modèle de base: ce faisant, seules les variables importantes pour résoudre le problème sont utilisées par le classifieur. Le manque d'information étiquétée et l'incertitude quant à sa pertinence pour le problème sont à la source des deux modèles suivants, basés respectivement sur l'apprentissage actif et les méthodes semi-supervisées: le premier permet d'améliorer la qualité d'un ensemble d'entraînement par interaction directe entre l'utilisateur et la machine, alors que le deuxième utilise les pixels non étiquetés pour améliorer la description des données disponibles et la robustesse du modèle. Enfin, le dernier modèle proposé considère la question plus théorique de la structure entre les outputs: l'intègration de cette source d'information, jusqu'à présent jamais considérée en télédétection, ouvre des nouveaux défis de recherche. Advanced kernel methods for remote sensing image classification Devis Tuia Institut de Géomatique et d'Analyse du Risque September 2009 Abstract The technical developments in recent years have brought the quantity and quality of digital information to an unprecedented level, as enormous archives of satellite images are available to the users. However, even if these advances open more and more possibilities in the use of digital imagery, they also rise several problems of storage and treatment. The latter is considered in this Thesis: the processing of very high spatial and spectral resolution images is treated with approaches based on data-driven algorithms relying on kernel methods. In particular, the problem of image classification, i.e. the categorization of the image's pixels into a reduced number of classes reflecting spectral and contextual properties, is studied through the different models presented. The accent is put on algorithmic efficiency and the simplicity of the approaches proposed, to avoid too complex models that would not be used by users. The major challenge of the Thesis is to remain close to concrete remote sensing problems, without losing the methodological interest from the machine learning viewpoint: in this sense, this work aims at building a bridge between the machine learning and remote sensing communities and all the models proposed have been developed keeping in mind the need for such a synergy. Four models are proposed: first, an adaptive model learning the relevant image features has been proposed to solve the problem of high dimensionality and collinearity of the image features. This model provides automatically an accurate classifier and a ranking of the relevance of the single features. The scarcity and unreliability of labeled. information were the common root of the second and third models proposed: when confronted to such problems, the user can either construct the labeled set iteratively by direct interaction with the machine or use the unlabeled data to increase robustness and quality of the description of data. Both solutions have been explored resulting into two methodological contributions, based respectively on active learning and semisupervised learning. Finally, the more theoretical issue of structured outputs has been considered in the last model, which, by integrating outputs similarity into a model, opens new challenges and opportunities for remote sensing image processing.
Resumo:
Little is known about Internet use among adolescents with chronic conditions (CCs). Our results indicate that CC females, but not males, are more likely to be heavy Internet users than their peers. CC youths are also more likely to visit health-related web sites, but less frequently than other sites.
Resumo:
This paper presents multiple kernel learning (MKL) regression as an exploratory spatial data analysis and modelling tool. The MKL approach is introduced as an extension of support vector regression, where MKL uses dedicated kernels to divide a given task into sub-problems and to treat them separately in an effective way. It provides better interpretability to non-linear robust kernel regression at the cost of a more complex numerical optimization. In particular, we investigate the use of MKL as a tool that allows us to avoid using ad-hoc topographic indices as covariables in statistical models in complex terrains. Instead, MKL learns these relationships from the data in a non-parametric fashion. A study on data simulated from real terrain features confirms the ability of MKL to enhance the interpretability of data-driven models and to aid feature selection without degrading predictive performances. Here we examine the stability of the MKL algorithm with respect to the number of training data samples and to the presence of noise. The results of a real case study are also presented, where MKL is able to exploit a large set of terrain features computed at multiple spatial scales, when predicting mean wind speed in an Alpine region.
Resumo:
Objectives The purpose of this study is to assess short and long term changes in knowledge, attitudes, and skills among medical residents following a short course on cultural competency and to explore their perspectives on the experience. Methods Eighteen medical residents went through a short training programme comprised of two seminars lasting 30' and 60' respectively over two days. Three months later, we conducted three focus groups, with 17 residents to explore their thoughts, perspectives and feedback about the course. To measure changes over time, we carried out a quantitative sequential survey before the seminars, three days after, and three months later using the Multicultural Assessment Questionnaire. Results Residents expressed a wide variety of perspectives on the main themes related to the content of the training - culture, trialogue, stereotypes, status, epidemiology, history and geopolitics - and related to its organization - relevance, volume, timing, target audience, training tools, and working material. Using the MAQ, we observed a higher global performance score (n=16) at three days (median=38) compared to results before the training (median=33) revealing a median difference of 5.5 points (z=2.4, p=0.015). This difference was still present at three months (∆=4.5, z=2.4, p=0.018), mainly due to knowledge acquisition (∆=3) rather than attitudes (∆=0) or skills (∆=1). Conclusions Cross-cultural competence training not only brings awareness of multicultural issues but also helps participants understand their own cultures, perception of others and preconceived ideas. Physicians' education should however also focus on improving implementation of acquired knowledge in cross-cultural competence.
Resumo:
It is estimated that around 230 people die each year due to radon (222Rn) exposure in Switzerland. 222Rn occurs mainly in closed environments like buildings and originates primarily from the subjacent ground. Therefore it depends strongly on geology and shows substantial regional variations. Correct identification of these regional variations would lead to substantial reduction of 222Rn exposure of the population based on appropriate construction of new and mitigation of already existing buildings. Prediction of indoor 222Rn concentrations (IRC) and identification of 222Rn prone areas is however difficult since IRC depend on a variety of different variables like building characteristics, meteorology, geology and anthropogenic factors. The present work aims at the development of predictive models and the understanding of IRC in Switzerland, taking into account a maximum of information in order to minimize the prediction uncertainty. The predictive maps will be used as a decision-support tool for 222Rn risk management. The construction of these models is based on different data-driven statistical methods, in combination with geographical information systems (GIS). In a first phase we performed univariate analysis of IRC for different variables, namely the detector type, building category, foundation, year of construction, the average outdoor temperature during measurement, altitude and lithology. All variables showed significant associations to IRC. Buildings constructed after 1900 showed significantly lower IRC compared to earlier constructions. We observed a further drop of IRC after 1970. In addition to that, we found an association of IRC with altitude. With regard to lithology, we observed the lowest IRC in sedimentary rocks (excluding carbonates) and sediments and the highest IRC in the Jura carbonates and igneous rock. The IRC data was systematically analyzed for potential bias due to spatially unbalanced sampling of measurements. In order to facilitate the modeling and the interpretation of the influence of geology on IRC, we developed an algorithm based on k-medoids clustering which permits to define coherent geological classes in terms of IRC. We performed a soil gas 222Rn concentration (SRC) measurement campaign in order to determine the predictive power of SRC with respect to IRC. We found that the use of SRC is limited for IRC prediction. The second part of the project was dedicated to predictive mapping of IRC using models which take into account the multidimensionality of the process of 222Rn entry into buildings. We used kernel regression and ensemble regression tree for this purpose. We could explain up to 33% of the variance of the log transformed IRC all over Switzerland. This is a good performance compared to former attempts of IRC modeling in Switzerland. As predictor variables we considered geographical coordinates, altitude, outdoor temperature, building type, foundation, year of construction and detector type. Ensemble regression trees like random forests allow to determine the role of each IRC predictor in a multidimensional setting. We found spatial information like geology, altitude and coordinates to have stronger influences on IRC than building related variables like foundation type, building type and year of construction. Based on kernel estimation we developed an approach to determine the local probability of IRC to exceed 300 Bq/m3. In addition to that we developed a confidence index in order to provide an estimate of uncertainty of the map. All methods allow an easy creation of tailor-made maps for different building characteristics. Our work is an essential step towards a 222Rn risk assessment which accounts at the same time for different architectural situations as well as geological and geographical conditions. For the communication of 222Rn hazard to the population we recommend to make use of the probability map based on kernel estimation. The communication of 222Rn hazard could for example be implemented via a web interface where the users specify the characteristics and coordinates of their home in order to obtain the probability to be above a given IRC with a corresponding index of confidence. Taking into account the health effects of 222Rn, our results have the potential to substantially improve the estimation of the effective dose from 222Rn delivered to the Swiss population.
Resumo:
The capacity to interact socially and share information underlies the success of many animal species, humans included. Researchers of many fields have emphasized the evo¬lutionary significance of how patterns of connections between individuals, or the social networks, and learning abilities affect the information obtained by animal societies. To date, studies have focused on the dynamics either of social networks, or of the spread of information. The present work aims to study them together. We make use of mathematical and computational models to study the dynamics of networks, where social learning and information sharing affect the structure of the population the individuals belong to. The number and strength of the relationships between individuals, in turn, impact the accessibility and the diffusion of the shared information. Moreover, we inves¬tigate how different strategies in the evaluation and choice of interacting partners impact the processes of knowledge acquisition and social structure rearrangement. First, we look at how different evaluations of social interactions affect the availability of the information and the network topology. We compare a first case, where individuals evaluate social exchanges by the amount of information that can be shared by the partner, with a second case, where they evaluate interactions by considering their partners' social status. We show that, even if both strategies take into account the knowledge endowments of the partners, they have very different effects on the system. In particular, we find that the first case generally enables individuals to accumulate higher amounts of information, thanks to the more efficient patterns of social connections they are able to build. Then, we study the effects that homophily, or the tendency to interact with similar partners, has on knowledge accumulation and social structure. We compare the case where individuals who know the same information are more likely to learn socially from each other, to the opposite case, where individuals who know different information are instead more likely to learn socially from each other. We find that it is not trivial to claim which strategy is better than the other. Depending on the possibility of forgetting information, the way new social partners can be chosen, and the population size, we delineate the conditions for which each strategy allows accumulating more information, or in a faster way For these conditions, we also discuss the topological characteristics of the resulting social structure, relating them to the information dynamics outcome. In conclusion, this work paves the road for modeling the joint dynamics of the spread of information among individuals and their social interactions. It also provides a formal framework to study jointly the effects of different strategies in the choice of partners on social structure, and how they favor the accumulation of knowledge in the population. - La capacité d'interagir socialement et de partager des informations est à la base de la réussite de nombreuses espèces animales, y compris les humains. Les chercheurs de nombreux domaines ont souligné l'importance évolutive de la façon dont les modes de connexions entre individus, ou réseaux sociaux et les capacités d'apprentissage affectent les informations obtenues par les sociétés animales. À ce jour, les études se sont concentrées sur la dynamique soit des réseaux sociaux, soit de la diffusion de l'information. Le présent travail a pour but de les étudier ensemble. Nous utilisons des modèles mathématiques et informatiques pour étudier la dynamique des réseaux, où l'apprentissage social et le partage d'information affectent la structure de la population à laquelle les individus appartiennent. Le nombre et la solidité des relations entre les individus ont à leurs tours un impact sur l'accessibilité et la diffusion de l'informa¬tion partagée. Par ailleurs, nous étudions comment les différentes stratégies d'évaluation et de choix des partenaires d'interaction ont une incidence sur les processus d'acquisition des connaissances ainsi que le réarrangement de la structure sociale. Tout d'abord, nous examinons comment des évaluations différentes des interactions sociales influent sur la disponibilité de l'information ainsi que sur la topologie du réseau. Nous comparons un premier cas, où les individus évaluent les échanges sociaux par la quantité d'information qui peut être partagée par le partenaire, avec un second cas, où ils évaluent les interactions en tenant compte du statut social de leurs partenaires. Nous montrons que, même si les deux stratégies prennent en compte le montant de connaissances des partenaires, elles ont des effets très différents sur le système. En particulier, nous constatons que le premier cas permet généralement aux individus d'accumuler de plus grandes quantités d'information, grâce à des modèles de connexions sociales plus efficaces qu'ils sont capables de construire. Ensuite, nous étudions les effets que l'homophilie, ou la tendance à interagir avec des partenaires similaires, a sur l'accumulation des connaissances et la structure sociale. Nous comparons le cas où des personnes qui connaissent les mêmes informations sont plus sus¬ceptibles d'apprendre socialement l'une de l'autre, au cas où les individus qui connaissent des informations différentes sont au contraire plus susceptibles d'apprendre socialement l'un de l'autre. Nous constatons qu'il n'est pas trivial de déterminer quelle stratégie est meilleure que l'autre. En fonction de la possibilité d'oublier l'information, la façon dont les nouveaux partenaires sociaux peuvent être choisis, et la taille de la population, nous déterminons les conditions pour lesquelles chaque stratégie permet d'accumuler plus d'in¬formations, ou d'une manière plus rapide. Pour ces conditions, nous discutons également les caractéristiques topologiques de la structure sociale qui en résulte, les reliant au résultat de la dynamique de l'information. En conclusion, ce travail ouvre la route pour la modélisation de la dynamique conjointe de la diffusion de l'information entre les individus et leurs interactions sociales. Il fournit également un cadre formel pour étudier conjointement les effets de différentes stratégies de choix des partenaires sur la structure sociale et comment elles favorisent l'accumulation de connaissances dans la population.
Resumo:
The detection of multi-resistant bacterial pathogens, particularly those to carbapenemases, in leukemic and stem cell transplant patients forces the use of old or non-conventional agents as the only remaining treatment options. These include colistin/polymyxin B, tigecycline, fosfomycin and various anti-gram-positive agents. Data on the use of these agents in leukemic patients are scanty, with only linezolid subjected to formal trials. The Expert Group of the 4(th) European Conference on Infections in Leukemia has developed guidelines for their use in these patient populations. Targeted therapy should be based on (i) in vitro susceptibility data, (ii) knowledge of the best treatment option against the particular species or phenotype of bacteria, (iii) pharmacokinetic/pharmacodynamic data, and (iv) careful assessment of the risk-benefit balance. For infections due to resistant Gram-negative bacteria, these agents should be preferably used in combination with other agents that remain active in vitro, because of suboptimal efficacy (e.g., tigecycline) and the risk of emergent resistance (e.g., fosfomycin). The paucity of new antibacterial drugs in the near future should lead us to limit the use of these drugs to situations where no alternative exists.
Resumo:
The hydrological and biogeochemical processes that operate in catchments influence the ecological quality of freshwater systems through delivery of fine sediment, nutrients and organic matter. Most models that seek to characterise the delivery of diffuse pollutants from land to water are reductionist. The multitude of processes that are parameterised in such models to ensure generic applicability make them complex and difficult to test on available data. Here, we outline an alternative - data-driven - inverse approach. We apply SCIMAP, a parsimonious risk based model that has an explicit treatment of hydrological connectivity. we take a Bayesian approach to the inverse problem of determining the risk that must be assigned to different land uses in a catchment in order to explain the spatial patterns of measured in-stream nutrient concentrations. We apply the model to identify the key sources of nitrogen (N) and phosphorus (P) diffuse pollution risk in eleven UK catchments covering a range of landscapes. The model results show that: 1) some land use generates a consistently high or low risk of diffuse nutrient pollution; but 2) the risks associated with different land uses vary both between catchments and between nutrients; and 3) that the dominant sources of P and N risk in the catchment are often a function of the spatial configuration of land uses. Taken on a case-by-case basis, this type of inverse approach may be used to help prioritise the focus of interventions to reduce diffuse pollution risk for freshwater ecosystems. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Interactions between stimuli's acoustic features and experience-based internal models of the environment enable listeners to compensate for the disruptions in auditory streams that are regularly encountered in noisy environments. However, whether auditory gaps are filled in predictively or restored a posteriori remains unclear. The current lack of positive statistical evidence that internal models can actually shape brain activity as would real sounds precludes accepting predictive accounts of filling-in phenomenon. We investigated the neurophysiological effects of internal models by testing whether single-trial electrophysiological responses to omitted sounds in a rule-based sequence of tones with varying pitch could be decoded from the responses to real sounds and by analyzing the ERPs to the omissions with data-driven electrical neuroimaging methods. The decoding of the brain responses to different expected, but omitted, tones in both passive and active listening conditions was above chance based on the responses to the real sound in active listening conditions. Topographic ERP analyses and electrical source estimations revealed that, in the absence of any stimulation, experience-based internal models elicit an electrophysiological activity different from noise and that the temporal dynamics of this activity depend on attention. We further found that the expected change in pitch direction of omitted tones modulated the activity of left posterior temporal areas 140-200 msec after the onset of omissions. Collectively, our results indicate that, even in the absence of any stimulation, internal models modulate brain activity as do real sounds, indicating that auditory filling in can be accounted for by predictive activity.
Resumo:
The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) provides world-class bioinformatics databases, software tools, services and training to the international life science community in academia and industry. These solutions allow life scientists to turn the exponentially growing amount of data into knowledge. Here, we provide an overview of SIB's resources and competence areas, with a strong focus on curated databases and SIB's most popular and widely used resources. In particular, SIB's Bioinformatics resource portal ExPASy features over 150 resources, including UniProtKB/Swiss-Prot, ENZYME, PROSITE, neXtProt, STRING, UniCarbKB, SugarBindDB, SwissRegulon, EPD, arrayMap, Bgee, SWISS-MODEL Repository, OMA, OrthoDB and other databases, which are briefly described in this article.
Resumo:
Imaging mass spectrometry (IMS) represents an innovative tool in the cancer research pipeline, which is increasingly being used in clinical and pharmaceutical applications. The unique properties of the technique, especially the amount of data generated, make the handling of data from multiple IMS acquisitions challenging. This work presents a histology-driven IMS approach aiming to identify discriminant lipid signatures from the simultaneous mining of IMS data sets from multiple samples. The feasibility of the developed workflow is evaluated on a set of three human colorectal cancer liver metastasis (CRCLM) tissue sections. Lipid IMS on tissue sections was performed using MALDI-TOF/TOF MS in both negative and positive ionization modes after 1,5-diaminonaphthalene matrix deposition by sublimation. The combination of both positive and negative acquisition results was performed during data mining to simplify the process and interrogate a larger lipidome into a single analysis. To reduce the complexity of the IMS data sets, a sub data set was generated by randomly selecting a fixed number of spectra from a histologically defined region of interest, resulting in a 10-fold data reduction. Principal component analysis confirmed that the molecular selectivity of the regions of interest is maintained after data reduction. Partial least-squares and heat map analyses demonstrated a selective signature of the CRCLM, revealing lipids that are significantly up- and down-regulated in the tumor region. This comprehensive approach is thus of interest for defining disease signatures directly from IMS data sets by the use of combinatory data mining, opening novel routes of investigation for addressing the demands of the clinical setting.
Resumo:
OBJECT: To study a scan protocol for coronary magnetic resonance angiography based on multiple breath-holds featuring 1D motion compensation and to compare the resulting image quality to a navigator-gated free-breathing acquisition. Image reconstruction was performed using L1 regularized iterative SENSE. MATERIALS AND METHODS: The effects of respiratory motion on the Cartesian sampling scheme were minimized by performing data acquisition in multiple breath-holds. During the scan, repetitive readouts through a k-space center were used to detect and correct the respiratory displacement of the heart by exploiting the self-navigation principle in image reconstruction. In vivo experiments were performed in nine healthy volunteers and the resulting image quality was compared to a navigator-gated reference in terms of vessel length and sharpness. RESULTS: Acquisition in breath-hold is an effective method to reduce the scan time by more than 30 % compared to the navigator-gated reference. Although an equivalent mean image quality with respect to the reference was achieved with the proposed method, the 1D motion compensation did not work equally well in all cases. CONCLUSION: In general, the image quality scaled with the robustness of the motion compensation. Nevertheless, the featured setup provides a positive basis for future extension with more advanced motion compensation methods.