36 resultados para ree software environment for statistical computing and graphics R
Resumo:
Therapeutic drug monitoring (TDM) aims to optimize treatments by individualizing dosage regimens based on the measurement of blood concentrations. Dosage individualization to maintain concentrations within a target range requires pharmacokinetic and clinical capabilities. Bayesian calculations currently represent the gold standard TDM approach but require computation assistance. In recent decades computer programs have been developed to assist clinicians in this assignment. The aim of this survey was to assess and compare computer tools designed to support TDM clinical activities. The literature and the Internet were searched to identify software. All programs were tested on personal computers. Each program was scored against a standardized grid covering pharmacokinetic relevance, user friendliness, computing aspects, interfacing and storage. A weighting factor was applied to each criterion of the grid to account for its relative importance. To assess the robustness of the software, six representative clinical vignettes were processed through each of them. Altogether, 12 software tools were identified, tested and ranked, representing a comprehensive review of the available software. Numbers of drugs handled by the software vary widely (from two to 180), and eight programs offer users the possibility of adding new drug models based on population pharmacokinetic analyses. Bayesian computation to predict dosage adaptation from blood concentration (a posteriori adjustment) is performed by ten tools, while nine are also able to propose a priori dosage regimens, based only on individual patient covariates such as age, sex and bodyweight. Among those applying Bayesian calculation, MM-USC*PACK© uses the non-parametric approach. The top two programs emerging from this benchmark were MwPharm© and TCIWorks. Most other programs evaluated had good potential while being less sophisticated or less user friendly. Programs vary in complexity and might not fit all healthcare settings. Each software tool must therefore be regarded with respect to the individual needs of hospitals or clinicians. Programs should be easy and fast for routine activities, including for non-experienced users. Computer-assisted TDM is gaining growing interest and should further improve, especially in terms of information system interfacing, user friendliness, data storage capability and report generation.
Resumo:
SUMMARYSpecies distribution models (SDMs) represent nowadays an essential tool in the research fields of ecology and conservation biology. By combining observations of species occurrence or abundance with information on the environmental characteristic of the observation sites, they can provide information on the ecology of species, predict their distributions across the landscape or extrapolate them to other spatial or time frames. The advent of SDMs, supported by geographic information systems (GIS), new developments in statistical models and constantly increasing computational capacities, has revolutionized the way ecologists can comprehend species distributions in their environment. SDMs have brought the tool that allows describing species realized niches across a multivariate environmental space and predict their spatial distribution. Predictions, in the form of probabilistic maps showing the potential distribution of the species, are an irreplaceable mean to inform every single unit of a territory about its biodiversity potential. SDMs and the corresponding spatial predictions can be used to plan conservation actions for particular species, to design field surveys, to assess the risks related to the spread of invasive species, to select reserve locations and design reserve networks, and ultimately, to forecast distributional changes according to scenarios of climate and/or land use change.By assessing the effect of several factors on model performance and on the accuracy of spatial predictions, this thesis aims at improving techniques and data available for distribution modelling and at providing the best possible information to conservation managers to support their decisions and action plans for the conservation of biodiversity in Switzerland and beyond. Several monitoring programs have been put in place from the national to the global scale, and different sources of data now exist and start to be available to researchers who want to model species distribution. However, because of the lack of means, data are often not gathered at an appropriate resolution, are sampled only over limited areas, are not spatially explicit or do not provide a sound biological information. A typical example of this is data on 'habitat' (sensu biota). Even though this is essential information for an effective conservation planning, it often has to be approximated from land use, the closest available information. Moreover, data are often not sampled according to an established sampling design, which can lead to biased samples and consequently to spurious modelling results. Understanding the sources of variability linked to the different phases of the modelling process and their importance is crucial in order to evaluate the final distribution maps that are to be used for conservation purposes.The research presented in this thesis was essentially conducted within the framework of the Landspot Project, a project supported by the Swiss National Science Foundation. The main goal of the project was to assess the possible contribution of pre-modelled 'habitat' units to model the distribution of animal species, in particular butterfly species, across Switzerland. While pursuing this goal, different aspects of data quality, sampling design and modelling process were addressed and improved, and implications for conservation discussed. The main 'habitat' units considered in this thesis are grassland and forest communities of natural and anthropogenic origin as defined in the typology of habitats for Switzerland. These communities are mainly defined at the phytosociological level of the alliance. For the time being, no comprehensive map of such communities is available at the national scale and at fine resolution. As a first step, it was therefore necessary to create distribution models and maps for these communities across Switzerland and thus to gather and collect the necessary data. In order to reach this first objective, several new developments were necessary such as the definition of expert models, the classification of the Swiss territory in environmental domains, the design of an environmentally stratified sampling of the target vegetation units across Switzerland, the development of a database integrating a decision-support system assisting in the classification of the relevés, and the downscaling of the land use/cover data from 100 m to 25 m resolution.The main contributions of this thesis to the discipline of species distribution modelling (SDM) are assembled in four main scientific papers. In the first, published in Journal of Riogeography different issues related to the modelling process itself are investigated. First is assessed the effect of five different stepwise selection methods on model performance, stability and parsimony, using data of the forest inventory of State of Vaud. In the same paper are also assessed: the effect of weighting absences to ensure a prevalence of 0.5 prior to model calibration; the effect of limiting absences beyond the environmental envelope defined by presences; four different methods for incorporating spatial autocorrelation; and finally, the effect of integrating predictor interactions. Results allowed to specifically enhance the GRASP tool (Generalized Regression Analysis and Spatial Predictions) that now incorporates new selection methods and the possibility of dealing with interactions among predictors as well as spatial autocorrelation. The contribution of different sources of remotely sensed information to species distribution models was also assessed. The second paper (to be submitted) explores the combined effects of sample size and data post-stratification on the accuracy of models using data on grassland distribution across Switzerland collected within the framework of the Landspot project and supplemented with other important vegetation databases. For the stratification of the data, different spatial frameworks were compared. In particular, environmental stratification by Swiss Environmental Domains was compared to geographical stratification either by biogeographic regions or political states (cantons). The third paper (to be submitted) assesses the contribution of pre- modelled vegetation communities to the modelling of fauna. It is a two-steps approach that combines the disciplines of community ecology and spatial ecology and integrates their corresponding concepts of habitat. First are modelled vegetation communities per se and then these 'habitat' units are used in order to model animal species habitat. A case study is presented with grassland communities and butterfly species. Different ways of integrating vegetation information in the models of butterfly distribution were also evaluated. Finally, a glimpse to climate change is given in the fourth paper, recently published in Ecological Modelling. This paper proposes a conceptual framework for analysing range shifts, namely a catalogue of the possible patterns of change in the distribution of a species along elevational or other environmental gradients and an improved quantitative methodology to identify and objectively describe these patterns. The methodology was developed using data from the Swiss national common breeding bird survey and the article presents results concerning the observed shifts in the elevational distribution of breeding birds in Switzerland.The overall objective of this thesis is to improve species distribution models as potential inputs for different conservation tools (e.g. red lists, ecological networks, risk assessment of the spread of invasive species, vulnerability assessment in the context of climate change). While no conservation issues or tools are directly tested in this thesis, the importance of the proposed improvements made in species distribution modelling is discussed in the context of the selection of reserve networks.RESUMELes modèles de distribution d'espèces (SDMs) représentent aujourd'hui un outil essentiel dans les domaines de recherche de l'écologie et de la biologie de la conservation. En combinant les observations de la présence des espèces ou de leur abondance avec des informations sur les caractéristiques environnementales des sites d'observation, ces modèles peuvent fournir des informations sur l'écologie des espèces, prédire leur distribution à travers le paysage ou l'extrapoler dans l'espace et le temps. Le déploiement des SDMs, soutenu par les systèmes d'information géographique (SIG), les nouveaux développements dans les modèles statistiques, ainsi que la constante augmentation des capacités de calcul, a révolutionné la façon dont les écologistes peuvent comprendre la distribution des espèces dans leur environnement. Les SDMs ont apporté l'outil qui permet de décrire la niche réalisée des espèces dans un espace environnemental multivarié et prédire leur distribution spatiale. Les prédictions, sous forme de carte probabilistes montrant la distribution potentielle de l'espèce, sont un moyen irremplaçable d'informer chaque unité du territoire de sa biodiversité potentielle. Les SDMs et les prédictions spatiales correspondantes peuvent être utilisés pour planifier des mesures de conservation pour des espèces particulières, pour concevoir des plans d'échantillonnage, pour évaluer les risques liés à la propagation d'espèces envahissantes, pour choisir l'emplacement de réserves et les mettre en réseau, et finalement, pour prévoir les changements de répartition en fonction de scénarios de changement climatique et/ou d'utilisation du sol. En évaluant l'effet de plusieurs facteurs sur la performance des modèles et sur la précision des prédictions spatiales, cette thèse vise à améliorer les techniques et les données disponibles pour la modélisation de la distribution des espèces et à fournir la meilleure information possible aux gestionnaires pour appuyer leurs décisions et leurs plans d'action pour la conservation de la biodiversité en Suisse et au-delà. Plusieurs programmes de surveillance ont été mis en place de l'échelle nationale à l'échelle globale, et différentes sources de données sont désormais disponibles pour les chercheurs qui veulent modéliser la distribution des espèces. Toutefois, en raison du manque de moyens, les données sont souvent collectées à une résolution inappropriée, sont échantillonnées sur des zones limitées, ne sont pas spatialement explicites ou ne fournissent pas une information écologique suffisante. Un exemple typique est fourni par les données sur 'l'habitat' (sensu biota). Même s'il s'agit d'une information essentielle pour des mesures de conservation efficaces, elle est souvent approximée par l'utilisation du sol, l'information qui s'en approche le plus. En outre, les données ne sont souvent pas échantillonnées selon un plan d'échantillonnage établi, ce qui biaise les échantillons et par conséquent les résultats de la modélisation. Comprendre les sources de variabilité liées aux différentes phases du processus de modélisation s'avère crucial afin d'évaluer l'utilisation des cartes de distribution prédites à des fins de conservation.La recherche présentée dans cette thèse a été essentiellement menée dans le cadre du projet Landspot, un projet soutenu par le Fond National Suisse pour la Recherche. L'objectif principal de ce projet était d'évaluer la contribution d'unités 'd'habitat' pré-modélisées pour modéliser la répartition des espèces animales, notamment de papillons, à travers la Suisse. Tout en poursuivant cet objectif, différents aspects touchant à la qualité des données, au plan d'échantillonnage et au processus de modélisation sont abordés et améliorés, et leurs implications pour la conservation des espèces discutées. Les principaux 'habitats' considérés dans cette thèse sont des communautés de prairie et de forêt d'origine naturelle et anthropique telles que définies dans la typologie des habitats de Suisse. Ces communautés sont principalement définies au niveau phytosociologique de l'alliance. Pour l'instant aucune carte de la distribution de ces communautés n'est disponible à l'échelle nationale et à résolution fine. Dans un premier temps, il a donc été nécessaire de créer des modèles de distribution de ces communautés à travers la Suisse et par conséquent de recueillir les données nécessaires. Afin d'atteindre ce premier objectif, plusieurs nouveaux développements ont été nécessaires, tels que la définition de modèles experts, la classification du territoire suisse en domaines environnementaux, la conception d'un échantillonnage environnementalement stratifié des unités de végétation cibles dans toute la Suisse, la création d'une base de données intégrant un système d'aide à la décision pour la classification des relevés, et le « downscaling » des données de couverture du sol de 100 m à 25 m de résolution. Les principales contributions de cette thèse à la discipline de la modélisation de la distribution d'espèces (SDM) sont rassemblées dans quatre articles scientifiques. Dans le premier article, publié dans le Journal of Biogeography, différentes questions liées au processus de modélisation sont étudiées en utilisant les données de l'inventaire forestier de l'Etat de Vaud. Tout d'abord sont évalués les effets de cinq méthodes de sélection pas-à-pas sur la performance, la stabilité et la parcimonie des modèles. Dans le même article sont également évalués: l'effet de la pondération des absences afin d'assurer une prévalence de 0.5 lors de la calibration du modèle; l'effet de limiter les absences au-delà de l'enveloppe définie par les présences; quatre méthodes différentes pour l'intégration de l'autocorrélation spatiale; et enfin, l'effet de l'intégration d'interactions entre facteurs. Les résultats présentés dans cet article ont permis d'améliorer l'outil GRASP qui intègre désonnais de nouvelles méthodes de sélection et la possibilité de traiter les interactions entre variables explicatives, ainsi que l'autocorrélation spatiale. La contribution de différentes sources de données issues de la télédétection a également été évaluée. Le deuxième article (en voie de soumission) explore les effets combinés de la taille de l'échantillon et de la post-stratification sur le la précision des modèles. Les données utilisées ici sont celles concernant la répartition des prairies de Suisse recueillies dans le cadre du projet Landspot et complétées par d'autres sources. Pour la stratification des données, différents cadres spatiaux ont été comparés. En particulier, la stratification environnementale par les domaines environnementaux de Suisse a été comparée à la stratification géographique par les régions biogéographiques ou par les cantons. Le troisième article (en voie de soumission) évalue la contribution de communautés végétales pré-modélisées à la modélisation de la faune. C'est une approche en deux étapes qui combine les disciplines de l'écologie des communautés et de l'écologie spatiale en intégrant leurs concepts de 'habitat' respectifs. Les communautés végétales sont modélisées d'abord, puis ces unités de 'habitat' sont utilisées pour modéliser les espèces animales. Une étude de cas est présentée avec des communautés prairiales et des espèces de papillons. Différentes façons d'intégrer l'information sur la végétation dans les modèles de répartition des papillons sont évaluées. Enfin, un clin d'oeil aux changements climatiques dans le dernier article, publié dans Ecological Modelling. Cet article propose un cadre conceptuel pour l'analyse des changements dans la distribution des espèces qui comprend notamment un catalogue des différentes formes possibles de changement le long d'un gradient d'élévation ou autre gradient environnemental, et une méthode quantitative améliorée pour identifier et décrire ces déplacements. Cette méthodologie a été développée en utilisant des données issues du monitoring des oiseaux nicheurs répandus et l'article présente les résultats concernant les déplacements observés dans la distribution altitudinale des oiseaux nicheurs en Suisse.L'objectif général de cette thèse est d'améliorer les modèles de distribution des espèces en tant que source d'information possible pour les différents outils de conservation (par exemple, listes rouges, réseaux écologiques, évaluation des risques de propagation d'espèces envahissantes, évaluation de la vulnérabilité des espèces dans le contexte de changement climatique). Bien que ces questions de conservation ne soient pas directement testées dans cette thèse, l'importance des améliorations proposées pour la modélisation de la distribution des espèces est discutée à la fin de ce travail dans le contexte de la sélection de réseaux de réserves.
Resumo:
Metabolic problems lead to numerous failures during clinical trials, and much effort is now devoted to developing in silico models predicting metabolic stability and metabolites. Such models are well known for cytochromes P450 and some transferases, whereas less has been done to predict the activity of human hydrolases. The present study was undertaken to develop a computational approach able to predict the hydrolysis of novel esters by human carboxylesterase hCES2. The study involved first a homology modeling of the hCES2 protein based on the model of hCES1 since the two proteins share a high degree of homology (congruent with 73%). A set of 40 known substrates of hCES2 was taken from the literature; the ligands were docked in both their neutral and ionized forms using GriDock, a parallel tool based on the AutoDock4.0 engine which can perform efficient and easy virtual screening analyses of large molecular databases exploiting multi-core architectures. Useful statistical models (e.g., r (2) = 0.91 for substrates in their unprotonated state) were calculated by correlating experimental pK(m) values with distance between the carbon atom of the substrate's ester group and the hydroxy function of Ser228. Additional parameters in the equations accounted for hydrophobic and electrostatic interactions between substrates and contributing residues. The negatively charged residues in the hCES2 cavity explained the preference of the enzyme for neutral substrates and, more generally, suggested that ligands which interact too strongly by ionic bonds (e.g., ACE inhibitors) cannot be good CES2 substrates because they are trapped in the cavity in unproductive modes and behave as inhibitors. The effects of protonation on substrate recognition and the contrasting behavior of substrates and products were finally investigated by MD simulations of some CES2 complexes.
Resumo:
Familial searching consists of searching for a full profile left at a crime scene in a National DNA Database (NDNAD). In this paper we are interested in the circumstance where no full match is returned, but a partial match is found between a database member's profile and the crime stain. Because close relatives share more of their DNA than unrelated persons, this partial match may indicate that the crime stain was left by a close relative of the person with whom the partial match was found. This approach has successfully solved important crimes in the UK and the USA. In a previous paper, a model, which takes into account substructure and siblings, was used to simulate a NDNAD. In this paper, we have used this model to test the usefulness of familial searching and offer guidelines for pre-assessment of the cases based on the likelihood ratio. Siblings of "persons" present in the simulated Swiss NDNAD were created. These profiles (N=10,000) were used as traces and were then compared to the whole database (N=100,000). The statistical results obtained show that the technique has great potential confirming the findings of previous studies. However, effectiveness of the technique is only one part of the story. Familial searching has juridical and ethical aspects that should not be ignored. In Switzerland for example, there are no specific guidelines to the legality or otherwise of familial searching. This article both presents statistical results, and addresses criminological and civil liberties aspects to take into account risks and benefits of familial searching.
Resumo:
Abstract This thesis proposes a set of adaptive broadcast solutions and an adaptive data replication solution to support the deployment of P2P applications. P2P applications are an emerging type of distributed applications that are running on top of P2P networks. Typical P2P applications are video streaming, file sharing, etc. While interesting because they are fully distributed, P2P applications suffer from several deployment problems, due to the nature of the environment on which they perform. Indeed, defining an application on top of a P2P network often means defining an application where peers contribute resources in exchange for their ability to use the P2P application. For example, in P2P file sharing application, while the user is downloading some file, the P2P application is in parallel serving that file to other users. Such peers could have limited hardware resources, e.g., CPU, bandwidth and memory or the end-user could decide to limit the resources it dedicates to the P2P application a priori. In addition, a P2P network is typically emerged into an unreliable environment, where communication links and processes are subject to message losses and crashes, respectively. To support P2P applications, this thesis proposes a set of services that address some underlying constraints related to the nature of P2P networks. The proposed services include a set of adaptive broadcast solutions and an adaptive data replication solution that can be used as the basis of several P2P applications. Our data replication solution permits to increase availability and to reduce the communication overhead. The broadcast solutions aim, at providing a communication substrate encapsulating one of the key communication paradigms used by P2P applications: broadcast. Our broadcast solutions typically aim at offering reliability and scalability to some upper layer, be it an end-to-end P2P application or another system-level layer, such as a data replication layer. Our contributions are organized in a protocol stack made of three layers. In each layer, we propose a set of adaptive protocols that address specific constraints imposed by the environment. Each protocol is evaluated through a set of simulations. The adaptiveness aspect of our solutions relies on the fact that they take into account the constraints of the underlying system in a proactive manner. To model these constraints, we define an environment approximation algorithm allowing us to obtain an approximated view about the system or part of it. This approximated view includes the topology and the components reliability expressed in probabilistic terms. To adapt to the underlying system constraints, the proposed broadcast solutions route messages through tree overlays permitting to maximize the broadcast reliability. Here, the broadcast reliability is expressed as a function of the selected paths reliability and of the use of available resources. These resources are modeled in terms of quotas of messages translating the receiving and sending capacities at each node. To allow a deployment in a large-scale system, we take into account the available memory at processes by limiting the view they have to maintain about the system. Using this partial view, we propose three scalable broadcast algorithms, which are based on a propagation overlay that tends to the global tree overlay and adapts to some constraints of the underlying system. At a higher level, this thesis also proposes a data replication solution that is adaptive both in terms of replica placement and in terms of request routing. At the routing level, this solution takes the unreliability of the environment into account, in order to maximize reliable delivery of requests. At the replica placement level, the dynamically changing origin and frequency of read/write requests are analyzed, in order to define a set of replica that minimizes communication cost.
Resumo:
Abstract: The literature on the various links between organizations and their external environment is very extensive and fragmented. This thesis is comprised of three separate essays, each examining specific research questions related to these links. The first essay deals with the notion of industry life cycle and how the geographical concentration of an industry is linked to the particular life cycle stage in which the industry finds itself. The aim of this first essay is firstly to verify if the evolution of the Swiss hotel industry fits some of the stylized facts of the industry life cycle. The second aim is to verify if there is evidence of geographical clustering of the hotel industry, and by extension of tourism. The third aim is to verify a hypothesis that industry decline manifests itself mainly by company closures in decentralized locations. The importance for organizational survival and performance of adapting and reacting to environmental changes has long been ascertained. This adaptation requires managers, under conditions of uncertainty, to identify relevant changes in their external environment and to interpret the possible effects of those changes on their organization. Furthermore, it requires finding and adopting organizational responses in reaction to the environmental changes. The second essay explores how managers perceive their environment by reporting the results of two workshops held with managers from the European hotel industry. In the third essay we examine in more detail the role of uncertainty in the interpretation by executives of environmental changes. We integrate existing theories of interpretation and uncertainty into one framework, which we then test using national survey data from the hotel industry. In all three essays we are able to provide some evidence to support our main hypotheses, but.also make suggestions far further research into the topics examined.
Resumo:
Summary Landscapes are continuously changing. Natural forces of change such as heavy rainfall and fires can exert lasting influences on their physical form. However, changes related to human activities have often shaped landscapes more distinctly. In Western Europe, especially modern agricultural practices and the expanse of overbuilt land have left their marks in the landscapes since the middle of the 20th century. In the recent years men realised that mare and more changes that were formerly attributed to natural forces might indirectly be the result of their own action. Perhaps the most striking landscape change indirectly driven by human activity we can witness in these days is the large withdrawal of Alpine glaciers. Together with the landscapes also habitats of animal and plant species have undergone vast and sometimes rapid changes that have been hold responsible for the ongoing loss of biodiversity. Thereby, still little knowledge is available about probable effects of the rate of landscape change on species persistence and disappearance. Therefore, the development and speed of land use/land cover in the Swiss communes between the 1950s and 1990s were reconstructed using 10 parameters from agriculture and housing censuses, and were further correlated with changes in butterfly species occurrences. Cluster analyses were used to detect spatial patterns of change on broad spatial scales. Thereby, clusters of communes showing similar changes or transformation rates were identified for single decades and put into a temporally dynamic sequence. The obtained picture on the changes showed a prevalent replacement of non-intensive agriculture by intensive practices, a strong spreading of urban communes around city centres, and transitions towards larger farm sizes in the mountainous areas. Increasing transformation rates toward more intensive agricultural managements were especially found until the 1970s, whereas afterwards the trends were commonly negative. However, transformation rates representing the development of residential buildings showed positive courses at any time. The analyses concerning the butterfly species showed that grassland species reacted sensitively to the density of livestock in the communes. This might indicate the augmented use of dry grasslands as cattle pastures that show altered plant species compositions. Furthermore, these species also decreased in communes where farms with an agricultural area >5ha have disappeared. The species of the wetland habitats were favoured in communes with smaller fractions of agricultural areas and lower densities of large farms (>10ha) but did not show any correlation to transformation rates. It was concluded from these analyses that transformation rates might influence species disappearance to a certain extent but that states of the environmental predictors might generally outweigh the importance of the corresponding rates. Information on the current distribution of species is evident for nature conservation. Planning authorities that define priority areas for species protection or examine and authorise construction projects need to know about the spatial distribution of species. Hence, models that simulate the potential spatial distribution of species have become important decision tools. The underlying statistical analyses such as the widely used generalised linear models (GLM) often rely on binary species presence-absence data. However, often only species presence data have been colleted, especially for vagrant, rare or cryptic species such as butterflies or reptiles. Modellers have thus introduced randomly selected absence data to design distribution models. Yet, selecting false absence data might bias the model results. Therefore, we investigated several strategies to select more reliable absence data to model the distribution of butterfly species based on historical distribution data. The results showed that better models were obtained when historical data from longer time periods were considered. Furthermore, model performance was additionally increased when long-term data of species that show similar habitat requirements as the modelled species were used. This successful methodological approach was further applied to assess consequences of future landscape changes on the occurrence of butterfly species inhabiting dry grasslands or wetlands. These habitat types have been subjected to strong deterioration in the recent decades, what makes their protection a future mission. Four spatially explicit scenarios that described (i) ongoing land use changes as observed between 1985 and 1997, (ii) liberalised agricultural markets, and (iii) slightly and (iv) strongly lowered agricultural production provided probable directions of landscape change. Current species-environment relationships were derived from a statistical model and used to predict future occurrence probabilities in six major biogeographical regions in Switzerland, comprising the Jura Mountains, the Plateau, the Northern and Southern Alps, as well as the Western and Eastern Central Alps. The main results were that dry grasslands species profited from lowered agricultural production, whereas overgrowth of open areas in the liberalisation scenario might impair species occurrence. The wetland species mostly responded with decreases in their occurrence probabilities in the scenarios, due to a loss of their preferred habitat. Further analyses about factors currently influencing species occurrences confirmed anthropogenic causes such as urbanisation, abandonment of open land, and agricultural intensification. Hence, landscape planning should pay more attention to these forces in areas currently inhabited by these butterfly species to enable sustainable species persistence. In this thesis historical data were intensively used to reconstruct past developments and to make them useful for current investigations. Yet, the availability of historical data and the analyses on broader spatial scales has often limited the explanatory power of the conducted analyses. Meaningful descriptors of former habitat characteristics and abundant species distribution data are generally sparse, especially for fine scale analyses. However, this situation can be ameliorated by broadening the extent of the study site and the used grain size, as was done in this thesis by considering the whole of Switzerland with its communes. Nevertheless, current monitoring projects and data recording techniques are promising data sources that might allow more detailed analyses about effects of long-term species reactions on landscape changes in the near future. This work, however, also showed the value of historical species distribution data as for example their potential to locate still unknown species occurrences. The results might therefore contribute to further research activities that investigate current and future species distributions considering the immense richness of historical distribution data. Résumé Les paysages changent continuellement. Des farces naturelles comme des pluies violentes ou des feux peuvent avoir une influence durable sur la forme du paysage. Cependant, les changements attribués aux activités humaines ont souvent modelé les paysages plus profondément. Depuis les années 1950 surtout, les pratiques agricoles modernes ou l'expansion des surfaces d'habitat et d'infrastructure ont caractérisé le développement du paysage en Europe de l'Ouest. Ces dernières années, l'homme a commencé à réaliser que beaucoup de changements «naturels » pourraient indirectement résulter de ses propres activités. Le changement de paysage le plus apparent dont nous sommes témoins de nos jours est probablement l'immense retraite des glaciers alpins. Avec les paysages, les habitats des animaux et des plantes ont aussi été exposés à des changements vastes et quelquefois rapides, tenus pour coresponsable de la continuelle diminution de la biodiversité. Cependant, nous savons peu des effets probables de la rapidité des changements du paysage sur la persistance et la disparition des espèces. Le développement et la rapidité du changement de l'utilisation et de la couverture du sol dans les communes suisses entre les années 50 et 90 ont donc été reconstruits au moyen de 10 variables issues des recensements agricoles et résidentiels et ont été corrélés avec des changements de présence des papillons diurnes. Des analyses de groupes (Cluster analyses) ont été utilisées pour détecter des arrangements spatiaux de changements à l'échelle de la Suisse. Des communes avec des changements ou rapidités comparables ont été délimitées pour des décennies séparées et ont été placées en séquence temporelle, en rendrent une certaine dynamique du changement. Les résultats ont montré un remplacement répandu d'une agriculture extensive des pratiques intensives, une forte expansion des faubourgs urbains autour des grandes cités et des transitions vers de plus grandes surfaces d'exploitation dans les Alpes. Dans le cas des exploitations agricoles, des taux de changement croissants ont été observés jusqu'aux années 70, alors que la tendance a généralement été inversée dans les années suivantes. Par contre, la vitesse de construction des nouvelles maisons a montré des courbes positives pendant les 50 années. Les analyses sur la réaction des papillons diurnes ont montré que les espèces des prairies sèches supportaient une grande densité de bétail. Il est possible que dans ces communes beaucoup des prairies sèches aient été fertilisées et utilisées comme pâturages, qui ont une autre composition floristique. De plus, les espèces ont diminué dans les communes caractérisées par une rapide perte des fermes avec une surface cultivable supérieure à 5 ha. Les espèces des marais ont été favorisées dans des communes avec peu de surface cultivable et peu de grandes fermes, mais n'ont pas réagi aux taux de changement. Il en a donc été conclu que la rapidité des changements pourrait expliquer les disparitions d'espèces dans certains cas, mais que les variables prédictives qui expriment des états pourraient être des descripteurs plus importants. Des informations sur la distribution récente des espèces sont importantes par rapport aux mesures pour la conservation de la nature. Pour des autorités occupées à définir des zones de protection prioritaires ou à autoriser des projets de construction, ces informations sont indispensables. Les modèles de distribution spatiale d'espèces sont donc devenus des moyens de décision importants. Les méthodes statistiques courantes comme les modèles linéaires généralisés (GLM) demandent des données de présence et d'absence des espèces. Cependant, souvent seules les données de présence sont disponibles, surtout pour les animaux migrants, rares ou cryptiques comme des papillons ou des reptiles. C'est pourquoi certains modélisateurs ont choisi des absences au hasard, avec le risque d'influencer le résultat en choisissant des fausses absences. Nous avons établi plusieurs stratégies, basées sur des données de distribution historique des papillons diurnes, pour sélectionner des absences plus fiables. Les résultats ont démontré que de meilleurs modèles pouvaient être obtenus lorsque les données proviennent des périodes de temps plus longues. En plus, la performance des modèles a pu être augmentée en considérant des données de distribution à long terme d'espèces qui occupent des habitats similaires à ceux de l'espèce cible. Vu le succès de cette stratégie, elle a été utilisée pour évaluer les effets potentiels des changements de paysage futurs sur la distribution des papillons des prairies sèches et marais, deux habitats qui ont souffert de graves détériorations. Quatre scénarios spatialement explicites, décrivant (i) l'extrapolation des changements de l'utilisation de sol tels qu'observés entre 1985 et 1997, (ii) la libéralisation des marchés agricoles, et une production agricole (iii) légèrement amoindrie et (iv) fortement diminuée, ont été utilisés pour générer des directions de changement probables. Les relations actuelles entre la distribution des espèces et l'environnement ont été déterminées par le biais des modèles statistiques et ont été utilisées pour calculer des probabilités de présence selon les scénarios dans six régions biogéographiques majeures de la Suisse, comportant le Jura, le Plateau, les Alpes du Nord, du Sud, centrales orientales et centrales occidentales. Les résultats principaux ont montré que les espèces des prairies sèches pourraient profiter d'une diminution de la production agricole, mais qu'elles pourraient aussi disparaître à cause de l'embroussaillement des terres ouvertes dû à la libéralisation des marchés agricoles. La probabilité de présence des espèces de marais a décrû à cause d'une perte générale des habitats favorables. De plus, les analyses ont confirmé que des causes humaines comme l'urbanisation, l'abandon des terres ouvertes et l'intensification de l'agriculture affectent actuellement ces espèces. Ainsi ces forces devraient être mieux prises en compte lors de planifications paysagères, pour que ces papillons diurnes puissent survivre dans leurs habitats actuels. Dans ce travail de thèse, des données historiques ont été intensivement utilisées pour reconstruire des développements anciens et pour les rendre utiles à des recherches contemporaines. Cependant, la disponibilité des données historiques et les analyses à grande échelle ont souvent limité le pouvoir explicatif des analyses. Des descripteurs pertinents pour caractériser les habitats anciens et des données suffisantes sur la distribution des espèces sont généralement rares, spécialement pour des analyses à des échelles fores. Cette situation peut être améliorée en augmentant l'étendue du site d'étude et la résolution, comme il a été fait dans cette thèse en considérant toute la Suisse avec ses communes. Cependant, les récents projets de surveillance et les techniques de collecte de données sont des sources prometteuses, qui pourraient permettre des analyses plus détaillés sur les réactions à long terme des espèces aux changements de paysage dans le futur. Ce travail a aussi montré la valeur des anciennes données de distribution, par exemple leur potentiel pour aider à localiser des' présences d'espèces encore inconnues. Les résultats peuvent contribuer à des activités de recherche à venir, qui étudieraient les distributions récentes ou futures d'espèces en considérant l'immense richesse des données de distribution historiques.
Resumo:
The Trepca Pb-Zn-Ag skarn deposit (29 Mt of ore at 3.45% Pb, 2.30% Zn, and 80 g/t Ag) is located in the Kopaonik block of the western Vardar zone, Kosovo. The mineralization, hosted by recrystallized limestone of Upper Triassic age, was structurally and lithologically controlled. Ore deposition is spatially and temporally related with the postcollisional magmatism of Oligocene age (23-26 Ma). The deposit was formed during two distinct mineralization stages: an early prograde closed-system and a later retrograde open-system stage. The prograde mineralization consisting mainly of pyroxenes (Hd(54-100)Jo(0-45)Di(0-45)) resulted from the interaction of magmatic fluids associated with Oligocene (23-26 Ma) postcollisional magmatism. Whereas there is no direct contact between magmatic rocks and the mineralization, the deposit is classified as a distal Pb-Zn-Ag skarn. Abundant pyroxene reflects low oxygen fugacity (<10(-31) bar) and anhydrous environment. Fluid inclusion data and mineral assemblage limit the prograde stage within a temperature range between 390 degrees and 475 degrees C. Formation pressure is estimated below 900 bars. Isotopic composition of aqueous fluid, inclusions hosted by hedenbergite (delta D = -108 to -130 parts per thousand; delta O-18 = 7.5-8.0 parts per thousand), Mn-enriched mineralogy and high REE content of the host carbonates at the contact with the skarn mineralization suggest that a magmatic fluid was modified during its infiltration through the country rocks. The retrograde mineral assemblage comprises ilvaite, magnetite, arsenopyrite, pyrrhotite, marcasite, pyrite, quartz, and various carbonates. Increases in oxygen and sulfur fugacities, as well as a hydrous character of mineralization, require an open-system model. The opening of the system is related to phreatomagmatic explosion and formation of the breccia. Arsenopyrite geothermometer limits the retrograde stage within the temperature range between 350 degrees and 380 degrees C and sulfur fugacity between 10(-8.8) and 10(-7.2) bars. The principal ore minerals, galena, sphalerite, pyrite, and minor chalcopyrite, were deposited from a moderately saline Ca-Na chloride fluid at around 350 degrees C. According to the isotopic composition of fluid inclusions hosted by sphalerite (delta D = -55 to -74 parts per thousand; delta O-18 = -9.6 to -13.6 parts per thousand), the fluid responsible for ore deposition was dominantly meteoric in origin. The delta S-31 values of the sulfides spanning between -5.5 and +10 parts per thousand point to a magmatic origin of sulfur. Ore deposition appears to have been largely contemporaneous with the retrograde stage of the skarn development. Postore stage accompanied the precipitation of significant amount of carbonates including the travertine deposits at the deposit surface. Mineralogical composition of travertine varies from calcite to siderite and all carbonates contain significant amounts of Mn. Decreased formation temperature and depletion in the REE content point to an influence of pH-neutralized cold ground water and dying magmatic system.
Resumo:
Introduction: Therapeutic drug monitoring (TDM) aims at optimizing treatment by individualizing dosage regimen based on measurement of blood concentrations. Maintaining concentrations within a target range requires pharmacokinetic and clinical capabilities. Bayesian calculation represents a gold standard in TDM approach but requires computing assistance. In the last decades computer programs have been developed to assist clinicians in this assignment. The aim of this benchmarking was to assess and compare computer tools designed to support TDM clinical activities.¦Method: Literature and Internet search was performed to identify software. All programs were tested on common personal computer. Each program was scored against a standardized grid covering pharmacokinetic relevance, user-friendliness, computing aspects, interfacing, and storage. A weighting factor was applied to each criterion of the grid to consider its relative importance. To assess the robustness of the software, six representative clinical vignettes were also processed through all of them.¦Results: 12 software tools were identified, tested and ranked. It represents a comprehensive review of the available software's characteristics. Numbers of drugs handled vary widely and 8 programs offer the ability to the user to add its own drug model. 10 computer programs are able to compute Bayesian dosage adaptation based on a blood concentration (a posteriori adjustment) while 9 are also able to suggest a priori dosage regimen (prior to any blood concentration measurement), based on individual patient covariates, such as age, gender, weight. Among those applying Bayesian analysis, one uses the non-parametric approach. The top 2 software emerging from this benchmark are MwPharm and TCIWorks. Other programs evaluated have also a good potential but are less sophisticated (e.g. in terms of storage or report generation) or less user-friendly.¦Conclusion: Whereas 2 integrated programs are at the top of the ranked listed, such complex tools would possibly not fit all institutions, and each software tool must be regarded with respect to individual needs of hospitals or clinicians. Interest in computing tool to support therapeutic monitoring is still growing. Although developers put efforts into it the last years, there is still room for improvement, especially in terms of institutional information system interfacing, user-friendliness, capacity of data storage and report generation.
Resumo:
Objectives: Therapeutic drug monitoring (TDM) aims at optimizing treatment by individualizing dosage regimen based on blood concentrations measurement. Maintaining concentrations within a target range requires pharmacokinetic (PK) and clinical capabilities. Bayesian calculation represents a gold standard in TDM approach but requires computing assistance. The aim of this benchmarking was to assess and compare computer tools designed to support TDM clinical activities.¦Methods: Literature and Internet were searched to identify software. Each program was scored against a standardized grid covering pharmacokinetic relevance, user-friendliness, computing aspects, interfacing, and storage. A weighting factor was applied to each criterion of the grid to consider its relative importance. To assess the robustness of the software, six representative clinical vignettes were also processed through all of them.¦Results: 12 software tools were identified, tested and ranked. It represents a comprehensive review of the available software characteristics. Numbers of drugs handled vary from 2 to more than 180, and integration of different population types is available for some programs. Nevertheless, 8 programs offer the ability to add new drug models based on population PK data. 10 computer tools incorporate Bayesian computation to predict dosage regimen (individual parameters are calculated based on population PK models). All of them are able to compute Bayesian a posteriori dosage adaptation based on a blood concentration while 9 are also able to suggest a priori dosage regimen, only based on individual patient covariates. Among those applying Bayesian analysis, MM-USC*PACK uses a non-parametric approach. The top 2 programs emerging from this benchmark are MwPharm and TCIWorks. Others programs evaluated have also a good potential but are less sophisticated or less user-friendly.¦Conclusions: Whereas 2 software packages are ranked at the top of the list, such complex tools would possibly not fit all institutions, and each program must be regarded with respect to individual needs of hospitals or clinicians. Programs should be easy and fast for routine activities, including for non-experienced users. Although interest in TDM tools is growing and efforts were put into it in the last years, there is still room for improvement, especially in terms of institutional information system interfacing, user-friendliness, capability of data storage and automated report generation.
Resumo:
Centrifuge is a user-friendly system to simultaneously access Arabidopsis gene annotations and intra- and inter-organism sequence comparison data. The tool allows rapid retrieval of user-selected data for each annotated Arabidopsis gene providing, in any combination, data on the following features: predicted protein properties such as mass, pI, cellular location and transmembrane domains; SWISS-PROT annotations; Interpro domains; Gene Ontology records; verified transcription; BLAST matches to the proteomes of A.thaliana, Oryza sativa (rice), Caenorhabditis elegans, Drosophila melanogaster and Homo sapiens. The tool lends itself particularly well to the rapid analysis of contigs or of tens or hundreds of genes identified by high-throughput gene expression experiments. In these cases, a summary table of principal predicted protein features for all genes is given followed by more detailed reports for each individual gene. Centrifuge can also be used for single gene analysis or in a word search mode. AVAILABILITY: http://centrifuge.unil.ch/ CONTACT: edward.farmer@unil.ch.
Resumo:
ABSTRACT : A firm's competitive advantage can arise from internal resources as well as from an interfirm network. -This dissertation investigates the competitive advantage of a firm involved in an innovation network by integrating strategic management theory and social network theory. It develops theory and provides empirical evidence that illustrates how a networked firm enables the network value and appropriates this value in an optimal way according to its strategic purpose. The four inter-related essays in this dissertation provide a framework that sheds light on the extraction of value from an innovation network by managing and designing the network in a proactive manner. The first essay reviews research in social network theory and knowledge transfer management, and identifies the crucial factors of innovation network configuration for a firm's learning performance or innovation output. The findings suggest that network structure, network relationship, and network position all impact on a firm's performance. Although the previous literature indicates that there are disagreements about the impact of dense or spare structure, as well as strong or weak ties, case evidence from Chinese software companies reveals that dense and strong connections with partners are positively associated with firms' performance. The second essay is a theoretical essay that illustrates the limitations of social network theory for explaining the source of network value and offers a new theoretical model that applies resource-based view to network environments. It suggests that network configurations, such as network structure, network relationship and network position, can be considered important network resources. In addition, this essay introduces the concept of network capability, and suggests that four types of network capabilities play an important role in unlocking the potential value of network resources and determining the distribution of network rents between partners. This essay also highlights the contingent effects of network capability on a firm's innovation output, and explains how the different impacts of network capability depend on a firm's strategic choices. This new theoretical model has been pre-tested with a case study of China software industry, which enhances the internal validity of this theory. The third essay addresses the questions of what impact network capability has on firm innovation performance and what are the antecedent factors of network capability. This essay employs a structural equation modelling methodology that uses a sample of 211 Chinese Hi-tech firms. It develops a measurement of network capability and reveals that networked firms deal with cooperation between, and coordination with partners on different levels according to their levels of network capability. The empirical results also suggests that IT maturity, the openness of culture, management system involved, and experience with network activities are antecedents of network capabilities. Furthermore, the two-group analysis of the role of international partner(s) shows that when there is a culture and norm gap between foreign partners, a firm must mobilize more resources and effort to improve its performance with respect to its innovation network. The fourth essay addresses the way in which network capabilities influence firm innovation performance. By using hierarchical multiple regression with data from Chinese Hi-tech firms, the findings suggest that there is a significant partial mediating effect of knowledge transfer on the relationships between network capabilities and innovation performance. The findings also reveal that the impacts of network capabilities divert with the environment and strategic decision the firm has made: exploration or exploitation. Network constructing capability provides a greater positive impact on and yields more contributions to innovation performance than does network operating capability in an exploration network. Network operating capability is more important than network constructing capability for innovative firms in an exploitation network. Therefore, these findings highlight that the firm can shape the innovation network proactively for better benefits, but when it does so, it should adjust its focus and change its efforts in accordance with its innovation purposes or strategic orientation.
Resumo:
Purpose: The accurate estimation of total energy expenditure (TEE) is essential to allow the provision of nutritional requirements in patients treated by maintenance hemodialysis (MHD). The measurement of TEE and resting energy expenditure (REE) by direct or indirect calorimetry and doubly labeled water are complicated, timeconsuming and cumbersome in this population. Recently, a new system called SenseWear® armband (SWA) was developed to assess TEE, physical activity and REE. This device works by measurements of body acceleration in two axes, heat production and steps counts. REE measured by indirect calorimetry and SWA are well correlated. The aim of this study was to determine TEE, physical activity and REE on patients on MHD using this new device. Methods and materials: Daily TEE, REE, step count, activity time, intensity of activity and lying time were determined for 7 consecutive days in unselected stable patients on MHD and sex, age and weightmatched healthy controls (HC). Patients with malnutrition, cancer, use of immunosuppressive drugs, hypoalbumemia <35 g/L and those hospitalized in the last 3 months, were excluded. For MHD patients, separate analyses were conducted in dialysis and non-dialysis days. Relevant parameters known to affect REE, such as BMI, albumin, pre-albumin, hemoglobin, Kt/V, CRP, bicarbonate, PTH, TSH, were recorded. Results: Thirty patients on MHD and 30 HC were included. In MHD patients, there were 20 men and 10 women. Age was 60,13 years ± 14.97 (mean ± SD), BMI was 25.77 kg/m² ± 4.73 and body weight was 74.65 kg ± 16.16. There were no significant differences between the two groups. TEE was lower in MHD patients compared to HC (28.79 ± 5.51 SD versus 32.91 ± 5.75 SD kcal/kg/day; p <0.01). Activity time was significantly lower in patients on MHD (101.3 ± 12.6SD versus 50.7 ± 9.4 SD min; p = 0.0021). Energy expenditure during the time of activity was significantly lower in MHD patients. MHD patients walked 4543 ± 643 SD vs 8537 ± 744 SD steps per day (p <0.0001). Age was negatively correlated with TEE (r = -0.70) and intensity of activity (r = -0.61) in HC, but not in patients on MHD. TEE showed no difference between dialysis and non-dialysis days (29.92 ± 2.03 SD versus 28.44 ± 1.90 SD kcal/kg/day; p = NS), reflecting a lack of difference in activity (number of steps, time of physical activity) and REE. This finding was observed in MHD patients both older and younger than 60 years. However, age stratification appeared to have an influence on TEE, regardless of dialysis day, (29.92 ± 2.07 SD kcal/kg/day for <60 years-old versus 27.41 ± 1.04 SD kcal/kg/day for ≥60 years old), although failing to reach statistical significance. Conclusion: Using SWA, we have shown that stable patients on MHD have a lower TEE than matched HC. On average, a TEE of 28.79 kcal/kg/day, partially affected by age, was measured. This finding gives support to the clinical impression that it is difficult and probably unnecessary to provide an energy amount of 30-35 kcal/kg/day, as proposed by international guidelines for this population. In addition, we documented for the first time that MHD patients exert a reduced physical activity as compared to HC. There were surprisingly no differences in TEE, REE and physical activity parameters between dialysis and non-dialysis days. This observation might be due to the fact that patients on MHD produce a physical effort to reach the dialysis centre. Age per se did not influence physical activity in MHD patients, contrary to HC, reflecting the impact of co-morbidities on physical activity in this group of patients.
Resumo:
Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.
Resumo:
BACKGROUND: Blood pressure (BP) is known to aggregate in families. Yet, heritability estimates are population-specific and no Swiss data have been published so far. We estimated the heritability of ambulatory and office BP in a Swiss population-based sample. METHODS: The Swiss Kidney Project on Genes in Hypertension is a population-based family study focusing on BP genetics. Office and ambulatory BP were measured in 1009 individuals from 271 nuclear families. Heritability was estimated for SBP, DBP, and pulse pressure using a maximum likelihood method implanted in the Statistical Analysis in Genetic Epidemiology software. RESULTS: The 518 women and 491 men included in this analysis had a mean (±SD) age of 48.3 (±17.4) and 47.3 (±17.7) years, and a mean BMI of 23.8 (±4.2) and 25.9 (±4.1) kg/m, respectively. Narrow-sense heritability estimates (±standard error) for ambulatory SBP, DBP, and pulse pressure were 0.37 ± 0.07, 0.26 ± 0.07, and 0.29 ± 0.07 for 24-h BP; 0.39 ± 0.07, 0.28 ± 0.07, and 0.27 ± 0.07 for day BP; and 0.25 ± 0.07, 0.20 ± 0.07, and 0.30 ± 0.07 for night BP, respectively (all P < 0.001). Heritability estimates for office SBP, DBP, and pulse pressure were 0.21 ± 0.08, 0.25 ± 0.08, and 0.18 ± 0.07 (all P < 0.01). CONCLUSIONS: We found significant heritability estimates for both ambulatory and office BP in this Swiss population-based study. Our findings justify the ongoing search for the genetic determinants of BP.