222 resultados para Classification Tree Pruning
Resumo:
BACKGROUND: The majority of Haemosporida species infect birds or reptiles, but many important genera, including Plasmodium, infect mammals. Dipteran vectors shared by avian, reptilian and mammalian Haemosporida, suggest multiple invasions of Mammalia during haemosporidian evolution; yet, phylogenetic analyses have detected only a single invasion event. Until now, several important mammal-infecting genera have been absent in these analyses. This study focuses on the evolutionary origin of Polychromophilus, a unique malaria genus that only infects bats (Microchiroptera) and is transmitted by bat flies (Nycteribiidae). METHODS: Two species of Polychromophilus were obtained from wild bats caught in Switzerland. These were molecularly characterized using four genes (asl, clpc, coI, cytb) from the three different genomes (nucleus, apicoplast, mitochondrion). These data were then combined with data of 60 taxa of Haemosporida available in GenBank. Bayesian inference, maximum likelihood and a range of rooting methods were used to test specific hypotheses concerning the phylogenetic relationships between Polychromophilus and the other haemosporidian genera. RESULTS: The Polychromophilus melanipherus and Polychromophilus murinus samples show genetically distinct patterns and group according to species. The Bayesian tree topology suggests that the monophyletic clade of Polychromophilus falls within the avian/saurian clade of Plasmodium and directed hypothesis testing confirms the Plasmodium origin. CONCLUSION: Polychromophilus' ancestor was most likely a bird- or reptile-infecting Plasmodium before it switched to bats. The invasion of mammals as hosts has, therefore, not been a unique event in the evolutionary history of Haemosporida, despite the suspected costs of adapting to a new host. This was, moreover, accompanied by a switch in dipteran host.
Resumo:
BACKGROUND: Western Palearctic tree frogs (Hyla arborea group) represent a strong potential for evolutionary and conservation genetic research, so far underexploited due to limited molecular resources. New microsatellite markers have recently been developed for Hyla arborea, with high cross-species utility across the entire circum-Mediterranean radiation. Here we conduct sibship analyses to map available markers for use in future population genetic applications. FINDINGS: We characterized eight linkage groups, including one sex-linked, all showing drastically reduced recombination in males compared to females, as previously documented in this species. Mapping of the new 15 markers to the ~200 My diverged Xenopus tropicalis genome suggests a generally conserved synteny with only one confirmed major chromosome rearrangement. CONCLUSIONS: The new microsatellites are representative of several chromosomes of H. arborea that are likely to be conserved across closely-related species. Our linkage map provides an important resource for genetic research in European Hylids, notably for studies of speciation, genome evolution and conservation.
Resumo:
Summary Ecotones are sensitive to change because they contain high numbers of species living at the margin of their environmental tolerance. This is equally true of tree-lines, which are determined by attitudinal or latitudinal temperature gradients. In the current context of climate change, they are expected to undergo modifications in position, tree biomass and possibly species composition. Attitudinal and latitudinal tree-lines differ mainly in the steepness of the underlying temperature gradient: distances are larger at latitudinal tree-lines, which could have an impact on the ability of tree species to migrate in response to climate change. Aside from temperature, tree-lines are also affected on a more local level by pressure from human activities. These are also changing as a consequence of modifications in our societies and may interact with the effects of climate change. Forest dynamics models are often used for climate change simulations because of their mechanistic processes. The spatially-explicit model TreeMig was used as a base to develop a model specifically tuned for the northern European and Alpine tree-line ecotones. For the latter, a module for land-use change processes was also added. The temperature response parameters for the species in the model were first calibrated by means of tree-ring data from various species and sites at both tree-lines. This improved the growth response function in the model, but also lead to the conclusion that regeneration is probably more important than growth for controlling tree-line position and species' distributions. The second step was to implement the module for abandonment of agricultural land in the Alps, based on an existing spatial statistical model. The sensitivity of its most important variables was tested and the model's performance compared to other modelling approaches. The probability that agricultural land would be abandoned was strongly influenced by the distance from the nearest forest and the slope, bath of which are proxies for cultivation costs. When applied to a case study area, the resulting model, named TreeMig-LAb, gave the most realistic results. These were consistent with observed consequences of land-abandonment such as the expansion of the existing forest and closing up of gaps. This new model was then applied in two case study areas, one in the Swiss Alps and one in Finnish Lapland, under a variety of climate change scenarios. These were based on forecasts of temperature change over the next century by the IPCC and the HadCM3 climate model (ΔT: +1.3, +3.5 and +5.6 °C) and included a post-change stabilisation period of 300 years. The results showed radical disruptions at both tree-lines. With the most conservative climate change scenario, species' distributions simply shifted, but it took several centuries reach a new equilibrium. With the more extreme scenarios, some species disappeared from our study areas (e.g. Pinus cembra in the Alps) or dwindled to very low numbers, as they ran out of land into which they could migrate. The most striking result was the lag in the response of most species, independently from the climate change scenario or tree-line type considered. Finally, a statistical model of the effect of reindeer (Rangifer tarandus) browsing on the growth of Pinus sylvestris was developed, as a first step towards implementing human impacts at the boreal tree-line. The expected effect was an indirect one, as reindeer deplete the ground lichen cover, thought to protect the trees against adverse climate conditions. The model showed a small but significant effect of browsing, but as the link with the underlying climate variables was unclear and the model was not spatial, it was not usable as such. Developing the TreeMig-LAb model allowed to: a) establish a method for deriving species' parameters for the growth equation from tree-rings, b) highlight the importance of regeneration in determining tree-line position and species' distributions and c) improve the integration of social sciences into landscape modelling. Applying the model at the Alpine and northern European tree-lines under different climate change scenarios showed that with most forecasted levels of temperature increase, tree-lines would suffer major disruptions, with shifts in distributions and potential extinction of some tree-line species. However, these responses showed strong lags, so these effects would not become apparent before decades and could take centuries to stabilise. Résumé Les écotones son sensibles au changement en raison du nombre élevé d'espèces qui y vivent à la limite de leur tolérance environnementale. Ceci s'applique également aux limites des arbres définies par les gradients de température altitudinaux et latitudinaux. Dans le contexte actuel de changement climatique, on s'attend à ce qu'elles subissent des modifications de leur position, de la biomasse des arbres et éventuellement des essences qui les composent. Les limites altitudinales et latitudinales diffèrent essentiellement au niveau de la pente des gradients de température qui les sous-tendent les distance sont plus grandes pour les limites latitudinales, ce qui pourrait avoir un impact sur la capacité des espèces à migrer en réponse au changement climatique. En sus de la température, la limite des arbres est aussi influencée à un niveau plus local par les pressions dues aux activités humaines. Celles-ci sont aussi en mutation suite aux changements dans nos sociétés et peuvent interagir avec les effets du changement climatique. Les modèles de dynamique forestière sont souvent utilisés pour simuler les effets du changement climatique, car ils sont basés sur la modélisation de processus. Le modèle spatialement explicite TreeMig a été utilisé comme base pour développer un modèle spécialement adapté pour la limite des arbres en Europe du Nord et dans les Alpes. Pour cette dernière, un module servant à simuler des changements d'utilisation du sol a également été ajouté. Tout d'abord, les paramètres de la courbe de réponse à la température pour les espèces inclues dans le modèle ont été calibrées au moyen de données dendrochronologiques pour diverses espèces et divers sites des deux écotones. Ceci a permis d'améliorer la courbe de croissance du modèle, mais a également permis de conclure que la régénération est probablement plus déterminante que la croissance en ce qui concerne la position de la limite des arbres et la distribution des espèces. La seconde étape consistait à implémenter le module d'abandon du terrain agricole dans les Alpes, basé sur un modèle statistique spatial existant. La sensibilité des variables les plus importantes du modèle a été testée et la performance de ce dernier comparée à d'autres approches de modélisation. La probabilité qu'un terrain soit abandonné était fortement influencée par la distance à la forêt la plus proche et par la pente, qui sont tous deux des substituts pour les coûts liés à la mise en culture. Lors de l'application en situation réelle, le nouveau modèle, baptisé TreeMig-LAb, a donné les résultats les plus réalistes. Ceux-ci étaient comparables aux conséquences déjà observées de l'abandon de terrains agricoles, telles que l'expansion des forêts existantes et la fermeture des clairières. Ce nouveau modèle a ensuite été mis en application dans deux zones d'étude, l'une dans les Alpes suisses et l'autre en Laponie finlandaise, avec divers scénarios de changement climatique. Ces derniers étaient basés sur les prévisions de changement de température pour le siècle prochain établies par l'IPCC et le modèle climatique HadCM3 (ΔT: +1.3, +3.5 et +5.6 °C) et comprenaient une période de stabilisation post-changement climatique de 300 ans. Les résultats ont montré des perturbations majeures dans les deux types de limites de arbres. Avec le scénario de changement climatique le moins extrême, les distributions respectives des espèces ont subi un simple glissement, mais il a fallu plusieurs siècles pour qu'elles atteignent un nouvel équilibre. Avec les autres scénarios, certaines espèces ont disparu de la zone d'étude (p. ex. Pinus cembra dans les Alpes) ou ont vu leur population diminuer parce qu'il n'y avait plus assez de terrains disponibles dans lesquels elles puissent migrer. Le résultat le plus frappant a été le temps de latence dans la réponse de la plupart des espèces, indépendamment du scénario de changement climatique utilisé ou du type de limite des arbres. Finalement, un modèle statistique de l'effet de l'abroutissement par les rennes (Rangifer tarandus) sur la croissance de Pinus sylvestris a été développé, comme première étape en vue de l'implémentation des impacts humains sur la limite boréale des arbres. L'effet attendu était indirect, puisque les rennes réduisent la couverture de lichen sur le sol, dont on attend un effet protecteur contre les rigueurs climatiques. Le modèle a mis en évidence un effet modeste mais significatif, mais étant donné que le lien avec les variables climatiques sous jacentes était peu clair et que le modèle n'était pas appliqué dans l'espace, il n'était pas utilisable tel quel. Le développement du modèle TreeMig-LAb a permis : a) d'établir une méthode pour déduire les paramètres spécifiques de l'équation de croissance ä partir de données dendrochronologiques, b) de mettre en évidence l'importance de la régénération dans la position de la limite des arbres et la distribution des espèces et c) d'améliorer l'intégration des sciences sociales dans les modèles de paysage. L'application du modèle aux limites alpines et nord-européennes des arbres sous différents scénarios de changement climatique a montré qu'avec la plupart des niveaux d'augmentation de température prévus, la limite des arbres subirait des perturbations majeures, avec des glissements d'aires de répartition et l'extinction potentielle de certaines espèces. Cependant, ces réponses ont montré des temps de latence importants, si bien que ces effets ne seraient pas visibles avant des décennies et pourraient mettre plusieurs siècles à se stabiliser.
Resumo:
The paper deals with the development and application of the generic methodology for automatic processing (mapping and classification) of environmental data. General Regression Neural Network (GRNN) is considered in detail and is proposed as an efficient tool to solve the problem of spatial data mapping (regression). The Probabilistic Neural Network (PNN) is considered as an automatic tool for spatial classifications. The automatic tuning of isotropic and anisotropic GRNN/PNN models using cross-validation procedure is presented. Results are compared with the k-Nearest-Neighbours (k-NN) interpolation algorithm using independent validation data set. Real case studies are based on decision-oriented mapping and classification of radioactively contaminated territories.
Resumo:
Colorectal cancer (CRC) is a major cause of cancer mortality. Whereas some patients respond well to therapy, others do not, and thus more precise, individualized treatment strategies are needed. To that end, we analyzed gene expression profiles from 1,290 CRC tumors using consensus-based unsupervised clustering. The resultant clusters were then associated with therapeutic response data to the epidermal growth factor receptor-targeted drug cetuximab in 80 patients. The results of these studies define six clinically relevant CRC subtypes. Each subtype shares similarities to distinct cell types within the normal colon crypt and shows differing degrees of 'stemness' and Wnt signaling. Subtype-specific gene signatures are proposed to identify these subtypes. Three subtypes have markedly better disease-free survival (DFS) after surgical resection, suggesting these patients might be spared from the adverse effects of chemotherapy when they have localized disease. One of these three subtypes, identified by filamin A expression, does not respond to cetuximab but may respond to cMET receptor tyrosine kinase inhibitors in the metastatic setting. Two other subtypes, with poor and intermediate DFS, associate with improved response to the chemotherapy regimen FOLFIRI in adjuvant or metastatic settings. Development of clinically deployable assays for these subtypes and of subtype-specific therapies may contribute to more effective management of this challenging disease.
Dissemination of the Swiss Model for Outcome Classification in Health Promotion and Prevention SMOC.
Resumo:
Résumé Suite aux recentes avancées technologiques, les archives d'images digitales ont connu une croissance qualitative et quantitative sans précédent. Malgré les énormes possibilités qu'elles offrent, ces avancées posent de nouvelles questions quant au traitement des masses de données saisies. Cette question est à la base de cette Thèse: les problèmes de traitement d'information digitale à très haute résolution spatiale et/ou spectrale y sont considérés en recourant à des approches d'apprentissage statistique, les méthodes à noyau. Cette Thèse étudie des problèmes de classification d'images, c'est à dire de catégorisation de pixels en un nombre réduit de classes refletant les propriétés spectrales et contextuelles des objets qu'elles représentent. L'accent est mis sur l'efficience des algorithmes, ainsi que sur leur simplicité, de manière à augmenter leur potentiel d'implementation pour les utilisateurs. De plus, le défi de cette Thèse est de rester proche des problèmes concrets des utilisateurs d'images satellite sans pour autant perdre de vue l'intéret des méthodes proposées pour le milieu du machine learning dont elles sont issues. En ce sens, ce travail joue la carte de la transdisciplinarité en maintenant un lien fort entre les deux sciences dans tous les développements proposés. Quatre modèles sont proposés: le premier répond au problème de la haute dimensionalité et de la redondance des données par un modèle optimisant les performances en classification en s'adaptant aux particularités de l'image. Ceci est rendu possible par un système de ranking des variables (les bandes) qui est optimisé en même temps que le modèle de base: ce faisant, seules les variables importantes pour résoudre le problème sont utilisées par le classifieur. Le manque d'information étiquétée et l'incertitude quant à sa pertinence pour le problème sont à la source des deux modèles suivants, basés respectivement sur l'apprentissage actif et les méthodes semi-supervisées: le premier permet d'améliorer la qualité d'un ensemble d'entraînement par interaction directe entre l'utilisateur et la machine, alors que le deuxième utilise les pixels non étiquetés pour améliorer la description des données disponibles et la robustesse du modèle. Enfin, le dernier modèle proposé considère la question plus théorique de la structure entre les outputs: l'intègration de cette source d'information, jusqu'à présent jamais considérée en télédétection, ouvre des nouveaux défis de recherche. Advanced kernel methods for remote sensing image classification Devis Tuia Institut de Géomatique et d'Analyse du Risque September 2009 Abstract The technical developments in recent years have brought the quantity and quality of digital information to an unprecedented level, as enormous archives of satellite images are available to the users. However, even if these advances open more and more possibilities in the use of digital imagery, they also rise several problems of storage and treatment. The latter is considered in this Thesis: the processing of very high spatial and spectral resolution images is treated with approaches based on data-driven algorithms relying on kernel methods. In particular, the problem of image classification, i.e. the categorization of the image's pixels into a reduced number of classes reflecting spectral and contextual properties, is studied through the different models presented. The accent is put on algorithmic efficiency and the simplicity of the approaches proposed, to avoid too complex models that would not be used by users. The major challenge of the Thesis is to remain close to concrete remote sensing problems, without losing the methodological interest from the machine learning viewpoint: in this sense, this work aims at building a bridge between the machine learning and remote sensing communities and all the models proposed have been developed keeping in mind the need for such a synergy. Four models are proposed: first, an adaptive model learning the relevant image features has been proposed to solve the problem of high dimensionality and collinearity of the image features. This model provides automatically an accurate classifier and a ranking of the relevance of the single features. The scarcity and unreliability of labeled. information were the common root of the second and third models proposed: when confronted to such problems, the user can either construct the labeled set iteratively by direct interaction with the machine or use the unlabeled data to increase robustness and quality of the description of data. Both solutions have been explored resulting into two methodological contributions, based respectively on active learning and semisupervised learning. Finally, the more theoretical issue of structured outputs has been considered in the last model, which, by integrating outputs similarity into a model, opens new challenges and opportunities for remote sensing image processing.
Resumo:
BACKGROUND: To compare the prognostic relevance of Masaoka and Müller-Hermelink classifications. METHODS: We treated 71 patients with thymic tumors at our institution between 1980 and 1997. Complete follow-up was achieved in 69 patients (97%) with a mean follow up-time of 8.3 years (range, 9 months to 17 years). RESULTS: Masaoka stage I was found in 31 patients (44.9%), stage II in 17 (24.6%), stage III in 19 (27.6%), and stage IV in 2 (2.9%). The 10-year overall survival rate was 83.5% for stage I, 100% for stage IIa, 58% for stage IIb, 44% for stage III, and 0% for stage IV. The disease-free survival rates were 100%, 70%, 40%, 38%, and 0%, respectively. Histologic classification according to Müller-Hermelink found medullary tumors in 7 patients (10.1%), mixed in 18 (26.1%), organoid in 14 (20.3%), cortical in 11 (15.9%), well-differentiated thymic carcinoma in 14 (20.3%), and endocrine carcinoma in 5 (7.3%), with 10-year overall survival rates of 100%, 75%, 92%, 87.5%, 30%, and 0%, respectively, and 10-year disease-free survival rates of 100%, 100%, 77%, 75%, 37%, and 0%, respectively. Medullary, mixed, and well-differentiated organoid tumors were correlated with stage I and II, and well-differentiated thymic carcinoma and endocrine carcinoma with stage III and IV (p < 0.001). Multivariate analysis showed age, gender, myasthenia gravis, and postoperative adjuvant therapy not to be significant predictors of overall and disease-free survival after complete resection, whereas the Müller-Hermelink and Masaoka classifications were independent significant predictors for overall (p < 0.05) and disease-free survival (p < 0.004; p < 0.0001). CONCLUSIONS: The consideration of staging and histology in thymic tumors has the potential to improve recurrence prediction and patient selection for combined treatment modalities.
Resumo:
For the last 2 decades, supertree reconstruction has been an active field of research and has seen the development of a large number of major algorithms. Because of the growing popularity of the supertree methods, it has become necessary to evaluate the performance of these algorithms to determine which are the best options (especially with regard to the supermatrix approach that is widely used). In this study, seven of the most commonly used supertree methods are investigated by using a large empirical data set (in terms of number of taxa and molecular markers) from the worldwide flowering plant family Sapindaceae. Supertree methods were evaluated using several criteria: similarity of the supertrees with the input trees, similarity between the supertrees and the total evidence tree, level of resolution of the supertree and computational time required by the algorithm. Additional analyses were also conducted on a reduced data set to test if the performance levels were affected by the heuristic searches rather than the algorithms themselves. Based on our results, two main groups of supertree methods were identified: on one hand, the matrix representation with parsimony (MRP), MinFlip, and MinCut methods performed well according to our criteria, whereas the average consensus, split fit, and most similar supertree methods showed a poorer performance or at least did not behave the same way as the total evidence tree. Results for the super distance matrix, that is, the most recent approach tested here, were promising with at least one derived method performing as well as MRP, MinFlip, and MinCut. The output of each method was only slightly improved when applied to the reduced data set, suggesting a correct behavior of the heuristic searches and a relatively low sensitivity of the algorithms to data set sizes and missing data. Results also showed that the MRP analyses could reach a high level of quality even when using a simple heuristic search strategy, with the exception of MRP with Purvis coding scheme and reversible parsimony. The future of supertrees lies in the implementation of a standardized heuristic search for all methods and the increase in computing power to handle large data sets. The latter would prove to be particularly useful for promising approaches such as the maximum quartet fit method that yet requires substantial computing power.
Resumo:
When dealing with multi-angular image sequences, problems of reflectance changes due either to illumination and acquisition geometry, or to interactions with the atmosphere, naturally arise. These phenomena interplay with the scene and lead to a modification of the measured radiance: for example, according to the angle of acquisition, tall objects may be seen from top or from the side and different light scatterings may affect the surfaces. This results in shifts in the acquired radiance, that make the problem of multi-angular classification harder and might lead to catastrophic results, since surfaces with the same reflectance return significantly different signals. In this paper, rather than performing atmospheric or bi-directional reflection distribution function (BRDF) correction, a non-linear manifold learning approach is used to align data structures. This method maximizes the similarity between the different acquisitions by deforming their manifold, thus enhancing the transferability of classification models among the images of the sequence.
Resumo:
For several years, the lack of consensus on definition, nomenclature, natural history, and biology of serrated polyps (SPs) of the colon has created considerable confusion among pathologists. According to the latest WHO classification, the family of SPs comprises hyperplastic polyps (HPs), sessile serrated adenomas/polyps (SSA/Ps), and traditional serrated adenomas (TSAs). The term SSA/P with dysplasia has replaced the category of mixed hyperplastic/adenomatous polyps (MPs). The present study aimed to evaluate the reproducibility of the diagnosis of SPs based on currently available diagnostic criteria and interactive consensus development. In an initial round, H&E slides of 70 cases of SPs were circulated among participating pathologists across Europe. This round was followed by a consensus discussion on diagnostic criteria. A second round was performed on the same 70 cases using the revised criteria and definitions according to the recent WHO classification. Data were evaluated for inter-observer agreement using Kappa statistics. In the initial round, for the total of 70 cases, a fair overall kappa value of 0.318 was reached, while in the second round overall kappa value improved to moderate (kappa = 0.557; p < 0.001). Overall kappa values for each diagnostic category also significantly improved in the final round, reaching 0.977 for HP, 0.912 for SSA/P, and 0.845 for TSA (p < 0.001). The diagnostic reproducibility of SPs improves when strictly defined, standardized diagnostic criteria adopted by consensus are applied.
Resumo:
Occasional XY recombination is a proposed explanation for the sex-chromosome homomorphy in European tree frogs. Numerous laboratory crosses, however, failed to detect any event of male recombination, and a detailed survey of NW-European Hyla arborea populations identified male-specific alleles at sex-linked loci, pointing to the absence of XY recombination in their recent history. Here, we address this paradox in a phylogeographic framework by genotyping sex-linked microsatellite markers in populations and sibships from the entire species range. Contrasting with postglacial populations of NW Europe, which display complete absence of XY recombination and strong sex-chromosome differentiation, refugial populations of the southern Balkans and Adriatic coast show limited XY recombination and large overlaps in allele frequencies. Geographically and historically intermediate populations of the Pannonian Basin show intermediate patterns of XY differentiation. Even in populations where X and Y occasionally recombine, the genetic diversity of Y haplotypes is reduced below the levels expected from the fourfold drop in copy numbers. This study is the first in which X and Y haplotypes could be phased over the distribution range in a species with homomorphic sex chromosomes; it shows that XY-recombination patterns may differ strikingly between conspecific populations, and that recombination arrest may evolve rapidly (<5000 generations).