970 resultados para hierarchical classification structures
Resumo:
We study the relationship between topological scales and dynamic time scales in complex networks. The analysis is based on the full dynamics towards synchronization of a system of coupled oscillators. In the synchronization process, modular structures corresponding to well-defined communities of nodes emerge in different time scales, ordered in a hierarchical way. The analysis also provides a useful connection between synchronization dynamics, complex networks topology, and spectral graph analysis.
Resumo:
This thesis is a compilation of projects to study sediment processes recharging debris flow channels. These works, conducted during my stay at the University of Lausanne, focus in the geological and morphological implications of torrent catchments to characterize debris supply, a fundamental element to predict debris flows. Other aspects of sediment dynamics are considered, e.g. the coupling headwaters - torrent, as well as the development of a modeling software that simulates sediment transfer in torrent systems. The sediment activity at Manival, an active torrent system of the northern French Alps, was investigated using terrestrial laser scanning and supplemented with geostructural investigations and a survey of sediment transferred in the main torrent. A full year of sediment flux could be observed, which coincided with two debris flows and several bedload transport events. This study revealed that both debris flows generated in the torrent and were preceded in time by recharge of material from the headwaters. Debris production occurred mostly during winter - early spring time and was caused by large slope failures. Sediment transfers were more puzzling, occurring almost exclusively in early spring subordinated to runoffconditions and in autumn during long rainfall. Intense rainstorms in summer did not affect debris storage that seems to rely on the stability of debris deposits. The morpho-geological implication in debris supply was evaluated using DEM and field surveys. A slope angle-based classification of topography could characterize the mode of debris production and transfer. A slope stability analysis derived from the structures in rock mass could assess susceptibility to failure. The modeled rockfall source areas included more than 97% of the recorded events and the sediment budgets appeared to be correlated to the density of potential slope failure. This work showed that the analysis of process-related terrain morphology and of susceptibility to slope failure document the sediment dynamics to quantitatively assess erosion zones leading to debris flow activity. The development of erosional landforms was evaluated by analyzing their geometry with the orientations of potential rock slope failure and with the direction of the maximum joint frequency. Structure in rock mass, but in particular wedge failure and the dominant discontinuities, appear as a first-order control of erosional mechanisms affecting bedrock- dominated catchment. They represent some weaknesses that are exploited primarily by mass wasting processes and erosion, promoting not only the initiation of rock couloirs and gullies, but also their propagation. Incorporating the geological control in geomorphic processes contributes to better understand the landscape evolution of active catchments. A sediment flux algorithm was implemented in a sediment cascade model that discretizes the torrent catchment in channel reaches and individual process-response systems. Each conceptual element includes in simple manner geomorphological and sediment flux information derived from GIS complemented with field mapping. This tool enables to simulate sediment transfers in channels considering evolving debris supply and conveyance, and helps reducing the uncertainty inherent to sediment budget prediction in torrent systems. Cette thèse est un recueil de projets d'études des processus de recharges sédimentaires des chenaux torrentiels. Ces travaux, réalisés lorsque j'étais employé à l'Université de Lausanne, se concentrent sur les implications géologiques et morphologiques des bassins dans l'apport de sédiments, élément fondamental dans la prédiction de laves torrentielles. D'autres aspects de dynamique sédimentaire ont été abordés, p. ex. le couplage torrent - bassin, ainsi qu'un modèle de simulation du transfert sédimentaire en milieu torrentiel. L'activité sédimentaire du Manival, un système torrentiel actif des Alpes françaises, a été étudiée par relevés au laser scanner terrestre et complétée par une étude géostructurale ainsi qu'un suivi du transfert en sédiments du torrent. Une année de flux sédimentaire a pu être observée, coïncidant avec deux laves torrentielles et plusieurs phénomènes de charriages. Cette étude a révélé que les laves s'étaient générées dans le torrent et étaient précédées par une recharge de débris depuis les versants. La production de débris s'est passée principalement en l'hiver - début du printemps, causée par de grandes ruptures de pentes. Le transfert était plus étrange, se produisant presque exclusivement au début du printemps subordonné aux conditions d'écoulement et en automne lors de longues pluies. Les orages d'été n'affectèrent guère les dépôts, qui semblent dépendre de leur stabilité. Les implications morpho-géologiques dans l'apport sédimentaire ont été évaluées à l'aide de MNT et études de terrain. Une classification de la topographie basée sur la pente a permis de charactériser le mode de production et transfert. Une analyse de stabilité de pente à partir des structures de roches a permis d'estimer la susceptibilité à la rupture. Les zones sources modélisées comprennent plus de 97% des chutes de blocs observées et les bilans sédimentaires sont corrélés à la densité de ruptures potentielles. Ce travail d'analyses des morphologies du terrain et de susceptibilité à la rupture documente la dynamique sédimentaire pour l'estimation quantitative des zones érosives induisant l'activité torrentielle. Le développement des formes d'érosion a été évalué par l'analyse de leur géométrie avec celle des ruptures potentielles et avec la direction de la fréquence maximale des joints. Les structures de roches, mais en particulier les dièdres et les discontinuités dominantes, semblent être très influents dans les mécanismes d'érosion affectant les bassins rocheux. Ils représentent des zones de faiblesse exploitées en priorité par les processus de démantèlement et d'érosion, encourageant l'initiation de ravines et couloirs, mais aussi leur propagation. L'incorporation du control géologique dans les processus de surface contribue à une meilleure compréhension de l'évolution topographique de bassins actifs. Un algorithme de flux sédimentaire a été implémenté dans un modèle en cascade, lequel divise le bassin en biefs et en systèmes individuels répondant aux processus. Chaque unité inclut de façon simple les informations géomorpologiques et celles du flux sédimentaire dérivées à partir de SIG et de cartographie de terrain. Cet outil permet la simulation des transferts de masse dans les chenaux, considérants la variabilité de l'apport et son transport, et aide à réduire l'incertitude liée à la prédiction de bilans sédimentaires torrentiels. Ce travail vise très humblement d'éclairer quelques aspects de la dynamique sédimentaire en milieu torrentiel.
Resumo:
Expression data contribute significantly to the biological value of the sequenced human genome, providing extensive information about gene structure and the pattern of gene expression. ESTs, together with SAGE libraries and microarray experiment information, provide a broad and rich view of the transcriptome. However, it is difficult to perform large-scale expression mining of the data generated by these diverse experimental approaches. Not only is the data stored in disparate locations, but there is frequent ambiguity in the meaning of terms used to describe the source of the material used in the experiment. Untangling semantic differences between the data provided by different resources is therefore largely reliant on the domain knowledge of a human expert. We present here eVOC, a system which associates labelled target cDNAs for microarray experiments, or cDNA libraries and their associated transcripts with controlled terms in a set of hierarchical vocabularies. eVOC consists of four orthogonal controlled vocabularies suitable for describing the domains of human gene expression data including Anatomical System, Cell Type, Pathology and Developmental Stage. We have curated and annotated 7016 cDNA libraries represented in dbEST, as well as 104 SAGE libraries,with expression information,and provide this as an integrated, public resource that allows the linking of transcripts and libraries with expression terms. Both the vocabularies and the vocabulary-annotated libraries can be retrieved from http://www.sanbi.ac.za/evoc/. Several groups are involved in developing this resource with the aim of unifying transcript expression information.
Resumo:
A recent method used to optimize biased neural networks with low levels of activity is applied to a hierarchical model. As a consequence, the performance of the system is strongly enhanced. The steps to achieve optimization are analyzed in detail.
Resumo:
Résumé Suite aux recentes avancées technologiques, les archives d'images digitales ont connu une croissance qualitative et quantitative sans précédent. Malgré les énormes possibilités qu'elles offrent, ces avancées posent de nouvelles questions quant au traitement des masses de données saisies. Cette question est à la base de cette Thèse: les problèmes de traitement d'information digitale à très haute résolution spatiale et/ou spectrale y sont considérés en recourant à des approches d'apprentissage statistique, les méthodes à noyau. Cette Thèse étudie des problèmes de classification d'images, c'est à dire de catégorisation de pixels en un nombre réduit de classes refletant les propriétés spectrales et contextuelles des objets qu'elles représentent. L'accent est mis sur l'efficience des algorithmes, ainsi que sur leur simplicité, de manière à augmenter leur potentiel d'implementation pour les utilisateurs. De plus, le défi de cette Thèse est de rester proche des problèmes concrets des utilisateurs d'images satellite sans pour autant perdre de vue l'intéret des méthodes proposées pour le milieu du machine learning dont elles sont issues. En ce sens, ce travail joue la carte de la transdisciplinarité en maintenant un lien fort entre les deux sciences dans tous les développements proposés. Quatre modèles sont proposés: le premier répond au problème de la haute dimensionalité et de la redondance des données par un modèle optimisant les performances en classification en s'adaptant aux particularités de l'image. Ceci est rendu possible par un système de ranking des variables (les bandes) qui est optimisé en même temps que le modèle de base: ce faisant, seules les variables importantes pour résoudre le problème sont utilisées par le classifieur. Le manque d'information étiquétée et l'incertitude quant à sa pertinence pour le problème sont à la source des deux modèles suivants, basés respectivement sur l'apprentissage actif et les méthodes semi-supervisées: le premier permet d'améliorer la qualité d'un ensemble d'entraînement par interaction directe entre l'utilisateur et la machine, alors que le deuxième utilise les pixels non étiquetés pour améliorer la description des données disponibles et la robustesse du modèle. Enfin, le dernier modèle proposé considère la question plus théorique de la structure entre les outputs: l'intègration de cette source d'information, jusqu'à présent jamais considérée en télédétection, ouvre des nouveaux défis de recherche. Advanced kernel methods for remote sensing image classification Devis Tuia Institut de Géomatique et d'Analyse du Risque September 2009 Abstract The technical developments in recent years have brought the quantity and quality of digital information to an unprecedented level, as enormous archives of satellite images are available to the users. However, even if these advances open more and more possibilities in the use of digital imagery, they also rise several problems of storage and treatment. The latter is considered in this Thesis: the processing of very high spatial and spectral resolution images is treated with approaches based on data-driven algorithms relying on kernel methods. In particular, the problem of image classification, i.e. the categorization of the image's pixels into a reduced number of classes reflecting spectral and contextual properties, is studied through the different models presented. The accent is put on algorithmic efficiency and the simplicity of the approaches proposed, to avoid too complex models that would not be used by users. The major challenge of the Thesis is to remain close to concrete remote sensing problems, without losing the methodological interest from the machine learning viewpoint: in this sense, this work aims at building a bridge between the machine learning and remote sensing communities and all the models proposed have been developed keeping in mind the need for such a synergy. Four models are proposed: first, an adaptive model learning the relevant image features has been proposed to solve the problem of high dimensionality and collinearity of the image features. This model provides automatically an accurate classifier and a ranking of the relevance of the single features. The scarcity and unreliability of labeled. information were the common root of the second and third models proposed: when confronted to such problems, the user can either construct the labeled set iteratively by direct interaction with the machine or use the unlabeled data to increase robustness and quality of the description of data. Both solutions have been explored resulting into two methodological contributions, based respectively on active learning and semisupervised learning. Finally, the more theoretical issue of structured outputs has been considered in the last model, which, by integrating outputs similarity into a model, opens new challenges and opportunities for remote sensing image processing.
Resumo:
River bifurcations are key nodes within braided river systems controlling the flow and sediment partitioning and therefore the dynamics of the river braiding process. Recent research has shown that certain geometrical configurations induce instabilities that lead to downstream mid-channel bar formation and the formation of bifurcations. However, we currently have a poor understanding of the flow division process within bifurcations and the flow dynamics in the downstream bifurcates, both of which are needed to understand bifurcation stability. This paper presents results of a numerical sensitivity experiment undertaken using computational fluid dynamics (CFD) with the purpose of understanding the flow dynamics of a series of idealized bifurcations. A geometric sensitivity analysis is undertaken for a range of channel slopes (0.005 to 0.03), bifurcation angles (22 degrees to 42 degrees) and a restricted set of inflow conditions based upon simulating flow through meander bends with different curvature on the flow field dynamics through the bifurcation. The results demonstrate that the overall slope of the bifurcation affects the velocity of flow through the bifurcation and when slope asymmetry is introduced, the flow structures in the bifurcation are modified. In terms of bifurcation evolution the most important observation appears to be that once slope asymmetry is greater than 0.2 the flow within the steep bifurcate shows potential instability and the potential for alternate channel bar formation. Bifurcation angle also defines the flow structures within the bifurcation with an increase in bifurcation angle increasing the flow velocity down both bifurcates. However, redistributive effects of secondary circulation caused by upstream curvature can very easily counter the effects of local bifurcation characteristics. Copyright (C) 2011 John Wiley & Sons, Ltd.
Resumo:
BACKGROUND: To compare the prognostic relevance of Masaoka and Müller-Hermelink classifications. METHODS: We treated 71 patients with thymic tumors at our institution between 1980 and 1997. Complete follow-up was achieved in 69 patients (97%) with a mean follow up-time of 8.3 years (range, 9 months to 17 years). RESULTS: Masaoka stage I was found in 31 patients (44.9%), stage II in 17 (24.6%), stage III in 19 (27.6%), and stage IV in 2 (2.9%). The 10-year overall survival rate was 83.5% for stage I, 100% for stage IIa, 58% for stage IIb, 44% for stage III, and 0% for stage IV. The disease-free survival rates were 100%, 70%, 40%, 38%, and 0%, respectively. Histologic classification according to Müller-Hermelink found medullary tumors in 7 patients (10.1%), mixed in 18 (26.1%), organoid in 14 (20.3%), cortical in 11 (15.9%), well-differentiated thymic carcinoma in 14 (20.3%), and endocrine carcinoma in 5 (7.3%), with 10-year overall survival rates of 100%, 75%, 92%, 87.5%, 30%, and 0%, respectively, and 10-year disease-free survival rates of 100%, 100%, 77%, 75%, 37%, and 0%, respectively. Medullary, mixed, and well-differentiated organoid tumors were correlated with stage I and II, and well-differentiated thymic carcinoma and endocrine carcinoma with stage III and IV (p < 0.001). Multivariate analysis showed age, gender, myasthenia gravis, and postoperative adjuvant therapy not to be significant predictors of overall and disease-free survival after complete resection, whereas the Müller-Hermelink and Masaoka classifications were independent significant predictors for overall (p < 0.05) and disease-free survival (p < 0.004; p < 0.0001). CONCLUSIONS: The consideration of staging and histology in thymic tumors has the potential to improve recurrence prediction and patient selection for combined treatment modalities.
Resumo:
ABSTRACT Preservation of mangroves, a very significant ecosystem from a social, economic, and environmental viewpoint, requires knowledge on soil composition, genesis, morphology, and classification. These aspects are of paramount importance to understand the dynamics of sustainability and preservation of this natural resource. In this study mangrove soils in the Subaé river basin were described and classified and inorganic waste concentrations evaluated. Seven pedons of mangrove soil were chosen, five under fluvial influence and two under marine influence and analyzed for morphology. Samples of horizons and layers were collected for physical and chemical analyses, including heavy metals (Pb, Cd, Mn, Zn, and Fe). The moist soils were suboxidic, with Eh values below 350 mV. The pH level of the pedons under fluvial influence ranged from moderately acid to alkaline, while the pH in pedons under marine influence was around 7.0 throughout the profile. The concentration of cations in the sorting complex for all pedons, independent of fluvial or marine influence, indicated the following order: Na+>Mg2+>Ca2+>K+. Mangrove soils from the Subaé river basin under fluvial and marine influence had different morphological, physical, and chemical characteristics. The highest Pb and Cd concentrations were found in the pedons under fluvial influence, perhaps due to their closeness to the mining company Plumbum, while the concentrations in pedon P7 were lowest, due to greater distance from the factory. For containing at least one metal above the reference levels established by the National Oceanic and Atmospheric Administration (United States Environmental Protection Agency), the pedons were classified as potentially toxic. The soils were classified as Gleissolos Tiomórficos Órticos (sálicos) sódico neofluvissólico in according to the Brazilian Soil Classification System, indicating potential toxicity and very poor drainage, except for pedon P7, which was classified in the same subgroup as the others, but different in that the metal concentrations met acceptable standards.
Resumo:
As part of its 2006 systemic evaluation of DOC’s facilities, operations and programming, the Durrant/PBA consulting group found several shortcomings with the Department’s inmate custody classification system. Specifically, the consultants found that the system:
Resumo:
MOTIVATION: Analysis of millions of pyro-sequences is currently playing a crucial role in the advance of environmental microbiology. Taxonomy-independent, i.e. unsupervised, clustering of these sequences is essential for the definition of Operational Taxonomic Units. For this application, reproducibility and robustness should be the most sought after qualities, but have thus far largely been overlooked. RESULTS: More than 1 million hyper-variable internal transcribed spacer 1 (ITS1) sequences of fungal origin have been analyzed. The ITS1 sequences were first properly extracted from 454 reads using generalized profiles. Then, otupipe, cd-hit-454, ESPRIT-Tree and DBC454, a new algorithm presented here, were used to analyze the sequences. A numerical assay was developed to measure the reproducibility and robustness of these algorithms. DBC454 was the most robust, closely followed by ESPRIT-Tree. DBC454 features density-based hierarchical clustering, which complements the other methods by providing insights into the structure of the data. AVAILABILITY: An executable is freely available for non-commercial users at ftp://ftp.vital-it.ch/tools/dbc454. It is designed to run under MPI on a cluster of 64-bit Linux machines running Red Hat 4.x, or on a multi-core OSX system. CONTACT: dbc454@vital-it.ch or nicolas.guex@isb-sib.ch.
Resumo:
We present a heuristic method for learning error correcting output codes matrices based on a hierarchical partition of the class space that maximizes a discriminative criterion. To achieve this goal, the optimal codeword separation is sacrificed in favor of a maximum class discrimination in the partitions. The creation of the hierarchical partition set is performed using a binary tree. As a result, a compact matrix with high discrimination power is obtained. Our method is validated using the UCI database and applied to a real problem, the classification of traffic sign images.
Resumo:
RÉSUMÉ Cette thèse porte sur le développement de méthodes algorithmiques pour découvrir automatiquement la structure morphologique des mots d'un corpus. On considère en particulier le cas des langues s'approchant du type introflexionnel, comme l'arabe ou l'hébreu. La tradition linguistique décrit la morphologie de ces langues en termes d'unités discontinues : les racines consonantiques et les schèmes vocaliques. Ce genre de structure constitue un défi pour les systèmes actuels d'apprentissage automatique, qui opèrent généralement avec des unités continues. La stratégie adoptée ici consiste à traiter le problème comme une séquence de deux sous-problèmes. Le premier est d'ordre phonologique : il s'agit de diviser les symboles (phonèmes, lettres) du corpus en deux groupes correspondant autant que possible aux consonnes et voyelles phonétiques. Le second est de nature morphologique et repose sur les résultats du premier : il s'agit d'établir l'inventaire des racines et schèmes du corpus et de déterminer leurs règles de combinaison. On examine la portée et les limites d'une approche basée sur deux hypothèses : (i) la distinction entre consonnes et voyelles peut être inférée sur la base de leur tendance à alterner dans la chaîne parlée; (ii) les racines et les schèmes peuvent être identifiés respectivement aux séquences de consonnes et voyelles découvertes précédemment. L'algorithme proposé utilise une méthode purement distributionnelle pour partitionner les symboles du corpus. Puis il applique des principes analogiques pour identifier un ensemble de candidats sérieux au titre de racine ou de schème, et pour élargir progressivement cet ensemble. Cette extension est soumise à une procédure d'évaluation basée sur le principe de la longueur de description minimale, dans- l'esprit de LINGUISTICA (Goldsmith, 2001). L'algorithme est implémenté sous la forme d'un programme informatique nommé ARABICA, et évalué sur un corpus de noms arabes, du point de vue de sa capacité à décrire le système du pluriel. Cette étude montre que des structures linguistiques complexes peuvent être découvertes en ne faisant qu'un minimum d'hypothèses a priori sur les phénomènes considérés. Elle illustre la synergie possible entre des mécanismes d'apprentissage portant sur des niveaux de description linguistique distincts, et cherche à déterminer quand et pourquoi cette coopération échoue. Elle conclut que la tension entre l'universalité de la distinction consonnes-voyelles et la spécificité de la structuration racine-schème est cruciale pour expliquer les forces et les faiblesses d'une telle approche. ABSTRACT This dissertation is concerned with the development of algorithmic methods for the unsupervised learning of natural language morphology, using a symbolically transcribed wordlist. It focuses on the case of languages approaching the introflectional type, such as Arabic or Hebrew. The morphology of such languages is traditionally described in terms of discontinuous units: consonantal roots and vocalic patterns. Inferring this kind of structure is a challenging task for current unsupervised learning systems, which generally operate with continuous units. In this study, the problem of learning root-and-pattern morphology is divided into a phonological and a morphological subproblem. The phonological component of the analysis seeks to partition the symbols of a corpus (phonemes, letters) into two subsets that correspond well with the phonetic definition of consonants and vowels; building around this result, the morphological component attempts to establish the list of roots and patterns in the corpus, and to infer the rules that govern their combinations. We assess the extent to which this can be done on the basis of two hypotheses: (i) the distinction between consonants and vowels can be learned by observing their tendency to alternate in speech; (ii) roots and patterns can be identified as sequences of the previously discovered consonants and vowels respectively. The proposed algorithm uses a purely distributional method for partitioning symbols. Then it applies analogical principles to identify a preliminary set of reliable roots and patterns, and gradually enlarge it. This extension process is guided by an evaluation procedure based on the minimum description length principle, in line with the approach to morphological learning embodied in LINGUISTICA (Goldsmith, 2001). The algorithm is implemented as a computer program named ARABICA; it is evaluated with regard to its ability to account for the system of plural formation in a corpus of Arabic nouns. This thesis shows that complex linguistic structures can be discovered without recourse to a rich set of a priori hypotheses about the phenomena under consideration. It illustrates the possible synergy between learning mechanisms operating at distinct levels of linguistic description, and attempts to determine where and why such a cooperation fails. It concludes that the tension between the universality of the consonant-vowel distinction and the specificity of root-and-pattern structure is crucial for understanding the advantages and weaknesses of this approach.
Resumo:
Map produced by Iowa Department of Transportation of System Classification.
Resumo:
Ce texte est un « droit de réponse » par les auteurs de l'article « Vers un naturalisme social. À la croisée des sciences sociales et des sciences cognitives », publié par SociologieS en octobre 2011, au débat qu'il a suscité. Après une brève mise au point sur la forme même du débat, ainsi que sur les dissensions ponctuelles qui opposent les différents protagonistes, l'article répond aux inquiétudes parfaitement légitimes et aux questions de fond que soulève le naturalisme social.
Resumo:
In Switzerland there is a strong movement at a national policy level towards strengthening patient rights and patient involvement in health care decisions. Yet, there is no national programme promoting shared decision making. First decision support tools (prenatal diagnosis and screening) for the counselling process have been developed and implemented. Although Swiss doctors acknowledge that shared decision making is important, hierarchical structures and asymmetric physician-patient relationships are still prevailing. The last years have seen some promising activities regarding the training of medical students and the development of patient support programmes. Swiss direct democracy and the habit of consensual decision making and citizen involvement in general may provide a fertile ground for SDM development in the primary care setting.