980 resultados para Labeling hierarchical clustering


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Proceeding of the 3rd International Conference on Fractional Systems and Signals, at Ghent, Belgium

Relevância:

80.00% 80.00%

Publicador:

Resumo:

S’insérant dans les domaines de la Lecture et de l’Analyse de Textes Assistées par Ordinateur (LATAO), de la Gestion Électronique des Documents (GÉD), de la visualisation de l’information et, en partie, de l’anthropologie, cette recherche exploratoire propose l’expérimentation d’une méthodologie descriptive en fouille de textes afin de cartographier thématiquement un corpus de textes anthropologiques. Plus précisément, nous souhaitons éprouver la méthode de classification hiérarchique ascendante (CHA) pour extraire et analyser les thèmes issus de résumés de mémoires et de thèses octroyés de 1985 à 2009 (1240 résumés), par les départements d’anthropologie de l’Université de Montréal et de l’Université Laval, ainsi que le département d’histoire de l’Université Laval (pour les résumés archéologiques et ethnologiques). En première partie de mémoire, nous présentons notre cadre théorique, c'est-à-dire que nous expliquons ce qu’est la fouille de textes, ses origines, ses applications, les étapes méthodologiques puis, nous complétons avec une revue des principales publications. La deuxième partie est consacrée au cadre méthodologique et ainsi, nous abordons les différentes étapes par lesquelles ce projet fut conduit; la collecte des données, le filtrage linguistique, la classification automatique, pour en nommer que quelques-unes. Finalement, en dernière partie, nous présentons les résultats de notre recherche, en nous attardant plus particulièrement sur deux expérimentations. Nous abordons également la navigation thématique et les approches conceptuelles en thématisation, par exemple, en anthropologie, la dichotomie culture ̸ biologie. Nous terminons avec les limites de ce projet et les pistes d’intérêts pour de futures recherches.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Dans une turbine hydraulique, la rotation des aubes dans l’eau crée une zone de basse pression, amenant l’eau à passer de l’état liquide à l’état gazeux. Ce phénomène de changement de phase est appelé cavitation et est similaire à l’ébullition. Lorsque les cavités de vapeur formées implosent près des parois, il en résulte une érosion sévère des matériaux, accélérant de façon importante la dégradation de la turbine. Un système de détection de l’érosion de cavitation à l’aide de mesures vibratoires, employable sur les turbines en opération, a donc été installé sur quatre groupes turbine-alternateur d’une centrale et permet d’estimer précisément le taux d’érosion en kg/ 10 000 h. Le présent projet vise à répondre à deux objectifs principaux. Premièrement, étudier le comportement de la cavitation sur un groupe turbine-alternateur cible et construire un modèle statistique, dans le but de prédire la variable cavitation en fonction des variables opératoires (tels l’ouverture de vannage, le débit, les niveaux amont et aval, etc.). Deuxièmement, élaborer une méthodologie permettant la reproductibilité de l’étude à d’autres sites. Une étude rétrospective sera effectuée et on se concentrera sur les données disponibles depuis la mise à jour du système en 2010. Des résultats préliminaires ont mis en évidence l’hétérogénéité du comportement de cavitation ainsi que des changements entre la relation entre la cavitation et diverses variables opératoires. Nous nous proposons de développer un modèle probabiliste adapté, en utilisant notamment le regroupement hiérarchique et des modèles de régression linéaire multiple.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Besides the spinal deformity, scoliosis modifies notably the general appearance of the trunk resulting in trunk rotation, imbalance, and asymmetries that constitutes patients' major concern. Existing classifications of scoliosis, based on the type of spinal curve as depicted on radiographs, are currently used to guide treatment strategies. Unfortunately, even though a perfect correction of the spinal curve is achieved, some trunk deformities remain, making patients dissatisfied with the treatment received. The purpose of this study is to identify possible shape patterns of trunk surface deformity associated with scoliosis. First, trunk surface is represented by a multivariate functional trunk shape descriptor based on 3-D clinical measurements computed on cross sections of the trunk. Then, the classical formulation of hierarchical clustering is adapted to the case of multivariate functional data and applied to a set of 236 trunk surface 3-D reconstructions. The highest internal validity is obtained when considering 11 clusters that explain up to 65% of the variance in our dataset. Our clustering result shows a concordance with the radiographic classification of spinal curves in 68% of the cases. As opposed to radiographic evaluation, the trunk descriptor is 3-D and its functional nature offers a compact and elegant description of not only the type, but also the severity and extent of the trunk surface deformity along the trunk length. In future work, new management strategies based on the resulting trunk shape patterns could be thought of in order to improve the esthetic outcome after treatment, and thus patients satisfaction.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The diversity of social bees was assessed at 15 sites across five locations of the Nilgiri Biosphere Reserve, Western Ghats, India, from January to December 2007. We also conducted floristic analyses of local vegetation in each site using one-hectare sample plots. All woody species with a dbh (diameter at breast height) : 30 cm were recorded within the plots. A total area of 9.72 ha was assessed for floristic composition. Similarity of floristic composition between sites was determined using the Jaccard's distance measure and a dendrogram constructed based on the hierarchical clustering of floristic dissimilarities between sites. A Bee Importance Index (BII) was developed to give a measure of the bee diversity at each site. This index was a sum of the species richness of bee species in a site and their visitation frequencies to flowers, calculated as mean flower visits hour 1 within 2 focal patches within one hectare plots. The visits of bee species to flowers were also recorded. The Jaccard distance measure indicated that the montane sites were quite dissimilar to the low elevation sites in floristic diversity. The BII was 7-9 for the wet forest sites and ranged from 4-6 for drier forest sites. Seventy three plant species were identified as social bee plants and of them 45% were visited by one species of bee, 37% by two bee species and 18% by more than two bee species, indicating a certain degree of floral specialization among bees.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

K-Means is a popular clustering algorithm which adopts an iterative refinement procedure to determine data partitions and to compute their associated centres of mass, called centroids. The straightforward implementation of the algorithm is often referred to as `brute force' since it computes a proximity measure from each data point to each centroid at every iteration of the K-Means process. Efficient implementations of the K-Means algorithm have been predominantly based on multi-dimensional binary search trees (KD-Trees). A combination of an efficient data structure and geometrical constraints allow to reduce the number of distance computations required at each iteration. In this work we present a general space partitioning approach for improving the efficiency and the scalability of the K-Means algorithm. We propose to adopt approximate hierarchical clustering methods to generate binary space partitioning trees in contrast to KD-Trees. In the experimental analysis, we have tested the performance of the proposed Binary Space Partitioning K-Means (BSP-KM) when a divisive clustering algorithm is used. We have carried out extensive experimental tests to compare the proposed approach to the one based on KD-Trees (KD-KM) in a wide range of the parameters space. BSP-KM is more scalable than KDKM, while keeping the deterministic nature of the `brute force' algorithm. In particular, the proposed space partitioning approach has shown to overcome the well-known limitation of KD-Trees in high-dimensional spaces and can also be adopted to improve the efficiency of other algorithms in which KD-Trees have been used.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The polar winter stratospheric vortex is a coherent structure that undergoes different types of deformation that can be revealed by the geometric invariant moments. Three moments are used—the aspect ratio, the centroid latitude, and the area of the vortex based on stratospheric data from the 40-yr ECMWF Re-Analysis (ERA-40) project—to study sudden stratospheric warmings. Hierarchical clustering combined with data image visualization techniques is used as well. Using the gap statistic, three optimal clusters are obtained based on the three geometric moments considered here. The 850-K potential vorticity field, as well as the vertical profiles of polar temperature and zonal wind, provides evidence that the clusters represent, respectively, the undisturbed (U), displaced (D), and split (S) states of the polar vortex. This systematic method for identifying and characterizing the state of the polar vortex using objective methods is useful as a tool for analyzing observations and as a test for climate models to simulate the observations. The method correctly identifies all previously identified major warmings and also identifies significant minor warmings where the atmosphere is substantially disturbed but does not quite meet the criteria to qualify as a major stratospheric warming.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The presence of 10 virulence genes was examined using polymerase chain reaction (PCR) in 365 European O157 and non-O157 Escherichia coli isolates associated with verotoxin production. Strain-specific PCR data were analysed using hierarchical clustering. The resulting dendrogram clearly separated O157 from non-O157 strains. The former clustered typical high-risk seropathotype (SPT) A strains from all regions, including Sweden and Spain, which were homogenous by Cramer's V statistic, and strains with less typical O157 features mostly from Hungary. The non-O157 strains divided into a high-risk SPTB harbouring O26, O111 and O103 strains, a group pathogenic to pigs, and a group with few virulence genes other than for verotoxin. The data demonstrate SPT designation and selected PCR separated verotoxigenic E. coli of high and low risk to humans; although more virulence genes or pulsed-field gel electrophoresis will need to be included to separate high-risk strains further for epidemiological tracing.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The urban heat island is a well-known phenomenon that impacts a wide variety of city operations. With greater availability of cheap meteorological sensors, it is possible to measure the spatial patterns of urban atmospheric characteristics with greater resolution. To develop robust and resilient networks, recognizing sensors may malfunction, it is important to know when measurement points are providing additional information and also the minimum number of sensors needed to provide spatial information for particular applications. Here we consider the example of temperature data, and the urban heat island, through analysis of a network of sensors in the Tokyo metropolitan area (Extended METROS). The effect of reducing observation points from an existing meteorological measurement network is considered, using random sampling and sampling with clustering. The results indicated the sampling with hierarchical clustering can yield similar temperature patterns with up to a 30% reduction in measurement sites in Tokyo. The methods presented have broader utility in evaluating the robustness and resilience of existing urban temperature networks and in how networks can be enhanced by new mobile and open data sources.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Mesenchymal stem cells (MSC) are multipotent cells which can be obtained from several adult and fetal tissues including human umbilical cord units. We have recently shown that umbilical cord tissue (UC) is richer in MSC than umbilical cord blood (UCB) but their origin and characteristics in blood as compared to the cord remains unknown. Here we compared, for the first time, the exonic protein-coding and intronic noncoding RNA (ncRNA) expression profiles of MSC from match-paired UC and UCB samples, harvested from the same donors, processed simultaneously and under the same culture conditions. The patterns of intronic ncRNA expression in MSC from UC and UCB paired units were highly similar, indicative of their common donor origin. The respective exonic protein-coding transcript expression profiles, however, were significantly different. Hierarchical clustering based on protein-coding expression similarities grouped MSC according to their tissue location rather than original donor. Genes related to systems development, osteogenesis and immune system were expressed at higher levels in UCB, whereas genes related to cell adhesion, morphogenesis, secretion, angiogenesis and neurogenesis were more expressed in UC cells. These molecular differences verified in tissue-specific MSC gene expression may reflect functional activities influenced by distinct niches and should be considered when developing clinical protocols involving MSC from different sources. In addition, these findings reinforce our previous suggestion on the importance of banking the whole umbilical cord unit for research or future therapeutic use.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Development of polarized immune responses controls resistance and susceptibility to many microorganisms. However, studies of several infectious, allergic, and autoimmune diseases have shown that chronic type-1 and type-2 cytokine responses can also cause significant morbidity and mortality if left unchecked. We used mouse cDNA microarrays to molecularly phenotype the gene expression patterns that characterize two disparate but equally lethal forms of liver pathology that develop in Schistosoma mansoni infected mice polarized for type-1 and type-2 cytokine responses. Hierarchical clustering analysis identified at least three groups of genes associated with a polarized type-2 response and two linked with an extreme type-1 cytokine phenotype. Predictions about liver fibrosis,  apoptosis, and granulocyte recruitment and activation generated by the microarray studies were confirmed later by traditional biological assays. The data show that cDNA microarrays are useful not only for determining  coordinated gene expression profiles but are also highly effective for molecularly “fingerprinting” diseased tissues. Moreover, they illustrate the potential of genome-wide approaches for generating comprehensive views on the molecular and biochemical mechanisms regulating infectious  disease pathogenesis.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The Information Bottleneck method aims to extract a compact representation which preserves the maximum relevant information. The sub-optimality in agglomerative Information Bottleneck (aIB) algorithm restricts the applications of Information Bottleneck method. In this paper, the concept of density-based chains is adopted to evaluate the information loss among the neighbors of an element, rather than the information loss between pairs of elements. The DaIB algorithm is then presented to alleviate the sub-optimality problem in aIB while simultaneously keeping the useful hierarchical clustering tree-structure. The experiment results on the benchmark data sets show that the DaIB algorithm can get more relevant information and higher precision than aIB algorithm, and the paired t-test indicates that these improvements are statistically significant.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Biologically human brain processes information in both uniimodal and multimodal approaches. In fact, information is progressively abstracted and seamlessly fused. Subsequently, the fusion of multimodal inputs allows a holistic understanding of a problem. The proliferation of technology has exponentially produced various sources of data, which could be likened to being the state of multimodality in human brain. Therefore, this is an inspiration to develop a methodology for exploring multimodal data and further identifying multi-view patterns. Specifically, we propose a brain inspired conceptual model that allows exploration and identification of patterns at different levels of granularity, different types of hierarchies and different types of modalities. A structurally adaptive neural network is deployed to implement the proposed model. Furthermore, the acquisition of multi-view patterns with the proposed model is
demonstrated and discussed with some experimental results.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The human brain processes information in both unimodal and multimodal fashion where information is progressively captured, accumulated, abstracted and seamlessly fused. Subsequently, the fusion of multimodal inputs allows a holistic understanding of a problem. The proliferation of technology has produced various sources of electronic data and continues to do so exponentially. Finding patterns from such multi-source and multimodal data could be compared to the multimodal and multidimensional information processing in the human brain. Therefore, such brain functionality could be taken as an inspiration to develop a methodology for exploring multimodal and multi-source electronic data and further identifying multi-view patterns. In this paper, we first propose a brain inspired conceptual model that allows exploration and identification of patterns at different levels of granularity, different types of hierarchies and different types of modalities. Secondly, we present a cluster driven approach for the implementation of the proposed brain inspired model. Particularly, the Growing Self Organising Maps (GSOM) based cross-clustering approach is discussed. Furthermore, the acquisition of multi-view patterns with clusters driven implementation is demonstrated with experimental results.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Humans perceive entities such as objects, patterns, events, etc. as concepts, which are the basic units in human intelligence and communications. In addition, perceptions of these entities could be abstracted and generalised at multiple levels of granularity. In particular, such granulation allows the formation and usage of concepts in human intelligence. Such natural granularity in human intelligence could inspire and motivate the design and development of pattern identification approach in Data Mining. In our opinion, a pattern could be perceived at multiple levels of granularity and thus we advocate for the co-existence of hierarchy and granularity. In addition, granular patterns exist across different sources of data (multimodality). In this paper, we present a cognitive model that incorporates the characteristics of Hierarchy, Granularity and Multimodality for multi-view patterns identification in crime domain. Such framework is implemented with Growing Self Organising Maps (GSOM) and some experimental results are presented and discussed.