853 resultados para height partition clustering
Resumo:
The pubertal height growth spurt is a distinctive feature of childhood growth reflecting both the central onset of puberty and local growth factors. Although little is known about the underlying genetics, growth variability during puberty correlates with adult risks for hormone-dependent cancer and adverse cardiometabolic health. The only gene so far associated with pubertal height growth, LIN28B, pleiotropically influences childhood growth, puberty and cancer progression, pointing to shared underlying mechanisms. To discover genetic loci influencing pubertal height and growth and to place them in context of overall growth and maturation, we performed genome-wide association meta-analyses in 18 737 European samples utilizing longitudinally collected height measurements. We found significant associations (P < 1.67 × 10(-8)) at 10 loci, including LIN28B. Five loci associated with pubertal timing, all impacting multiple aspects of growth. In particular, a novel variant correlated with expression of MAPK3, and associated both with increased prepubertal growth and earlier menarche. Another variant near ADCY3-POMC associated with increased body mass index, reduced pubertal growth and earlier puberty. Whereas epidemiological correlations suggest that early puberty marks a pathway from rapid prepubertal growth to reduced final height and adult obesity, our study shows that individual loci associating with pubertal growth have variable longitudinal growth patterns that may differ from epidemiological observations. Overall, this study uncovers part of the complex genetic architecture linking pubertal height growth, the timing of puberty and childhood obesity and provides new information to pinpoint processes linking these traits.
Resumo:
This paper presents the predicted flow dynamics from the application of a Reynolds-averaged NavierStokes model to a series of bifurcation geometries with morphologies measured during previous flume experiments. The topography of the bifurcations consists of either plane or bedform-dominated beds which may or may not possess discordance between the two bifurcation distributaries. Numerical predictions are compared with experimental results to assess the ability of the numerical model to reproduce the division of flow into the bifurcation distributaries. The hydrodynamic model predicts: (1) diverting fluxes in the upstream channel which direct water into the distributaries; (2) super-elevation of the free surface induced at the bifurcation edge by pressure differences; and (3) counter-rotating secondary circulation cells which develop upstream of the apex of the bifurcation and move into the downstream channels, with water converging at the surface and diverging at the bed. When bedforms are not present, weak transversal fluxes characterize the upstream channel for almost its entire length, associated with clearly distinguishable secondary circulation cells, although these may be under-estimated by the turbulence model used in the solution. In the bedform dominated case, the same hydrodynamic conditions were not observed, with the bifurcation influence restricted and depth scale secondary circulation cells not forming. The results also demonstrate the dominant effect bed discordance has upon flow division between the two distributaries. Finally, results indicate that in bedform dominated rivers. Consequently, we suggest that sand-bed river bifurcations are more likely to have an influence that extends much further upstream and have a greater impact upon water distribution. This may contribute to observed morphological differences between sand-bedded and gravel-bedded braided river networks. Copyright (C) 2012 John Wiley & Sons, Ltd.
Resumo:
We develop a full theoretical approach to clustering in complex networks. A key concept is introduced, the edge multiplicity, that measures the number of triangles passing through an edge. This quantity extends the clustering coefficient in that it involves the properties of two¿and not just one¿vertices. The formalism is completed with the definition of a three-vertex correlation function, which is the fundamental quantity describing the properties of clustered networks. The formalism suggests different metrics that are able to thoroughly characterize transitive relations. A rigorous analysis of several real networks, which makes use of this formalism and the metrics, is also provided. It is also found that clustered networks can be classified into two main groups: the weak and the strong transitivity classes. In the first class, edge multiplicity is small, with triangles being disjoint. In the second class, edge multiplicity is high and so triangles share many edges. As we shall see in the following paper, the class a network belongs to has strong implications in its percolation properties.
Resumo:
The percolation properties of clustered networks are analyzed in detail. In the case of weak clustering, we present an analytical approach that allows us to find the critical threshold and the size of the giant component. Numerical simulations confirm the accuracy of our results. In more general terms, we show that weak clustering hinders the onset of the giant component whereas strong clustering favors its appearance. This is a direct consequence of the differences in the k-core structure of the networks, which are found to be totally different depending on the level of clustering. An empirical analysis of a real social network confirms our predictions.
Resumo:
We present a generator of random networks where both the degree-dependent clustering coefficient and the degree distribution are tunable. Following the same philosophy as in the configuration model, the degree distribution and the clustering coefficient for each class of nodes of degree k are fixed ad hoc and a priori. The algorithm generates corresponding topologies by applying first a closure of triangles and second the classical closure of remaining free stubs. The procedure unveils an universal relation among clustering and degree-degree correlations for all networks, where the level of assortativity establishes an upper limit to the level of clustering. Maximum assortativity ensures no restriction on the decay of the clustering coefficient whereas disassortativity sets a stronger constraint on its behavior. Correlation measures in real networks are seen to observe this structural bound.
Resumo:
PURPOSE: To objectively characterize different heart tissues from functional and viability images provided by composite-strain-encoding (C-SENC) MRI. MATERIALS AND METHODS: C-SENC is a new MRI technique for simultaneously acquiring cardiac functional and viability images. In this work, an unsupervised multi-stage fuzzy clustering method is proposed to identify different heart tissues in the C-SENC images. The method is based on sequential application of the fuzzy c-means (FCM) and iterative self-organizing data (ISODATA) clustering algorithms. The proposed method is tested on simulated heart images and on images from nine patients with and without myocardial infarction (MI). The resulting clustered images are compared with MRI delayed-enhancement (DE) viability images for determining MI. Also, Bland-Altman analysis is conducted between the two methods. RESULTS: Normal myocardium, infarcted myocardium, and blood are correctly identified using the proposed method. The clustered images correctly identified 90 +/- 4% of the pixels defined as infarct in the DE images. In addition, 89 +/- 5% of the pixels defined as infarct in the clustered images were also defined as infarct in DE images. The Bland-Altman results show no bias between the two methods in identifying MI. CONCLUSION: The proposed technique allows for objectively identifying divergent heart tissues, which would be potentially important for clinical decision-making in patients with MI.
Resumo:
Background: The trithorax group (trxG) and Polycomb group (PcG) proteins are responsible for the maintenance of stable transcriptional patterns of many developmental regulators. They bind to specific regions of DNA and direct the post-translational modifications of histones, playing a role in the dynamics of chromatin structure.Results: We have performed genome-wide expression studies of trx and ash2 mutants in Drosophila melanogaster. Using computational analysis of our microarray data, we have identified 25 clusters of genes potentially regulated by TRX. Most of these clusters consist of genes that encode structural proteins involved in cuticle formation. This organization appears to be a distinctive feature of the regulatory networks of TRX and other chromatin regulators, since we have observed the same arrangement in clusters after experiments performed with ASH2, as well as in experiments performed by others with NURF, dMyc, and ASH1. We have also found many of these clusters to be significantly conserved in D. simulans, D. yakuba, D. pseudoobscura and partially in Anopheles gambiae.Conclusion: The analysis of genes governed by chromatin regulators has led to the identification of clusters of functionally related genes conserved in other insect species, suggesting this chromosomal organization is biologically important. Moreover, our results indicate that TRX and other chromatin regulators may act globally on chromatin domains that contain transcriptionally co-regulated genes.
Resumo:
MOTIVATION: Analysis of millions of pyro-sequences is currently playing a crucial role in the advance of environmental microbiology. Taxonomy-independent, i.e. unsupervised, clustering of these sequences is essential for the definition of Operational Taxonomic Units. For this application, reproducibility and robustness should be the most sought after qualities, but have thus far largely been overlooked. RESULTS: More than 1 million hyper-variable internal transcribed spacer 1 (ITS1) sequences of fungal origin have been analyzed. The ITS1 sequences were first properly extracted from 454 reads using generalized profiles. Then, otupipe, cd-hit-454, ESPRIT-Tree and DBC454, a new algorithm presented here, were used to analyze the sequences. A numerical assay was developed to measure the reproducibility and robustness of these algorithms. DBC454 was the most robust, closely followed by ESPRIT-Tree. DBC454 features density-based hierarchical clustering, which complements the other methods by providing insights into the structure of the data. AVAILABILITY: An executable is freely available for non-commercial users at ftp://ftp.vital-it.ch/tools/dbc454. It is designed to run under MPI on a cluster of 64-bit Linux machines running Red Hat 4.x, or on a multi-core OSX system. CONTACT: dbc454@vital-it.ch or nicolas.guex@isb-sib.ch.
Resumo:
Abstract : This work is concerned with the development and application of novel unsupervised learning methods, having in mind two target applications: the analysis of forensic case data and the classification of remote sensing images. First, a method based on a symbolic optimization of the inter-sample distance measure is proposed to improve the flexibility of spectral clustering algorithms, and applied to the problem of forensic case data. This distance is optimized using a loss function related to the preservation of neighborhood structure between the input space and the space of principal components, and solutions are found using genetic programming. Results are compared to a variety of state-of--the-art clustering algorithms. Subsequently, a new large-scale clustering method based on a joint optimization of feature extraction and classification is proposed and applied to various databases, including two hyperspectral remote sensing images. The algorithm makes uses of a functional model (e.g., a neural network) for clustering which is trained by stochastic gradient descent. Results indicate that such a technique can easily scale to huge databases, can avoid the so-called out-of-sample problem, and can compete with or even outperform existing clustering algorithms on both artificial data and real remote sensing images. This is verified on small databases as well as very large problems. Résumé : Ce travail de recherche porte sur le développement et l'application de méthodes d'apprentissage dites non supervisées. Les applications visées par ces méthodes sont l'analyse de données forensiques et la classification d'images hyperspectrales en télédétection. Dans un premier temps, une méthodologie de classification non supervisée fondée sur l'optimisation symbolique d'une mesure de distance inter-échantillons est proposée. Cette mesure est obtenue en optimisant une fonction de coût reliée à la préservation de la structure de voisinage d'un point entre l'espace des variables initiales et l'espace des composantes principales. Cette méthode est appliquée à l'analyse de données forensiques et comparée à un éventail de méthodes déjà existantes. En second lieu, une méthode fondée sur une optimisation conjointe des tâches de sélection de variables et de classification est implémentée dans un réseau de neurones et appliquée à diverses bases de données, dont deux images hyperspectrales. Le réseau de neurones est entraîné à l'aide d'un algorithme de gradient stochastique, ce qui rend cette technique applicable à des images de très haute résolution. Les résultats de l'application de cette dernière montrent que l'utilisation d'une telle technique permet de classifier de très grandes bases de données sans difficulté et donne des résultats avantageusement comparables aux méthodes existantes.
Resumo:
The rationale of this study was to investigate molecular flexibility and its influence on physicochemical properties with a view to uncovering additional information on the fuzzy concept of dynamic molecular structure. Indeed, it is now known that computed molecular interaction fields (MIFs) such as molecular electrostatic potentials (MEPs) and lipophilicity potentials (MLPs) are conformation-dependent, as are dipole moments. A database of 125 compounds was used whose conformational space was explored, while conformation-dependent parameters were computed for each non-redundant conformer found in the conformational space of the compounds. These parameters were the virtual log P (log P(MLP), calculated by a MLP approach), the apolar surface area (ASA), polar surface area (PSA), and solvent-accessible surface (SAS). For each compound, the range taken by each parameter (its property space) was divided by the number of rotors taken as an index of flexibility, yielding a parameter termed 'molecular sensitivity'. This parameter was poorly correlated with others (i.e., it contains novel information) and showed the compounds to fall into two broad classes. 'Sensitive' molecules are those whose computed property ranges are markedly sensitive to conformational effects, whereas 'insensitive' (in fact, less sensitive) molecules have property ranges which are comparatively less affected by conformational fluctuations. A pharmacokinetic application is presented.
Resumo:
Most common human traits and diseases have a polygenic pattern of inheritance: DNA sequence variants at many genetic loci influence the phenotype. Genome-wide association (GWA) studies have identified more than 600 variants associated with human traits, but these typically explain small fractions of phenotypic variation, raising questions about the use of further studies. Here, using 183,727 individuals, we show that hundreds of genetic variants, in at least 180 loci, influence adult height, a highly heritable and classic polygenic trait. The large number of loci reveals patterns with important implications for genetic studies of common human diseases and traits. First, the 180 loci are not random, but instead are enriched for genes that are connected in biological pathways (P = 0.016) and that underlie skeletal growth defects (P < 0.001). Second, the likely causal gene is often located near the most strongly associated variant: in 13 of 21 loci containing a known skeletal growth gene, that gene was closest to the associated variant. Third, at least 19 loci have multiple independently associated variants, suggesting that allelic heterogeneity is a frequent feature of polygenic traits, that comprehensive explorations of already-discovered loci should discover additional variants and that an appreciable fraction of associated loci may have been identified. Fourth, associated variants are enriched for likely functional effects on genes, being over-represented among variants that alter amino-acid structure of proteins and expression levels of nearby genes. Our data explain approximately 10% of the phenotypic variation in height, and we estimate that unidentified common variants of similar effect sizes would increase this figure to approximately 16% of phenotypic variation (approximately 20% of heritable variation). Although additional approaches are needed to dissect the genetic architecture of polygenic human traits fully, our findings indicate that GWA studies can identify large numbers of loci that implicate biologically relevant genes and pathways.
Resumo:
As adult height is a well-established retrospective measure of health and standard of living, it is important to understand the factors that determine it. Among them, the influence of socio-environmental factors has been subjected to empirical scrutiny. This paper explores the influence of generational (or environmental) effects and individual and gender-specific heterogeneity on adult height. Our data set is from contemporary Spain, a country governed by an authoritarian regime between 1939 and 1977. First, we use normal position and quantile regression analysis to identify the determinants of self-reported adult height and to measure the influence of individual heterogeneity. Second, we use a Blinder-Oaxaca decomposition approach to explain the `gender height gap¿ and its distribution, so as to measure the influence on this gap of individual heterogeneity. Our findings suggest a significant increase in adult height in the generations that benefited from the country¿s economic liberalization in the 1950s, and especially those brought up after the transition to democracy in the 1970s. In contrast, distributional effects on height suggest that only in recent generations has ¿height increased more among the tallest¿. Although the mean gender height gap is 11 cm, generational effects and other controls such as individual capabilities explain on average roughly 5% of this difference, a figure that rises to 10% in the lowest 10% quantile.
Resumo:
OBJECTIVES: Growth retardation is a frequent complication of paediatric inflammatory bowel disease (IBD). Only a few studies report the final height of these patients, with controversial results. We compared adult height of patients with paediatric IBD with that of patients with adult-onset disease. METHODS: Height data of 675 women 19-44 years of age and 454 men 23-44 years of age obtained at inclusion in the Swiss IBD cohort study registry were grouped according to the age at diagnosis: (a) prepubertal (men≤13, women≤11 years), (b) pubertal (men 13-22, women 11-18 years) and (c) adult (men>22, women>18 years of age), and compared with each other and with healthy controls. RESULTS: Male patients with prepubertal onset of Crohn's disease (CD) had significantly lower final height (mean 172±6 cm, range 161-182) compared with men with pubertal (179±6 cm, 161-192) or adult (178±7 cm, 162-200) age at onset and the general population (178±7 cm, 142-204). Height z-scores standardized against heights of the normal population were significantly lower in all patients with a prepubertal diagnosis of CD (-0.8±0.9) compared with the other patient groups (-0.1±0.8, P<0.001). Prepubertal onset of CD emerged as a risk factor for reduced final height in patients with prepubertal CD. No difference for final height was found between patients with ulcerative or unclassified IBD diagnosed at prepubertal, pubertal or adult age. CONCLUSION: Prepubertal onset of CD is a risk for lower final height, independent of the initial disease location and the necessity for surgical interventions.