54 resultados para height partition clustering
em Université de Lausanne, Switzerland
Resumo:
The long term goal of this research is to develop a program able to produce an automatic segmentation and categorization of textual sequences into discourse types. In this preliminary contribution, we present the construction of an algorithm which takes a segmented text as input and attempts to produce a categorization of sequences, such as narrative, argumentative, descriptive and so on. Also, this work aims at investigating a possible convergence between the typological approach developed in particular in the field of text and discourse analysis in French by Adam (2008) and Bronckart (1997) and unsupervised statistical learning.
Resumo:
OBJECTIVES: Growth retardation is a frequent complication of paediatric inflammatory bowel disease (IBD). Only a few studies report the final height of these patients, with controversial results. We compared adult height of patients with paediatric IBD with that of patients with adult-onset disease. METHODS: Height data of 675 women 19-44 years of age and 454 men 23-44 years of age obtained at inclusion in the Swiss IBD cohort study registry were grouped according to the age at diagnosis: (a) prepubertal (men≤13, women≤11 years), (b) pubertal (men 13-22, women 11-18 years) and (c) adult (men>22, women>18 years of age), and compared with each other and with healthy controls. RESULTS: Male patients with prepubertal onset of Crohn's disease (CD) had significantly lower final height (mean 172±6 cm, range 161-182) compared with men with pubertal (179±6 cm, 161-192) or adult (178±7 cm, 162-200) age at onset and the general population (178±7 cm, 142-204). Height z-scores standardized against heights of the normal population were significantly lower in all patients with a prepubertal diagnosis of CD (-0.8±0.9) compared with the other patient groups (-0.1±0.8, P<0.001). Prepubertal onset of CD emerged as a risk factor for reduced final height in patients with prepubertal CD. No difference for final height was found between patients with ulcerative or unclassified IBD diagnosed at prepubertal, pubertal or adult age. CONCLUSION: Prepubertal onset of CD is a risk for lower final height, independent of the initial disease location and the necessity for surgical interventions.
Resumo:
Specific properties emerge from the structure of large networks, such as that of worldwide air traffic, including a highly hierarchical node structure and multi-level small world sub-groups that strongly influence future dynamics. We have developed clustering methods to understand the form of these structures, to identify structural properties, and to evaluate the effects of these properties. Graph clustering methods are often constructed from different components: a metric, a clustering index, and a modularity measure to assess the quality of a clustering method. To understand the impact of each of these components on the clustering method, we explore and compare different combinations. These different combinations are used to compare multilevel clustering methods to delineate the effects of geographical distance, hubs, network densities, and bridges on worldwide air passenger traffic. The ultimate goal of this methodological research is to demonstrate evidence of combined effects in the development of an air traffic network. In fact, the network can be divided into different levels of âeurooecohesionâeuro, which can be qualified and measured by comparative studies (Newman, 2002; Guimera et al., 2005; Sales-Pardo et al., 2007).
Resumo:
Adult height is a model polygenic trait, but there has been limited success in identifying the genes underlying its normal variation. To identify genetic variants influencing adult human height, we used genome-wide association data from 13,665 individuals and genotyped 39 variants in an additional 16,482 samples. We identified 20 variants associated with adult height (P < 5 x 10(-7), with 10 reaching P < 1 x 10(-10)). Combined, the 20 SNPs explain approximately 3% of height variation, with a approximately 5 cm difference between the 6.2% of people with 17 or fewer 'tall' alleles compared to the 5.5% with 27 or more 'tall' alleles. The loci we identified implicate genes in Hedgehog signaling (IHH, HHIP, PTCH1), extracellular matrix (EFEMP1, ADAMTSL3, ACAN) and cancer (CDK6, HMGA2, DLEU7) pathways, and provide new insights into human growth and developmental processes. Finally, our results provide insights into the genetic architecture of a classic quantitative trait.
Resumo:
Distribution of socio-economic features in urban space is an important source of information for land and transportation planning. The metropolization phenomenon has changed the distribution of types of professions in space and has given birth to different spatial patterns that the urban planner must know in order to plan a sustainable city. Such distributions can be discovered by statistical and learning algorithms through different methods. In this paper, an unsupervised classification method and a cluster detection method are discussed and applied to analyze the socio-economic structure of Switzerland. The unsupervised classification method, based on Ward's classification and self-organized maps, is used to classify the municipalities of the country and allows to reduce a highly-dimensional input information to interpret the socio-economic landscape. The cluster detection method, the spatial scan statistics, is used in a more specific manner in order to detect hot spots of certain types of service activities. The method is applied to the distribution services in the agglomeration of Lausanne. Results show the emergence of new centralities and can be analyzed in both transportation and social terms.
Resumo:
Background: Few data is available on long-term secular trends in height and weight in children in countries in transition. We assessed the secular trends in height and weight among representative samples of children and adolescents from the Seychelles (African region). Methods: Weight and height data from all students of all schools in four selected school grades (kindergarten, 4th, 7th and 10th years) were collected by cross-sectional surveys for periods 1998-9 (3,676 boys, 3,715 girls) and 2005-6 (4,867 boys, 4,846 girls). Data from 1956-7 was extracted from a previously published report. Results: Height increased, in boys, by 1.6 cm/decade for the period 1956-7 to 1998- 9, and 1.1 cm/decade for the period 1998-8 to 2005-6; in girls, the corresponding figures were 0.9 cm/decade and 1.8 cm/decade. At age 15.5 years, boys/girls were taller by 10/13 cm in 2005-6 than in 1956-7. Weight increased, in boys, by 1.4 kg/decade for the period 1956-7 to 1998-9, and by 2.2 kg/decade for the subsequent period; the corresponding figures in girls were 1.1 kg/decade and 2.5 kg/decade. Conclusion: Marked upward secular trends in body height and weight were documented in children and adolescents aged <16 years in the Seychelles, consistent with large changes in socio-economic and nutritional indicators in the considered 50- year interval. However, indirect evidence suggests that the secular height gain reflects accelerated growth during childhood over time with less than commensurate impact on adult height. Conversely, the largely steeper secular increase in weight than height is consistent with a pediatric obesity epidemic.
Resumo:
Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ∼2,000, ∼3,700 and ∼9,500 SNPs explained ∼21%, ∼24% and ∼29% of phenotypic variance. Furthermore, all common variants together captured 60% of heritability. The 697 variants clustered in 423 loci were enriched for genes, pathways and tissue types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/β-catenin and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants.
Resumo:
A methodology of exploratory data analysis investigating the phenomenon of orographic precipitation enhancement is proposed. The precipitation observations obtained from three Swiss Doppler weather radars are analysed for the major precipitation event of August 2005 in the Alps. Image processing techniques are used to detect significant precipitation cells/pixels from radar images while filtering out spurious effects due to ground clutter. The contribution of topography to precipitation patterns is described by an extensive set of topographical descriptors computed from the digital elevation model at multiple spatial scales. Additionally, the motion vector field is derived from subsequent radar images and integrated into a set of topographic features to highlight the slopes exposed to main flows. Following the exploratory data analysis with a recent algorithm of spectral clustering, it is shown that orographic precipitation cells are generated under specific flow and topographic conditions. Repeatability of precipitation patterns in particular spatial locations is found to be linked to specific local terrain shapes, e.g. at the top of hills and on the upwind side of the mountains. This methodology and our empirical findings for the Alpine region provide a basis for building computational data-driven models of orographic enhancement and triggering of precipitation. Copyright (C) 2011 Royal Meteorological Society .
Resumo:
OBJECTIVE: : Identification of children with elevated blood pressure (BP) is difficult because of the multiple sex, age, and height-specific thresholds to define elevated BP. We propose a simple set of absolute height-specific BP thresholds and evaluate their performance to identify children with elevated BP in two different populations. METHODS: : Using the 95th sex, age, and relative-height BP US thresholds to define elevated BP in children (standard criteria), we derived a set of (non sex- and non age-specific) absolute height-specific BP thresholds for 11 height categories by 10 cm increments. Using data from large school-based surveys conducted in Switzerland (N = 5207; 2621 boys, 2586 girls; age range: 10.1-14.9 years) and in the Seychelles (N = 25 759; 13 048 boys, 12 711 girls; age range: 4.4-18.8 years), we evaluated the performance of these height-specific thresholds to identify children with elevated BP. We also derived sex-specific absolute height-specific BP thresholds and compared their performance. RESULTS: : In the Swiss and the Seychelles surveys, the prevalence of elevated BP (standard criteria) was 11.4 and 9.1%, respectively. The height-specific thresholds to identify elevated BP had a sensitivity of 80 and 84%, a specificity of 99 and 99%, a positive predictive value of 92 and 91%, and a negative predictive value of 97 and 98%, respectively. Performance of sex-specific absolute height-specific BP thresholds was similar. CONCLUSION: : A simple table of height-specific BP thresholds allowed identifying children with elevated BP with high sensitivity and excellent specificity.
Resumo:
OBJECTIVE: This study assessed clustering of multiple risk behaviors (i.e., low leisure-time physical activity, low fruits/vegetables intake, and high alcohol consumption) with level of cigarette consumption. METHODS: Data from the 2002 Swiss Health Survey, a population-based cross-sectional telephone survey assessing health and self-reported risk behaviors, were used. 18,005 subjects (8052 men and 9953 women) aged 25 years old or more participated. RESULTS: Smokers more frequently had low leisure time physical activity, low fruits/vegetables intake, and high alcohol consumption than non- and ex-smokers. Frequency of each risk behavior increased steadily with cigarette consumption. Clustering of risk behaviors increased with cigarette consumption in both men and women. For men, the odds ratios of multiple (> or =2) risk behaviors other than smoking, adjusted for age, nationality, and educational level, were 1.14 (95% confidence interval: 0.97, 1.33) for ex-smokers, 1.24 (0.93, 1.64) for light smokers (1-9 cigarettes/day), 1.72 (1.36, 2.17) for moderate smokers (10-19 cigarettes/day), and 3.07 (2.59, 3.64) for heavy smokers (> or =20 cigarettes/day) versus non-smokers. Similar odds ratios were found for women for corresponding groups, i.e., 1.01 (0.86, 1.19), 1.26 (1.00, 1.58), 1.62 (1.33, 1.98), and 2.75 (2.30, 3.29). CONCLUSIONS: Counseling and intervention with smokers should take into account the strong clustering of risk behaviors with level of cigarette consumption.
Resumo:
ECG criteria for left ventricular hypertrophy (LVH) have been almost exclusively elaborated and calibrated in white populations. Because several interethnic differences in ECG characteristics have been found, the applicability of these criteria to African individuals remains to be demonstrated. We therefore investigated the performance of classic ECG criteria for LVH detection in an African population. Digitized 12-lead ECG tracings were obtained from 334 African individuals randomly selected from the general population of the Republic of Seychelles (Indian Ocean). Left ventricular mass was calculated with M-mode echocardiography and indexed to body height. LVH was defined by taking the 95th percentile of body height-indexed LVM values in a reference subgroup. In the entire study sample, 16 men and 15 women (prevalence 9.3%) were finally declared to have LVH, of whom 9 were of the reference subgroup. Sensitivity, specificity, accuracy, and positive and negative predictive values for LVH were calculated for 9 classic ECG criteria, and receiver operating characteristic curves were computed. We also generated a new composite time-voltage criterion with stepwise multiple linear regression: weighted time-voltage criterion=(0.2366R(aVL)+0.0551R(V5)+0.0785S(V3)+ 0.2993T(V1))xQRS duration. The Sokolow-Lyon criterion reached the highest sensitivity (61%) and the R(aVL) voltage criterion reached the highest specificity (97%) when evaluated at their traditional partition value. However, at a fixed specificity of 95%, the sensitivity of these 10 criteria ranged from 16% to 32%. Best accuracy was obtained with the R(aVL) voltage criterion and the new composite time-voltage criterion (89% for both). Positive and negative predictive values varied considerably depending on the concomitant presence of 3 clinical risk factors for LVH (hypertension, age >/=50 years, overweight). Median positive and negative predictive values of the 10 ECG criteria were 15% and 95%, respectively, for subjects with none or 1 of these risk factors compared with 63% and 76% for subjects with all of them. In conclusion, the performance of classic ECG criteria for LVH detection was largely disparate and appeared to be lower in this population of East African origin than in white subjects. A newly generated composite time-voltage criterion might provide improved performance. The predictive value of ECG criteria for LVH was considerably enhanced with the integration of information on concomitant clinical risk factors for LVH.
Resumo:
Abstract: To cluster textual sequence types (discourse types/modes) in French texts, K-means algorithm with high-dimensional embeddings and fuzzy clustering algorithm were applied on clauses whose POS (part-ofspeech) n-gram profiles were previously extracted. Uni-, bi- and trigrams were used on four 19th century French short stories by Maupassant. For high-dimensional embeddings, power transformations on the chi-squared distances between clauses were explored. Preliminary results show that highdimensional embeddings improve the quality of clustering, contrasting the use of bi and trigrams whose performance is disappointing, possibly because of feature space sparsity.