869 resultados para height partition clustering
Resumo:
In questo lavoro di tesi si è studiato il clustering degli ammassi di galassie e la determinazione della posizione del picco BAO per ottenere vincoli sui parametri cosmologici. A tale scopo si è implementato un codice per la stima dell'errore tramite i metodi di jackknife e bootstrap. La misura del picco BAO confrontata con i modelli cosmologici, grazie all'errore stimato molto piccolo, è risultato in accordo con il modelli LambdaCDM, e permette di ottenere vincoli su alcuni parametri dei modelli cosmologici.
Resumo:
Bioinformatics, in the last few decades, has played a fundamental role to give sense to the huge amount of data produced. Obtained the complete sequence of a genome, the major problem of knowing as much as possible of its coding regions, is crucial. Protein sequence annotation is challenging and, due to the size of the problem, only computational approaches can provide a feasible solution. As it has been recently pointed out by the Critical Assessment of Function Annotations (CAFA), most accurate methods are those based on the transfer-by-homology approach and the most incisive contribution is given by cross-genome comparisons. In the present thesis it is described a non-hierarchical sequence clustering method for protein automatic large-scale annotation, called “The Bologna Annotation Resource Plus” (BAR+). The method is based on an all-against-all alignment of more than 13 millions protein sequences characterized by a very stringent metric. BAR+ can safely transfer functional features (Gene Ontology and Pfam terms) inside clusters by means of a statistical validation, even in the case of multi-domain proteins. Within BAR+ clusters it is also possible to transfer the three dimensional structure (when a template is available). This is possible by the way of cluster-specific HMM profiles that can be used to calculate reliable template-to-target alignments even in the case of distantly related proteins (sequence identity < 30%). Other BAR+ based applications have been developed during my doctorate including the prediction of Magnesium binding sites in human proteins, the ABC transporters superfamily classification and the functional prediction (GO terms) of the CAFA targets. Remarkably, in the CAFA assessment, BAR+ placed among the ten most accurate methods. At present, as a web server for the functional and structural protein sequence annotation, BAR+ is freely available at http://bar.biocomp.unibo.it/bar2.0.
Resumo:
Lo scopo del clustering è quindi quello di individuare strutture nei dati significative, ed è proprio dalla seguente definizione che è iniziata questa attività di tesi , fornendo un approccio innovativo ed inesplorato al cluster, ovvero non ricercando la relazione ma ragionando su cosa non lo sia. Osservando un insieme di dati ,cosa rappresenta la non relazione? Una domanda difficile da porsi , che ha intrinsecamente la sua risposta, ovvero l’indipendenza di ogni singolo dato da tutti gli altri. La ricerca quindi dell’indipendenza tra i dati ha portato il nostro pensiero all’approccio statistico ai dati , in quanto essa è ben descritta e dimostrata in statistica. Ogni punto in un dataset, per essere considerato “privo di collegamenti/relazioni” , significa che la stessa probabilità di essere presente in ogni elemento spaziale dell’intero dataset. Matematicamente parlando , ogni punto P in uno spazio S ha la stessa probabilità di cadere in una regione R ; il che vuol dire che tale punto può CASUALMENTE essere all’interno di una qualsiasi regione del dataset. Da questa assunzione inizia il lavoro di tesi, diviso in più parti. Il secondo capitolo analizza lo stato dell’arte del clustering, raffrontato alla crescente problematica della mole di dati, che con l’avvento della diffusione della rete ha visto incrementare esponenzialmente la grandezza delle basi di conoscenza sia in termini di attributi (dimensioni) che in termini di quantità di dati (Big Data). Il terzo capitolo richiama i concetti teorico-statistici utilizzati dagli algoritimi statistici implementati. Nel quarto capitolo vi sono i dettagli relativi all’implementazione degli algoritmi , ove sono descritte le varie fasi di investigazione ,le motivazioni sulle scelte architetturali e le considerazioni che hanno portato all’esclusione di una delle 3 versioni implementate. Nel quinto capitolo gli algoritmi 2 e 3 sono confrontati con alcuni algoritmi presenti in letteratura, per dimostrare le potenzialità e le problematiche dell’algoritmo sviluppato , tali test sono a livello qualitativo , in quanto l’obbiettivo del lavoro di tesi è dimostrare come un approccio statistico può rivelarsi un’arma vincente e non quello di fornire un nuovo algoritmo utilizzabile nelle varie problematiche di clustering. Nel sesto capitolo saranno tratte le conclusioni sul lavoro svolto e saranno elencati i possibili interventi futuri dai quali la ricerca appena iniziata del clustering statistico potrebbe crescere.
Resumo:
An impressive discrepancy between reported and measured parental height is often observed. The aims of this study were: (a) to assess whether there is a significant difference between the reported and measured parental height; (b) to focus on the reported and, thereafter, measured height of the partner; (c) to analyse its impact on the calculated target height range.
Resumo:
We have investigated the use of hierarchical clustering of flow cytometry data to classify samples of conventional central chondrosarcoma, a malignant cartilage forming tumor of uncertain cellular origin, according to similarities with surface marker profiles of several known cell types. Human primary chondrosarcoma cells, articular chondrocytes, mesenchymal stem cells, fibroblasts, and a panel of tumor cell lines from chondrocytic or epithelial origin were clustered based on the expression profile of eleven surface markers. For clustering, eight hierarchical clustering algorithms, three distance metrics, as well as several approaches for data preprocessing, including multivariate outlier detection, logarithmic transformation, and z-score normalization, were systematically evaluated. By selecting clustering approaches shown to give reproducible results for cluster recovery of known cell types, primary conventional central chondrosacoma cells could be grouped in two main clusters with distinctive marker expression signatures: one group clustering together with mesenchymal stem cells (CD49b-high/CD10-low/CD221-high) and a second group clustering close to fibroblasts (CD49b-low/CD10-high/CD221-low). Hierarchical clustering also revealed substantial differences between primary conventional central chondrosarcoma cells and established chondrosarcoma cell lines, with the latter not only segregating apart from primary tumor cells and normal tissue cells, but clustering together with cell lines from epithelial lineage. Our study provides a foundation for the use of hierarchical clustering applied to flow cytometry data as a powerful tool to classify samples according to marker expression patterns, which could lead to uncover new cancer subtypes.
Resumo:
In recent years, enamel matrix derivative (EMD) has garnered much interest in the dental field for its apparent bioactivity that stimulates regeneration of periodontal tissues including periodontal ligament, cementum and alveolar bone. Despite its widespread use, the underlying cellular mechanisms remain unclear and an understanding of its biological interactions could identify new strategies for tissue engineering. Previous in vitro research has demonstrated that EMD promotes premature osteoblast clustering at early time points. The aim of the present study was to evaluate the influence of cell clustering on vital osteoblast cell-cell communication and adhesion molecules, connexin 43 (cx43) and N-cadherin (N-cad) as assessed by immunofluorescence imaging, real-time PCR and Western blot analysis. In addition, differentiation markers of osteoblasts were quantified using alkaline phosphatase, osteocalcin and von Kossa staining. EMD significantly increased the expression of connexin 43 and N-cadherin at early time points ranging from 2 to 5 days. Protein expression was localized to cell membranes when compared to control groups. Alkaline phosphatase activity was also significantly increased on EMD-coated samples at 3, 5 and 7 days post seeding. Interestingly, higher activity was localized to cell cluster regions. There was a 3 fold increase in osteocalcin and bone sialoprotein mRNA levels for osteoblasts cultured on EMD-coated culture dishes. Moreover, EMD significantly increased extracellular mineral deposition in cell clusters as assessed through von Kossa staining at 5, 7, 10 and 14 days post seeding. We conclude that EMD up-regulates the expression of vital osteoblast cell-cell communication and adhesion molecules, which enhances the differentiation and mineralization activity of osteoblasts. These findings provide further support for the clinical evidence that EMD increases the speed and quality of new bone formation in vivo.
Resumo:
To every partially ordered set (poset), one can associate a generating function, known as the P-partition generating function. We find necessary conditions and sufficient conditions for two posets to have the same P-partition generating function. We define the notion of a jump sequence for a labeled poset and show that having equal jumpsequences is a necessary condition for generating function equality. We also develop multiple ways of modifying posets that preserve generating function equality. Finally, we are able to give a complete classification of equalities among partially ordered setswith exactly two linear extensions.
Resumo:
A number of mathematical models for predicting growth and final height outcome have been proposed to enable the clinician to 'individualize' growth-promoting treatment. However, despite optimizing these models, many patients with isolated growth hormone deficiency (IGHD) do not reach their target height. The aim of this study was to analyse the impact of polymorphic genotypes [CA repeat promoter polymorphism of insulin-like growth factor-I (IGF-I) and the -202 A/C promoter polymorphism of IGF-Binding Protein-3 (IGFBP-3)] on variable growth factors as well as final height in severe IGHD following GH treatment. DESIGN, PATIENTS AND CONTROLS: One hundred seventy eight (IGF-I) and 167 (IGFBP-3) subjects with severe growth retardation because of IGHD were studied. In addition, the various genotypes were also studied in a healthy control group of 211 subjects.
Does published orthodontic research account for clustering effects during statistical data analysis?
Resumo:
In orthodontics, multiple site observations within patients or multiple observations collected at consecutive time points are often encountered. Clustered designs require larger sample sizes compared to individual randomized trials and special statistical analyses that account for the fact that observations within clusters are correlated. It is the purpose of this study to assess to what degree clustering effects are considered during design and data analysis in the three major orthodontic journals. The contents of the most recent 24 issues of the American Journal of Orthodontics and Dentofacial Orthopedics (AJODO), Angle Orthodontist (AO), and European Journal of Orthodontics (EJO) from December 2010 backwards were hand searched. Articles with clustering effects and whether the authors accounted for clustering effects were identified. Additionally, information was collected on: involvement of a statistician, single or multicenter study, number of authors in the publication, geographical area, and statistical significance. From the 1584 articles, after exclusions, 1062 were assessed for clustering effects from which 250 (23.5 per cent) were considered to have clustering effects in the design (kappa = 0.92, 95 per cent CI: 0.67-0.99 for inter rater agreement). From the studies with clustering effects only, 63 (25.20 per cent) had indicated accounting for clustering effects. There was evidence that the studies published in the AO have higher odds of accounting for clustering effects [AO versus AJODO: odds ratio (OR) = 2.17, 95 per cent confidence interval (CI): 1.06-4.43, P = 0.03; EJO versus AJODO: OR = 1.90, 95 per cent CI: 0.84-4.24, non-significant; and EJO versus AO: OR = 1.15, 95 per cent CI: 0.57-2.33, non-significant). The results of this study indicate that only about a quarter of the studies with clustering effects account for this in statistical data analysis.
Resumo:
Little is known about engagement in multiple health behaviours in childhood cancer survivors.
Resumo:
Identification of children with elevated blood pressure (BP) is difficult because of the multiple sex, age, and height-specific thresholds to define elevated BP. We propose a simple set of absolute height-specific BP thresholds and evaluate their performance to identify children with elevated BP in two different populations.