840 resultados para Evolutionary clustering
Resumo:
In a large number of problems the high dimensionality of the search space, the vast number of variables and the economical constrains limit the ability of classical techniques to reach the optimum of a function, known or unknown. In this thesis we investigate the possibility to combine approaches from advanced statistics and optimization algorithms in such a way to better explore the combinatorial search space and to increase the performance of the approaches. To this purpose we propose two methods: (i) Model Based Ant Colony Design and (ii) Naïve Bayes Ant Colony Optimization. We test the performance of the two proposed solutions on a simulation study and we apply the novel techniques on an appplication in the field of Enzyme Engineering and Design.
Resumo:
Two Amerindian populations from the Peruvian Amazon (Yanesha) and from rural lowlands of the Argentinean Gran Chaco (Wichi) were analyzed. They represent two case study of the South American genetic variability. The Yanesha represent a model of population isolated for long-time in the Amazon rainforest, characterized by environmental and altitudinal stratifications. The Wichi represent a model of population living in an area recently colonized by European populations (the Criollos are the population of the admixed descendents), whose aim is to depict the native ancestral gene pool and the degree of admixture, in relation to the very high prevalence of Chagas disease. The methods used for the genotyping are common, concerning the Y chromosome markers (male lineage) and the mitochondrial markers (maternal lineage). The determination of the phylogeographic diagnostic polymorphisms was carried out by the classical techniques of PCR, restriction enzymes, sequencing and specific mini-sequencing. New method for the detection of the protozoa Trypanosoma cruzi was developed by means of the nested PCR. The main results show patterns of genetic stratification in Yanesha forest communities, referable to different migrations at different times, estimated by Bayesian analyses. In particular Yanesha were considered as a population of transition between the Amazon basin and the Andean Cordillera, evaluating the potential migration routes and the separation of clusters of community in relation to different genetic bio-ancestry. As the Wichi, the gene pool analyzed appears clearly differentiated by the admixed sympatric Criollos, due to strict social practices (deeply analyzed with the support of cultural anthropological tools) that have preserved the native identity at a diachronic level. A pattern of distribution of the seropositivity in relation to the different phylogenetic lineages (the adaptation in evolutionary terms) does not appear, neither Amerindian nor European, but in relation to environmental and living conditions of the two distinct subpopulations.
Resumo:
There are different ways to do cluster analysis of categorical data in the literature and the choice among them is strongly related to the aim of the researcher, if we do not take into account time and economical constraints. Main approaches for clustering are usually distinguished into model-based and distance-based methods: the former assume that objects belonging to the same class are similar in the sense that their observed values come from the same probability distribution, whose parameters are unknown and need to be estimated; the latter evaluate distances among objects by a defined dissimilarity measure and, basing on it, allocate units to the closest group. In clustering, one may be interested in the classification of similar objects into groups, and one may be interested in finding observations that come from the same true homogeneous distribution. But do both of these aims lead to the same clustering? And how good are clustering methods designed to fulfil one of these aims in terms of the other? In order to answer, two approaches, namely a latent class model (mixture of multinomial distributions) and a partition around medoids one, are evaluated and compared by Adjusted Rand Index, Average Silhouette Width and Pearson-Gamma indexes in a fairly wide simulation study. Simulation outcomes are plotted in bi-dimensional graphs via Multidimensional Scaling; size of points is proportional to the number of points that overlap and different colours are used according to the cluster membership.
Resumo:
One of the quickest plant movements ever known is made by the ´explosive´ style in Marantaceae in the service of secondary pollen presentation – herewith showing a striking apomorphy to the sister Cannaceae that might be of high evolutionary consequence. Though known already since the beginning of the 19th century the underlying mechanism of the movement has hitherto not been clarified. The present study reports about the biomechanics of the style-staminode complex and the hydraulic principles of the movement. For the first time it is shown by experiment that in Maranta noctiflora through longitudinal growth of the maturing style in the ´straitjacket´ of the hooded staminode both the hold of the style prior to its release and its tensioning for the movement are brought about. The longer the style grows in relation to the enclosing hooded staminode the more does its capacity for curling up for pollen transfer increase. Hereby I distinguish between the ´basic tension´ that a growing style builds up anyway, even when the hooded staminode is removed beforehand, and the ´induced tension´ which comes about only under the pressure of a ´too short´ hooded staminode and which enables the movement. The results of these investigations are discussed in view of previous interpretations ranging from possible biomechanical to electrophysiological mechanisms. To understand furthermore by which means the style gives way to the strong bending movement without suffering outwardly visible damage I examined its anatomical structure in several genera for its mechanical and hydraulic properties and for the determination of the entire curvature after release. The actual bending part contains tubulate cells whose walls are extraordinarily porous and large longitudinal intercellular spaces. SEM indicates the starting points of cell-wall loosening in primary walls and lysis of middle lamellae - probably through an intense pectinase activity in the maturing style. Fluorescence pictures of macerated and living style-tissue confirm cell-wall perforations that do apparently connect neighbouring cells, which leads to an extremely permeable parenchyma. The ´water-body´ can be shifted from central to dorsal cell layers to support the bending. The geometrical form of the curvature is determined by the vascular bundles. I conclude that the style in Marantaceae contains no ´antagonistic´ motile tissues as in Mimosa or Dionaea. Instead, through self-maceration it develops to a ´hydraulic tissue´ which carries out an irreversible movement through a sudden reshaping. To ascertain the evolutionary consequence of this apomorphic pollination mechanism the diversity and systematic value of hooded staminodes are examined. For this hooded staminodes of 24 genera are sorted according to a minimalistic selection of shape characters and eight morphological types are abstracted from the resulting groups. These types are mapped onto an already available maximally parsimonious tree comprising five major clades. An amazing correspondence is found between the morphological types and the clades; several sister-relationships are confirmed and in cases of uncertain position possible evolutionary pathways, such as convergence, dispersal or re-migration, are discussed, as well as the great evolutionary tendencies for the entire family in which – at least as regards the shape of hooded staminodes – there is obviously a tendency from complicated to strongly simplified forms. It suggests itself that such simplifying derivations may very likely have taken place as adaptations to pollinating animals about which at present too little is known. The value of morphological characters in relation to modern phylogenetic analysis is discussed and conditions for the selection of morphological characters valuable for a systematic grouping are proposed. Altogether, in view of the evolutionary success of Marantaceae compared with Cannaceae the movement mechanism of the style-staminode complex can safely be considered a key innovation within the order Zingiberales.
Resumo:
Il task del data mining si pone come obiettivo l'estrazione automatica di schemi significativi da grandi quantità di dati. Un esempio di schemi che possono essere cercati sono raggruppamenti significativi dei dati, si parla in questo caso di clustering. Gli algoritmi di clustering tradizionali mostrano grossi limiti in caso di dataset ad alta dimensionalità, composti cioè da oggetti descritti da un numero consistente di attributi. Di fronte a queste tipologie di dataset è necessario quindi adottare una diversa metodologia di analisi: il subspace clustering. Il subspace clustering consiste nella visita del reticolo di tutti i possibili sottospazi alla ricerca di gruppi signicativi (cluster). Una ricerca di questo tipo è un'operazione particolarmente costosa dal punto di vista computazionale. Diverse ottimizzazioni sono state proposte al fine di rendere gli algoritmi di subspace clustering più efficienti. In questo lavoro di tesi si è affrontato il problema da un punto di vista diverso: l'utilizzo della parallelizzazione al fine di ridurre il costo computazionale di un algoritmo di subspace clustering.
Resumo:
Die Marantaceae (550 Arten) sind eine weltweit verbreitete Familie von Stauden und Lianen aus dem Unterwuchs tropischer Tieflandregenwälder. Der morphologisch-ökologische Vergleich des basal abzweigenden Sarcophrynium-Astes mit dem in abgeleiteter Position stehenden Marantochloa-Ast, soll beispielhaft evolutionäre Muster in der Familie beleuchten. So wird in der Doktorarbeit zum ersten Mal ein Überblick über die Blütenbiologie und Phylogenie von rund 30 der 40 afrikanischen Marantaceae Arten präsentiert. Die Analysen basieren auf Daten von drei mehrmonatigen Feldaufenthalten in Gabun jeweils zwischen September und Januar. Vier Blütentypen werden beschrieben, die jeweils mit einer spezifischen Bestäubergilde verbunden sind (kleine, mittlere, große Bienen bzw. Vögel). Bestäubungsexperimente belegen, dass 18 Arten selbstkompatibel, aber nur zwei Arten autogam sind, also keine Bestäubungsvermittler benötigen. Der Fruchtansatz ist generell gering (10 -30 %). Die komplexe Synorganisation der Blüte ermöglicht in den Marantaceae einen explosiven Bestäubungsmechanismus. Um dessen ökologische Funktionalität zu verstehen, werden die Blüten von 66 Arten, alle wichtigen Äste der Marantaceae abdeckend, unter einem morphologisch-funktionalen Gesichtspunkt untersucht. Es gibt große Übereinstimmungen zwischen allen untersuchten Arten im Zusammenspiel (Synorganisation) der wichtigsten Bauelemente (Griffel, Kapuzenblatt, Schwielenblatt), die eine präzise Pollenübertragung ermöglichen. Basierend auf Daten von nrDNA (ITS, 5S) und cpDNA (trnL-F) wird für ein nahezu komplettes Artenspektrum die Phylogenie der zwei afrikanischen Äste erstellt. Hierauf werden morphologische und ökologische Merkmale sowie geographischer Verbreitungsmuster nach dem Parsimonieprinzip rekonstruiert, um so deren evolutionäre Bedeutung für die Marantaceae abschätzen zu können. Die Ergebnisse weisen auf die Beteiligung einer Vielzahl verschiedener Artbildungsfaktoren hin.
Resumo:
In questo lavoro di tesi si è studiato il clustering degli ammassi di galassie e la determinazione della posizione del picco BAO per ottenere vincoli sui parametri cosmologici. A tale scopo si è implementato un codice per la stima dell'errore tramite i metodi di jackknife e bootstrap. La misura del picco BAO confrontata con i modelli cosmologici, grazie all'errore stimato molto piccolo, è risultato in accordo con il modelli LambdaCDM, e permette di ottenere vincoli su alcuni parametri dei modelli cosmologici.
Resumo:
Survivin, a unique member of the family of inhibitors of apoptosis (IAP) proteins, orchestrates intracellular pathways during cell division and apoptosis. Its central regulatory function in vertebrate molecular pathways as mitotic regulator and inhibitor of apoptotic cell death has major implications for tumor cell proliferation and viability, and has inspired several approaches that target survivin for cancer therapy. Analyses in early-branching Metazoa so far propose an exclusive role of survivin as a chromosomal passenger protein, whereas only later during evolution the second, complementary antiapoptotic function might have arisen, concurrent with increased organismal complexity. To lift the veil on the ancestral function(s) of this key regulatory molecule, a survivin homologue of the phylogenetically oldest extant metazoan taxon (phylum Porifera) was identified and functionally characterized. SURVL of the demosponge Suberites domuncula shares significant similarities with its metazoan homologues, ranging from conserved exon/intron structures to the presence of localization signal and protein-interaction domains, characteristic of IAP proteins. Whereas sponge tissue displayed a very low steady-state level, SURVL expression was significantly up-regulated in rapidly proliferating primmorph cells. In addition, challenge of sponge tissue and primmorphs with cadmium and the lipopeptide Pam3Cys-Ser-(Lys)4 stimulated SURVL expression, concurrent with the expression of newly discovered poriferan caspases (CASL and CASL2). Complementary functional analyses in transfected HEK-293 revealed that heterologous expression of poriferan survivin in human cells not only promotes cell proliferation but also augments resistance to cadmium-induced cell death. Taken together, these results demonstrate both a deep evolutionary conserved and fundamental dual role of survivin, and an equally conserved central position of this key regulatory molecule in interconnected pathways of cell cycle and apoptosis. Additionally, SDCASL, SDCASL2, and SDTILRc (TIR-LRR containing protein) may represent new components of the innate defense sentinel in sponges. SDCASL and SDCASL2 are two new caspase-homolog proteins with a singular structure. In addition to their CASc domains, SDCASL and SDCASL2 feature a small prodomain NH2-terminal (effector caspases) and a remarkably long COOH-terminal domain containing one or several functional double stranded RNA binding domains (dsrm). This new caspase prototype can characterize a caspase specialization coupling pathogen sensing and apoptosis, and could represent a very efficient defense mechanism. SDTILRc encompasses also a unique combination of domains: several leucine rich repeats (LRR) and a Toll/IL-1 receptor (TIR) domain. This unusual domain association may correspond to a new family of intracellular sensing protein, forming a subclass of pattern recognition receptors (PRR).
Resumo:
Bioinformatics, in the last few decades, has played a fundamental role to give sense to the huge amount of data produced. Obtained the complete sequence of a genome, the major problem of knowing as much as possible of its coding regions, is crucial. Protein sequence annotation is challenging and, due to the size of the problem, only computational approaches can provide a feasible solution. As it has been recently pointed out by the Critical Assessment of Function Annotations (CAFA), most accurate methods are those based on the transfer-by-homology approach and the most incisive contribution is given by cross-genome comparisons. In the present thesis it is described a non-hierarchical sequence clustering method for protein automatic large-scale annotation, called “The Bologna Annotation Resource Plus” (BAR+). The method is based on an all-against-all alignment of more than 13 millions protein sequences characterized by a very stringent metric. BAR+ can safely transfer functional features (Gene Ontology and Pfam terms) inside clusters by means of a statistical validation, even in the case of multi-domain proteins. Within BAR+ clusters it is also possible to transfer the three dimensional structure (when a template is available). This is possible by the way of cluster-specific HMM profiles that can be used to calculate reliable template-to-target alignments even in the case of distantly related proteins (sequence identity < 30%). Other BAR+ based applications have been developed during my doctorate including the prediction of Magnesium binding sites in human proteins, the ABC transporters superfamily classification and the functional prediction (GO terms) of the CAFA targets. Remarkably, in the CAFA assessment, BAR+ placed among the ten most accurate methods. At present, as a web server for the functional and structural protein sequence annotation, BAR+ is freely available at http://bar.biocomp.unibo.it/bar2.0.
Resumo:
The present study deal with the population structure and connectivity of the Mediterranean endemic starry ray Raja asterias (Delaroche, 1809) in the Western and Eastern Mediterranean basin. A panel of eight microsatellite loci which cross-amplify in Rajidae (El Nagar, 2010) was used to assess population connectivity and structure. Those aims were investigated by analyzing the genetic variation of 9 population sample for a total of 185 individuals collected during past scientific surveys (MEDITS, GRUND), commercial trawling and also directly at fish markets. The purpose of this thesis is to estimate the genetic divergence occurring between the Mediterranean populations and, in particular, to assess the presence of any barrier (geographic, hydrogeological and biological) to gene flow for this species. Different statistical approaches were performed to reach this aim evaluating both the genetic diversity (nucleotide diversity, allelic richness, observed and expected heterozygosity and Hardy-Weinberg equilibrium test) and the population differentiation patterns (pairwise Fst estimated and population structure analysis). The results obtained from the analysis of the microsatellite dataset suggest a geographic and genetic separation between the starry ray populations of the Mediterranean basin into three or four distinct groups: Western and Eastern Mediterranean basins and Sicilian coast always clustering as an independent group and Algeria which could be or not considered another separate group. The data were discussed from both an evolutionary and a conservation point of view and in relation to previous results obtained by the analysis of mitochondrial marker. A comparison with other Mediterranean demersal skate species was performed in order to better contextualise our results. Finally, our results could offer useful information to protect vulnerable species as R. asterias and developing effective conservation plans in the Mediterranean.
Resumo:
Im Rahmen dieser Doktorarbeit wurde in zwei Schwerpunktanalysen mit eine Teil- und Gesamtdatensatz die Untersuchung der Hybridisierung zwischen den beiden Microcebus-Arten M. murinus und M. griseorufus im Ökoton Südostmadagaskars umfangreich und vertieft untersucht. Für die genetischen Analysen wurden die maternal vererbte mitochondriale Hypervariable Region I (HVR 1) und neun nukleäre biparental vererbte Mikrosatellitenmarker eingesetzt. Als weiterer Datensatz wurden morphometrische Daten verwendet. Für die erste Schwerpunktanalyse wurde ein bereits vorhandener Teildatensatz (Hapke 2005 & Gligor 2006) mit Daten von insgesamt 162 Individuen aus neun Populationen der Dornbuschzone, der Übergangswaldzone und des Küstenwaldgebietes eingesetzt. In der zweiten Schwerpunktanalyse wurde eine umfangreiche Untersuchung der Microcebus griseoruus-M. murinus- Hybridzone vorgenommen. Für diese detaillierte Charakterisierung der Hybridzone wurde eine ausgedehnte und fein auflösende Probennahme in einem als Kernzone definierten Bereich, der die gesamte Übergangswaldzone und die dazu benachbarten Dornbuschgebiete umfasste, durchgeführt. Die morphometrischen und genetischen Daten der neu beprobten Individuen dieser Kernzone wurden mit den Daten des Teildatensatzes und weiteren Daten aus Küstenwaldpopulationen (Hapke 2005) zu einem Gesamtdatensatz zusammengefasst. Die Integration des Teildatensatzes in den Gesamtdatensatz erforderte umfassende und zeitintensive Labor- und Analysearbeiten, die im Rahmen dieser Doktorarbeit durchgeführt wurden. Der Gesamtdatensatz umfasste insgesamt 569 Individuen der Gattung Microcebus aus 29 Untersuchungsstandorten. Die mit beiden Datensätzen durchgeführte Analyse morphometrischer Daten zeigte deutlich, dass die Mehrzahl der Individuen aus der Übergangswaldzone einen intermediären Morphotyp aufweist. Durch die mit den Daten des Teildatensatzes durchgeführten Bayes’schen Clusteranalysen und Assignment-Tests, das vornehmlich in den Populationen der Übergangszone beobachtete signifikante Kopplungsungleichgewicht und Heterozygotendefizit, die festgestellte Verteilung der mitochondrialen Haplotypen und das kontrastierende Muster zwischen nukleären Mikrosatellitengenotypen und mitochondrialen Haplotypen in den Übergangswaldpopulationen konnte erstmals das Vorkommen einer Hybridzone zwischen Microcebus-Arten wissenschaftlich fundiert festgestellt werden. Die Ergebnisse dieser Schwerpunktanalyse wurden in der Fachzeitschrift Molecular Ecology publiziert (Gligor et al. 2009). Die in der ersten Schwerpunktanalyse festgestellte Hybridzone konnte durch die zweite Schwerpunktanalyse mit den genetischen und morphometrischen Daten des Gesamtdatensatzes nicht nur bestätigt werden, sondern auch auf die gesamte Übergangswaldzone erweitert werden. Ferner wurden starke Hinweise auf eine Hybridisierung beider Microcebus-Arten an einigen Dornbuschstandorten der Kernzone gefunden. Durch die große Datenmenge des Gesamtdatensatzes, vor allem aus der Kernzone des Untersuchungsgebietes, war es möglich eine fundierte Charakterisierung der Microcebus griseoruus-M. murinus- Hybridzone durchzuführen. Die Übereinstimmung der Hybridzone mit dem beobachteten Vegetationsmosaik zusammen mit den Ergebnissen der PCA, der PCoA und der Bayes’schen Clusteranalyse sprechen für das Modell der „Mosaik Hybridzone“, während die Einzelbetrachtung der mosaikartig verteilten intermediären Übergangswälder eine hohe Abundanz der Hybride aufzeigte und somit eher das „Bounded Hybrid Superiority model“ unterstützt. Der gewählte geographische Beprobungsmaßstab könnte somit einen Einfluss auf die beobachtete Struktur einer Hybridzone haben. Eines der markantesten Muster in der Hybridzone ist das stark kontrastierende cyto-nukleäre Muster. Der seit ca. 3000 Jahren fortschreitende Klimawandel in Südmadagaskar und die damit verbundene Expansion des Verbreitungsgebietes der Art Microcebus griseorufus nach Osten, das in dieser Arbeit festgestellte „male-biased dispersal“ bei M. griseorufus und der Einfluss exogener Selektion sprechen stark für eine massive asymmetrische nukleäre Genintrogression von M. griseorufus-Allelen in M. murinus-Populationen, verbunden mit einer potentiellen Verdrängung der Art M. murinus aus der Übergangswaldzone. In den jeweiligen Kerngebieten Dornbusch und Küstenwald bleibt jedoch die Diskretheit beider Arten gewahrt.
Resumo:
Lo scopo del clustering è quindi quello di individuare strutture nei dati significative, ed è proprio dalla seguente definizione che è iniziata questa attività di tesi , fornendo un approccio innovativo ed inesplorato al cluster, ovvero non ricercando la relazione ma ragionando su cosa non lo sia. Osservando un insieme di dati ,cosa rappresenta la non relazione? Una domanda difficile da porsi , che ha intrinsecamente la sua risposta, ovvero l’indipendenza di ogni singolo dato da tutti gli altri. La ricerca quindi dell’indipendenza tra i dati ha portato il nostro pensiero all’approccio statistico ai dati , in quanto essa è ben descritta e dimostrata in statistica. Ogni punto in un dataset, per essere considerato “privo di collegamenti/relazioni” , significa che la stessa probabilità di essere presente in ogni elemento spaziale dell’intero dataset. Matematicamente parlando , ogni punto P in uno spazio S ha la stessa probabilità di cadere in una regione R ; il che vuol dire che tale punto può CASUALMENTE essere all’interno di una qualsiasi regione del dataset. Da questa assunzione inizia il lavoro di tesi, diviso in più parti. Il secondo capitolo analizza lo stato dell’arte del clustering, raffrontato alla crescente problematica della mole di dati, che con l’avvento della diffusione della rete ha visto incrementare esponenzialmente la grandezza delle basi di conoscenza sia in termini di attributi (dimensioni) che in termini di quantità di dati (Big Data). Il terzo capitolo richiama i concetti teorico-statistici utilizzati dagli algoritimi statistici implementati. Nel quarto capitolo vi sono i dettagli relativi all’implementazione degli algoritmi , ove sono descritte le varie fasi di investigazione ,le motivazioni sulle scelte architetturali e le considerazioni che hanno portato all’esclusione di una delle 3 versioni implementate. Nel quinto capitolo gli algoritmi 2 e 3 sono confrontati con alcuni algoritmi presenti in letteratura, per dimostrare le potenzialità e le problematiche dell’algoritmo sviluppato , tali test sono a livello qualitativo , in quanto l’obbiettivo del lavoro di tesi è dimostrare come un approccio statistico può rivelarsi un’arma vincente e non quello di fornire un nuovo algoritmo utilizzabile nelle varie problematiche di clustering. Nel sesto capitolo saranno tratte le conclusioni sul lavoro svolto e saranno elencati i possibili interventi futuri dai quali la ricerca appena iniziata del clustering statistico potrebbe crescere.
Resumo:
Epigenetic variability is a new mechanism for the study of human microevolution, because it creates both phenotypic diversity within an individual and within population. This mechanism constitutes an important reservoir for adaptation in response to new stimuli and recent studies have demonstrated that selective pressures shape not only the genetic code but also DNA methylation profiles. The aim of this thesis is the study of the role of DNA methylation changes in human adaptive processes, considering the Italian peninsula and macro-geographical areas. A whole-genome analysis of DNA methylation profile across the Italian penisula identified some genes whose methylation levels differ between individuals of different Italian districts (South, Centre and North of Italy). These genes are involved in nitrogen compound metabolism and genes involved in pathogens response. Considering individuals with different macro-geographical origins (individuals of Asians, European and African ancestry) more significant DMRs (differentially methylated regions) were identified and are located in genes involved in glucoronidation, in immune response as well as in cell comunication processes. A "profile" of each ancestry (African, Asian and European) was described. Moreover a deepen analysis of three candidate genes (KRTCAP3, MAD1L and BRSK2) in a cohort of individuals of different countries (Morocco, Nigeria, China and Philippines) living in Bologna, was performed in order to explore genetic and epigenetic diversity. Moreover this thesis have paved the way for the application of DNA methylation for the study of hystorical remains and in particular for the age-estimation of individuals starting from biological samples (such as teeth or blood). Noteworthy, a mathematical model that considered methylation values of DNA extracted from cementum and pulp of living individuals can estimate chronological age with high accuracy (median absolute difference between age estimated from DNA methylation and chronological age was 1.2 years).
Resumo:
Holding the major share of stellar mass in galaxies and being also old and passively evolving, early-type galaxies (ETGs) are the primary probes in investigating these various evolution scenarios, as well as being useful means to provide insights on cosmological parameters. In this thesis work I focused specifically on ETGs and on their capability in constraining galaxy formation and evolution; in particular, the principal aims were to derive some of the ETGs evolutionary parameters, such as age, metallicity and star formation history (SFH) and to study their age-redshift and mass-age relations. In order to infer galaxy physical parameters, I used the public code STARLIGHT: this program provides a best fit to the observed spectrum from a combination of many theoretical models defined in user-made libraries. the comparison between the output and input light-weighted ages shows a good agreement starting from SNRs of ∼ 10, with a bias of ∼ 2.2% and a dispersion 3%. Furthermore, also metallicities and SFHs are well reproduced. In the second part of the thesis I performed an analysis on real data, starting from Sloan Digital Sky Survey (SDSS) spectra. I found that galaxies get older with cosmic time and with increasing mass (for a fixed redshift bin); absolute light-weighted ages, instead, result independent from the fitting parameters or the synthetic models used. Metallicities, instead, are very similar from each other and clearly consistent with the ones derived from the Lick indices. The predicted SFH indicates the presence of a double burst of star formation. Velocity dispersions and extinctiona are also well constrained, following the expected behaviours. As a further step, I also fitted single SDSS spectra (with SNR∼ 20), to verify that stacked spectra gave the same results without introducing any bias: this is an important check, if one wants to apply the method at higher z, where stacked spectra are necessary to increase the SNR. Our upcoming aim is to adopt this approach also on galaxy spectra obtained from higher redshift Surveys, such as BOSS (z ∼ 0.5), zCOSMOS (z 1), K20 (z ∼ 1), GMASS (z ∼ 1.5) and, eventually, Euclid (z 2). Indeed, I am currently carrying on a preliminary study to estabilish the applicability of the method to lower resolution, as well as higher redshift (z 2) spectra, just like the Euclid ones.
Resumo:
We have investigated the use of hierarchical clustering of flow cytometry data to classify samples of conventional central chondrosarcoma, a malignant cartilage forming tumor of uncertain cellular origin, according to similarities with surface marker profiles of several known cell types. Human primary chondrosarcoma cells, articular chondrocytes, mesenchymal stem cells, fibroblasts, and a panel of tumor cell lines from chondrocytic or epithelial origin were clustered based on the expression profile of eleven surface markers. For clustering, eight hierarchical clustering algorithms, three distance metrics, as well as several approaches for data preprocessing, including multivariate outlier detection, logarithmic transformation, and z-score normalization, were systematically evaluated. By selecting clustering approaches shown to give reproducible results for cluster recovery of known cell types, primary conventional central chondrosacoma cells could be grouped in two main clusters with distinctive marker expression signatures: one group clustering together with mesenchymal stem cells (CD49b-high/CD10-low/CD221-high) and a second group clustering close to fibroblasts (CD49b-low/CD10-high/CD221-low). Hierarchical clustering also revealed substantial differences between primary conventional central chondrosarcoma cells and established chondrosarcoma cell lines, with the latter not only segregating apart from primary tumor cells and normal tissue cells, but clustering together with cell lines from epithelial lineage. Our study provides a foundation for the use of hierarchical clustering applied to flow cytometry data as a powerful tool to classify samples according to marker expression patterns, which could lead to uncover new cancer subtypes.