2 resultados para Bachelor of arts degree
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
The main aim of this Ph.D. dissertation is the study of clustering dependent data by means of copula functions with particular emphasis on microarray data. Copula functions are a popular multivariate modeling tool in each field where the multivariate dependence is of great interest and their use in clustering has not been still investigated. The first part of this work contains the review of the literature of clustering methods, copula functions and microarray experiments. The attention focuses on the K–means (Hartigan, 1975; Hartigan and Wong, 1979), the hierarchical (Everitt, 1974) and the model–based (Fraley and Raftery, 1998, 1999, 2000, 2007) clustering techniques because their performance is compared. Then, the probabilistic interpretation of the Sklar’s theorem (Sklar’s, 1959), the estimation methods for copulas like the Inference for Margins (Joe and Xu, 1996) and the Archimedean and Elliptical copula families are presented. In the end, applications of clustering methods and copulas to the genetic and microarray experiments are highlighted. The second part contains the original contribution proposed. A simulation study is performed in order to evaluate the performance of the K–means and the hierarchical bottom–up clustering methods in identifying clusters according to the dependence structure of the data generating process. Different simulations are performed by varying different conditions (e.g., the kind of margins (distinct, overlapping and nested) and the value of the dependence parameter ) and the results are evaluated by means of different measures of performance. In light of the simulation results and of the limits of the two investigated clustering methods, a new clustering algorithm based on copula functions (‘CoClust’ in brief) is proposed. The basic idea, the iterative procedure of the CoClust and the description of the written R functions with their output are given. The CoClust algorithm is tested on simulated data (by varying the number of clusters, the copula models, the dependence parameter value and the degree of overlap of margins) and is compared with the performance of model–based clustering by using different measures of performance, like the percentage of well–identified number of clusters and the not rejection percentage of H0 on . It is shown that the CoClust algorithm allows to overcome all observed limits of the other investigated clustering techniques and is able to identify clusters according to the dependence structure of the data independently of the degree of overlap of margins and the strength of the dependence. The CoClust uses a criterion based on the maximized log–likelihood function of the copula and can virtually account for any possible dependence relationship between observations. Many peculiar characteristics are shown for the CoClust, e.g. its capability of identifying the true number of clusters and the fact that it does not require a starting classification. Finally, the CoClust algorithm is applied to the real microarray data of Hedenfalk et al. (2001) both to the gene expressions observed in three different cancer samples and to the columns (tumor samples) of the whole data matrix.
Resumo:
An integrated array of analytical methods -including clay mineralogy, vitrinite reflectance, Raman spectroscopy on carbonaceous material, and apatite fission-track analysis- was employed to constrain the thermal and thermochronological evolution of selected portions of the Pontides of northern Turkey. (1) A multimethod investigation was applied for the first time to characterise the thermal history of the Karakaya Complex, a Permo-Triassic subduction-accretion complex cropping out throughout the Sakarya Zone. The results indicate two different thermal regimes: the Lower Karakaya Complex (Nilüfer Unit) -mostly made of metabasite and marble- suffered peak temperatures of 300-500°C (greenschist facies); the Upper Karakaya Complex (Hodul and the Orhanlar Units) –mostly made of greywacke and arkose- yielded heterogeneous peak temperatures (125-376°C), possibly the result of different degree of involvement of the units in the complex dynamic processes of the accretionary wedge. Contrary to common belief, the results of this study indicate that the entire Karakaya Complex suffered metamorphic conditions. Moreover, a good degree of correlation among the results of these methods demonstrate that Raman spectroscopy on carbonaceous material can be applied successfully to temperature ranges of 200-330°C, thus extending the application of this method from higher grade metamorphic contexts to lower grade metamorphic conditions. (2) Apatite fission-track analysis was applied to the Sakarya and the İstanbul Zones in order to constrain the exhumation history and timing of amalgamation of these two exotic terranes. AFT ages from the İstanbul and Sakarya terranes recorded three distinct episodes of exhumation related to the complex tectonic evolution of the Pontides. (i) Paleocene - early Eocene ages (62.3-50.3 Ma) reflect the closure of the İzmir-Ankara ocean and the ensuing collision between the Sakarya terrane and the Anatolide-Tauride Block. (ii) Late Eocene - earliest Oligocene (43.5-32.3 Ma) ages reflect renewed tectonic activity along the İzmir-Ankara. (iii) Late Oligocene- Early Miocene ages reflect the onset and development of the northern Aegean extension. The consistency of AFT ages, both north and south of the tectonic contact between the İstanbul and Sakarya terranes, suggest that such terranes were amalgamated in pre-Cenozoic times. (3) Fission-track analysis was also applied to rock samples from the Marmara region, in an attempt to constrain the inception and development of the North Anatolian Fault system in the region. The results agree with those from the central Pontides. The youngest AFT ages (Late Oligocene - early Miocene) were recorded in the western portion of the Marmara Sea region and reflect the onset and development of northern Aegean extension. Fission-track data from the eastern Marmara Sea region indicate rapid Early Eocene exhumation induced by the development of the İzmir-Ankara orogenic wedge. Thermochronological data along the trace of the Ganos Fault –a segment of the North Anatolian Fault system- indicate the presence of a tectonic discontinuity active by Late Oligocene time, i.e. well before the arrival of the North Anatolian Fault system in the area. The integration of thermochronologic data with preexisting structural data point to the existence of a system of major E-W-trending structural discontinuities active at least from the Late Oligocene. In the Early Pliocene, inception of the present-day North Anatolian Fault system in the Marmara region occurred by reactivation of these older tectonic structures.