4 resultados para Unbalanced Starting
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
In the past decade, the advent of efficient genome sequencing tools and high-throughput experimental biotechnology has lead to enormous progress in the life science. Among the most important innovations is the microarray tecnology. It allows to quantify the expression for thousands of genes simultaneously by measurin the hybridization from a tissue of interest to probes on a small glass or plastic slide. The characteristics of these data include a fair amount of random noise, a predictor dimension in the thousand, and a sample noise in the dozens. One of the most exciting areas to which microarray technology has been applied is the challenge of deciphering complex disease such as cancer. In these studies, samples are taken from two or more groups of individuals with heterogeneous phenotypes, pathologies, or clinical outcomes. these samples are hybridized to microarrays in an effort to find a small number of genes which are strongly correlated with the group of individuals. Eventhough today methods to analyse the data are welle developed and close to reach a standard organization (through the effort of preposed International project like Microarray Gene Expression Data -MGED- Society [1]) it is not unfrequant to stumble in a clinician's question that do not have a compelling statistical method that could permit to answer it.The contribution of this dissertation in deciphering disease regards the development of new approaches aiming at handle open problems posed by clinicians in handle specific experimental designs. In Chapter 1 starting from a biological necessary introduction, we revise the microarray tecnologies and all the important steps that involve an experiment from the production of the array, to the quality controls ending with preprocessing steps that will be used into the data analysis in the rest of the dissertation. While in Chapter 2 a critical review of standard analysis methods are provided stressing most of problems that In Chapter 3 is introduced a method to adress the issue of unbalanced design of miacroarray experiments. In microarray experiments, experimental design is a crucial starting-point for obtaining reasonable results. In a two-class problem, an equal or similar number of samples it should be collected between the two classes. However in some cases, e.g. rare pathologies, the approach to be taken is less evident. We propose to address this issue by applying a modified version of SAM [2]. MultiSAM consists in a reiterated application of a SAM analysis, comparing the less populated class (LPC) with 1,000 random samplings of the same size from the more populated class (MPC) A list of the differentially expressed genes is generated for each SAM application. After 1,000 reiterations, each single probe given a "score" ranging from 0 to 1,000 based on its recurrence in the 1,000 lists as differentially expressed. The performance of MultiSAM was compared to the performance of SAM and LIMMA [3] over two simulated data sets via beta and exponential distribution. The results of all three algorithms over low- noise data sets seems acceptable However, on a real unbalanced two-channel data set reagardin Chronic Lymphocitic Leukemia, LIMMA finds no significant probe, SAM finds 23 significantly changed probes but cannot separate the two classes, while MultiSAM finds 122 probes with score >300 and separates the data into two clusters by hierarchical clustering. We also report extra-assay validation in terms of differentially expressed genes Although standard algorithms perform well over low-noise simulated data sets, multi-SAM seems to be the only one able to reveal subtle differences in gene expression profiles on real unbalanced data. In Chapter 4 a method to adress similarities evaluation in a three-class prblem by means of Relevance Vector Machine [4] is described. In fact, looking at microarray data in a prognostic and diagnostic clinical framework, not only differences could have a crucial role. In some cases similarities can give useful and, sometimes even more, important information. The goal, given three classes, could be to establish, with a certain level of confidence, if the third one is similar to the first or the second one. In this work we show that Relevance Vector Machine (RVM) [2] could be a possible solutions to the limitation of standard supervised classification. In fact, RVM offers many advantages compared, for example, with his well-known precursor (Support Vector Machine - SVM [3]). Among these advantages, the estimate of posterior probability of class membership represents a key feature to address the similarity issue. This is a highly important, but often overlooked, option of any practical pattern recognition system. We focused on Tumor-Grade-three-class problem, so we have 67 samples of grade I (G1), 54 samples of grade 3 (G3) and 100 samples of grade 2 (G2). The goal is to find a model able to separate G1 from G3, then evaluate the third class G2 as test-set to obtain the probability for samples of G2 to be member of class G1 or class G3. The analysis showed that breast cancer samples of grade II have a molecular profile more similar to breast cancer samples of grade I. Looking at the literature this result have been guessed, but no measure of significance was gived before.
Resumo:
The study presented in this work deals with the investigation of the effects produced by two common techniques of static balancing on the dynamic performances of closed-chain linkages, taking into account the compliance of the mechanism components. The long-term goal of the research consists in determining an optimal balancing strategy for parallel spatial manipulators. The present contribution is a starting point and it focuses on the planar four-bar linkage, intended as the simplest example of closed-chain mechanism. The elastodynamic behaviour of an unbalanced four-bar linkage and two balanced ones, respectively obtained by mass and elastic balancing, is investigated by means of both numerical simulations and experimental tests. The purpose of this work is to obtain preliminary results, to be refined and broadened in future developments
Resumo:
The present doctoral thesis is structured as a collection of three essays. The first essay, “SOC(HE)-Italy: a classification for graduate occupations” presents the conceptual basis, the construction, the validation and the application to the Italian labour force of the occupational classification termed SOC(HE)-Italy. I have developed this classification under the supervision of Kate Purcell during my period as a visiting research student at the Warwick Institute for Emplyment Research. This classification links the constituent tasks and duties of a particular job to the relevant knowledge and skills imparted via Higher Education (HE). It is based onto the SOC(HE)2010, an occupational classification first proposed by Kate Purcell in 2013, but differently constructed. In the second essay “Assessing the incidence and wage effects of overeducation among Italian graduates using a new measure for educational requirements” I utilize this classification to build a valid and reliable measure for job requirements. The lack of an unbiased measure for this dimension constitutes one of the major constraints to achieve a generally accepted measurement of overeducation. Estimations of overeducation incidence and wage effects are run onto AlmaLaurea data from the survey on graduates career paths. I have written this essay and obtained these estimates benefiting of the help and guidance of Giovanni Guidetti and Giulio Pedrini. The third and last essay titled “Overeducation in the Italian labour market: clarifying the concepts and addressing the measurement error problem” addresses a number of theoretical issues concerning the concepts of educational mismatch and overeducation. Using Istat data from RCFL survey I run estimates of the ORU model for the whole Italian labour force. In my knowledge, this is the first time ever such model is estimated on such population. In addition, I adopt the new measure of overeducation based onto the SOC(HE)-Italy classification.
Resumo:
Topoisomerase I (Top1) poisons are among the most clinically-effective drugs used for colon, ovary and lung cancers. Unpublished data from our lab have recently revealed that the structurally-unrelated Top1 poisons, Camptothecin (CPT) and Indimitecan (LMP776), induce the formation of micronuclei (MNi) in human cancer cells. In addition, MNi trigger an innate immune gene response by stimulating the cGAS/STING pathway. As the mechanisms of MNi formation are not fully determined, our aim is here to establish how MNi form after Top1 poisoning. Using immunofluorescence assays and EdU labelling of nascent DNAs, our results show that, after 24 hours of recovery, a short treatment with sub-cytotoxic doses of Top1 poisons induces the formation of MNi that do not contain newly synthetized (EdU+) DNA. We also saw that Top1 poisons delay replication machinery reducing EdU incorporation and produce significant levels of the damage markers γH2AX and p53BP1 in S-phase cells but not in G1 and G2/M cells. The results also show that MNi formation is dependent on R-loops, as RNaseH1 overexpression markedly reduces Top1 induced MNi. Genome-wide mapping of R-loops by DRIP-seq technique revealed that R-loop levels are both decreased and increased by CPT. In particular, increased R-loops are mainly found at active genes and always overlapped with Top1cc sites. We also found that increased R-loops overlap with lamina-associated chromatin domains while decreased R-loops correlate with replication origin sites. Overall, our data are consistent with the formation of MNi due to R-loop increase and under-replication at specific regions caused by Top1 poisons. These results will eventually help in developing new strategies for effective personalized interventions by using Top1-targeted compounds as immuno-modulators in cancer patients.