17 resultados para Data modeling
Resumo:
The recent advent of Next-generation sequencing technologies has revolutionized the way of analyzing the genome. This innovation allows to get deeper information at a lower cost and in less time, and provides data that are discrete measurements. One of the most important applications with these data is the differential analysis, that is investigating if one gene exhibit a different expression level in correspondence of two (or more) biological conditions (such as disease states, treatments received and so on). As for the statistical analysis, the final aim will be statistical testing and for modeling these data the Negative Binomial distribution is considered the most adequate one especially because it allows for "over dispersion". However, the estimation of the dispersion parameter is a very delicate issue because few information are usually available for estimating it. Many strategies have been proposed, but they often result in procedures based on plug-in estimates, and in this thesis we show that this discrepancy between the estimation and the testing framework can lead to uncontrolled first-type errors. We propose a mixture model that allows each gene to share information with other genes that exhibit similar variability. Afterwards, three consistent statistical tests are developed for differential expression analysis. We show that the proposed method improves the sensitivity of detecting differentially expressed genes with respect to the common procedures, since it is the best one in reaching the nominal value for the first-type error, while keeping elevate power. The method is finally illustrated on prostate cancer RNA-seq data.
Resumo:
The kinematics is a fundamental tool to infer the dynamical structure of galaxies and to understand their formation and evolution. Spectroscopic observations of gas emission lines are often used to derive rotation curves and velocity dispersions. It is however difficult to disentangle these two quantities in low spatial-resolution data because of beam smearing. In this thesis, we present 3D-Barolo, a new software to derive the gas kinematics of disk galaxies from emission-line data-cubes. The code builds tilted-ring models in the 3D observational space and compares them with the actual data-cubes. 3D-Barolo works with data at a wide range of spatial resolutions without being affected by instrumental biases. We use 3D-Barolo to derive rotation curves and velocity dispersions of several galaxies in both the local and the high-redshift Universe. We run our code on HI observations of nearby galaxies and we compare our results with 2D traditional approaches. We show that a 3D approach to the derivation of the gas kinematics has to be preferred to a 2D approach whenever a galaxy is resolved with less than about 20 elements across the disk. We moreover analyze a sample of galaxies at z~1, observed in the H-alpha line with the KMOS/VLT spectrograph. Our 3D modeling reveals that the kinematics of these high-z systems is comparable to that of local disk galaxies, with steeply-rising rotation curves followed by a flat part and H-alpha velocity dispersions of 15-40 km/s over the whole disks. This evidence suggests that disk galaxies were already fully settled about 7-8 billion years ago. In summary, 3D-Barolo is a powerful and robust tool to separate physical and instrumental effects and to derive a reliable kinematics. The analysis of large samples of galaxies at different redshifts with 3D-Barolo will provide new insights on how galaxies assemble and evolve throughout cosmic time.