8 resultados para Large-scale Structure Of Universe
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
In the thesis we present the implementation of the quadratic maximum likelihood (QML) method, ideal to estimate the angular power spectrum of the cross-correlation between cosmic microwave background (CMB) and large scale structure (LSS) maps as well as their individual auto-spectra. Such a tool is an optimal method (unbiased and with minimum variance) in pixel space and goes beyond all the previous harmonic analysis present in the literature. We describe the implementation of the QML method in the {\it BolISW} code and demonstrate its accuracy on simulated maps throughout a Monte Carlo. We apply this optimal estimator to WMAP 7-year and NRAO VLA Sky Survey (NVSS) data and explore the robustness of the angular power spectrum estimates obtained by the QML method. Taking into account the shot noise and one of the systematics (declination correction) in NVSS, we can safely use most of the information contained in this survey. On the contrary we neglect the noise in temperature since WMAP is already cosmic variance dominated on the large scales. Because of a discrepancy in the galaxy auto spectrum between the estimates and the theoretical model, we use two different galaxy distributions: the first one with a constant bias $b$ and the second one with a redshift dependent bias $b(z)$. Finally, we make use of the angular power spectrum estimates obtained by the QML method to derive constraints on the dark energy critical density in a flat $\Lambda$CDM model by different likelihood prescriptions. When using just the cross-correlation between WMAP7 and NVSS maps with 1.8° resolution, we show that $\Omega_\Lambda$ is about the 70\% of the total energy density, disfavouring an Einstein-de Sitter Universe at more than 2 $\sigma$ CL (confidence level).
Resumo:
Flood disasters are a major cause of fatalities and economic losses, and several studies indicate that global flood risk is currently increasing. In order to reduce and mitigate the impact of river flood disasters, the current trend is to integrate existing structural defences with non structural measures. This calls for a wider application of advanced hydraulic models for flood hazard and risk mapping, engineering design, and flood forecasting systems. Within this framework, two different hydraulic models for large scale analysis of flood events have been developed. The two models, named CA2D and IFD-GGA, adopt an integrated approach based on the diffusive shallow water equations and a simplified finite volume scheme. The models are also designed for massive code parallelization, which has a key importance in reducing run times in large scale and high-detail applications. The two models were first applied to several numerical cases, to test the reliability and accuracy of different model versions. Then, the most effective versions were applied to different real flood events and flood scenarios. The IFD-GGA model showed serious problems that prevented further applications. On the contrary, the CA2D model proved to be fast and robust, and able to reproduce 1D and 2D flow processes in terms of water depth and velocity. In most applications the accuracy of model results was good and adequate to large scale analysis. Where complex flow processes occurred local errors were observed, due to the model approximations. However, they did not compromise the correct representation of overall flow processes. In conclusion, the CA model can be a valuable tool for the simulation of a wide range of flood event types, including lowland and flash flood events.
Resumo:
Redshift Space Distortions (RSD) are an apparent anisotropy in the distribution of galaxies due to their peculiar motion. These features are imprinted in the correlation function of galaxies, which describes how these structures distribute around each other. RSD can be represented by a distortions parameter $\beta$, which is strictly related to the growth of cosmic structures. For this reason, measurements of RSD can be exploited to give constraints on the cosmological parameters, such us for example the neutrino mass. Neutrinos are neutral subatomic particles that come with three flavours, the electron, the muon and the tau neutrino. Their mass differences can be measured in the oscillation experiments. Information on the absolute scale of neutrino mass can come from cosmology, since neutrinos leave a characteristic imprint on the large scale structure of the universe. The aim of this thesis is to provide constraints on the accuracy with which neutrino mass can be estimated when expoiting measurements of RSD. In particular we want to describe how the error on the neutrino mass estimate depends on three fundamental parameters of a galaxy redshift survey: the density of the catalogue, the bias of the sample considered and the volume observed. In doing this we make use of the BASICC Simulation from which we extract a series of dark matter halo catalogues, characterized by different value of bias, density and volume. This mock data are analysed via a Markov Chain Monte Carlo procedure, in order to estimate the neutrino mass fraction, using the software package CosmoMC, which has been conveniently modified. In this way we are able to extract a fitting formula describing our measurements, which can be used to forecast the precision reachable in future surveys like Euclid, using this kind of observations.
Resumo:
The continuous increase of genome sequencing projects produced a huge amount of data in the last 10 years: currently more than 600 prokaryotic and 80 eukaryotic genomes are fully sequenced and publically available. However the sole sequencing process of a genome is able to determine just raw nucleotide sequences. This is only the first step of the genome annotation process that will deal with the issue of assigning biological information to each sequence. The annotation process is done at each different level of the biological information processing mechanism, from DNA to protein, and cannot be accomplished only by in vitro analysis procedures resulting extremely expensive and time consuming when applied at a this large scale level. Thus, in silico methods need to be used to accomplish the task. The aim of this work was the implementation of predictive computational methods to allow a fast, reliable, and automated annotation of genomes and proteins starting from aminoacidic sequences. The first part of the work was focused on the implementation of a new machine learning based method for the prediction of the subcellular localization of soluble eukaryotic proteins. The method is called BaCelLo, and was developed in 2006. The main peculiarity of the method is to be independent from biases present in the training dataset, which causes the over‐prediction of the most represented examples in all the other available predictors developed so far. This important result was achieved by a modification, made by myself, to the standard Support Vector Machine (SVM) algorithm with the creation of the so called Balanced SVM. BaCelLo is able to predict the most important subcellular localizations in eukaryotic cells and three, kingdom‐specific, predictors were implemented. In two extensive comparisons, carried out in 2006 and 2008, BaCelLo reported to outperform all the currently available state‐of‐the‐art methods for this prediction task. BaCelLo was subsequently used to completely annotate 5 eukaryotic genomes, by integrating it in a pipeline of predictors developed at the Bologna Biocomputing group by Dr. Pier Luigi Martelli and Dr. Piero Fariselli. An online database, called eSLDB, was developed by integrating, for each aminoacidic sequence extracted from the genome, the predicted subcellular localization merged with experimental and similarity‐based annotations. In the second part of the work a new, machine learning based, method was implemented for the prediction of GPI‐anchored proteins. Basically the method is able to efficiently predict from the raw aminoacidic sequence both the presence of the GPI‐anchor (by means of an SVM), and the position in the sequence of the post‐translational modification event, the so called ω‐site (by means of an Hidden Markov Model (HMM)). The method is called GPIPE and reported to greatly enhance the prediction performances of GPI‐anchored proteins over all the previously developed methods. GPIPE was able to predict up to 88% of the experimentally annotated GPI‐anchored proteins by maintaining a rate of false positive prediction as low as 0.1%. GPIPE was used to completely annotate 81 eukaryotic genomes, and more than 15000 putative GPI‐anchored proteins were predicted, 561 of which are found in H. sapiens. In average 1% of a proteome is predicted as GPI‐anchored. A statistical analysis was performed onto the composition of the regions surrounding the ω‐site that allowed the definition of specific aminoacidic abundances in the different considered regions. Furthermore the hypothesis that compositional biases are present among the four major eukaryotic kingdoms, proposed in literature, was tested and rejected. All the developed predictors and databases are freely available at: BaCelLo http://gpcr.biocomp.unibo.it/bacello eSLDB http://gpcr.biocomp.unibo.it/esldb GPIPE http://gpcr.biocomp.unibo.it/gpipe
Resumo:
The Geoffroy’s bat Myotis emarginatus is mainly present in southern, south-eastern and central Europe (Červerný, 1999) and is often recorded from northern Spain (Quetglas, 2002; Flaquer et al., 2004). It has demonstrated the species’ preference for forest. Myotis capaccinii, confined to the Mediterranean (Guille´n, 1999), is classified as ‘vulnerable’ on a global scale (Hutson, Mickleburgh & Racey, 2001). In general, the species preferred calm waters bordered by well-developed riparian vegetation and large (> 5 m) inter-bank distances (Biscardi et al. 2007). In this study we present the first results about population genetic structure of these two species of genus Myotis. We used two methods of sampling: invasive and non-invasive techniques. A total of 323 invasive samples and a total of 107 non-invasive samples were collected and analyzed. For Myotis emarginatus we have individuated for the first time a set of 7 microsatellites, which can work on this species, started from a set developed on Myotis myotis (Castella et al. 2000). We developed also a method for analysis of non-invasive samples, that given a good percentage of positive analyzed samples. The results have highlighted for the species Myotis emarginatus the presence on the European territory of two big groups, discovered by using the microsatellites tracers. On this species, 33 haplotypes of Dloop have been identified, some of them are presented only in some colonies. We identified respectively 33 haplotypes of Dloop and 10 of cytB for Myotis emarginatus and 25 of dloop and 15 of cytB for Myotis capaccinii. Myotis emarginatus’ results, both microsatellites and mtDNA, show that there is a strong genetic flow between different colonies across Europe. The results achieved on Myotis capaccinii are very interesting, in this case either for the microsatellites or the mitochondrial DNA sequences, and it has been highlighted a big difference between different colonies.
Resumo:
Bioinformatics, in the last few decades, has played a fundamental role to give sense to the huge amount of data produced. Obtained the complete sequence of a genome, the major problem of knowing as much as possible of its coding regions, is crucial. Protein sequence annotation is challenging and, due to the size of the problem, only computational approaches can provide a feasible solution. As it has been recently pointed out by the Critical Assessment of Function Annotations (CAFA), most accurate methods are those based on the transfer-by-homology approach and the most incisive contribution is given by cross-genome comparisons. In the present thesis it is described a non-hierarchical sequence clustering method for protein automatic large-scale annotation, called “The Bologna Annotation Resource Plus” (BAR+). The method is based on an all-against-all alignment of more than 13 millions protein sequences characterized by a very stringent metric. BAR+ can safely transfer functional features (Gene Ontology and Pfam terms) inside clusters by means of a statistical validation, even in the case of multi-domain proteins. Within BAR+ clusters it is also possible to transfer the three dimensional structure (when a template is available). This is possible by the way of cluster-specific HMM profiles that can be used to calculate reliable template-to-target alignments even in the case of distantly related proteins (sequence identity < 30%). Other BAR+ based applications have been developed during my doctorate including the prediction of Magnesium binding sites in human proteins, the ABC transporters superfamily classification and the functional prediction (GO terms) of the CAFA targets. Remarkably, in the CAFA assessment, BAR+ placed among the ten most accurate methods. At present, as a web server for the functional and structural protein sequence annotation, BAR+ is freely available at http://bar.biocomp.unibo.it/bar2.0.
Resumo:
The wide diffusion of cheap, small, and portable sensors integrated in an unprecedented large variety of devices and the availability of almost ubiquitous Internet connectivity make it possible to collect an unprecedented amount of real time information about the environment we live in. These data streams, if properly and timely analyzed, can be exploited to build new intelligent and pervasive services that have the potential of improving people's quality of life in a variety of cross concerning domains such as entertainment, health-care, or energy management. The large heterogeneity of application domains, however, calls for a middleware-level infrastructure that can effectively support their different quality requirements. In this thesis we study the challenges related to the provisioning of differentiated quality-of-service (QoS) during the processing of data streams produced in pervasive environments. We analyze the trade-offs between guaranteed quality, cost, and scalability in streams distribution and processing by surveying existing state-of-the-art solutions and identifying and exploring their weaknesses. We propose an original model for QoS-centric distributed stream processing in data centers and we present Quasit, its prototype implementation offering a scalable and extensible platform that can be used by researchers to implement and validate novel QoS-enforcement mechanisms. To support our study, we also explore an original class of weaker quality guarantees that can reduce costs when application semantics do not require strict quality enforcement. We validate the effectiveness of this idea in a practical use-case scenario that investigates partial fault-tolerance policies in stream processing by performing a large experimental study on the prototype of our novel LAAR dynamic replication technique. Our modeling, prototyping, and experimental work demonstrates that, by providing data distribution and processing middleware with application-level knowledge of the different quality requirements associated to different pervasive data flows, it is possible to improve system scalability while reducing costs.
Resumo:
The Large Magellanic Cloud (LMC) is widely considered as the first step of the cosmological distance ladder, since it contains many different distance indicators. An accurate determination of the distance to the LMC allows one to calibrate these distance indicators that are then used to measure the distance to far objects. The main goal of this thesis is to study the distance and structure of the LMC, as traced by different distance indicators. For these purposes three types of distance indicators were chosen: Classical Cepheids,``hot'' eclipsing binaries and RR Lyrae stars. These objects belong to different stellar populations tracing, in turn, different sub-structures of the LMC. The RR Lyrae stars (age >10 Gyr) are distributed smoothly and likely trace the halo of the LMC. Classical Cepheids are young objects (age 50-200 Myr), mainly located in the bar and spiral arm of the galaxy, while ``hot'' eclipsing binaries mainly trace the star forming regions of the LMC. Furthermore, we have chosen these distance indicators for our study, since the calibration of their zero-points is based on fundamental geometric methods. The ESA cornerstone mission Gaia, launched on 19 December 2013, will measure trigonometric parallaxes for one billion stars with an accuracy of 20 micro-arcsec at V=15 mag, and 200 micro-arcsec at V=20 mag, thus will allow us to calibrate the zero-points of Classical Cepheids, eclipsing binaries and RR Lyrae stars with an unprecedented precision.