9 resultados para Data Generation

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Biological data are inherently interconnected: protein sequences are connected to their annotations, the annotations are structured into ontologies, and so on. While protein-protein interactions are already represented by graphs, in this work I am presenting how a graph structure can be used to enrich the annotation of protein sequences thanks to algorithms that analyze the graph topology. We also describe a novel solution to restrict the data generation needed for building such a graph, thanks to constraints on the data and dynamic programming. The proposed algorithm ideally improves the generation time by a factor of 5. The graph representation is then exploited to build a comprehensive database, thanks to the rising technology of graph databases. While graph databases are widely used for other kind of data, from Twitter tweets to recommendation systems, their application to bioinformatics is new. A graph database is proposed, with a structure that can be easily expanded and queried.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The quality of temperature and humidity retrievals from the infrared SEVIRI sensors on the geostationary Meteosat Second Generation (MSG) satellites is assessed by means of a one dimensional variational algorithm. The study is performed with the aim of improving the spatial and temporal resolution of available observations to feed analysis systems designed for high resolution regional scale numerical weather prediction (NWP) models. The non-hydrostatic forecast model COSMO (COnsortium for Small scale MOdelling) in the ARPA-SIM operational configuration is used to provide background fields. Only clear sky observations over sea are processed. An optimised 1D–VAR set-up comprising of the two water vapour and the three window channels is selected. It maximises the reduction of errors in the model backgrounds while ensuring ease of operational implementation through accurate bias correction procedures and correct radiative transfer simulations. The 1D–VAR retrieval quality is firstly quantified in relative terms employing statistics to estimate the reduction in the background model errors. Additionally the absolute retrieval accuracy is assessed comparing the analysis with independent radiosonde and satellite observations. The inclusion of satellite data brings a substantial reduction in the warm and dry biases present in the forecast model. Moreover it is shown that the retrieval profiles generated by the 1D–VAR are well correlated with the radiosonde measurements. Subsequently the 1D–VAR technique is applied to two three–dimensional case–studies: a false alarm case–study occurred in Friuli–Venezia–Giulia on the 8th of July 2004 and a heavy precipitation case occurred in Emilia–Romagna region between 9th and 12th of April 2005. The impact of satellite data for these two events is evaluated in terms of increments in the integrated water vapour and saturation water vapour over the column, in the 2 meters temperature and specific humidity and in the surface temperature. To improve the 1D–VAR technique a method to calculate flow–dependent model error covariance matrices is also assessed. The approach employs members from an ensemble forecast system generated by perturbing physical parameterisation schemes inside the model. The improved set–up applied to the case of 8th of July 2004 shows a substantial neutral impact.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The inherent stochastic character of most of the physical quantities involved in engineering models has led to an always increasing interest for probabilistic analysis. Many approaches to stochastic analysis have been proposed. However, it is widely acknowledged that the only universal method available to solve accurately any kind of stochastic mechanics problem is Monte Carlo Simulation. One of the key parts in the implementation of this technique is the accurate and efficient generation of samples of the random processes and fields involved in the problem at hand. In the present thesis an original method for the simulation of homogeneous, multi-dimensional, multi-variate, non-Gaussian random fields is proposed. The algorithm has proved to be very accurate in matching both the target spectrum and the marginal probability. The computational efficiency and robustness are very good too, even when dealing with strongly non-Gaussian distributions. What is more, the resulting samples posses all the relevant, welldefined and desired properties of “translation fields”, including crossing rates and distributions of extremes. The topic of the second part of the thesis lies in the field of non-destructive parametric structural identification. Its objective is to evaluate the mechanical characteristics of constituent bars in existing truss structures, using static loads and strain measurements. In the cases of missing data and of damages that interest only a small portion of the bar, Genetic Algorithm have proved to be an effective tool to solve the problem.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Subduction zones are the favorite places to generate tsunamigenic earthquakes, where friction between oceanic and continental plates causes the occurrence of a strong seismicity. The topics and the methodologies discussed in this thesis are focussed to the understanding of the rupture process of the seismic sources of great earthquakes that generate tsunamis. The tsunamigenesis is controlled by several kinematical characteristic of the parent earthquake, as the focal mechanism, the depth of the rupture, the slip distribution along the fault area and by the mechanical properties of the source zone. Each of these factors plays a fundamental role in the tsunami generation. Therefore, inferring the source parameters of tsunamigenic earthquakes is crucial to understand the generation of the consequent tsunami and so to mitigate the risk along the coasts. The typical way to proceed when we want to gather information regarding the source process is to have recourse to the inversion of geophysical data that are available. Tsunami data, moreover, are useful to constrain the portion of the fault area that extends offshore, generally close to the trench that, on the contrary, other kinds of data are not able to constrain. In this thesis I have discussed the rupture process of some recent tsunamigenic events, as inferred by means of an inverse method. I have presented the 2003 Tokachi-Oki (Japan) earthquake (Mw 8.1). In this study the slip distribution on the fault has been inferred by inverting tsunami waveform, GPS, and bottom-pressure data. The joint inversion of tsunami and geodetic data has revealed a much better constrain for the slip distribution on the fault rather than the separate inversions of single datasets. Then we have studied the earthquake occurred on 2007 in southern Sumatra (Mw 8.4). By inverting several tsunami waveforms, both in the near and in the far field, we have determined the slip distribution and the mean rupture velocity along the causative fault. Since the largest patch of slip was concentrated on the deepest part of the fault, this is the likely reason for the small tsunami waves that followed the earthquake, pointing out how much the depth of the rupture plays a crucial role in controlling the tsunamigenesis. Finally, we have presented a new rupture model for the great 2004 Sumatra earthquake (Mw 9.2). We have performed the joint inversion of tsunami waveform, GPS and satellite altimetry data, to infer the slip distribution, the slip direction, and the rupture velocity on the fault. Furthermore, in this work we have presented a novel method to estimate, in a self-consistent way, the average rigidity of the source zone. The estimation of the source zone rigidity is important since it may play a significant role in the tsunami generation and, particularly for slow earthquakes, a low rigidity value is sometimes necessary to explain how a relatively low seismic moment earthquake may generate significant tsunamis; this latter point may be relevant for explaining the mechanics of the tsunami earthquakes, one of the open issues in present day seismology. The investigation of these tsunamigenic earthquakes has underlined the importance to use a joint inversion of different geophysical data to determine the rupture characteristics. The results shown here have important implications for the implementation of new tsunami warning systems – particularly in the near-field – the improvement of the current ones, and furthermore for the planning of the inundation maps for tsunami-hazard assessment along the coastal area.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis deal with the design of advanced OFDM systems. Both waveform and receiver design have been treated. The main scope of the Thesis is to study, create, and propose, ideas and novel design solutions able to cope with the weaknesses and crucial aspects of modern OFDM systems. Starting from the the transmitter side, the problem represented by low resilience to non-linear distortion has been assessed. A novel technique that considerably reduces the Peak-to-Average Power Ratio (PAPR) yielding a quasi constant signal envelope in the time domain (PAPR close to 1 dB) has been proposed.The proposed technique, named Rotation Invariant Subcarrier Mapping (RISM),is a novel scheme for subcarriers data mapping,where the symbols belonging to the modulation alphabet are not anchored, but maintain some degrees of freedom. In other words, a bit tuple is not mapped on a single point, rather it is mapped onto a geometrical locus, which is totally or partially rotation invariant. The final positions of the transmitted complex symbols are chosen by an iterative optimization process in order to minimize the PAPR of the resulting OFDM symbol. Numerical results confirm that RISM makes OFDM usable even in severe non-linear channels. Another well known problem which has been tackled is the vulnerability to synchronization errors. Indeed in OFDM system an accurate recovery of carrier frequency and symbol timing is crucial for the proper demodulation of the received packets. In general, timing and frequency synchronization is performed in two separate phases called PRE-FFT and POST-FFT synchronization. Regarding the PRE-FFT phase, a novel joint symbol timing and carrier frequency synchronization algorithm has been presented. The proposed algorithm is characterized by a very low hardware complexity, and, at the same time, it guarantees very good performance in in both AWGN and multipath channels. Regarding the POST-FFT phase, a novel approach for both pilot structure and receiver design has been presented. In particular, a novel pilot pattern has been introduced in order to minimize the occurrence of overlaps between two pattern shifted replicas. This allows to replace conventional pilots with nulls in the frequency domain, introducing the so called Silent Pilots. As a result, the optimal receiver turns out to be very robust against severe Rayleigh fading multipath and characterized by low complexity. Performance of this approach has been analytically and numerically evaluated. Comparing the proposed approach with state of the art alternatives, in both AWGN and multipath fading channels, considerable performance improvements have been obtained. The crucial problem of channel estimation has been thoroughly investigated, with particular emphasis on the decimation of the Channel Impulse Response (CIR) through the selection of the Most Significant Samples (MSSs). In this contest our contribution is twofold, from the theoretical side, we derived lower bounds on the estimation mean-square error (MSE) performance for any MSS selection strategy,from the receiver design we proposed novel MSS selection strategies which have been shown to approach these MSE lower bounds, and outperformed the state-of-the-art alternatives. Finally, the possibility of using of Single Carrier Frequency Division Multiple Access (SC-FDMA) in the Broadband Satellite Return Channel has been assessed. Notably, SC-FDMA is able to improve the physical layer spectral efficiency with respect to single carrier systems, which have been used so far in the Return Channel Satellite (RCS) standards. However, it requires a strict synchronization and it is also sensitive to phase noise of local radio frequency oscillators. For this reason, an effective pilot tone arrangement within the SC-FDMA frame, and a novel Joint Multi-User (JMU) estimation method for the SC-FDMA, has been proposed. As shown by numerical results, the proposed scheme manages to satisfy strict synchronization requirements and to guarantee a proper demodulation of the received signal.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background. Human small cell lung cancer (SCLC) accounting for approximately 15-20% of all lung cancers, is an aggressive tumor with high propensity for early regional and distant metastases. Although the initial tumor rate response to chemotherapy is very high, SCLC relapses after approximately 4 months in ED and 12 months in LD. Basal cell carcinoma (BCC) is the most prevalent cancer in the western world, and its incidence is increasing worldwide. This type of cancer rarely metastasizes and the death rate is extraordinary low. Surgery is curative for most of the patients, but for those that develop locally advanced or metastatic BCC there is currently no effective treatment. Both types of cancer have been deeply investigated and genetic alterations, MYCN amplification (MA) among the most interesting, have been found. These could become targets of new pharmacological therapies. Procedures. We created and characterized novel BLI xenograft orthotopic mouse models of SCLC to evaluate the tumor onset and progression and the efficacy of new pharmacological strategies. We compared an in vitro model with a transgenic mouse model of BCC, to investigate and delineate the canonical HH signalling pathway and its connections with other molecular pathways. Results and conclusions. The orthotopic models showed latency and progression patterns similar to human disease. Chemotherapy treatments improved survival rates and validated the in vivo model. The presence of MA and overexpression were confirmed in each model and we tested the efficacy of a new MYCN inhibitor in vitro. Preliminar data of BCC models highlighted Hedgehog pathway role and underlined the importance of both in vitro and in vivo strategies to achieve a better understanding of the pathology and to evaluate the applicability of new therapeutic compounds

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The discovery of the Cosmic Microwave Background (CMB) radiation in 1965 is one of the fundamental milestones supporting the Big Bang theory. The CMB is one of the most important source of information in cosmology. The excellent accuracy of the recent CMB data of WMAP and Planck satellites confirmed the validity of the standard cosmological model and set a new challenge for the data analysis processes and their interpretation. In this thesis we deal with several aspects and useful tools of the data analysis. We focus on their optimization in order to have a complete exploitation of the Planck data and contribute to the final published results. The issues investigated are: the change of coordinates of CMB maps using the HEALPix package, the problem of the aliasing effect in the generation of low resolution maps, the comparison of the Angular Power Spectrum (APS) extraction performances of the optimal QML method, implemented in the code called BolPol, and the pseudo-Cl method, implemented in Cromaster. The QML method has been then applied to the Planck data at large angular scales to extract the CMB APS. The same method has been applied also to analyze the TT parity and the Low Variance anomalies in the Planck maps, showing a consistent deviation from the standard cosmological model, the possible origins for this results have been discussed. The Cromaster code instead has been applied to the 408 MHz and 1.42 GHz surveys focusing on the analysis of the APS of selected regions of the synchrotron emission. The new generation of CMB experiments will be dedicated to polarization measurements, for which are necessary high accuracy devices for separating the polarizations. Here a new technology, called Photonic Crystals, is exploited to develop a new polarization splitter device and its performances are compared to the devices used nowadays.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The recent advent of Next-generation sequencing technologies has revolutionized the way of analyzing the genome. This innovation allows to get deeper information at a lower cost and in less time, and provides data that are discrete measurements. One of the most important applications with these data is the differential analysis, that is investigating if one gene exhibit a different expression level in correspondence of two (or more) biological conditions (such as disease states, treatments received and so on). As for the statistical analysis, the final aim will be statistical testing and for modeling these data the Negative Binomial distribution is considered the most adequate one especially because it allows for "over dispersion". However, the estimation of the dispersion parameter is a very delicate issue because few information are usually available for estimating it. Many strategies have been proposed, but they often result in procedures based on plug-in estimates, and in this thesis we show that this discrepancy between the estimation and the testing framework can lead to uncontrolled first-type errors. We propose a mixture model that allows each gene to share information with other genes that exhibit similar variability. Afterwards, three consistent statistical tests are developed for differential expression analysis. We show that the proposed method improves the sensitivity of detecting differentially expressed genes with respect to the common procedures, since it is the best one in reaching the nominal value for the first-type error, while keeping elevate power. The method is finally illustrated on prostate cancer RNA-seq data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pediatric acute myeloid leukemia (AML) is a molecularly heterogeneous disease that arises from genetic alterations in pathways that regulate self-renewal and myeloid differentiation. While the majority of patients carry recurrent chromosomal translocations, almost 20% of childhood AML do not show any recognizable cytogenetic alteration and are defined as cytogenetically normal (CN)-AML. CN-AML patients have always showed a great variability in response to therapy and overall outcome, underlining the presence of unknown genetic changes, not detectable by conventional analyses, but relevant for pathogenesis, and outcome of AML. The development of novel genome-wide techniques such as next-generation sequencing, have tremendously improved our ability to interrogate the cancer genome. Based on this background, the aim of this research study was to investigate the mutational landscape of pediatric CN-AML patients negative for all the currently known somatic mutations reported in AML through whole-transcriptome sequencing (RNA-seq). RNA-seq performed on diagnostic leukemic blasts from 19 pediatric CN-AML cases revealed a considerable incidence of cryptic chromosomal rearrangements, with the identification of 21 putative fusion genes. Several of the fusion genes that were identified in this study are recurrent and might have a prognostic and/or therapeutic relevance. A paradigm of that is the CBFA2T3-GLIS2 fusion, which has been demonstrated to be a common alteration in pediatric CN-AML, predicting poor outcome. Important findings have been also obtained in the identification of novel therapeutic targets. On one side, the identification of NUP98-JARID1A fusion suggests the use of disulfiram; on the other, here we describe alteration-activating tyrosine kinases, providing functional data supporting the use of tyrosine kinase inhibitors to specifically inhibit leukemia cells. This study provides new insights in the knowledge of genetic alterations underlying pediatric AML, defines novel prognostic markers and putative therapeutic targets, and prospectively ensures a correct risk stratification and risk-adapted therapy also for the “all-neg” AML subgroup.