14 resultados para Complex data
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
In the past decade, the advent of efficient genome sequencing tools and high-throughput experimental biotechnology has lead to enormous progress in the life science. Among the most important innovations is the microarray tecnology. It allows to quantify the expression for thousands of genes simultaneously by measurin the hybridization from a tissue of interest to probes on a small glass or plastic slide. The characteristics of these data include a fair amount of random noise, a predictor dimension in the thousand, and a sample noise in the dozens. One of the most exciting areas to which microarray technology has been applied is the challenge of deciphering complex disease such as cancer. In these studies, samples are taken from two or more groups of individuals with heterogeneous phenotypes, pathologies, or clinical outcomes. these samples are hybridized to microarrays in an effort to find a small number of genes which are strongly correlated with the group of individuals. Eventhough today methods to analyse the data are welle developed and close to reach a standard organization (through the effort of preposed International project like Microarray Gene Expression Data -MGED- Society [1]) it is not unfrequant to stumble in a clinician's question that do not have a compelling statistical method that could permit to answer it.The contribution of this dissertation in deciphering disease regards the development of new approaches aiming at handle open problems posed by clinicians in handle specific experimental designs. In Chapter 1 starting from a biological necessary introduction, we revise the microarray tecnologies and all the important steps that involve an experiment from the production of the array, to the quality controls ending with preprocessing steps that will be used into the data analysis in the rest of the dissertation. While in Chapter 2 a critical review of standard analysis methods are provided stressing most of problems that In Chapter 3 is introduced a method to adress the issue of unbalanced design of miacroarray experiments. In microarray experiments, experimental design is a crucial starting-point for obtaining reasonable results. In a two-class problem, an equal or similar number of samples it should be collected between the two classes. However in some cases, e.g. rare pathologies, the approach to be taken is less evident. We propose to address this issue by applying a modified version of SAM [2]. MultiSAM consists in a reiterated application of a SAM analysis, comparing the less populated class (LPC) with 1,000 random samplings of the same size from the more populated class (MPC) A list of the differentially expressed genes is generated for each SAM application. After 1,000 reiterations, each single probe given a "score" ranging from 0 to 1,000 based on its recurrence in the 1,000 lists as differentially expressed. The performance of MultiSAM was compared to the performance of SAM and LIMMA [3] over two simulated data sets via beta and exponential distribution. The results of all three algorithms over low- noise data sets seems acceptable However, on a real unbalanced two-channel data set reagardin Chronic Lymphocitic Leukemia, LIMMA finds no significant probe, SAM finds 23 significantly changed probes but cannot separate the two classes, while MultiSAM finds 122 probes with score >300 and separates the data into two clusters by hierarchical clustering. We also report extra-assay validation in terms of differentially expressed genes Although standard algorithms perform well over low-noise simulated data sets, multi-SAM seems to be the only one able to reveal subtle differences in gene expression profiles on real unbalanced data. In Chapter 4 a method to adress similarities evaluation in a three-class prblem by means of Relevance Vector Machine [4] is described. In fact, looking at microarray data in a prognostic and diagnostic clinical framework, not only differences could have a crucial role. In some cases similarities can give useful and, sometimes even more, important information. The goal, given three classes, could be to establish, with a certain level of confidence, if the third one is similar to the first or the second one. In this work we show that Relevance Vector Machine (RVM) [2] could be a possible solutions to the limitation of standard supervised classification. In fact, RVM offers many advantages compared, for example, with his well-known precursor (Support Vector Machine - SVM [3]). Among these advantages, the estimate of posterior probability of class membership represents a key feature to address the similarity issue. This is a highly important, but often overlooked, option of any practical pattern recognition system. We focused on Tumor-Grade-three-class problem, so we have 67 samples of grade I (G1), 54 samples of grade 3 (G3) and 100 samples of grade 2 (G2). The goal is to find a model able to separate G1 from G3, then evaluate the third class G2 as test-set to obtain the probability for samples of G2 to be member of class G1 or class G3. The analysis showed that breast cancer samples of grade II have a molecular profile more similar to breast cancer samples of grade I. Looking at the literature this result have been guessed, but no measure of significance was gived before.
Resumo:
For its particular position and the complex geological history, the Northern Apennines has been considered as a natural laboratory to apply several kinds of investigations. By the way, it is complicated to joint all the knowledge about the Northern Apennines in a unique picture that explains the structural and geological emplacement that produced it. The main goal of this thesis is to put together all information on the deformation - in the crust and at depth - of this region and to describe a geodynamical model that takes account of it. To do so, we have analyzed the pattern of deformation in the crust and in the mantle. In both cases the deformation has been studied using always information recovered from earthquakes, although using different techniques. In particular the shallower deformation has been studied using seismic moment tensors information. For our purpose we used the methods described in Arvidsson and Ekstrom (1998) that allowing the use in the inversion of surface waves [and not only of the body waves as the Centroid Moment Tensor (Dziewonski et al., 1981) one] allow to determine seismic source parameters for earthquakes with magnitude as small as 4.0. We applied this tool in the Northern Apennines and through this activity we have built up the Italian CMT dataset (Pondrelli et al., 2006) and the pattern of seismic deformation using the Kostrov (1974) method on a regular grid of 0.25 degree cells. We obtained a map of lateral variations of the pattern of seismic deformation on different layers of depth, taking into account the fact that shallow earthquakes (within 15 km of depth) in the region occur everywhere while most of events with a deeper hypocenter (15-40 km) occur only in the outer part of the belt, on the Adriatic side. For the analysis of the deep deformation, i.e. that occurred in the mantle, we used the anisotropy information characterizing the structure below the Northern Apennines. The anisotropy is an earth properties that in the crust is due to the presence of aligned fluid filled cracks or alternating isotropic layers with different elastic properties while in the mantle the most important cause of seismic anisotropy is the lattice preferred orientation (LPO) of the mantle minerals as the olivine. This last is a highly anisotropic mineral and tends to align its fast crystallographic axes (a-axis) parallel to the astenospheric flow as a response to finite strain induced by geodynamic processes. The seismic anisotropy pattern of a region is measured utilizing the shear wave splitting phenomenon (that is the seismological analogue to optical birefringence). Here, to do so, we apply on teleseismic earthquakes recorded on stations located in the study region, the Sileny and Plomerova (1996) approach. The results are analyzed on the basis of their lateral and vertical variations to better define the earth structure beneath Northern Apennines. We find different anisotropic domains, a Tuscany and an Adria one, with a pattern of seismic anisotropy which laterally varies in a similar way respect to the seismic deformation. Moreover, beneath the Adriatic region the distribution of the splitting parameters is so complex to request an appropriate analysis. Therefore we applied on our data the code of Menke and Levin (2003) which allows to look for different models of structures with multilayer anisotropy. We obtained that the structure beneath the Po Plain is probably even more complicated than expected. On the basis of the results obtained for this thesis, added with those from previous works, we suggest that slab roll-back, which created the Apennines and opened the Tyrrhenian Sea, evolved in the north boundary of Northern Apennines in a different way from its southern part. In particular, the trench retreat developed primarily south of our study region, with an eastward roll-back. In the northern portion of the orogen, after a first stage during which the retreat was perpendicular to the trench, it became oblique with respect to the structure.
Resumo:
An integrated array of analytical methods -including clay mineralogy, vitrinite reflectance, Raman spectroscopy on carbonaceous material, and apatite fission-track analysis- was employed to constrain the thermal and thermochronological evolution of selected portions of the Pontides of northern Turkey. (1) A multimethod investigation was applied for the first time to characterise the thermal history of the Karakaya Complex, a Permo-Triassic subduction-accretion complex cropping out throughout the Sakarya Zone. The results indicate two different thermal regimes: the Lower Karakaya Complex (Nilüfer Unit) -mostly made of metabasite and marble- suffered peak temperatures of 300-500°C (greenschist facies); the Upper Karakaya Complex (Hodul and the Orhanlar Units) –mostly made of greywacke and arkose- yielded heterogeneous peak temperatures (125-376°C), possibly the result of different degree of involvement of the units in the complex dynamic processes of the accretionary wedge. Contrary to common belief, the results of this study indicate that the entire Karakaya Complex suffered metamorphic conditions. Moreover, a good degree of correlation among the results of these methods demonstrate that Raman spectroscopy on carbonaceous material can be applied successfully to temperature ranges of 200-330°C, thus extending the application of this method from higher grade metamorphic contexts to lower grade metamorphic conditions. (2) Apatite fission-track analysis was applied to the Sakarya and the İstanbul Zones in order to constrain the exhumation history and timing of amalgamation of these two exotic terranes. AFT ages from the İstanbul and Sakarya terranes recorded three distinct episodes of exhumation related to the complex tectonic evolution of the Pontides. (i) Paleocene - early Eocene ages (62.3-50.3 Ma) reflect the closure of the İzmir-Ankara ocean and the ensuing collision between the Sakarya terrane and the Anatolide-Tauride Block. (ii) Late Eocene - earliest Oligocene (43.5-32.3 Ma) ages reflect renewed tectonic activity along the İzmir-Ankara. (iii) Late Oligocene- Early Miocene ages reflect the onset and development of the northern Aegean extension. The consistency of AFT ages, both north and south of the tectonic contact between the İstanbul and Sakarya terranes, suggest that such terranes were amalgamated in pre-Cenozoic times. (3) Fission-track analysis was also applied to rock samples from the Marmara region, in an attempt to constrain the inception and development of the North Anatolian Fault system in the region. The results agree with those from the central Pontides. The youngest AFT ages (Late Oligocene - early Miocene) were recorded in the western portion of the Marmara Sea region and reflect the onset and development of northern Aegean extension. Fission-track data from the eastern Marmara Sea region indicate rapid Early Eocene exhumation induced by the development of the İzmir-Ankara orogenic wedge. Thermochronological data along the trace of the Ganos Fault –a segment of the North Anatolian Fault system- indicate the presence of a tectonic discontinuity active by Late Oligocene time, i.e. well before the arrival of the North Anatolian Fault system in the area. The integration of thermochronologic data with preexisting structural data point to the existence of a system of major E-W-trending structural discontinuities active at least from the Late Oligocene. In the Early Pliocene, inception of the present-day North Anatolian Fault system in the Marmara region occurred by reactivation of these older tectonic structures.
Resumo:
In this thesis Marsili back-arc basin and Palinuro Volcanic Complex (Southern Tyrrhenian Sea) have been investigated by using magnetic, bathymetric and gravimetric data. A new velocity model of opening of the Marsili basin has been proposed, highlighting the transition from the horizontal spreading of the back-arc to the vertical accretion of the Marsili seamount. Introducing gravity data, Marsili's internal structure has been modeled and a huge portion of the volcano with low density and vanishing magnetization has been detected. Forward modeling of Palinuro Volcanic Complex showed as Palinuro represents the shallowest evidence of a deep tectonic discontinuity and the possible transition domain between the oceanic crust of Marsili Basin and the continental crust related to the Appenninic chain.
Resumo:
The southern Apennines of Italy have been experienced several destructive earthquakes both in historic and recent times. The present day seismicity, characterized by small-to-moderate magnitude earthquakes, was used like a probe to obatin a deeper knowledge of the fault structures where the largest earthquakes occurred in the past. With the aim to infer a three dimensional seismic image both the problem of data quality and the selection of a reliable and robust tomographic inversion strategy have been faced. The data quality has been obtained to develop optimized procedures for the measurements of P- and S-wave arrival times, through the use of polarization filtering and to the application of a refined re-picking technique based on cross-correlation of waveforms. A technique of iterative tomographic inversion, linearized, damped combined with a strategy of multiscale inversion type has been adopted. The retrieved P-wave velocity model indicates the presence of a strong velocity variation along a direction orthogonal to the Apenninic chain. This variation defines two domains which are characterized by a relatively low and high velocity values. From the comparison between the inferred P-wave velocity model with a portion of a structural section available in literature, the high velocity body was correlated with the Apulia carbonatic platforms whereas the low velocity bodies was associated to the basinal deposits. The deduced Vp/Vs ratio shows that the ratio is lower than 1.8 in the shallower part of the model, while for depths ranging between 5 km and 12 km the ratio increases up to 2.1 in correspondence to the area of higher seismicity. This confirms that areas characterized by higher values are more prone to generate earthquakes as a response to the presence of fluids and higher pore-pressures.
Resumo:
The present PhD thesis was focused on the development and application of chemical methodology (Py-GC-MS) and data-processing method by multivariate data analysis (chemometrics). The chromatographic and mass spectrometric data obtained with this technique are particularly suitable to be interpreted by chemometric methods such as PCA (Principal Component Analysis) as regards data exploration and SIMCA (Soft Independent Models of Class Analogy) for the classification. As a first approach, some issues related to the field of cultural heritage were discussed with a particular attention to the differentiation of binders used in pictorial field. A marker of egg tempera the phosphoric acid esterified, a pyrolysis product of lecithin, was determined using HMDS (hexamethyldisilazane) rather than the TMAH (tetramethylammonium hydroxide) as a derivatizing reagent. The validity of analytical pyrolysis as tool to characterize and classify different types of bacteria was verified. The FAMEs chromatographic profiles represent an important tool for the bacterial identification. Because of the complexity of the chromatograms, it was possible to characterize the bacteria only according to their genus, while the differentiation at the species level has been achieved by means of chemometric analysis. To perform this study, normalized areas peaks relevant to fatty acids were taken into account. Chemometric methods were applied to experimental datasets. The obtained results demonstrate the effectiveness of analytical pyrolysis and chemometric analysis for the rapid characterization of bacterial species. Application to a samples of bacterial (Pseudomonas Mendocina), fungal (Pleorotus ostreatus) and mixed- biofilms was also performed. A comparison with the chromatographic profiles established the possibility to: • Differentiate the bacterial and fungal biofilms according to the (FAMEs) profile. • Characterize the fungal biofilm by means the typical pattern of pyrolytic fragments derived from saccharides present in the cell wall. • Individuate the markers of bacterial and fungal biofilm in the same mixed-biofilm sample.
Resumo:
The relevance of human joint models was shown in the literature. In particular, the great importance of models for the joint passive motion simulation (i.e. motion under virtually unloaded conditions) was outlined. They clarify the role played by the principal anatomical structures of the articulation, enhancing the comprehension of surgical treatments, and in particular the design of total ankle replacement and ligament reconstruction. Equivalent rigid link mechanisms proved to be an efficient tool for an accurate simulation of the joint passive motion. This thesis focuses on the ankle complex (i.e. the anatomical structure composed of the tibiotalar and the subtalar joints), which has a considerable role in human locomotion. The lack of interpreting models of this articulation and the poor results of total ankle replacement arthroplasty have strongly suggested devising new mathematical models capable of reproducing the restraining function of each structure of the joint and of replicating the relative motion of the bones which constitute the joint itself. In this contest, novel equivalent mechanisms are proposed for modelling the ankle passive motion. Their geometry is based on the joint’s anatomical structures. In particular, the role of the main ligaments of the articulation is investigated under passive conditions by means of nine 5-5 fully parallel mechanisms. Based on this investigation, a one-DOF spatial mechanism is developed for modelling the passive motion of the lower leg. The model considers many passive structures constituting the articulation, overcoming the limitations of previous models which took into account few anatomical elements of the ankle complex. All the models have been identified from experimental data by means of optimization procedure. Then, the simulated motions have been compared to the experimental one, in order to show the efficiency of the approach and thus to deduce the role of each anatomical structure in the ankle kinematic behavior.
Resumo:
During my PhD, starting from the original formulations proposed by Bertrand et al., 2000 and Emolo & Zollo 2005, I developed inversion methods and applied then at different earthquakes. In particular large efforts have been devoted to the study of the model resolution and to the estimation of the model parameter errors. To study the source kinematic characteristics of the Christchurch earthquake we performed a joint inversion of strong-motion, GPS and InSAR data using a non-linear inversion method. Considering the complexity highlighted by superficial deformation data, we adopted a fault model consisting of two partially overlapping segments, with dimensions 15x11 and 7x7 km2, having different faulting styles. This two-fault model allows to better reconstruct the complex shape of the superficial deformation data. The total seismic moment resulting from the joint inversion is 3.0x1025 dyne.cm (Mw = 6.2) with an average rupture velocity of 2.0 km/s. Errors associated with the kinematic model have been estimated of around 20-30 %. The 2009 Aquila sequence was characterized by an intense aftershocks sequence that lasted several months. In this study we applied an inversion method that assumes as data the apparent Source Time Functions (aSTFs), to a Mw 4.0 aftershock of the Aquila sequence. The estimation of aSTFs was obtained using the deconvolution method proposed by Vallée et al., 2004. The inversion results show a heterogeneous slip distribution, characterized by two main slip patches located NW of the hypocenter, and a variable rupture velocity distribution (mean value of 2.5 km/s), showing a rupture front acceleration in between the two high slip zones. Errors of about 20% characterize the final estimated parameters.
Resumo:
The aim of the thesis is to propose a Bayesian estimation through Markov chain Monte Carlo of multidimensional item response theory models for graded responses with complex structures and correlated traits. In particular, this work focuses on the multiunidimensional and the additive underlying latent structures, considering that the first one is widely used and represents a classical approach in multidimensional item response analysis, while the second one is able to reflect the complexity of real interactions between items and respondents. A simulation study is conducted to evaluate the parameter recovery for the proposed models under different conditions (sample size, test and subtest length, number of response categories, and correlation structure). The results show that the parameter recovery is particularly sensitive to the sample size, due to the model complexity and the high number of parameters to be estimated. For a sufficiently large sample size the parameters of the multiunidimensional and additive graded response models are well reproduced. The results are also affected by the trade-off between the number of items constituting the test and the number of item categories. An application of the proposed models on response data collected to investigate Romagna and San Marino residents' perceptions and attitudes towards the tourism industry is also presented.
Resumo:
In this thesis the evolution of the techno-social systems analysis methods will be reported, through the explanation of the various research experience directly faced. The first case presented is a research based on data mining of a dataset of words association named Human Brain Cloud: validation will be faced and, also through a non-trivial modeling, a better understanding of language properties will be presented. Then, a real complex system experiment will be introduced: the WideNoise experiment in the context of the EveryAware european project. The project and the experiment course will be illustrated and data analysis will be displayed. Then the Experimental Tribe platform for social computation will be introduced . It has been conceived to help researchers in the implementation of web experiments, and aims also to catalyze the cumulative growth of experimental methodologies and the standardization of tools cited above. In the last part, three other research experience which already took place on the Experimental Tribe platform will be discussed in detail, from the design of the experiment to the analysis of the results and, eventually, to the modeling of the systems involved. The experiments are: CityRace, about the measurement of human traffic-facing strategies; laPENSOcosì, aiming to unveil the political opinion structure; AirProbe, implemented again in the EveryAware project framework, which consisted in monitoring air quality opinion shift of a community informed about local air pollution. At the end, the evolution of the technosocial systems investigation methods shall emerge together with the opportunities and the threats offered by this new scientific path.
Resumo:
This thesis concerns the study of complex conformational surfaces and tautomeric equilibria of molecules and molecular complexes by quantum chemical methods and rotational spectroscopy techniques. In particular, the focus of this research is on the effects of substitution and noncovalent interactions in determining the energies and geometries of different conformers, tautomers or molecular complexes. The Free-Jet Absorption Millimeter Wave spectroscopy and the Pulsed-Jet Fourier Transform Microwave spectroscopy have been applied to perform these studies and the obtained results showcase the suitability of these techniques for the study of conformational surfaces and intermolecular interactions. The series of investigations of selected medium-size molecules and complexes have shown how different instrumental setups can be used to obtain a variety of results on molecular properties. The systems studied, include molecules of biological interest such as anethole and molecules of astrophysical interest such as N-methylaminoethanol. Moreover halogenation effects have been investigated on halogen substituted tautomeric systems (5-chlorohydroxypyridine and 6-chlorohydroxypyridine), where it has shown that the position of the inserted halogen atom affects the prototropic equilibrium. As for fluorination effects, interesting results have been achieved investigating some small complexes where a molecule of water is used as a probe to reveal the changes on the electrostatic potential of different fluorinated compounds: 2-fluoropyridine, 3-fluoropyridine and penta-fluoropyridine. While in the case of the molecular complex between water and 2-fluoropyridine and 3-fluoropyridine the geometry of the complex with one water molecule is analogous to that of pyridine with the water molecule linked to the pyridine nitrogen, the case of pentafluoropyridine reveals the effect of perfluorination and the water oxygen points towards the positive center of the pyridine ring. Additional molecular adducts with a molecule of water have been analyzed (benzylamine-water and acrylic acid-water) in order to reveal the stabilizing driving forces that characterize these complexes.
Resumo:
This thesis reports an integrated analytical and physicochemical approach for the study of natural substances and new drugs based on mass spectrometry techniques combined with liquid chromatography. In particular, Chapter 1 concerns the study of Berberine a natural substance with pharmacological activity for the treatment of hepatobiliary and intestinal diseases. The first part focused on the relationships between physicochemical properties, pharmacokinetics and metabolism of Berberine and its metabolites. For this purpose a sensitive HPLC-ES-MS/MS method have been developed, validated and used to determine these compounds during their physicochemical properties studies and plasma levels of berberine and its metabolites including berberrubine(M1), demethylenberberine(M3), and jatrorrhizine(M4) in humans. Data show that M1, could have an efficient intestinal absorption by passive diffusion due to a keto-enol tautomerism confirmed by NMR studies and its higher plasma concentration. In the second part of Chapter 1, a comparison between M1 and BBR in vivo biodistribution in rat has been studied. In Chapter 2 a new HPLC-ES-MS/MS method for the simultaneous determination and quantification of glucosinolates, as glucoraphanin, glucoerucin and sinigrin, and isothiocyanates, as sulforaphane and erucin, has developed and validated. This method has been used for the analysis of functional foods enriched with vegetable extracts. Chapter 3 focused on a physicochemical study of the interaction between the bile acid sequestrants used in the treatment of hypercholesterolemia including colesevelam and cholestyramine with obeticolic acid (OCA), potent agonist of nuclear receptor farnesoid X (FXR). In particular, a new experimental model for the determination of equilibrium binding isotherm was developed. Chapter 4 focused on methodological aspects of new hard ionization coupled with liquid chromatography (Direct-EI-UHPLC-MS) not yet commercially available and potentially useful for qualitative analysis and for “transparent” molecules to soft ionization techniques. This method was applied to the analysis of several steroid derivatives.
Resumo:
Self-organising pervasive ecosystems of devices are set to become a major vehicle for delivering infrastructure and end-user services. The inherent complexity of such systems poses new challenges to those who want to dominate it by applying the principles of engineering. The recent growth in number and distribution of devices with decent computational and communicational abilities, that suddenly accelerated with the massive diffusion of smartphones and tablets, is delivering a world with a much higher density of devices in space. Also, communication technologies seem to be focussing on short-range device-to-device (P2P) interactions, with technologies such as Bluetooth and Near-Field Communication gaining greater adoption. Locality and situatedness become key to providing the best possible experience to users, and the classic model of a centralised, enormously powerful server gathering and processing data becomes less and less efficient with device density. Accomplishing complex global tasks without a centralised controller responsible of aggregating data, however, is a challenging task. In particular, there is a local-to-global issue that makes the application of engineering principles challenging at least: designing device-local programs that, through interaction, guarantee a certain global service level. In this thesis, we first analyse the state of the art in coordination systems, then motivate the work by describing the main issues of pre-existing tools and practices and identifying the improvements that would benefit the design of such complex software ecosystems. The contribution can be divided in three main branches. First, we introduce a novel simulation toolchain for pervasive ecosystems, designed for allowing good expressiveness still retaining high performance. Second, we leverage existing coordination models and patterns in order to create new spatial structures. Third, we introduce a novel language, based on the existing ``Field Calculus'' and integrated with the aforementioned toolchain, designed to be usable for practical aggregate programming.