10 resultados para Data Driven Modeling

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work is focused on the study of saltwater intrusion in coastal aquifers, and in particular on the realization of conceptual schemes to evaluate the risk associated with it. Saltwater intrusion depends on different natural and anthropic factors, both presenting a strong aleatory behaviour, that should be considered for an optimal management of the territory and water resources. Given the uncertainty of problem parameters, the risk associated with salinization needs to be cast in a probabilistic framework. On the basis of a widely adopted sharp interface formulation, key hydrogeological problem parameters are modeled as random variables, and global sensitivity analysis is used to determine their influence on the position of saltwater interface. The analyses presented in this work rely on an efficient model reduction technique, based on Polynomial Chaos Expansion, able to combine the best description of the model without great computational burden. When the assumptions of classical analytical models are not respected, and this occurs several times in the applications to real cases of study, as in the area analyzed in the present work, one can adopt data-driven techniques, based on the analysis of the data characterizing the system under study. It follows that a model can be defined on the basis of connections between the system state variables, with only a limited number of assumptions about the "physical" behaviour of the system.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Environmental computer models are deterministic models devoted to predict several environmental phenomena such as air pollution or meteorological events. Numerical model output is given in terms of averages over grid cells, usually at high spatial and temporal resolution. However, these outputs are often biased with unknown calibration and not equipped with any information about the associated uncertainty. Conversely, data collected at monitoring stations is more accurate since they essentially provide the true levels. Due the leading role played by numerical models, it now important to compare model output with observations. Statistical methods developed to combine numerical model output and station data are usually referred to as data fusion. In this work, we first combine ozone monitoring data with ozone predictions from the Eta-CMAQ air quality model in order to forecast real-time current 8-hour average ozone level defined as the average of the previous four hours, current hour, and predictions for the next three hours. We propose a Bayesian downscaler model based on first differences with a flexible coefficient structure and an efficient computational strategy to fit model parameters. Model validation for the eastern United States shows consequential improvement of our fully inferential approach compared with the current real-time forecasting system. Furthermore, we consider the introduction of temperature data from a weather forecast model into the downscaler, showing improved real-time ozone predictions. Finally, we introduce a hierarchical model to obtain spatially varying uncertainty associated with numerical model output. We show how we can learn about such uncertainty through suitable stochastic data fusion modeling using some external validation data. We illustrate our Bayesian model by providing the uncertainty map associated with a temperature output over the northeastern United States.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The construction and use of multimedia corpora has been advocated for a while in the literature as one of the expected future application fields of Corpus Linguistics. This research project represents a pioneering experience aimed at applying a data-driven methodology to the study of the field of AVT, similarly to what has been done in the last few decades in the macro-field of Translation Studies. This research was based on the experience of Forlixt 1, the Forlì Corpus of Screen Translation, developed at the University of Bologna’s Department of Interdisciplinary Studies in Translation, Languages and Culture. As a matter of fact, in order to quantify strategies of linguistic transfer of an AV product, we need to take into consideration not only the linguistic aspect of such a product but all the meaning-making resources deployed in the filmic text. Provided that one major benefit of Forlixt 1 is the combination of audiovisual and textual data, this corpus allows the user to access primary data for scientific investigation, and thus no longer rely on pre-processed material such as traditional annotated transcriptions. Based on this rationale, the first chapter of the thesis sets out to illustrate the state of the art of research in the disciplinary fields involved. The primary objective was to underline the main repercussions on multimedia texts resulting from the interaction of a double support, audio and video, and, accordingly, on procedures, means, and methods adopted in their translation. By drawing on previous research in semiotics and film studies, the relevant codes at work in visual and acoustic channels were outlined. Subsequently, we concentrated on the analysis of the verbal component and on the peculiar characteristics of filmic orality as opposed to spontaneous dialogic production. In the second part, an overview of the main AVT modalities was presented (dubbing, voice-over, interlinguistic and intra-linguistic subtitling, audio-description, etc.) in order to define the different technologies, processes and professional qualifications that this umbrella term presently includes. The second chapter focuses diachronically on various theories’ contribution to the application of Corpus Linguistics’ methods and tools to the field of Translation Studies (i.e. Descriptive Translation Studies, Polysystem Theory). In particular, we discussed how the use of corpora can favourably help reduce the gap existing between qualitative and quantitative approaches. Subsequently, we reviewed the tools traditionally employed by Corpus Linguistics in regard to the construction of traditional “written language” corpora, to assess whether and how they can be adapted to meet the needs of multimedia corpora. In particular, we reviewed existing speech and spoken corpora, as well as multimedia corpora specifically designed to investigate Translation. The third chapter reviews Forlixt 1's main developing steps, from a technical (IT design principles, data query functions) and methodological point of view, by laying down extensive scientific foundations for the annotation methods adopted, which presently encompass categories of pragmatic, sociolinguistic, linguacultural and semiotic nature. Finally, we described the main query tools (free search, guided search, advanced search and combined search) and the main intended uses of the database in a pedagogical perspective. The fourth chapter lists specific compilation criteria retained, as well as statistics of the two sub-corpora, by presenting data broken down by language pair (French-Italian and German-Italian) and genre (cinema’s comedies, television’s soapoperas and crime series). Next, we concentrated on the discussion of the results obtained from the analysis of summary tables reporting the frequency of categories applied to the French-Italian sub-corpus. The detailed observation of the distribution of categories identified in the original and dubbed corpus allowed us to empirically confirm some of the theories put forward in the literature and notably concerning the nature of the filmic text, the dubbing process and Italian dubbed language’s features. This was possible by looking into some of the most problematic aspects, like the rendering of socio-linguistic variation. The corpus equally allowed us to consider so far neglected aspects, such as pragmatic, prosodic, kinetic, facial, and semiotic elements, and their combination. At the end of this first exploration, some specific observations concerning possible macrotranslation trends were made for each type of sub-genre considered (cinematic and TV genre). On the grounds of this first quantitative investigation, the fifth chapter intended to further examine data, by applying ad hoc models of analysis. Given the virtually infinite number of combinations of categories adopted, and of the latter with searchable textual units, three possible qualitative and quantitative methods were designed, each of which was to concentrate on a particular translation dimension of the filmic text. The first one was the cultural dimension, which specifically focused on the rendering of selected cultural references and on the investigation of recurrent translation choices and strategies justified on the basis of the occurrence of specific clusters of categories. The second analysis was conducted on the linguistic dimension by exploring the occurrence of phrasal verbs in the Italian dubbed corpus and by ascertaining the influence on the adoption of related translation strategies of possible semiotic traits, such as gestures and facial expressions. Finally, the main aim of the third study was to verify whether, under which circumstances, and through which modality, graphic and iconic elements were translated into Italian from an original corpus of both German and French films. After having reviewed the main translation techniques at work, an exhaustive account of possible causes for their non-translation was equally provided. By way of conclusion, the discussion of results obtained from the distribution of annotation categories on the French-Italian corpus, as well as the application of specific models of analysis allowed us to underline possible advantages and drawbacks related to the adoption of a corpus-based approach to AVT studies. Even though possible updating and improvement were proposed in order to help solve some of the problems identified, it is argued that the added value of Forlixt 1 lies ultimately in having created a valuable instrument, allowing to carry out empirically-sound contrastive studies that may be usefully replicated on different language pairs and several types of multimedia texts. Furthermore, multimedia corpora can also play a crucial role in L2 and translation teaching, two disciplines in which their use still lacks systematic investigation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this thesis three measurements of top-antitop differential cross section at an energy in the center of mass of 7 TeV will be shown, as a function of the transverse momentum, the mass and the rapidity of the top-antitop system. The analysis has been carried over a data sample of about 5/fb recorded with the ATLAS detector. The events have been selected with a cut based approach in the "one lepton plus jets" channel, where the lepton can be either an electron or a muon. The most relevant backgrounds (multi-jet QCD and W+jets) have been extracted using data driven methods; the others (Z+ jets, diboson and single top) have been simulated with Monte Carlo techniques. The final, background-subtracted, distributions have been corrected, using unfolding methods, for the detector and selection effects. At the end, the results have been compared with the theoretical predictions. The measurements are dominated by the systematic uncertainties and show no relevant deviation from the Standard Model predictions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Falls are common and burdensome accidents among the elderly. About one third of the population aged 65 years or more experience at least one fall each year. Fall risk assessment is believed to be beneficial for fall prevention. This thesis is about prognostic tools for falls for community-dwelling older adults. We provide an overview of the state of the art. We then take different approaches: we propose a theoretical probabilistic model to investigate some properties of prognostic tools for falls; we present a tool whose parameters were derived from data of the literature; we train and test a data-driven prognostic tool. Finally, we present some preliminary results on prediction of falls through features extracted from wearable inertial sensors. Heterogeneity in validation results are expected from theoretical considerations and are observed from empirical data. Differences in studies design hinder comparability and collaborative research. According to the multifactorial etiology of falls, assessment on multiple risk factors is needed in order to achieve good predictive accuracy.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A critical point in the analysis of ground displacements time series is the development of data driven methods that allow the different sources that generate the observed displacements to be discerned and characterised. A widely used multivariate statistical technique is the Principal Component Analysis (PCA), which allows reducing the dimensionality of the data space maintaining most of the variance of the dataset explained. Anyway, PCA does not perform well in finding the solution to the so-called Blind Source Separation (BSS) problem, i.e. in recovering and separating the original sources that generated the observed data. This is mainly due to the assumptions on which PCA relies: it looks for a new Euclidean space where the projected data are uncorrelated. The Independent Component Analysis (ICA) is a popular technique adopted to approach this problem. However, the independence condition is not easy to impose, and it is often necessary to introduce some approximations. To work around this problem, I use a variational bayesian ICA (vbICA) method, which models the probability density function (pdf) of each source signal using a mix of Gaussian distributions. This technique allows for more flexibility in the description of the pdf of the sources, giving a more reliable estimate of them. Here I present the application of the vbICA technique to GPS position time series. First, I use vbICA on synthetic data that simulate a seismic cycle (interseismic + coseismic + postseismic + seasonal + noise) and a volcanic source, and I study the ability of the algorithm to recover the original (known) sources of deformation. Secondly, I apply vbICA to different tectonically active scenarios, such as the 2009 L'Aquila (central Italy) earthquake, the 2012 Emilia (northern Italy) seismic sequence, and the 2006 Guerrero (Mexico) Slow Slip Event (SSE).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Hydrothermal fluids are a fundamental resource for understanding and monitoring volcanic and non-volcanic systems. This thesis is focused on the study of hydrothermal system through numerical modeling with the geothermal simulator TOUGH2. Several simulations are presented, and geophysical and geochemical observables, arising from fluids circulation, are analyzed in detail throughout the thesis. In a volcanic setting, fluids feeding fumaroles and hot spring may play a key role in the hazard evaluation. The evolution of the fluids circulation is caused by a strong interaction between magmatic and hydrothermal systems. A simultaneous analysis of different geophysical and geochemical observables is a sound approach for interpreting monitored data and to infer a consistent conceptual model. Analyzed observables are ground displacement, gravity changes, electrical conductivity, amount, composition and temperature of the emitted gases at surface, and extent of degassing area. Results highlight the different temporal response of the considered observables, as well as the different radial pattern of variation. However, magnitude, temporal response and radial pattern of these signals depend not only on the evolution of fluid circulation, but a main role is played by the considered rock properties. Numerical simulations highlight differences that arise from the assumption of different permeabilities, for both homogeneous and heterogeneous systems. Rock properties affect hydrothermal fluid circulation, controlling both the range of variation and the temporal evolution of the observable signals. Low temperature fumaroles and low discharge rate may be affected by atmospheric conditions. Detailed parametric simulations were performed, aimed to understand the effects of system properties, such as permeability and gas reservoir overpressure, on diffuse degassing when air temperature and barometric pressure changes are applied to the ground surface. Hydrothermal circulation, however, is not only a characteristic of volcanic system. Hot fluids may be involved in several mankind problems, such as studies on geothermal engineering, nuclear waste propagation in porous medium, and Geological Carbon Sequestration (GCS). The current concept for large-scale GCS is the direct injection of supercritical carbon dioxide into deep geological formations which typically contain brine. Upward displacement of such brine from deep reservoirs driven by pressure increases resulting from carbon dioxide injection may occur through abandoned wells, permeable faults or permeable channels. Brine intrusion into aquifers may degrade groundwater resources. Numerical results show that pressure rise drives dense water up to the conduits, and does not necessarily result in continuous flow. Rather, overpressure leads to new hydrostatic equilibrium if fluids are initially density stratified. If warm and salty fluid does not cool passing through the conduit, an oscillatory solution is then possible. Parameter studies delineate steady-state (static) and oscillatory solutions.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We use data from about 700 GPS stations in the EuroMediterranen region to investigate the present-day behavior of the the Calabrian subduction zone within the Mediterranean-scale plates kinematics and to perform local scale studies about the strain accumulation on active structures. We focus attenction on the Messina Straits and Crati Valley faults where GPS data show extentional velocity gradients of ∼3 mm/yr and ∼2 mm/yr, respectively. We use dislocation model and a non-linear constrained optimization algorithm to invert for fault geometric parameters and slip-rates and evaluate the associated uncertainties adopting a bootstrap approach. Our analysis suggest the presence of two partially locked normal faults. To investigate the impact of elastic strain contributes from other nearby active faults onto the observed velocity gradient we use a block modeling approach. Our models show that the inferred slip-rates on the two analyzed structures are strongly impacted by the assumed locking width of the Calabrian subduction thrust. In order to frame the observed local deformation features within the present- day central Mediterranean kinematics we realyze a statistical analysis testing the indipendent motion (w.r.t. the African and Eurasias plates) of the Adriatic, Cal- abrian and Sicilian blocks. Our preferred model confirms a microplate like behaviour for all the investigated blocks. Within these kinematic boundary conditions we fur- ther investigate the Calabrian Slab interface geometry using a combined approach of block modeling and χ2ν statistic. Almost no information is obtained using only the horizontal GPS velocities that prove to be a not sufficient dataset for a multi-parametric inversion approach. Trying to stronger constrain the slab geometry we estimate the predicted vertical velocities performing suites of forward models of elastic dislocations varying the fault locking depth. Comparison with the observed field suggest a maximum resolved locking depth of 25 km.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Several countries have acquired, over the past decades, large amounts of area covering Airborne Electromagnetic data. Contribution of airborne geophysics has dramatically increased for both groundwater resource mapping and management proving how those systems are appropriate for large-scale and efficient groundwater surveying. We start with processing and inversion of two AEM dataset from two different systems collected over the Spiritwood Valley Aquifer area, Manitoba, Canada respectively, the AeroTEM III (commissioned by the Geological Survey of Canada in 2010) and the “Full waveform VTEM” dataset, collected and tested over the same survey area, during the fall 2011. We demonstrate that in the presence of multiple datasets, either AEM and ground data, due processing, inversion, post-processing, data integration and data calibration is the proper approach capable of providing reliable and consistent resistivity models. Our approach can be of interest to many end users, ranging from Geological Surveys, Universities to Private Companies, which are often proprietary of large geophysical databases to be interpreted for geological and\or hydrogeological purposes. In this study we deeply investigate the role of integration of several complimentary types of geophysical data collected over the same survey area. We show that data integration can improve inversions, reduce ambiguity and deliver high resolution results. We further attempt to use the final, most reliable output resistivity models as a solid basis for building a knowledge-driven 3D geological voxel-based model. A voxel approach allows a quantitative understanding of the hydrogeological setting of the area, and it can be further used to estimate the aquifers volumes (i.e. potential amount of groundwater resources) as well as hydrogeological flow model prediction. In addition, we investigated the impact of an AEM dataset towards hydrogeological mapping and 3D hydrogeological modeling, comparing it to having only a ground based TEM dataset and\or to having only boreholes data.