9 resultados para Estimation Methods
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
The main aim of this Ph.D. dissertation is the study of clustering dependent data by means of copula functions with particular emphasis on microarray data. Copula functions are a popular multivariate modeling tool in each field where the multivariate dependence is of great interest and their use in clustering has not been still investigated. The first part of this work contains the review of the literature of clustering methods, copula functions and microarray experiments. The attention focuses on the K–means (Hartigan, 1975; Hartigan and Wong, 1979), the hierarchical (Everitt, 1974) and the model–based (Fraley and Raftery, 1998, 1999, 2000, 2007) clustering techniques because their performance is compared. Then, the probabilistic interpretation of the Sklar’s theorem (Sklar’s, 1959), the estimation methods for copulas like the Inference for Margins (Joe and Xu, 1996) and the Archimedean and Elliptical copula families are presented. In the end, applications of clustering methods and copulas to the genetic and microarray experiments are highlighted. The second part contains the original contribution proposed. A simulation study is performed in order to evaluate the performance of the K–means and the hierarchical bottom–up clustering methods in identifying clusters according to the dependence structure of the data generating process. Different simulations are performed by varying different conditions (e.g., the kind of margins (distinct, overlapping and nested) and the value of the dependence parameter ) and the results are evaluated by means of different measures of performance. In light of the simulation results and of the limits of the two investigated clustering methods, a new clustering algorithm based on copula functions (‘CoClust’ in brief) is proposed. The basic idea, the iterative procedure of the CoClust and the description of the written R functions with their output are given. The CoClust algorithm is tested on simulated data (by varying the number of clusters, the copula models, the dependence parameter value and the degree of overlap of margins) and is compared with the performance of model–based clustering by using different measures of performance, like the percentage of well–identified number of clusters and the not rejection percentage of H0 on . It is shown that the CoClust algorithm allows to overcome all observed limits of the other investigated clustering techniques and is able to identify clusters according to the dependence structure of the data independently of the degree of overlap of margins and the strength of the dependence. The CoClust uses a criterion based on the maximized log–likelihood function of the copula and can virtually account for any possible dependence relationship between observations. Many peculiar characteristics are shown for the CoClust, e.g. its capability of identifying the true number of clusters and the fact that it does not require a starting classification. Finally, the CoClust algorithm is applied to the real microarray data of Hedenfalk et al. (2001) both to the gene expressions observed in three different cancer samples and to the columns (tumor samples) of the whole data matrix.
Resumo:
The objective of this work of thesis is the refined estimations of source parameters. To such a purpose we used two different approaches, one in the frequency domain and the other in the time domain. In frequency domain, we analyzed the P- and S-wave displacement spectra to estimate spectral parameters, that is corner frequencies and low frequency spectral amplitudes. We used a parametric modeling approach which is combined with a multi-step, non-linear inversion strategy and includes the correction for attenuation and site effects. The iterative multi-step procedure was applied to about 700 microearthquakes in the moment range 1011-1014 N•m and recorded at the dense, wide-dynamic range, seismic networks operating in Southern Apennines (Italy). The analysis of the source parameters is often complicated when we are not able to model the propagation accurately. In this case the empirical Green function approach is a very useful tool to study the seismic source properties. In fact the Empirical Green Functions (EGFs) consent to represent the contribution of propagation and site effects to signal without using approximate velocity models. An EGF is a recorded three-component set of time-histories of a small earthquake whose source mechanism and propagation path are similar to those of the master event. Thus, in time domain, the deconvolution method of Vallée (2004) was applied to calculate the source time functions (RSTFs) and to accurately estimate source size and rupture velocity. This technique was applied to 1) large event, that is Mw=6.3 2009 L’Aquila mainshock (Central Italy), 2) moderate events, that is cluster of earthquakes of 2009 L’Aquila sequence with moment magnitude ranging between 3 and 5.6, 3) small event, i.e. Mw=2.9 Laviano mainshock (Southern Italy).
Resumo:
The research activity characterizing the present thesis was mainly centered on the design, development and validation of methodologies for the estimation of stationary and time-varying connectivity between different regions of the human brain during specific complex cognitive tasks. Such activity involved two main aspects: i) the development of a stable, consistent and reproducible procedure for functional connectivity estimation with a high impact on neuroscience field and ii) its application to real data from healthy volunteers eliciting specific cognitive processes (attention and memory). In particular the methodological issues addressed in the present thesis consisted in finding out an approach to be applied in neuroscience field able to: i) include all the cerebral sources in connectivity estimation process; ii) to accurately describe the temporal evolution of connectivity networks; iii) to assess the significance of connectivity patterns; iv) to consistently describe relevant properties of brain networks. The advancement provided in this thesis allowed finding out quantifiable descriptors of cognitive processes during a high resolution EEG experiment involving subjects performing complex cognitive tasks.
Resumo:
The human movement analysis (HMA) aims to measure the abilities of a subject to stand or to walk. In the field of HMA, tests are daily performed in research laboratories, hospitals and clinics, aiming to diagnose a disease, distinguish between disease entities, monitor the progress of a treatment and predict the outcome of an intervention [Brand and Crowninshield, 1981; Brand, 1987; Baker, 2006]. To achieve these purposes, clinicians and researchers use measurement devices, like force platforms, stereophotogrammetric systems, accelerometers, baropodometric insoles, etc. This thesis focus on the force platform (FP) and in particular on the quality assessment of the FP data. The principal objective of our work was the design and the experimental validation of a portable system for the in situ calibration of FPs. The thesis is structured as follows: Chapter 1. Description of the physical principles used for the functioning of a FP: how these principles are used to create force transducers, such as strain gauges and piezoelectrics transducers. Then, description of the two category of FPs, three- and six-component, the signals acquisition (hardware structure), and the signals calibration. Finally, a brief description of the use of FPs in HMA, for balance or gait analysis. Chapter 2. Description of the inverse dynamics, the most common method used in the field of HMA. This method uses the signals measured by a FP to estimate kinetic quantities, such as joint forces and moments. The measures of these variables can not be taken directly, unless very invasive techniques; consequently these variables can only be estimated using indirect techniques, as the inverse dynamics. Finally, a brief description of the sources of error, present in the gait analysis. Chapter 3. State of the art in the FP calibration. The selected literature is divided in sections, each section describes: systems for the periodic control of the FP accuracy; systems for the error reduction in the FP signals; systems and procedures for the construction of a FP. In particular is detailed described a calibration system designed by our group, based on the theoretical method proposed by ?. This system was the “starting point” for the new system presented in this thesis. Chapter 4. Description of the new system, divided in its parts: 1) the algorithm; 2) the device; and 3) the calibration procedure, for the correct performing of the calibration process. The algorithm characteristics were optimized by a simulation approach, the results are here presented. In addiction, the different versions of the device are described. Chapter 5. Experimental validation of the new system, achieved by testing it on 4 commercial FPs. The effectiveness of the calibration was verified by measuring, before and after calibration, the accuracy of the FPs in measuring the center of pressure of an applied force. The new system can estimate local and global calibration matrices; by local and global calibration matrices, the non–linearity of the FPs was quantified and locally compensated. Further, a non–linear calibration is proposed. This calibration compensates the non– linear effect in the FP functioning, due to the bending of its upper plate. The experimental results are presented. Chapter 6. Influence of the FP calibration on the estimation of kinetic quantities, with the inverse dynamics approach. Chapter 7. The conclusions of this thesis are presented: need of a calibration of FPs and consequential enhancement in the kinetic data quality. Appendix: Calibration of the LC used in the presented system. Different calibration set–up of a 3D force transducer are presented, and is proposed the optimal set–up, with particular attention to the compensation of non–linearities. The optimal set–up is verified by experimental results.
Resumo:
The motivating problem concerns the estimation of the growth curve of solitary corals that follow the nonlinear Von Bertalanffy Growth Function (VBGF). The most common parameterization of the VBGF for corals is based on two parameters: the ultimate length L∞ and the growth rate k. One aim was to find a more reliable method for estimating these parameters, which can capture the influence of environmental covariates. The main issue with current methods is that they force the linearization of VBGF and neglect intra-individual variability. The idea was to use the hierarchical nonlinear model which has the appealing features of taking into account the influence of collection sites, possible intra-site measurement correlation and variance heterogeneity, and that can handle the influence of environmental factors and all the reliable information that might influence coral growth. This method was used on two databases of different solitary corals i.e. Balanophyllia europaea and Leptopsammia pruvoti, collected in six different sites in different environmental conditions, which introduced a decisive improvement in the results. Nevertheless, the theory of the energy balance in growth ascertains the linear correlation of the two parameters and the independence of the ultimate length L∞ from the influence of environmental covariates, so a further aim of the thesis was to propose a new parameterization based on the ultimate length and parameter c which explicitly describes the part of growth ascribable to site-specific conditions such as environmental factors. We explored the possibility of estimating these parameters characterizing the VBGF new parameterization via the nonlinear hierarchical model. Again there was a general improvement with respect to traditional methods. The results of the two parameterizations were similar, although a very slight improvement was observed in the new one. This is, nevertheless, more suitable from a theoretical point of view when considering environmental covariates.
Resumo:
In the last couple of decades we assisted to a reappraisal of spatial design-based techniques. Usually the spatial information regarding the spatial location of the individuals of a population has been used to develop efficient sampling designs. This thesis aims at offering a new technique for both inference on individual values and global population values able to employ the spatial information available before sampling at estimation level by rewriting a deterministic interpolator under a design-based framework. The achieved point estimator of the individual values is treated both in the case of finite spatial populations and continuous spatial domains, while the theory on the estimator of the population global value covers the finite population case only. A fairly broad simulation study compares the results of the point estimator with the simple random sampling without replacement estimator in predictive form and the kriging, which is the benchmark technique for inference on spatial data. The Monte Carlo experiment is carried out on populations generated according to different superpopulation methods in order to manage different aspects of the spatial structure. The simulation outcomes point out that the proposed point estimator has almost the same behaviour as the kriging predictor regardless of the parameters adopted for generating the populations, especially for low sampling fractions. Moreover, the use of the spatial information improves substantially design-based spatial inference on individual values.
Resumo:
In recent years, thanks to the technological advances, electromagnetic methods for non-invasive shallow subsurface characterization have been increasingly used in many areas of environmental and geoscience applications. Among all the geophysical electromagnetic methods, the Ground Penetrating Radar (GPR) has received unprecedented attention over the last few decades due to its capability to obtain, spatially and temporally, high-resolution electromagnetic parameter information thanks to its versatility, its handling, its non-invasive nature, its high resolving power, and its fast implementation. The main focus of this thesis is to perform a dielectric site characterization in an efficient and accurate way studying in-depth a physical phenomenon behind a recent developed GPR approach, the so-called early-time technique, which infers the electrical properties of the soil in the proximity of the antennas. In particular, the early-time approach is based on the amplitude analysis of the early-time portion of the GPR waveform using a fixed-offset ground-coupled antenna configuration where the separation between the transmitting and receiving antenna is on the order of the dominant pulse-wavelength. Amplitude information can be extracted from the early-time signal through complex trace analysis, computing the instantaneous-amplitude attributes over a selected time-duration of the early-time signal. Basically, if the acquired GPR signals are considered to represent the real part of a complex trace, and the imaginary part is the quadrature component obtained by applying a Hilbert transform to the GPR trace, the amplitude envelope is the absolute value of the resulting complex trace (also known as the instantaneous-amplitude). Analysing laboratory information, numerical simulations and natural field conditions, and summarising the overall results embodied in this thesis, it is possible to suggest the early-time GPR technique as an effective method to estimate physical properties of the soil in a fast and non-invasive way.
Resumo:
The Schroeder's backward integration method is the most used method to extract the decay curve of an acoustic impulse response and to calculate the reverberation time from this curve. In the literature the limits and the possible improvements of this method are widely discussed. In this work a new method is proposed for the evaluation of the energy decay curve. The new method has been implemented in a Matlab toolbox. Its performance has been tested versus the most accredited literature method. The values of EDT and reverberation time extracted from the energy decay curves calculated with both methods have been compared in terms of the values themselves and in terms of their statistical representativeness. The main case study consists of nine Italian historical theatres in which acoustical measurements were performed. The comparison of the two extraction methods has also been applied to a critical case, i.e. the structural impulse responses of some building elements. The comparison underlines that both methods return a comparable value of the T30. Decreasing the range of evaluation, they reveal increasing differences; in particular, the main differences are in the first part of the decay, where the EDT is evaluated. This is a consequence of the fact that the new method returns a “locally" defined energy decay curve, whereas the Schroeder's method accumulates energy from the tail to the beginning of the impulse response. Another characteristic of the new method for the energy decay extraction curve is its independence on the background noise estimation. Finally, a statistical analysis is performed on the T30 and EDT values calculated from the impulse responses measurements in the Italian historical theatres. The aim of this evaluation is to know whether a subset of measurements could be considered representative for a complete characterization of these opera houses.
Resumo:
Despite the scientific achievement of the last decades in the astrophysical and cosmological fields, the majority of the Universe energy content is still unknown. A potential solution to the “missing mass problem” is the existence of dark matter in the form of WIMPs. Due to the very small cross section for WIMP-nuleon interactions, the number of expected events is very limited (about 1 ev/tonne/year), thus requiring detectors with large target mass and low background level. The aim of the XENON1T experiment, the first tonne-scale LXe based detector, is to be sensitive to WIMP-nucleon cross section as low as 10^-47 cm^2. To investigate the possibility of such a detector to reach its goal, Monte Carlo simulations are mandatory to estimate the background. To this aim, the GEANT4 toolkit has been used to implement the detector geometry and to simulate the decays from the various background sources: electromagnetic and nuclear. From the analysis of the simulations, the level of background has been found totally acceptable for the experiment purposes: about 1 background event in a 2 tonne-years exposure. Indeed, using the Maximum Gap method, the XENON1T sensitivity has been evaluated and the minimum for the WIMP-nucleon cross sections has been found at 1.87 x 10^-47 cm^2, at 90% CL, for a WIMP mass of 45 GeV/c^2. The results have been independently cross checked by using the Likelihood Ratio method that confirmed such results with an agreement within less than a factor two. Such a result is completely acceptable considering the intrinsic differences between the two statistical methods. Thus, in the PhD thesis it has been proven that the XENON1T detector will be able to reach the designed sensitivity, thus lowering the limits on the WIMP-nucleon cross section by about 2 orders of magnitude with respect to the current experiments.