8 resultados para selection model

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

70.00% 70.00%

Publicador:

Resumo:

This thesis presents a creative and practical approach to dealing with the problem of selection bias. Selection bias may be the most important vexing problem in program evaluation or in any line of research that attempts to assert causality. Some of the greatest minds in economics and statistics have scrutinized the problem of selection bias, with the resulting approaches – Rubin’s Potential Outcome Approach(Rosenbaum and Rubin,1983; Rubin, 1991,2001,2004) or Heckman’s Selection model (Heckman, 1979) – being widely accepted and used as the best fixes. These solutions to the bias that arises in particular from self selection are imperfect, and many researchers, when feasible, reserve their strongest causal inference for data from experimental rather than observational studies. The innovative aspect of this thesis is to propose a data transformation that allows measuring and testing in an automatic and multivariate way the presence of selection bias. The approach involves the construction of a multi-dimensional conditional space of the X matrix in which the bias associated with the treatment assignment has been eliminated. Specifically, we propose the use of a partial dependence analysis of the X-space as a tool for investigating the dependence relationship between a set of observable pre-treatment categorical covariates X and a treatment indicator variable T, in order to obtain a measure of bias according to their dependence structure. The measure of selection bias is then expressed in terms of inertia due to the dependence between X and T that has been eliminated. Given the measure of selection bias, we propose a multivariate test of imbalance in order to check if the detected bias is significant, by using the asymptotical distribution of inertia due to T (Estadella et al. 2005) , and by preserving the multivariate nature of data. Further, we propose the use of a clustering procedure as a tool to find groups of comparable units on which estimate local causal effects, and the use of the multivariate test of imbalance as a stopping rule in choosing the best cluster solution set. The method is non parametric, it does not call for modeling the data, based on some underlying theory or assumption about the selection process, but instead it calls for using the existing variability within the data and letting the data to speak. The idea of proposing this multivariate approach to measure selection bias and test balance comes from the consideration that in applied research all aspects of multivariate balance, not represented in the univariate variable- by-variable summaries, are ignored. The first part contains an introduction to evaluation methods as part of public and private decision process and a review of the literature of evaluation methods. The attention is focused on Rubin Potential Outcome Approach, matching methods, and briefly on Heckman’s Selection Model. The second part focuses on some resulting limitations of conventional methods, with particular attention to the problem of how testing in the correct way balancing. The third part contains the original contribution proposed , a simulation study that allows to check the performance of the method for a given dependence setting and an application to a real data set. Finally, we discuss, conclude and explain our future perspectives.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The thesis deals with the problem of Model Selection (MS) motivated by information and prediction theory, focusing on parametric time series (TS) models. The main contribution of the thesis is the extension to the multivariate case of the Misspecification-Resistant Information Criterion (MRIC), a criterion introduced recently that solves Akaike’s original research problem posed 50 years ago, which led to the definition of the AIC. The importance of MS is witnessed by the huge amount of literature devoted to it and published in scientific journals of many different disciplines. Despite such a widespread treatment, the contributions that adopt a mathematically rigorous approach are not so numerous and one of the aims of this project is to review and assess them. Chapter 2 discusses methodological aspects of MS from information theory. Information criteria (IC) for the i.i.d. setting are surveyed along with their asymptotic properties; and the cases of small samples, misspecification, further estimators. Chapter 3 surveys criteria for TS. IC and prediction criteria are considered for: univariate models (AR, ARMA) in the time and frequency domain, parametric multivariate (VARMA, VAR); nonparametric nonlinear (NAR); and high-dimensional models. The MRIC answers Akaike’s original question on efficient criteria, for possibly-misspecified (PM) univariate TS models in multi-step prediction with high-dimensional data and nonlinear models. Chapter 4 extends the MRIC to PM multivariate TS models for multi-step prediction introducing the Vectorial MRIC (VMRIC). We show that the VMRIC is asymptotically efficient by proving the decomposition of the MSPE matrix and the consistency of its Method-of-Moments Estimator (MoME), for Least Squares multi-step prediction with univariate regressor. Chapter 5 extends the VMRIC to the general multiple regressor case, by showing that the MSPE matrix decomposition holds, obtaining consistency for its MoME, and proving its efficiency. The chapter concludes with a digression on the conditions for PM VARX models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the present study we are using multi variate analysis techniques to discriminate signal from background in the fully hadronic decay channel of ttbar events. We give a brief introduction to the role of the Top quark in the standard model and a general description of the CMS Experiment at LHC. We have used the CMS experiment computing and software infrastructure to generate and prepare the data samples used in this analysis. We tested the performance of three different classifiers applied to our data samples and used the selection obtained with the Multi Layer Perceptron classifier to give an estimation of the statistical and systematical uncertainty on the cross section measurement.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The presented study carried out an analysis on rural landscape changes. In particular the study focuses on the understanding of driving forces acting on the rural built environment using a statistical spatial model implemented through GIS techniques. It is well known that the study of landscape changes is essential for a conscious decision making in land planning. From a bibliography review results a general lack of studies dealing with the modeling of rural built environment and hence a theoretical modelling approach for such purpose is needed. The advancement in technology and modernity in building construction and agriculture have gradually changed the rural built environment. In addition, the phenomenon of urbanization of a determined the construction of new volumes that occurred beside abandoned or derelict rural buildings. Consequently there are two types of transformation dynamics affecting mainly the rural built environment that can be observed: the conversion of rural buildings and the increasing of building numbers. It is the specific aim of the presented study to propose a methodology for the development of a spatial model that allows the identification of driving forces that acted on the behaviours of the building allocation. In fact one of the most concerning dynamic nowadays is related to an irrational expansion of buildings sprawl across landscape. The proposed methodology is composed by some conceptual steps that cover different aspects related to the development of a spatial model: the selection of a response variable that better describe the phenomenon under study, the identification of possible driving forces, the sampling methodology concerning the collection of data, the most suitable algorithm to be adopted in relation to statistical theory and method used, the calibration process and evaluation of the model. A different combination of factors in various parts of the territory generated favourable or less favourable conditions for the building allocation and the existence of buildings represents the evidence of such optimum. Conversely the absence of buildings expresses a combination of agents which is not suitable for building allocation. Presence or absence of buildings can be adopted as indicators of such driving conditions, since they represent the expression of the action of driving forces in the land suitability sorting process. The existence of correlation between site selection and hypothetical driving forces, evaluated by means of modeling techniques, provides an evidence of which driving forces are involved in the allocation dynamic and an insight on their level of influence into the process. GIS software by means of spatial analysis tools allows to associate the concept of presence and absence with point futures generating a point process. Presence or absence of buildings at some site locations represent the expression of these driving factors interaction. In case of presences, points represent locations of real existing buildings, conversely absences represent locations were buildings are not existent and so they are generated by a stochastic mechanism. Possible driving forces are selected and the existence of a causal relationship with building allocations is assessed through a spatial model. The adoption of empirical statistical models provides a mechanism for the explanatory variable analysis and for the identification of key driving variables behind the site selection process for new building allocation. The model developed by following the methodology is applied to a case study to test the validity of the methodology. In particular the study area for the testing of the methodology is represented by the New District of Imola characterized by a prevailing agricultural production vocation and were transformation dynamic intensively occurred. The development of the model involved the identification of predictive variables (related to geomorphologic, socio-economic, structural and infrastructural systems of landscape) capable of representing the driving forces responsible for landscape changes.. The calibration of the model is carried out referring to spatial data regarding the periurban and rural area of the study area within the 1975-2005 time period by means of Generalised linear model. The resulting output from the model fit is continuous grid surface where cells assume values ranged from 0 to 1 of probability of building occurrences along the rural and periurban area of the study area. Hence the response variable assesses the changes in the rural built environment occurred in such time interval and is correlated to the selected explanatory variables by means of a generalized linear model using logistic regression. Comparing the probability map obtained from the model to the actual rural building distribution in 2005, the interpretation capability of the model can be evaluated. The proposed model can be also applied to the interpretation of trends which occurred in other study areas, and also referring to different time intervals, depending on the availability of data. The use of suitable data in terms of time, information, and spatial resolution and the costs related to data acquisition, pre-processing, and survey are among the most critical aspects of model implementation. Future in-depth studies can focus on using the proposed model to predict short/medium-range future scenarios for the rural built environment distribution in the study area. In order to predict future scenarios it is necessary to assume that the driving forces do not change and that their levels of influence within the model are not far from those assessed for the time interval used for the calibration.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the main targets of the CMS experiment is to search for the Standard Model Higgs boson. The 4-lepton channel (from the Higgs decay h->ZZ->4l, l = e,mu) is one of the most promising. The analysis is based on the identification of two opposite-sign, same-flavor lepton pairs: leptons are required to be isolated and to come from the same primary vertex. The Higgs would be statistically revealed by the presence of a resonance peak in the 4-lepton invariant mass distribution. The 4-lepton analysis at CMS is presented, spanning on its most important aspects: lepton identification, variables of isolation, impact parameter, kinematics, event selection, background control and statistical analysis of results. The search leads to an evidence for a signal presence with a statistical significance of more than four standard deviations. The excess of data, with respect to the background-only predictions, indicates the presence of a new boson, with a mass of about 126 GeV/c2 , decaying to two Z bosons, whose characteristics are compatible with the SM Higgs ones.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The development of a multibody model of a motorbike engine cranktrain is presented in this work, with an emphasis on flexible component model reduction. A modelling methodology based upon the adoption of non-ideal joints at interface locations, and the inclusion of component flexibility, is developed: both are necessary tasks if one wants to capture dynamic effects which arise in lightweight, high-speed applications. With regard to the first topic, both a ball bearing model and a journal bearing model are implemented, in order to properly capture the dynamic effects of the main connections in the system: angular contact ball bearings are modelled according to a five-DOF nonlinear scheme in order to grasp the crankshaft main bearings behaviour, while an impedance-based hydrodynamic bearing model is implemented providing an enhanced operation prediction at the conrod big end locations. Concerning the second matter, flexible models of the crankshaft and the connecting rod are produced. The well-established Craig-Bampton reduction technique is adopted as a general framework to obtain reduced model representations which are suitable for the subsequent multibody analyses. A particular component mode selection procedure is implemented, based on the concept of Effective Interface Mass, allowing an assessment of the accuracy of the reduced models prior to the nonlinear simulation phase. In addition, a procedure to alleviate the effects of modal truncation, based on the Modal Truncation Augmentation approach, is developed. In order to assess the performances of the proposed modal reduction schemes, numerical tests are performed onto the crankshaft and the conrod models in both frequency and modal domains. A multibody model of the cranktrain is eventually assembled and simulated using a commercial software. Numerical results are presented, demonstrating the effectiveness of the implemented flexible model reduction techniques. The advantages over the conventional frequency-based truncation approach are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A servo-controlled automatic machine can perform tasks that involve synchronized actuation of a significant number of servo-axes, namely one degree-of-freedom (DoF) electromechanical actuators. Each servo-axis comprises a servo-motor, a mechanical transmission and an end-effector, and is responsible for generating the desired motion profile and providing the power required to achieve the overall task. The design of a such a machine must involve a detailed study from a mechatronic viewpoint, due to its electric and mechanical nature. The first objective of this thesis is the development of an overarching electromechanical model for a servo-axis. Every loss source is taken into account, be it mechanical or electrical. The mechanical transmission is modeled by means of a sequence of lumped-parameter blocks. The electric model of the motor and the inverter takes into account winding losses, iron losses and controller switching losses. No experimental characterizations are needed to implement the electric model, since the parameters are inferred from the data available in commercial catalogs. With the global model at disposal, a second objective of this work is to perform the optimization analysis, in particular, the selection of the motor-reducer unit. The optimal transmission ratios that minimize several objective functions are found. An optimization process is carried out and repeated for each candidate motor. Then, we present a novel method where the discrete set of available motor is extended to a continuous domain, by fitting manufacturer data. The problem becomes a two-dimensional nonlinear optimization subject to nonlinear constraints, and the solution gives the optimal choice for the motor-reducer system. The presented electromechanical model, along with the implementation of optimization algorithms, forms a complete and powerful simulation tool for servo-controlled automatic machines. The tool allows for determining a wide range of electric and mechanical parameters and the behavior of the system in different operating conditions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The design optimization of industrial products has always been an essential activity to improve product quality while reducing time-to-market and production costs. Although cost management is very complex and comprises all phases of the product life cycle, the control of geometrical and dimensional variations, known as Dimensional Management (DM), allows compliance with product and process requirements. Hence, the tolerance-cost optimization becomes the main practice to provide an effective application of Design for Tolerancing (DfT) and Design to Cost (DtC) approaches by enabling a connection between product tolerances and associated manufacturing costs. However, despite the growing interest in this topic, a profitable application in the industry of these techniques is hampered by their complexity: the definition of a systematic framework is the key element to improving design optimization, enhancing the concurrent use of Computer-Aided tools and Model-Based Definition (MBD) practices. The present doctorate research aims to define and develop an integrated methodology for product/process design optimization, to better exploit the new capabilities of advanced simulations and tools. By implementing predictive models and multi-disciplinary optimization, a Computer-Aided Integrated framework for tolerance-cost optimization has been proposed to allow the integration of DfT and DtC approaches and their direct application for the design of automotive components. Several case studies have been considered, with the final application of the integrated framework on a high-performance V12 engine assembly, to achieve both functional targets and cost reduction. From a scientific point of view, the proposed methodology provides an improvement for the tolerance-cost optimization of industrial components. The integration of theoretical approaches and Computer-Aided tools allows to analyse the influence of tolerances on both product performance and manufacturing costs. The case studies proved the suitability of the methodology for its application in the industrial field, providing the identification of further areas for improvement and refinement.