11 resultados para Non-parametric
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
This thesis presents a creative and practical approach to dealing with the problem of selection bias. Selection bias may be the most important vexing problem in program evaluation or in any line of research that attempts to assert causality. Some of the greatest minds in economics and statistics have scrutinized the problem of selection bias, with the resulting approaches – Rubin’s Potential Outcome Approach(Rosenbaum and Rubin,1983; Rubin, 1991,2001,2004) or Heckman’s Selection model (Heckman, 1979) – being widely accepted and used as the best fixes. These solutions to the bias that arises in particular from self selection are imperfect, and many researchers, when feasible, reserve their strongest causal inference for data from experimental rather than observational studies. The innovative aspect of this thesis is to propose a data transformation that allows measuring and testing in an automatic and multivariate way the presence of selection bias. The approach involves the construction of a multi-dimensional conditional space of the X matrix in which the bias associated with the treatment assignment has been eliminated. Specifically, we propose the use of a partial dependence analysis of the X-space as a tool for investigating the dependence relationship between a set of observable pre-treatment categorical covariates X and a treatment indicator variable T, in order to obtain a measure of bias according to their dependence structure. The measure of selection bias is then expressed in terms of inertia due to the dependence between X and T that has been eliminated. Given the measure of selection bias, we propose a multivariate test of imbalance in order to check if the detected bias is significant, by using the asymptotical distribution of inertia due to T (Estadella et al. 2005) , and by preserving the multivariate nature of data. Further, we propose the use of a clustering procedure as a tool to find groups of comparable units on which estimate local causal effects, and the use of the multivariate test of imbalance as a stopping rule in choosing the best cluster solution set. The method is non parametric, it does not call for modeling the data, based on some underlying theory or assumption about the selection process, but instead it calls for using the existing variability within the data and letting the data to speak. The idea of proposing this multivariate approach to measure selection bias and test balance comes from the consideration that in applied research all aspects of multivariate balance, not represented in the univariate variable- by-variable summaries, are ignored. The first part contains an introduction to evaluation methods as part of public and private decision process and a review of the literature of evaluation methods. The attention is focused on Rubin Potential Outcome Approach, matching methods, and briefly on Heckman’s Selection Model. The second part focuses on some resulting limitations of conventional methods, with particular attention to the problem of how testing in the correct way balancing. The third part contains the original contribution proposed , a simulation study that allows to check the performance of the method for a given dependence setting and an application to a real data set. Finally, we discuss, conclude and explain our future perspectives.
Resumo:
Osteoarthritis (OA) or degenerative joint disease (DJD) is a pathology which affects the synovial joints and characterised by a focal loss of articular cartilage and subsequent bony reaction of the subcondral and marginal bone. Its etiology is best explained by a multifactorial model including: age, sex, genetic and systemic factors, other predisposing diseases and functional stress. In this study the results of the investigation of a modern identified skeletal collection will be presented. In particular, we will focus on the relationship between the presence of OA at various joints. The joint modifications have been analysed using a new methodology that allows the scoring of different degrees of expression of the features considered. Materials and Methods The sample examined comes from the Sassari identified skeletal collection (part of “Frassetto collections”). The individuals were born between 1828 and 1916 and died between 1918 and 1932. Information about sex and age is known for all the individuals. The occupation is known for 173 males and 125 females. Data concerning the occupation of the individuals indicate a preindustrial and rural society. OA has been diagnosed when eburnation (EB) or loss of morphology (LM) were present, or when at least two of the following: marginal lipping (ML), esostosis (EX) or erosion (ER), were present. For each articular surface affected a “mean score” was calculated, reflecting the “severity” of the alterations. A further “score” was calculated for each joint. In the analysis sexes and age classes were always kept separate. For the statistical analyses non parametric test were used. Results The results show there is an increase of OA with age in all the joints analyzed and in particular around 50 years and 60 years. The shoulder, the hip and the knee are the joints mainly affected with ageing while the ankle is the less affected; the correlation values confirm this result. The lesion which show the major correlation with age is the ML. In our sample males are more frequently and more severely affected by OA than females, particularly at the superior limbs, while hip and knee are similarly affected in the two sexes. Lateralization shows some positive results in particular in the right shoulder of males and in various articular surfaces especially of the superior limb of both males and females; articular surfaces and joints are quite always lateralized to the right. Occupational analyses did not show remarkable results probably because of the homogeneity of the sample; males although performing different activities are quite all employed in stressful works. No highest prevalence of knee and hip OA was found in farm-workers respect to the other males. Discussion and Conclusion In this work we propose a methodology to score the different features, necessary to diagnose OA, that allows the investigation of the severity of joint degeneration. This method is easier than the one proposed by Buikstra and Ubelaker (1994), but in the same time allows a quite detailed recording of the features. Epidemiological results can be interpreted quite simply and they are in accordance with other studies; more difficult is the interpretation of the occupational results because many questions concerning the activities performed by the individuals of the collection during their lifespan cannot be solved. Because of this, caution is suggested in the interpretation of bioarcheological specimens. With this work we hope to contribute to the discussion on the puzzling problem of the etiology of OA. The possibility of studying identified skeletons will add important data to the description of osseous features of OA, enriching the medical documentation, based on different criteria. Even if we are aware that the clinical diagnosis is different from the palaeopathological one we think our work will be useful in clarifying some epidemiological as well as pathological aspects of OA.
Resumo:
The primary aim of this dissertation to identify subgroups of patients with chronic kidney disease (CKD) who have a differential risk of progression of illness and the secondary aim is compare 2 equations to estimate the glomerular filtration rate (GFR). To this purpose, the PIRP (Prevention of Progressive Kidney Disease) registry was linked with the dialysis and mortality registries. The outcome of interest is the mean annual variation of GFR, estimated using the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) equation. A decision tree model was used to subtype CKD patients, based on the non-parametric procedure CHAID (Chi-squared Automatic Interaction Detector). The independent variables of the model include gender, age, diabetes, hypertension, cardiac diseases, body mass index, baseline serum creatinine, haemoglobin, proteinuria, LDL cholesterol, tryglycerides, serum phoshates, glycemia, parathyroid hormone and uricemia. The decision tree model classified patients into 10 terminal nodes using 6 variables (gender, age, proteinuria, diabetes, serum phosphates and ischemic cardiac disease) that predict a differential progression of kidney disease. Specifically, age <=53 year, male gender, proteinuria, diabetes and serum phosphates >3.70 mg/dl predict a faster decrease of GFR, while ischemic cardiac disease predicts a slower decrease. The comparison between GFR estimates obtained using MDRD4 and CKD-EPI equations shows a high percentage agreement (>90%), with modest discrepancies for high and low age and serum creatinine levels. The study results underscore the need for a tight follow-up schedule in patients with age <53, and of patients aged 54 to 67 with diabetes, to try to slow down the progression of the disease. The result also emphasize the effective management of patients aged>67, in whom the estimated decrease in glomerular filtration rate corresponds with the physiological decrease observed in the absence of kidney disease, except for the subgroup of patients with proteinuria, in whom the GFR decline is more pronounced.
Resumo:
The fall of the Berlin Wall opened the way for a reform path – the transition process – which accompanied ten former Socialist countries in Central and South Eastern Europe to knock at the EU doors. By the way, at the time of the EU membership several economic and structural weaknesses remained. A tendency towards convergence between the new Member States (NMS) and the EU average income level emerged, together with a spread of inequality at the sub-regional level, mainly driven by the backwardness of the agricultural and rural areas. Several progresses were made in evaluating the policies for rural areas, but a shared definition of rurality is still missing. Numerous indicators were calculated for assessing the effectiveness of the Common Agricultural Policy and Rural Development Policy. Previous analysis on the Central and Eastern European countries found that the characteristics of the most backward areas were insufficiently addressed by the policies enacted; the low data availability and accountability at a sub-regional level, and the deficiencies in institutional planning and implementation represented an obstacle for targeting policies and payments. The next pages aim at providing a basis for understanding the connections between the peculiarities of the transition process, the current development performance of NMS and the EU role, with particular attention to the agricultural and rural areas. Applying a mixed methodological approach (multivariate statistics, non-parametric methods, spatial econometrics), this study contributes to the identification of rural areas and to the analysis of the changes occurred during the EU membership in Hungary, assessing the effect of CAP introduction and its contribution to the convergence of the Hungarian agricultural and rural. The author believes that more targeted – and therefore efficient – policies for agricultural and rural areas require a deeper knowledge of their structural and dynamic characteristics.
Resumo:
The concept of competitiveness, for a long time considered as strictly connected to economic and financial performances, evolved, above all in recent years, toward new, wider interpretations disclosing its multidimensional nature. The shift to a multidimensional view of the phenomenon has excited an intense debate involving theoretical reflections on the features characterizing it, as well as methodological considerations on its assessment and measurement. The present research has a twofold objective: going in depth with the study of tangible and intangible aspect characterizing multidimensional competitive phenomena by assuming a micro-level point of view, and measuring competitiveness through a model-based approach. Specifically, we propose a non-parametric approach to Structural Equation Models techniques for the computation of multidimensional composite measures. Structural Equation Models tools will be used for the development of the empirical application on the italian case: a model based micro-level competitiveness indicator for the measurement of the phenomenon on a large sample of Italian small and medium enterprises will be constructed.
Resumo:
In this thesis we have developed solutions to common issues regarding widefield microscopes, facing the problem of the intensity inhomogeneity of an image and dealing with two strong limitations: the impossibility of acquiring either high detailed images representative of whole samples or deep 3D objects. First, we cope with the problem of the non-uniform distribution of the light signal inside a single image, named vignetting. In particular we proposed, for both light and fluorescent microscopy, non-parametric multi-image based methods, where the vignetting function is estimated directly from the sample without requiring any prior information. After getting flat-field corrected images, we studied how to fix the problem related to the limitation of the field of view of the camera, so to be able to acquire large areas at high magnification. To this purpose, we developed mosaicing techniques capable to work on-line. Starting from a set of overlapping images manually acquired, we validated a fast registration approach to accurately stitch together the images. Finally, we worked to virtually extend the field of view of the camera in the third dimension, with the purpose of reconstructing a single image completely in focus, stemming from objects having a relevant depth or being displaced in different focus planes. After studying the existing approaches for extending the depth of focus of the microscope, we proposed a general method that does not require any prior information. In order to compare the outcome of existing methods, different standard metrics are commonly used in literature. However, no metric is available to compare different methods in real cases. First, we validated a metric able to rank the methods as the Universal Quality Index does, but without needing any reference ground truth. Second, we proved that the approach we developed performs better in both synthetic and real cases.
Resumo:
The aim of this study was to examine whether a real high speed-short term competition influences clinicopathological data focusing on muscle enzymes, iron profile and Acute Phase Proteins. 30 Thoroughbred racing horses (15 geldings and 15 females) aged between 4-12 years (mean 7 years), were used for the study. All the animals performed a high speed-short term competition for a total distance of 154 m in about 12 seconds, repeated 8 times, within approximately one hour (Niballo Horse Race). Blood samples were obtained 24 hours before and within 30 minutes after the end of the races. On all samples were performed a complete blood count (CBC), biochemical and haemostatic profiles. The post-race concentrations for the single parameter were corrected using an estimation of the plasma volume contraction according to the individual Alb concentration. Data were analysed with descriptive statistics and the percentage of variation from the baseline values were recorded. Pre- and post-race results were compared with non-parametric statistics (Mann Whitney U test). A difference was considered significant at p<0.05. A significant plasma volume contraction after the race was detected (Hct, Alb; p<0.01). Other relevant findings were increased concentrations of muscular enzymes (CK, LDH; p<0.01), Crt (p<0.01), significant increased uric acid (p<0.01), a significant decrease of haptoglobin (p<0.01) associated to an increase of ferritin concentrations (p<0.01), significant decrease of fibrinogen (p<0.05) accompanied by a non-significant increase of D-Dimers concentrations (p=0.08). This competition produced relevant abnormalities on clinical pathology in galloping horses. This study confirms a significant muscular damage, oxidative stress, intravascular haemolysis and subclinical hemostatic alterations. Further studies are needed to better understand the pathogenesis, the medical relevance and the impact on performance of these alterations in equine sport medicine.
Resumo:
The objective of this study is to measure the impact of the national subsidy scheme on the olive and fruit sector in two regions of Albania, Shkodra and Fier. From the methodological point of view, we use a non- parametric approach based on the propensity score matching. This method overcomes problem of the missing data, by creating a counterfactual scenario. In the first step, the conditional probability to participate in the program was computed. Afterwards, different matching estimators were applied to establish whether the subsidies have affected sectors performance. One of the strengths of this study stays in the data. Cross-sectional primary data was gathered through about 250 interviews.. We have not found empirical evidence of significant effects of government aid program on production. Differences in production found between beneficiaries and non-beneficiaries disappear after adjustment by the conditional probability of participating into the program. This suggests that subsidized farmers would have performed better than the subsidized households even in the absence of production grants, revealing program self-selection. On the other hand, the scheme has affected positively the farm structure increasing the area under cultivation, but yields has not increased for beneficiaries compared to non beneficiaries. These combined results shed light on the reason of the missed impact. It could be reasonable to believe that the new plantation, in particular in the case of olives, has not yet reached full production. Therefore, we have reasons to believe on positive impacts in the future. Concerning some qualitative results, the extension of area under cultivation is strongly conditioned by the small farm size. This together with a thin land market makes extremely difficult the expansion beyond farm boundaries.
Resumo:
This work provides a forward step in the study and comprehension of the relationships between stochastic processes and a certain class of integral-partial differential equation, which can be used in order to model anomalous diffusion and transport in statistical physics. In the first part, we brought the reader through the fundamental notions of probability and stochastic processes, stochastic integration and stochastic differential equations as well. In particular, within the study of H-sssi processes, we focused on fractional Brownian motion (fBm) and its discrete-time increment process, the fractional Gaussian noise (fGn), which provide examples of non-Markovian Gaussian processes. The fGn, together with stationary FARIMA processes, is widely used in the modeling and estimation of long-memory, or long-range dependence (LRD). Time series manifesting long-range dependence, are often observed in nature especially in physics, meteorology, climatology, but also in hydrology, geophysics, economy and many others. We deepely studied LRD, giving many real data examples, providing statistical analysis and introducing parametric methods of estimation. Then, we introduced the theory of fractional integrals and derivatives, which indeed turns out to be very appropriate for studying and modeling systems with long-memory properties. After having introduced the basics concepts, we provided many examples and applications. For instance, we investigated the relaxation equation with distributed order time-fractional derivatives, which describes models characterized by a strong memory component and can be used to model relaxation in complex systems, which deviates from the classical exponential Debye pattern. Then, we focused in the study of generalizations of the standard diffusion equation, by passing through the preliminary study of the fractional forward drift equation. Such generalizations have been obtained by using fractional integrals and derivatives of distributed orders. In order to find a connection between the anomalous diffusion described by these equations and the long-range dependence, we introduced and studied the generalized grey Brownian motion (ggBm), which is actually a parametric class of H-sssi processes, which have indeed marginal probability density function evolving in time according to a partial integro-differential equation of fractional type. The ggBm is of course Non-Markovian. All around the work, we have remarked many times that, starting from a master equation of a probability density function f(x,t), it is always possible to define an equivalence class of stochastic processes with the same marginal density function f(x,t). All these processes provide suitable stochastic models for the starting equation. Studying the ggBm, we just focused on a subclass made up of processes with stationary increments. The ggBm has been defined canonically in the so called grey noise space. However, we have been able to provide a characterization notwithstanding the underline probability space. We also pointed out that that the generalized grey Brownian motion is a direct generalization of a Gaussian process and in particular it generalizes Brownain motion and fractional Brownain motion as well. Finally, we introduced and analyzed a more general class of diffusion type equations related to certain non-Markovian stochastic processes. We started from the forward drift equation, which have been made non-local in time by the introduction of a suitable chosen memory kernel K(t). The resulting non-Markovian equation has been interpreted in a natural way as the evolution equation of the marginal density function of a random time process l(t). We then consider the subordinated process Y(t)=X(l(t)) where X(t) is a Markovian diffusion. The corresponding time-evolution of the marginal density function of Y(t) is governed by a non-Markovian Fokker-Planck equation which involves the same memory kernel K(t). We developed several applications and derived the exact solutions. Moreover, we considered different stochastic models for the given equations, providing path simulations.
Resumo:
The thesis studies the economic and financial conditions of Italian households, by using microeconomic data of the Survey on Household Income and Wealth (SHIW) over the period 1998-2006. It develops along two lines of enquiry. First it studies the determinants of households holdings of assets and liabilities and estimates their correlation degree. After a review of the literature, it estimates two non-linear multivariate models on the interactions between assets and liabilities with repeated cross-sections. Second, it analyses households financial difficulties. It defines a quantitative measure of financial distress and tests, by means of non-linear dynamic probit models, whether the probability of experiencing financial difficulties is persistent over time. Chapter 1 provides a critical review of the theoretical and empirical literature on the estimation of assets and liabilities holdings, on their interactions and on households net wealth. The review stresses the fact that a large part of the literature explain households debt holdings as a function, among others, of net wealth, an assumption that runs into possible endogeneity problems. Chapter 2 defines two non-linear multivariate models to study the interactions between assets and liabilities held by Italian households. Estimation refers to a pooling of cross-sections of SHIW. The first model is a bivariate tobit that estimates factors affecting assets and liabilities and their degree of correlation with results coherent with theoretical expectations. To tackle the presence of non normality and heteroskedasticity in the error term, generating non consistent tobit estimators, semi-parametric estimates are provided that confirm the results of the tobit model. The second model is a quadrivariate probit on three different assets (safe, risky and real) and total liabilities; the results show the expected patterns of interdependence suggested by theoretical considerations. Chapter 3 reviews the methodologies for estimating non-linear dynamic panel data models, drawing attention to the problems to be dealt with to obtain consistent estimators. Specific attention is given to the initial condition problem raised by the inclusion of the lagged dependent variable in the set of explanatory variables. The advantage of using dynamic panel data models lies in the fact that they allow to simultaneously account for true state dependence, via the lagged variable, and unobserved heterogeneity via individual effects specification. Chapter 4 applies the models reviewed in Chapter 3 to analyse financial difficulties of Italian households, by using information on net wealth as provided in the panel component of the SHIW. The aim is to test whether households persistently experience financial difficulties over time. A thorough discussion is provided of the alternative approaches proposed by the literature (subjective/qualitative indicators versus quantitative indexes) to identify households in financial distress. Households in financial difficulties are identified as those holding amounts of net wealth lower than the value corresponding to the first quartile of net wealth distribution. Estimation is conducted via four different methods: the pooled probit model, the random effects probit model with exogenous initial conditions, the Heckman model and the recently developed Wooldridge model. Results obtained from all estimators accept the null hypothesis of true state dependence and show that, according with the literature, less sophisticated models, namely the pooled and exogenous models, over-estimate such persistence.
Resumo:
Hydrothermal fluids are a fundamental resource for understanding and monitoring volcanic and non-volcanic systems. This thesis is focused on the study of hydrothermal system through numerical modeling with the geothermal simulator TOUGH2. Several simulations are presented, and geophysical and geochemical observables, arising from fluids circulation, are analyzed in detail throughout the thesis. In a volcanic setting, fluids feeding fumaroles and hot spring may play a key role in the hazard evaluation. The evolution of the fluids circulation is caused by a strong interaction between magmatic and hydrothermal systems. A simultaneous analysis of different geophysical and geochemical observables is a sound approach for interpreting monitored data and to infer a consistent conceptual model. Analyzed observables are ground displacement, gravity changes, electrical conductivity, amount, composition and temperature of the emitted gases at surface, and extent of degassing area. Results highlight the different temporal response of the considered observables, as well as the different radial pattern of variation. However, magnitude, temporal response and radial pattern of these signals depend not only on the evolution of fluid circulation, but a main role is played by the considered rock properties. Numerical simulations highlight differences that arise from the assumption of different permeabilities, for both homogeneous and heterogeneous systems. Rock properties affect hydrothermal fluid circulation, controlling both the range of variation and the temporal evolution of the observable signals. Low temperature fumaroles and low discharge rate may be affected by atmospheric conditions. Detailed parametric simulations were performed, aimed to understand the effects of system properties, such as permeability and gas reservoir overpressure, on diffuse degassing when air temperature and barometric pressure changes are applied to the ground surface. Hydrothermal circulation, however, is not only a characteristic of volcanic system. Hot fluids may be involved in several mankind problems, such as studies on geothermal engineering, nuclear waste propagation in porous medium, and Geological Carbon Sequestration (GCS). The current concept for large-scale GCS is the direct injection of supercritical carbon dioxide into deep geological formations which typically contain brine. Upward displacement of such brine from deep reservoirs driven by pressure increases resulting from carbon dioxide injection may occur through abandoned wells, permeable faults or permeable channels. Brine intrusion into aquifers may degrade groundwater resources. Numerical results show that pressure rise drives dense water up to the conduits, and does not necessarily result in continuous flow. Rather, overpressure leads to new hydrostatic equilibrium if fluids are initially density stratified. If warm and salty fluid does not cool passing through the conduit, an oscillatory solution is then possible. Parameter studies delineate steady-state (static) and oscillatory solutions.