819 resultados para ESTIMATOR
Resumo:
Multiple linear regression model plays a key role in statistical inference and it has extensive applications in business, environmental, physical and social sciences. Multicollinearity has been a considerable problem in multiple regression analysis. When the regressor variables are multicollinear, it becomes difficult to make precise statistical inferences about the regression coefficients. There are some statistical methods that can be used, which are discussed in this thesis are ridge regression, Liu, two parameter biased and LASSO estimators. Firstly, an analytical comparison on the basis of risk was made among ridge, Liu and LASSO estimators under orthonormal regression model. I found that LASSO dominates least squares, ridge and Liu estimators over a significant portion of the parameter space for large dimension. Secondly, a simulation study was conducted to compare performance of ridge, Liu and two parameter biased estimator by their mean squared error criterion. I found that two parameter biased estimator performs better than its corresponding ridge regression estimator. Overall, Liu estimator performs better than both ridge and two parameter biased estimator.
Resumo:
The objective of this research was to investigate monthly climatological, seasonal, annual and interdecadal of the reference evapotranspiration (ETo) in Acre state in order to better understand its spatial and temporal variability and identify possible trends in the region. The study was conducted with data from Rio Branco municipalities, the state capital, Tarauacá and Cruzeiro do Sul considering a 30-year period (1985-2014), from monthly data from weather stations surface of the National Institute of Meteorology. The methodology was held, first, the consistency of meteorological data. Thus, it was made the gap filling in the time series by means of multivariate techniques. Subsequently were performed statistical tests trend (Mann-Kendall) and homogeneity, by Sen's estimator of the magnitude of this trend is estimated, as well as computational algorithms containing parametric and non-parametric tests for two samples to identify from that year the trend has become significant. Finally, analysis of variance technique (ANOVA) was adopted in order to verify whether there were significant differences in average annual evapotranspiration between locations. The indirect method of Penman-Montheith parameterized by FAO was used to calculate the ETo. The results of this work through examination of the descriptive statistics showed that the ETo the annual average was 3.80, 2.92 and 2.86 mm day-1 year, to Rio Branco, Tarauacá and Cruzeiro do Sul, respectively. Featuring quite remarkable seasonal pattern with a minimum in June and a maximum in October, with Rio Branco to town one with the strongest signal (amplitudes) on the other hand, the Southern Cross presented the highest variability among the studied locations. By ANOVA it was found that the average annual statistically different for a significance level of 1% between locations, but the annual average between Cruzeiro do Sul and Tarauacá no statistically significant differences. For the three locations, the 2000s was the one with the highest ETo values associated with warmer waters of the North Atlantic basin and the 80s to lower values, associated with cooler waters of this basin. By analyzing the Mann-kendall and Sen estimator test, there was a trend of increasing the seasonal reference evapotranspiration (fall, winter and spring) on the order of 0.11 mm per decade and that from the years of 1990, 1996 and 2001 became statistically significant to the localities of Cruzeiro do Sul Tarauacá and Rio Branco, respectively. For trend analysis of meteorological parameters was observed positive trend in the 5% level of significance, for average temperature, minimum temperature and solar radiation.
Resumo:
Bayesian adaptive methods have been extensively used in psychophysics to estimate the point at which performance on a task attains arbitrary percentage levels, although the statistical properties of these estimators have never been assessed. We used simulation techniques to determine the small-sample properties of Bayesian estimators of arbitrary performance points, specifically addressing the issues of bias and precision as a function of the target percentage level. The study covered three major types of psychophysical task (yes-no detection, 2AFC discrimination and 2AFC detection) and explored the entire range of target performance levels allowed for by each task. Other factors included in the study were the form and parameters of the actual psychometric function Psi, the form and parameters of the model function M assumed in the Bayesian method, and the location of Psi within the parameter space. Our results indicate that Bayesian adaptive methods render unbiased estimators of any arbitrary point on psi only when M=Psi, and otherwise they yield bias whose magnitude can be considerable as the target level moves away from the midpoint of the range of Psi. The standard error of the estimator also increases as the target level approaches extreme values whether or not M=Psi. Contrary to widespread belief, neither the performance level at which bias is null nor that at which standard error is minimal can be predicted by the sweat factor. A closed-form expression nevertheless gives a reasonable fit to data describing the dependence of standard error on number of trials and target level, which allows determination of the number of trials that must be administered to obtain estimates with prescribed precision.
Resumo:
The paper considers various extended asymmetric multivariate conditional volatility models, and derives appropriate regularity conditions and associated asymptotic theory. This enables checking of internal consistency and allows valid statistical inferences to be drawn based on empirical estimation. For this purpose, we use an underlying vector random coefficient autoregressive process, for which we show the equivalent representation for the asymmetric multivariate conditional volatility model, to derive asymptotic theory for the quasi-maximum likelihood estimator. As an extension, we develop a new multivariate asymmetric long memory volatility model, and discuss the associated asymptotic properties.
Resumo:
The paper develops a novel realized matrix-exponential stochastic volatility model of multivariate returns and realized covariances that incorporates asymmetry and long memory (hereafter the RMESV-ALM model). The matrix exponential transformation guarantees the positivedefiniteness of the dynamic covariance matrix. The contribution of the paper ties in with Robert Basmann’s seminal work in terms of the estimation of highly non-linear model specifications (“Causality tests and observationally equivalent representations of econometric models”, Journal of Econometrics, 1988, 39(1-2), 69–104), especially for developing tests for leverage and spillover effects in the covariance dynamics. Efficient importance sampling is used to maximize the likelihood function of RMESV-ALM, and the finite sample properties of the quasi-maximum likelihood estimator of the parameters are analysed. Using high frequency data for three US financial assets, the new model is estimated and evaluated. The forecasting performance of the new model is compared with a novel dynamic realized matrix-exponential conditional covariance model. The volatility and co-volatility spillovers are examined via the news impact curves and the impulse response functions from returns to volatility and co-volatility.
Resumo:
Open Access funded by Medical Research Council Acknowledgment The work reported here was funded by a grant from the Medical Research Council, UK, grant number: MR/J013838/1.
Resumo:
The Auger Engineering Radio Array (AERA) is part of the Pierre Auger Observatory and is used to detect the radio emission of cosmic-ray air showers. These observations are compared to the data of the surface detector stations of the Observatory, which provide well-calibrated information on the cosmic-ray energies and arrival directions. The response of the radio stations in the 30-80 MHz regime has been thoroughly calibrated to enable the reconstruction of the incoming electric field. For the latter, the energy deposit per area is determined from the radio pulses at each observer position and is interpolated using a two-dimensional function that takes into account signal asymmetries due to interference between the geomagnetic and charge-excess emission components. The spatial integral over the signal distribution gives a direct measurement of the energy transferred from the primary cosmic ray into radio emission in the AERA frequency range. We measure 15.8 MeV of radiation energy for a 1 EeV air shower arriving perpendicularly to the geomagnetic field. This radiation energy-corrected for geometrical effects-is used as a cosmic-ray energy estimator. Performing an absolute energy calibration against the surface-detector information, we observe that this radio-energy estimator scales quadratically with the cosmic-ray energy as expected for coherent emission. We find an energy resolution of the radio reconstruction of 22% for the data set and 17% for a high-quality subset containing only events with at least five radio stations with signal.
Resumo:
The position of a stationary target can be determined using triangulation in combination with time of arrival measurements at several sensors. In urban environments, none-line-of-sight (NLOS) propagation leads to biased time estimation and thus to inaccurate position estimates. Here, a semi-parametric approach is proposed to mitigate the effects of NLOS propagation. The degree of contamination by NLOS components in the observations, which result in asymmetric noise statistics, is determined and incorporated into the estimator. The proposed method is adequate for environments where the NLOS error plays a dominant role and outperforms previous approaches that assume a symmetric noise statistic.
Resumo:
Social attitudes, attitudes toward financial risk and attitudes toward deferred gratification are thought to influence many important economic decisions over the life-course. In economic theory, these attitudes are key components in diverse models of behavior, including collective action, saving and investment decisions and occupational choice. The relevance of these attitudes have been confirmed empirically. Yet, the factors that influence them are not well understood. This research evaluates how these attitudes are affected by large disruptive events, namely, a natural disaster and a civil conflict, and also by an individual-specific life event, namely, having children.
By implementing rigorous empirical strategies drawing on rich longitudinal datasets, this research project advances our understanding of how life experiences shape these attitudes. Moreover, compelling evidence is provided that the observed changes in attitudes are likely to reflect changes in preferences given that they are not driven just by changes in financial circumstances. Therefore the findings of this research project also contribute to the discussion of whether preferences are really fixed, a usual assumption in economics.
In the first chapter, I study how altruistic and trusting attitudes are affected by exposure to the 2004 Indian Ocean tsunami as long as ten years after the disaster occurred. Establishing a causal relationship between natural disasters and attitudes presents several challenges as endogenous exposure and sample selection can confound the analysis. I take on these challenges by exploiting plausibly exogenous variation in exposure to the tsunami and by relying on a longitudinal dataset representative of the pre-tsunami population in two districts of Aceh, Indonesia. The sample is drawn from the Study of the Tsunami Aftermath and Recovery (STAR), a survey with data collected both before and after the disaster and especially designed to identify the impact of the tsunami. The altruistic and trusting attitudes of the respondents are measured by their behavior in the dictator and trust games. I find that witnessing closely the damage caused by the tsunami but without suffering severe economic damage oneself increases altruistic and trusting behavior, particularly towards individuals from tsunami affected communities. Having suffered severe economic damage has no impact on altruistic behavior but may have increased trusting behavior. These effects do not seem to be caused by the consequences of the tsunami on people’s financial situation. Instead they are consistent with how experiences of loss and solidarity may have shaped social attitudes by affecting empathy and perceptions of who is deserving of aid and trust.
In the second chapter, co-authored with Ryan Brown, Duncan Thomas and Andrea Velasquez, we investigate how attitudes toward financial risk are affected by elevated levels of insecurity and uncertainty brought on by the Mexican Drug War. To conduct our analysis, we pair the Mexican Family Life Survey (MxFLS), a rich longitudinal dataset ideally suited for our purposes, with a dataset on homicide rates at the month and municipality-level. The homicide rates capture well the overall crime environment created by the drug war. The MxFLS elicits risk attitudes by asking respondents to choose between hypothetical gambles with different payoffs. Our strategy to identify a causal effect has two key components. First, we implement an individual fixed effects strategy which allows us to control for all time-invariant heterogeneity. The remaining time variant heterogeneity is unlikely to be correlated with changes in the local crime environment given the well-documented political origins of the Mexican Drug War. We also show supporting evidence in this regard. The second component of our identification strategy is to use an intent-to-treat approach to shield our estimates from endogenous migration. Our findings indicate that exposure to greater local-area violent crime results in increased risk aversion. This effect is not driven by changes in financial circumstances, but may be explained instead by heightened fear of victimization. Nonetheless, we find that having greater economic resources mitigate the impact. This may be due to individuals with greater economic resources being able to avoid crime by affording better transportation or security at work.
The third chapter, co-authored with Duncan Thomas, evaluates whether attitudes toward deferred gratification change after having children. For this study we also exploit the MxFLS, which elicits attitudes toward deferred gratification (commonly known as time discounting) by asking individuals to choose between hypothetical payments at different points in time. We implement a difference-in-difference estimator to control for all time-invariant heterogeneity and show that our results are robust to the inclusion of time varying characteristics likely correlated with child birth. We find that becoming a mother increases time discounting especially in the first two years after childbirth and in particular for those women without a spouse at home. Having additional children does not have an effect and the effect for men seems to go in the opposite direction. These heterogeneous effects suggest that child rearing may affect time discounting due to generated stress or not fully anticipated spending needs.
Resumo:
Extremal quantile index is a concept that the quantile index will drift to zero (or one)
as the sample size increases. The three chapters of my dissertation consists of three
applications of this concept in three distinct econometric problems. In Chapter 2, I
use the concept of extremal quantile index to derive new asymptotic properties and
inference method for quantile treatment effect estimators when the quantile index
of interest is close to zero. In Chapter 3, I rely on the concept of extremal quantile
index to achieve identification at infinity of the sample selection models and propose
a new inference method. Last, in Chapter 4, I use the concept of extremal quantile
index to define an asymptotic trimming scheme which can be used to control the
convergence rate of the estimator of the intercept of binary response models.
Resumo:
My dissertation has three chapters which develop and apply microeconometric tech- niques to empirically relevant problems. All the chapters examines the robustness issues (e.g., measurement error and model misspecification) in the econometric anal- ysis. The first chapter studies the identifying power of an instrumental variable in the nonparametric heterogeneous treatment effect framework when a binary treat- ment variable is mismeasured and endogenous. I characterize the sharp identified set for the local average treatment effect under the following two assumptions: (1) the exclusion restriction of an instrument and (2) deterministic monotonicity of the true treatment variable in the instrument. The identification strategy allows for general measurement error. Notably, (i) the measurement error is nonclassical, (ii) it can be endogenous, and (iii) no assumptions are imposed on the marginal distribution of the measurement error, so that I do not need to assume the accuracy of the measure- ment. Based on the partial identification result, I provide a consistent confidence interval for the local average treatment effect with uniformly valid size control. I also show that the identification strategy can incorporate repeated measurements to narrow the identified set, even if the repeated measurements themselves are endoge- nous. Using the the National Longitudinal Study of the High School Class of 1972, I demonstrate that my new methodology can produce nontrivial bounds for the return to college attendance when attendance is mismeasured and endogenous.
The second chapter, which is a part of a coauthored project with Federico Bugni, considers the problem of inference in dynamic discrete choice problems when the structural model is locally misspecified. We consider two popular classes of estimators for dynamic discrete choice models: K-step maximum likelihood estimators (K-ML) and K-step minimum distance estimators (K-MD), where K denotes the number of policy iterations employed in the estimation problem. These estimator classes include popular estimators such as Rust (1987)’s nested fixed point estimator, Hotz and Miller (1993)’s conditional choice probability estimator, Aguirregabiria and Mira (2002)’s nested algorithm estimator, and Pesendorfer and Schmidt-Dengler (2008)’s least squares estimator. We derive and compare the asymptotic distributions of K- ML and K-MD estimators when the model is arbitrarily locally misspecified and we obtain three main results. In the absence of misspecification, Aguirregabiria and Mira (2002) show that all K-ML estimators are asymptotically equivalent regardless of the choice of K. Our first result shows that this finding extends to a locally misspecified model, regardless of the degree of local misspecification. As a second result, we show that an analogous result holds for all K-MD estimators, i.e., all K- MD estimator are asymptotically equivalent regardless of the choice of K. Our third and final result is to compare K-MD and K-ML estimators in terms of asymptotic mean squared error. Under local misspecification, the optimally weighted K-MD estimator depends on the unknown asymptotic bias and is no longer feasible. In turn, feasible K-MD estimators could have an asymptotic mean squared error that is higher or lower than that of the K-ML estimators. To demonstrate the relevance of our asymptotic analysis, we illustrate our findings using in a simulation exercise based on a misspecified version of Rust (1987) bus engine problem.
The last chapter investigates the causal effect of the Omnibus Budget Reconcil- iation Act of 1993, which caused the biggest change to the EITC in its history, on unemployment and labor force participation among single mothers. Unemployment and labor force participation are difficult to define for a few reasons, for example, be- cause of marginally attached workers. Instead of searching for the unique definition for each of these two concepts, this chapter bounds unemployment and labor force participation by observable variables and, as a result, considers various competing definitions of these two concepts simultaneously. This bounding strategy leads to partial identification of the treatment effect. The inference results depend on the construction of the bounds, but they imply positive effect on labor force participa- tion and negligible effect on unemployment. The results imply that the difference- in-difference result based on the BLS definition of unemployment can be misleading
due to misclassification of unemployment.
Resumo:
Fitting statistical models is computationally challenging when the sample size or the dimension of the dataset is huge. An attractive approach for down-scaling the problem size is to first partition the dataset into subsets and then fit using distributed algorithms. The dataset can be partitioned either horizontally (in the sample space) or vertically (in the feature space), and the challenge arise in defining an algorithm with low communication, theoretical guarantees and excellent practical performance in general settings. For sample space partitioning, I propose a MEdian Selection Subset AGgregation Estimator ({\em message}) algorithm for solving these issues. The algorithm applies feature selection in parallel for each subset using regularized regression or Bayesian variable selection method, calculates the `median' feature inclusion index, estimates coefficients for the selected features in parallel for each subset, and then averages these estimates. The algorithm is simple, involves very minimal communication, scales efficiently in sample size, and has theoretical guarantees. I provide extensive experiments to show excellent performance in feature selection, estimation, prediction, and computation time relative to usual competitors.
While sample space partitioning is useful in handling datasets with large sample size, feature space partitioning is more effective when the data dimension is high. Existing methods for partitioning features, however, are either vulnerable to high correlations or inefficient in reducing the model dimension. In the thesis, I propose a new embarrassingly parallel framework named {\em DECO} for distributed variable selection and parameter estimation. In {\em DECO}, variables are first partitioned and allocated to m distributed workers. The decorrelated subset data within each worker are then fitted via any algorithm designed for high-dimensional problems. We show that by incorporating the decorrelation step, DECO can achieve consistent variable selection and parameter estimation on each subset with (almost) no assumptions. In addition, the convergence rate is nearly minimax optimal for both sparse and weakly sparse models and does NOT depend on the partition number m. Extensive numerical experiments are provided to illustrate the performance of the new framework.
For datasets with both large sample sizes and high dimensionality, I propose a new "divided-and-conquer" framework {\em DEME} (DECO-message) by leveraging both the {\em DECO} and the {\em message} algorithm. The new framework first partitions the dataset in the sample space into row cubes using {\em message} and then partition the feature space of the cubes using {\em DECO}. This procedure is equivalent to partitioning the original data matrix into multiple small blocks, each with a feasible size that can be stored and fitted in a computer in parallel. The results are then synthezied via the {\em DECO} and {\em message} algorithm in a reverse order to produce the final output. The whole framework is extremely scalable.
Resumo:
Free energy calculations are a computational method for determining thermodynamic quantities, such as free energies of binding, via simulation.
Currently, due to computational and algorithmic limitations, free energy calculations are limited in scope.
In this work, we propose two methods for improving the efficiency of free energy calculations.
First, we expand the state space of alchemical intermediates, and show that this expansion enables us to calculate free energies along lower variance paths.
We use Q-learning, a reinforcement learning technique, to discover and optimize paths at low computational cost.
Second, we reduce the cost of sampling along a given path by using sequential Monte Carlo samplers.
We develop a new free energy estimator, pCrooks (pairwise Crooks), a variant on the Crooks fluctuation theorem (CFT), which enables decomposition of the variance of the free energy estimate for discrete paths, while retaining beneficial characteristics of CFT.
Combining these two advancements, we show that for some test models, optimal expanded-space paths have a nearly 80% reduction in variance relative to the standard path.
Additionally, our free energy estimator converges at a more consistent rate and on average 1.8 times faster when we enable path searching, even when the cost of path discovery and refinement is considered.
Resumo:
The purpose of this dissertation is to examine three distributional issues in macroeconomics. First I explore the effects fiscal federalism on economic growth across regions in China. Using the comprehensive official data set of China for 31 regions from 1952 until 1999, I investigate a number of indicators used by the literature to measure federalism and find robust support for only one such measure: the ratio of local total revenue to local tax revenue. Using a difference-in-difference approach and exploiting the two-year gap in the implementation of a tax reform across different regions of China, I also identify a positive relationship between fiscal federalism and regional economic growth. The second paper hypothesizes that an inequitable distribution of income negatively affects the rule of law in resource-rich economies and provides robust evidence in support of this hypothesis. By investigating a data set that contains 193 countries and using econometric methodologies such as the fixed effects estimator and the generalized method of moments estimator, I find that resource-abundance improves the quality of institutions, as long as income and wealth disparity remains below a certain threshold. When inequality moves beyond this threshold, the positive effects of the resource-abundance level on institutions diminish quickly and turn negative eventually. This paper, thus, provides robust evidence about the endogeneity of institutions and the role income and wealth inequality plays in the determination of long-run growth rates. The third paper sets up a dynamic general equilibrium model with heterogeneous agents to investigate the causal channels which run from a concern for international status to long-run economic growth. The simulation results show that the initial distribution of income and wealth play an important role in whether agents gain or lose from globalization.
Resumo:
Quantile regression (QR) was first introduced by Roger Koenker and Gilbert Bassett in 1978. It is robust to outliers which affect least squares estimator on a large scale in linear regression. Instead of modeling mean of the response, QR provides an alternative way to model the relationship between quantiles of the response and covariates. Therefore, QR can be widely used to solve problems in econometrics, environmental sciences and health sciences. Sample size is an important factor in the planning stage of experimental design and observational studies. In ordinary linear regression, sample size may be determined based on either precision analysis or power analysis with closed form formulas. There are also methods that calculate sample size based on precision analysis for QR like C.Jennen-Steinmetz and S.Wellek (2005). A method to estimate sample size for QR based on power analysis was proposed by Shao and Wang (2009). In this paper, a new method is proposed to calculate sample size based on power analysis under hypothesis test of covariate effects. Even though error distribution assumption is not necessary for QR analysis itself, researchers have to make assumptions of error distribution and covariate structure in the planning stage of a study to obtain a reasonable estimate of sample size. In this project, both parametric and nonparametric methods are provided to estimate error distribution. Since the method proposed can be implemented in R, user is able to choose either parametric distribution or nonparametric kernel density estimation for error distribution. User also needs to specify the covariate structure and effect size to carry out sample size and power calculation. The performance of the method proposed is further evaluated using numerical simulation. The results suggest that the sample sizes obtained from our method provide empirical powers that are closed to the nominal power level, for example, 80%.