2 resultados para Random sample
em Duke University
Resumo:
Surveys can collect important data that inform policy decisions and drive social science research. Large government surveys collect information from the U.S. population on a wide range of topics, including demographics, education, employment, and lifestyle. Analysis of survey data presents unique challenges. In particular, one needs to account for missing data, for complex sampling designs, and for measurement error. Conceptually, a survey organization could spend lots of resources getting high-quality responses from a simple random sample, resulting in survey data that are easy to analyze. However, this scenario often is not realistic. To address these practical issues, survey organizations can leverage the information available from other sources of data. For example, in longitudinal studies that suffer from attrition, they can use the information from refreshment samples to correct for potential attrition bias. They can use information from known marginal distributions or survey design to improve inferences. They can use information from gold standard sources to correct for measurement error.
This thesis presents novel approaches to combining information from multiple sources that address the three problems described above.
The first method addresses nonignorable unit nonresponse and attrition in a panel survey with a refreshment sample. Panel surveys typically suffer from attrition, which can lead to biased inference when basing analysis only on cases that complete all waves of the panel. Unfortunately, the panel data alone cannot inform the extent of the bias due to attrition, so analysts must make strong and untestable assumptions about the missing data mechanism. Many panel studies also include refreshment samples, which are data collected from a random sample of new
individuals during some later wave of the panel. Refreshment samples offer information that can be utilized to correct for biases induced by nonignorable attrition while reducing reliance on strong assumptions about the attrition process. To date, these bias correction methods have not dealt with two key practical issues in panel studies: unit nonresponse in the initial wave of the panel and in the
refreshment sample itself. As we illustrate, nonignorable unit nonresponse
can significantly compromise the analyst's ability to use the refreshment samples for attrition bias correction. Thus, it is crucial for analysts to assess how sensitive their inferences---corrected for panel attrition---are to different assumptions about the nature of the unit nonresponse. We present an approach that facilitates such sensitivity analyses, both for suspected nonignorable unit nonresponse
in the initial wave and in the refreshment sample. We illustrate the approach using simulation studies and an analysis of data from the 2007-2008 Associated Press/Yahoo News election panel study.
The second method incorporates informative prior beliefs about
marginal probabilities into Bayesian latent class models for categorical data.
The basic idea is to append synthetic observations to the original data such that
(i) the empirical distributions of the desired margins match those of the prior beliefs, and (ii) the values of the remaining variables are left missing. The degree of prior uncertainty is controlled by the number of augmented records. Posterior inferences can be obtained via typical MCMC algorithms for latent class models, tailored to deal efficiently with the missing values in the concatenated data.
We illustrate the approach using a variety of simulations based on data from the American Community Survey, including an example of how augmented records can be used to fit latent class models to data from stratified samples.
The third method leverages the information from a gold standard survey to model reporting error. Survey data are subject to reporting error when respondents misunderstand the question or accidentally select the wrong response. Sometimes survey respondents knowingly select the wrong response, for example, by reporting a higher level of education than they actually have attained. We present an approach that allows an analyst to model reporting error by incorporating information from a gold standard survey. The analyst can specify various reporting error models and assess how sensitive their conclusions are to different assumptions about the reporting error process. We illustrate the approach using simulations based on data from the 1993 National Survey of College Graduates. We use the method to impute error-corrected educational attainments in the 2010 American Community Survey using the 2010 National Survey of College Graduates as the gold standard survey.
Resumo:
This dissertation explores the complex interactions between organizational structure and the environment. In Chapter 1, I investigate the effect of financial development on the formation of European corporate groups. Since cross-country regressions are hard to interpret in a causal sense, we exploit exogenous industry measures to investigate a specific channel through which financial development may affect group affiliation: internal capital markets. Using a comprehensive firm-level dataset on European corporate groups in 15 countries, we find that countries
with less developed financial markets have a higher percentage of group affiliates in more capital intensive industries. This relationship is more pronounced for young and small firms and for affiliates of large and diversified groups. Our findings are consistent with the view that internal capital markets may, under some conditions, be more efficient than prevailing external markets, and that this may drive group affiliation even in developed economies. In Chapter 2, I bridge current streams of innovation research to explore the interplay between R&D, external knowledge, and organizational structure–three elements of a firm’s innovation strategy which we argue should logically be studied together. Using within-firm patent assignment patterns,
we develop a novel measure of structure for a large sample of American firms. We find that centralized firms invest more in research and patent more per R&D dollar than decentralized firms. Both types access technology via mergers and acquisitions, but their acquisitions differ in terms of frequency, size, and i\ntegration. Consistent with our framework, their sources of value creation differ: while centralized firms derive more value from internal R&D, decentralized firms rely more on external knowledge. We discuss how these findings should stimulate more integrative work on theories of innovation. In Chapter 3, I use novel data on 1,265 newly-public firms to show that innovative firms exposed to environments with lower M&A activity just after their initial public offering (IPO) adapt by engaging in fewer technological acquisitions and
more internal research. However, this adaptive response becomes inertial shortly after IPO and persists well into maturity. This study advances our understanding of how the environment shapes heterogeneity and capabilities through its impact on firm structure. I discuss how my results can help bridge inertial versus adaptive perspectives in the study of organizations, by
documenting an instance when the two interact.