23 resultados para Causal inference
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
We present experimental and theoretical analyses of data requirements for haplotype inference algorithms. Our experiments include a broad range of problem sizes under two standard models of tree distribution and were designed to yield statistically robust results despite the size of the sample space. Our results validate Gusfield's conjecture that a population size of n log n is required to give (with high probability) sufficient information to deduce the n haplotypes and their complete evolutionary history. The experimental results inspired our experimental finding with theoretical bounds on the population size. We also analyze the population size required to deduce some fixed fraction of the evolutionary history of a set of n haplotypes and establish linear bounds on the required sample size. These linear bounds are also shown theoretically.
Resumo:
One of the most persistent and lasting debates in economic research refers to whether the answers to subjective questions can be used to explain individuals’ economic behavior. Using panel data for twelve EU countries, in the present study we analyze the causal relationship between self-reported housing satisfaction and residential mobility. Our results indicate that: i) households unsatisfied with their current housing situation are more likely to move; ii) housing satisfaction raises after a move, and; iii) housing satisfaction increases with the transition from being a renter to becoming a homeowner. Some interesting cross-country differences are observed. Our findings provide evidence in favor of use of subjective indicators of satisfaction with certain life domains in the analysis of individuals’ economic conduct.
Resumo:
Empirical studies assume that the macro Mincer return on schooling is con- stant across countries. Using a large sample of countries this paper shows that countries with a better quality of education have on average relatively higher macro Mincer coeficients. As rich countries have on average better educational quality, differences in human capital between countries are larger than has been typically assumed in the development accounting literature. Consequently, factor accumulation explains a considerably larger share of income differences across countries than what is usually found.
Resumo:
Consider a model with parameter phi, and an auxiliary model with parameter theta. Let phi be a randomly sampled from a given density over the known parameter space. Monte Carlo methods can be used to draw simulated data and compute the corresponding estimate of theta, say theta_tilde. A large set of tuples (phi, theta_tilde) can be generated in this manner. Nonparametric methods may be use to fit the function E(phi|theta_tilde=a), using these tuples. It is proposed to estimate phi using the fitted E(phi|theta_tilde=theta_hat), where theta_hat is the auxiliary estimate, using the real sample data. This is a consistent and asymptotically normally distributed estimator, under certain assumptions. Monte Carlo results for dynamic panel data and vector autoregressions show that this estimator can have very attractive small sample properties. Confidence intervals can be constructed using the quantiles of the phi for which theta_tilde is close to theta_hat. Such confidence intervals are found to have very accurate coverage.
Resumo:
El beneficio principal de contar con una representación de la potencia causal (Cheng, 1997) es que ésta supone una descripción contexto-independiente de la influencia de una determinada causa sobre el efecto. Por lo tanto, una forma adecuada de poner a prueba la existencia de estos modelos mentales es crear situaciones en las que la gente observa o predice la efectividad de las causas diana en múltiples contextos. La naturaleza trans-situacional de la potencia trae consigo una serie de consecuencias testables que hemos puesto a prueba a lo largo de tres series experimentales. En la primera serie experimental investigamos la transferencia de la fuerza causal, aprendida en un contexto específico, a un contexto en el que la probabilidad o tasa base del efecto es diferente. Los participantes debían predecir la probabilidad del efecto dada la introducción de la causa en el nuevo contexto. En la segunda serie experimental estudiamos las estrategias utilizadas por las personas a la hora de descubrir relaciones causales. De acuerdo con el modelo de la potencia causal, si pretendemos descubrir la potencia de una causa, entonces lo más apropiado es introducirla en el contexto más informativo y menos ambiguo posible. En los distintos experimentos de la serie combinamos tanto contextos como causas probabilísticas y determinísticas. En la tercera serie experimental intentamos extender los hallazgos de Liljeholm & Cheng (2007), en los se encontró que la generalización entre contextos ocurre según las predicciones del modelo de potencia. Parece probable que el procedimiento de dos fases utilizado por los autores promueva la tendencia a ignorar algunos ensayos, generando artificialmente resultados consistentes con los esperados por la potencia. Además, cuando controlamos la P(E|C) independientemente de la potencia, el patrón de resultados se invirtió, contradiciendo lo esperado por el modelo de Cheng. En conclusión, existe cierta evidencia que apoya la existencia de modelos causales pero es necesario buscar formas adecuadas de poner a prueba estos modelos.
Resumo:
Given a sample from a fully specified parametric model, let Zn be a given finite-dimensional statistic - for example, an initial estimator or a set of sample moments. We propose to (re-)estimate the parameters of the model by maximizing the likelihood of Zn. We call this the maximum indirect likelihood (MIL) estimator. We also propose a computationally tractable Bayesian version of the estimator which we refer to as a Bayesian Indirect Likelihood (BIL) estimator. In most cases, the density of the statistic will be of unknown form, and we develop simulated versions of the MIL and BIL estimators. We show that the indirect likelihood estimators are consistent and asymptotically normally distributed, with the same asymptotic variance as that of the corresponding efficient two-step GMM estimator based on the same statistic. However, our likelihood-based estimators, by taking into account the full finite-sample distribution of the statistic, are higher order efficient relative to GMM-type estimators. Furthermore, in many cases they enjoy a bias reduction property similar to that of the indirect inference estimator. Monte Carlo results for a number of applications including dynamic and nonlinear panel data models, a structural auction model and two DSGE models show that the proposed estimators indeed have attractive finite sample properties.
Resumo:
Low concentrations of elements in geochemical analyses have the peculiarity of beingcompositional data and, for a given level of significance, are likely to be beyond thecapabilities of laboratories to distinguish between minute concentrations and completeabsence, thus preventing laboratories from reporting extremely low concentrations of theanalyte. Instead, what is reported is the detection limit, which is the minimumconcentration that conclusively differentiates between presence and absence of theelement. A spatially distributed exhaustive sample is employed in this study to generateunbiased sub-samples, which are further censored to observe the effect that differentdetection limits and sample sizes have on the inference of population distributionsstarting from geochemical analyses having specimens below detection limit (nondetects).The isometric logratio transformation is used to convert the compositional data in thesimplex to samples in real space, thus allowing the practitioner to properly borrow fromthe large source of statistical techniques valid only in real space. The bootstrap method isused to numerically investigate the reliability of inferring several distributionalparameters employing different forms of imputation for the censored data. The casestudy illustrates that, in general, best results are obtained when imputations are madeusing the distribution best fitting the readings above detection limit and exposes theproblems of other more widely used practices. When the sample is spatially correlated, itis necessary to combine the bootstrap with stochastic simulation
Resumo:
First: A continuous-time version of Kyle's model (Kyle 1985), known as the Back's model (Back 1992), of asset pricing with asymmetric information, is studied. A larger class of price processes and of noise traders' processes are studied. The price process, as in Kyle's model, is allowed to depend on the path of the market order. The process of the noise traders' is an inhomogeneous Lévy process. Solutions are found by the Hamilton-Jacobi-Bellman equations. With the insider being risk-neutral, the price pressure is constant, and there is no equilibirium in the presence of jumps. If the insider is risk-averse, there is no equilibirium in the presence of either jumps or drifts. Also, it is analised when the release time is unknown. A general relation is established between the problem of finding an equilibrium and of enlargement of filtrations. Random announcement time is random is also considered. In such a case the market is not fully efficient and there exists equilibrium if the sensitivity of prices with respect to the global demand is time decreasing according with the distribution of the random time. Second: Power variations. it is considered, the asymptotic behavior of the power variation of processes of the form _integral_0^t u(s-)dS(s), where S_ is an alpha-stable process with index of stability 0&alpha&2 and the integral is an Itô integral. Stable convergence of corresponding fluctuations is established. These results provide statistical tools to infer the process u from discrete observations. Third: A bond market is studied where short rates r(t) evolve as an integral of g(t-s)sigma(s) with respect to W(ds), where g and sigma are deterministic and W is the stochastic Wiener measure. Processes of this type are particular cases of ambit processes. These processes are in general not of the semimartingale kind.
Resumo:
Pseudomonas fluorescens EPS62e was selected during a screening procedure for its high efficacy in controlling infections by Erwinia amylovora, the causal agent of fire blight disease, on different plant materials. In field trials carried out in pear trees during bloom, EPS62e colonized flowers until the carrying capacity, providing a moderate efficacy of fire-blight control. The putative mechanisms of EPS62e antagonism against E. amylovora were studied. EPS62e did not produce antimicrobial compounds described in P. fluorescens species and only developed antagonism in King’s B medium, where it produced siderophores. Interaction experiments in culture plate wells including a membrane filter, which physically separated the cultures, confirmed that inhibition of E. amylovora requires cell-to-cell contact. The spectrum of nutrient assimilation indicated that EPS62e used significantly more or different carbon sources than the pathogen. The maximum growth rate and affinity for nutrients in immature fruit extract were higher in EPS62e than in E. amylovora, but the cell yield was similar. The fitness of EPS62e and E. amylovora was studied upon inoculation in immature pear fruit wounds and hypanthia of intact flowers under controlled-environment conditions. When inoculated separately, EPS62e grew faster in flowers, whereas E. amylovora grew faster in fruit wounds because of its rapid spread to adjacent tissues. However, in preventive inoculations of EPS62e, subsequent growth of EPS101 was significantly inhibited. It is concluded that cell-to-cell interference as well as differences in growth potential and the spectrum and efficiency of nutrient use are mechanisms of antagonism of EPS62e against E. amylovora
Resumo:
Background: Two genes are called synthetic lethal (SL) if mutation of either alone is not lethal, but mutation of both leads to death or a significant decrease in organism's fitness. The detection of SL gene pairs constitutes a promising alternative for anti-cancer therapy. As cancer cells exhibit a large number of mutations, the identification of these mutated genes' SL partners may provide specific anti-cancer drug candidates, with minor perturbations to the healthy cells. Since existent SL data is mainly restricted to yeast screenings, the road towards human SL candidates is limited to inference methods. Results: In the present work, we use phylogenetic analysis and database manipulation (BioGRID for interactions, Ensembl and NCBI for homology, Gene Ontology for GO attributes) in order to reconstruct the phylogenetically-inferred SL gene network for human. In addition, available data on cancer mutated genes (COSMIC and Cancer Gene Census databases) as well as on existent approved drugs (DrugBank database) supports our selection of cancer-therapy candidates.Conclusions: Our work provides a complementary alternative to the current methods for drug discovering and gene target identification in anti-cancer research. Novel SL screening analysis and the use of highly curated databases would contribute to improve the results of this methodology.
Resumo:
The paper presents a competence-based instructional design system and a way to provide a personalization of navigation in the course content. The navigation aid tool builds on the competence graph and the student model, which includes the elements of uncertainty in the assessment of students. An individualized navigation graph is constructed for each student, suggesting the competences the student is more prepared to study. We use fuzzy set theory for dealing with uncertainty. The marks of the assessment tests are transformed into linguistic terms and used for assigning values to linguistic variables. For each competence, the level of difficulty and the level of knowing its prerequisites are calculated based on the assessment marks. Using these linguistic variables and approximate reasoning (fuzzy IF-THEN rules), a crisp category is assigned to each competence regarding its level of recommendation.
Resumo:
Small sample properties are of fundamental interest when only limited data is avail-able. Exact inference is limited by constraints imposed by speci.c nonrandomizedtests and of course also by lack of more data. These e¤ects can be separated as we propose to evaluate a test by comparing its type II error to the minimal type II error among all tests for the given sample. Game theory is used to establish this minimal type II error, the associated randomized test is characterized as part of a Nash equilibrium of a .ctitious game against nature.We use this method to investigate sequential tests for the di¤erence between twomeans when outcomes are constrained to belong to a given bounded set. Tests ofinequality and of noninferiority are included. We .nd that inference in terms oftype II error based on a balanced sample cannot be improved by sequential sampling or even by observing counter factual evidence providing there is a reasonable gap between the hypotheses.
Resumo:
Several estimators of the expectation, median and mode of the lognormal distribution are derived. They aim to be approximately unbiased, efficient, or have a minimax property in the class of estimators we introduce. The small-sample properties of these estimators are assessed by simulations and, when possible, analytically. Some of these estimators of the expectation are far more efficient than the maximum likelihood or the minimum-variance unbiased estimator, even for substantial samplesizes.
Resumo:
This paper discusses inference in self exciting threshold autoregressive (SETAR)models. Of main interest is inference for the threshold parameter. It iswell-known that the asymptotics of the corresponding estimator depend uponwhether the SETAR model is continuous or not. In the continuous case, thelimiting distribution is normal and standard inference is possible. Inthe discontinuous case, the limiting distribution is non-normal and cannotbe estimated consistently. We show valid inference can be drawn by theuse of the subsampling method. Moreover, the method can even be extendedto situations where the (dis)continuity of the model is unknown. In thiscase, also the inference for the regression parameters of the modelbecomes difficult and subsampling can be used advantageously there aswell. In addition, we consider an hypothesis test for the continuity ofthe SETAR model. A simulation study examines small sample performance.
Resumo:
Two finite extensive-form games are empirically equivalent when theempirical distribution on action profiles generated by every behaviorstrategy in one can also be generated by an appropriately chosen behaviorstrategy in the other. This paper provides a characterization ofempirical equivalence. The central idea is to relate a game's informationstructure to the conditional independencies in the empirical distributionsit generates. We present a new analytical device, the influence opportunitydiagram of a game, describe how such a diagram is constructed for a givenextensive-form game, and demonstrate that it provides a complete summaryof the information needed to test empirical equivalence between two games.