846 resultados para BIASED-ESTIMATION
Resumo:
This paper addresses the investment decisions considering the presence of financial constraints of 373 large Brazilian firms from 1997 to 2004, using panel data. A Bayesian econometric model was used considering ridge regression for multicollinearity problems among the variables in the model. Prior distributions are assumed for the parameters, classifying the model into random or fixed effects. We used a Bayesian approach to estimate the parameters, considering normal and Student t distributions for the error and assumed that the initial values for the lagged dependent variable are not fixed, but generated by a random process. The recursive predictive density criterion was used for model comparisons. Twenty models were tested and the results indicated that multicollinearity does influence the value of the estimated parameters. Controlling for capital intensity, financial constraints are found to be more important for capital-intensive firms, probably due to their lower profitability indexes, higher fixed costs and higher degree of property diversification.
Resumo:
Approximate models (proxies) can be employed to reduce the computational costs of estimating uncertainty. The price to pay is that the approximations introduced by the proxy model can lead to a biased estimation. To avoid this problem and ensure a reliable uncertainty quantification, we propose to combine functional data analysis and machine learning to build error models that allow us to obtain an accurate prediction of the exact response without solving the exact model for all realizations. We build the relationship between proxy and exact model on a learning set of geostatistical realizations for which both exact and approximate solvers are run. Functional principal components analysis (FPCA) is used to investigate the variability in the two sets of curves and reduce the dimensionality of the problem while maximizing the retained information. Once obtained, the error model can be used to predict the exact response of any realization on the basis of the sole proxy response. This methodology is purpose-oriented as the error model is constructed directly for the quantity of interest, rather than for the state of the system. Also, the dimensionality reduction performed by FPCA allows a diagnostic of the quality of the error model to assess the informativeness of the learning set and the fidelity of the proxy to the exact model. The possibility of obtaining a prediction of the exact response for any newly generated realization suggests that the methodology can be effectively used beyond the context of uncertainty quantification, in particular for Bayesian inference and optimization.
Resumo:
Os objetivos neste trabalho foram comparar estimativas de parâmetros genéticos obtidas por meio de dois modelos - um contendo apenas efeitos aditivos e de dominância e outro que incluiu os efeitos aditivo-conjunto (complementaridade) e epistático - e testar alternativas de critérios objetivos para determinação do coeficiente lambda na aplicação da regressão de cumeeira. Os resultados obtidos revelaram que a escolha de um critério para determinação do coeficiente lambda em regressão de cumeeira depende não apenas do conjunto de dados e do modelo utilizado, mas, sobretudo, de um conhecimento prévio acerca do fenômeno estudado e do significado prático e da interpretação dos parâmetros encontrados. Pelo uso de modelos mais completos para avaliação de efeitos genéticos em bovinos de corte, pode-se identificar a contribuição dos efeitos aditivo-conjunto e epistático, que encontram-se embutidos no efeito de heterose estimado por modelos mais simples. A regressão de cumeeira é uma ferramenta que viabiliza a obtenção dessas estimativas mesmo na presença de forte multicolinearidade.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Approximate models (proxies) can be employed to reduce the computational costs of estimating uncertainty. The price to pay is that the approximations introduced by the proxy model can lead to a biased estimation. To avoid this problem and ensure a reliable uncertainty quantification, we propose to combine functional data analysis and machine learning to build error models that allow us to obtain an accurate prediction of the exact response without solving the exact model for all realizations. We build the relationship between proxy and exact model on a learning set of geostatistical realizations for which both exact and approximate solvers are run. Functional principal components analysis (FPCA) is used to investigate the variability in the two sets of curves and reduce the dimensionality of the problem while maximizing the retained information. Once obtained, the error model can be used to predict the exact response of any realization on the basis of the sole proxy response. This methodology is purpose-oriented as the error model is constructed directly for the quantity of interest, rather than for the state of the system. Also, the dimensionality reduction performed by FPCA allows a diagnostic of the quality of the error model to assess the informativeness of the learning set and the fidelity of the proxy to the exact model. The possibility of obtaining a prediction of the exact response for any newly generated realization suggests that the methodology can be effectively used beyond the context of uncertainty quantification, in particular for Bayesian inference and optimization.
Resumo:
Comparing the patterns of population differentiation among genetic markers with different modes of inheritance call provide insights into patterns of sex-biased dispersal and gene flow. The blue-and-yellow Macaw (Ara ararauna) is a Neotropical parrot with a broad geographic distribution ill South America. However, little is known about the natural history and current status Of remaining wild populations, including levels of genetic variability. The progressive decline and possible fragmentation of populations may endanger this species in the near future. We analyzed mitochondrial DNA (mtDNA) control-region sequences and six microsatellite 106 Of Blue-and-yellow Macaws sampled throughout their geographic range ill Brazil to describe population genetic Structure, to make inferences about historical demography and dispersal behavior, and to provide insight for conservation efforts. Analyses of population genetic structure based on mtDNA showed evidence of two major populations ill western and eastern Brazil that share a few low-frequency haplotypes. This phylogeographic pattern seems to have originated by the historical isolation of Blue-and-yellow Macaw populations similar to 374,000 years ago and has been maintained by restricted gene flow and female philopatry. By contrast, variation ill biparentally inherited microsatellites was not structured geographically, Male-biased dispersal and female philopatry best explain the different patterns observed in these two markers. Because females disperse less than males, the two regional populations with well-differentiated mtDNA haplogroups should be considered two different management units for conservation purposes. Received 4 November 2007 accepted 10 December 2008.
Resumo:
To date, state-of-the-art seismic material parameter estimates from multi-component sea-bed seismic data are based on the assumption that the sea-bed consists of a fully elastic half-space. In reality, however, the shallow sea-bed generally consists of soft, unconsolidated sediments that are characterized by strong to very strong seismic attenuation. To explore the potential implications, we apply a state-of-the-art elastic decomposition algorithm to synthetic data for a range of canonical sea-bed models consisting of a viscoelastic half-space of varying attenuation. We find that in the presence of strong seismic attenuation, as quantified by Q-values of 10 or less, significant errors arise in the conventional elastic estimation of seismic properties. Tests on synthetic data indicate that these errors can be largely avoided by accounting for the inherent attenuation of the seafloor when estimating the seismic parameters. This can be achieved by replacing the real-valued expressions for the elastic moduli in the governing equations in the parameter estimation by their complex-valued viscoelastic equivalents. The practical application of our parameter procedure yields realistic estimates of the elastic seismic material properties of the shallow sea-bed, while the corresponding Q-estimates seem to be biased towards too low values, particularly for S-waves. Given that the estimation of inelastic material parameters is notoriously difficult, particularly in the immediate vicinity of the sea-bed, this is expected to be of interest and importance for civil and ocean engineering purposes.
Resumo:
This paper demonstrates that, unlike what the conventional wisdom says, measurement error biases in panel data estimation of convergence using OLS with fixed effects are huge, not trivial. It does so by way of the "skipping estimation"': taking data from every m years of the sample (where m is an integer greater than or equal to 2), as opposed to every single year. It is shown that the estimated speed of convergence from the OLS with fixed effects is biased upwards by as much as 7 to 15%.
Resumo:
MOTIVATION: Comparative analyses of gene expression data from different species have become an important component of the study of molecular evolution. Thus methods are needed to estimate evolutionary distances between expression profiles, as well as a neutral reference to estimate selective pressure. Divergence between expression profiles of homologous genes is often calculated with Pearson's or Euclidean distance. Neutral divergence is usually inferred from randomized data. Despite being widely used, neither of these two steps has been well studied. Here, we analyze these methods formally and on real data, highlight their limitations and propose improvements. RESULTS: It has been demonstrated that Pearson's distance, in contrast to Euclidean distance, leads to underestimation of the expression similarity between homologous genes with a conserved uniform pattern of expression. Here, we first extend this study to genes with conserved, but specific pattern of expression. Surprisingly, we find that both Pearson's and Euclidean distances used as a measure of expression similarity between genes depend on the expression specificity of those genes. We also show that the Euclidean distance depends strongly on data normalization. Next, we show that the randomization procedure that is widely used to estimate the rate of neutral evolution is biased when broadly expressed genes are abundant in the data. To overcome this problem, we propose a novel randomization procedure that is unbiased with respect to expression profiles present in the datasets. Applying our method to the mouse and human gene expression data suggests significant gene expression conservation between these species. CONTACT: marc.robinson-rechavi@unil.ch; sven.bergmann@unil.ch SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Resumo:
The systematic sampling (SYS) design (Madow and Madow, 1944) is widely used by statistical offices due to its simplicity and efficiency (e.g., Iachan, 1982). But it suffers from a serious defect, namely, that it is impossible to unbiasedly estimate the sampling variance (Iachan, 1982) and usual variance estimators (Yates and Grundy, 1953) are inadequate and can overestimate the variance significantly (Särndal et al., 1992). We propose a novel variance estimator which is less biased and that can be implemented with any given population order. We will justify this estimator theoretically and with a Monte Carlo simulation study.
Resumo:
This paper considers the problem of estimation when one of a number of populations, assumed normal with known common variance, is selected on the basis of it having the largest observed mean. Conditional on selection of the population, the observed mean is a biased estimate of the true mean. This problem arises in the analysis of clinical trials in which selection is made between a number of experimental treatments that are compared with each other either with or without an additional control treatment. Attempts to obtain approximately unbiased estimates in this setting have been proposed by Shen [2001. An improved method of evaluating drug effect in a multiple dose clinical trial. Statist. Medicine 20, 1913–1929] and Stallard and Todd [2005. Point estimates and confidence regions for sequential trials involving selection. J. Statist. Plann. Inference 135, 402–419]. This paper explores the problem in the simple setting in which two experimental treatments are compared in a single analysis. It is shown that in this case the estimate of Stallard and Todd is the maximum-likelihood estimate (m.l.e.), and this is compared with the estimate proposed by Shen. In particular, it is shown that the m.l.e. has infinite expectation whatever the true value of the mean being estimated. We show that there is no conditionally unbiased estimator, and propose a new family of approximately conditionally unbiased estimators, comparing these with the estimators suggested by Shen.
Resumo:
Approximate Bayesian computation (ABC) is a highly flexible technique that allows the estimation of parameters under demographic models that are too complex to be handled by full-likelihood methods. We assess the utility of this method to estimate the parameters of range expansion in a two-dimensional stepping-stone model, using samples from either a single deme or multiple demes. A minor modification to the ABC procedure is introduced, which leads to an improvement in the accuracy of estimation. The method is then used to estimate the expansion time and migration rates for five natural common vole populations in Switzerland typed for a sex-linked marker and a nuclear marker. Estimates based on both markers suggest that expansion occurred < 10,000 years ago, after the most recent glaciation, and that migration rates are strongly male biased.
Resumo:
We estimate the conditions for detectability of two planets in a 2/1 mean-motion resonance from radial velocity data, as a function of their masses, number of observations and the signal-to-noise ratio. Even for a data set of the order of 100 observations and standard deviations of the order of a few meters per second, we find that Jovian-size resonant planets are difficult to detect if the masses of the planets differ by a factor larger than similar to 4. This is consistent with the present population of real exosystems in the 2/1 commensurability, most of which have resonant pairs with similar minimum masses, and could indicate that many other resonant systems exist, but are currently beyond the detectability limit. Furthermore, we analyze the error distribution in masses and orbital elements of orbital fits from synthetic data sets for resonant planets in the 2/1 commensurability. For various mass ratios and number of data points we find that the eccentricity of the outer planet is systematically overestimated, although the inner planet`s eccentricity suffers a much smaller effect. If the initial conditions correspond to small-amplitude oscillations around stable apsidal corotation resonances, the amplitudes estimated from the orbital fits are biased toward larger amplitudes, in accordance to results found in real resonant extrasolar systems.
Resumo:
Sensitivity and specificity are measures that allow us to evaluate the performance of a diagnostic test. In practice, it is common to have situations where a proportion of selected individuals cannot have the real state of the disease verified, since the verification could be an invasive procedure, as occurs with biopsy. This happens, as a special case, in the diagnosis of prostate cancer, or in any other situation related to risks, that is, not practicable, nor ethical, or in situations with high cost. For this case, it is common to use diagnostic tests based only on the information of verified individuals. This procedure can lead to biased results or workup bias. In this paper, we introduce a Bayesian approach to estimate the sensitivity and the specificity for two diagnostic tests considering verified and unverified individuals, a result that generalizes the usual situation based on only one diagnostic test.
Resumo:
We consider method of moment fixed effects (FE) estimation of technical inefficiency. When N, the number of cross sectional observations, is large it ispossible to obtain consistent central moments of the population distribution of the inefficiencies. It is well-known that the traditional FE estimator may be seriously upward biased when N is large and T, the number of time observations, is small. Based on the second central moment and a single parameter distributional assumption on the inefficiencies, we obtain unbiased technical inefficiencies in large N settings. The proposed methodology bridges traditional FE and maximum likelihood estimation – bias is reduced without the random effects assumption.