995 resultados para Statistical bias
Resumo:
Longitudinal surveys are increasingly used to collect event history data on person-specific processes such as transitions between labour market states. Surveybased event history data pose a number of challenges for statistical analysis. These challenges include survey errors due to sampling, non-response, attrition and measurement. This study deals with non-response, attrition and measurement errors in event history data and the bias caused by them in event history analysis. The study also discusses some choices faced by a researcher using longitudinal survey data for event history analysis and demonstrates their effects. These choices include, whether a design-based or a model-based approach is taken, which subset of data to use and, if a design-based approach is taken, which weights to use. The study takes advantage of the possibility to use combined longitudinal survey register data. The Finnish subset of European Community Household Panel (FI ECHP) survey for waves 1–5 were linked at person-level with longitudinal register data. Unemployment spells were used as study variables of interest. Lastly, a simulation study was conducted in order to assess the statistical properties of the Inverse Probability of Censoring Weighting (IPCW) method in a survey data context. The study shows how combined longitudinal survey register data can be used to analyse and compare the non-response and attrition processes, test the missingness mechanism type and estimate the size of bias due to non-response and attrition. In our empirical analysis, initial non-response turned out to be a more important source of bias than attrition. Reported unemployment spells were subject to seam effects, omissions, and, to a lesser extent, overreporting. The use of proxy interviews tended to cause spell omissions. An often-ignored phenomenon classification error in reported spell outcomes, was also found in the data. Neither the Missing At Random (MAR) assumption about non-response and attrition mechanisms, nor the classical assumptions about measurement errors, turned out to be valid. Both measurement errors in spell durations and spell outcomes were found to cause bias in estimates from event history models. Low measurement accuracy affected the estimates of baseline hazard most. The design-based estimates based on data from respondents to all waves of interest and weighted by the last wave weights displayed the largest bias. Using all the available data, including the spells by attriters until the time of attrition, helped to reduce attrition bias. Lastly, the simulation study showed that the IPCW correction to design weights reduces bias due to dependent censoring in design-based Kaplan-Meier and Cox proportional hazard model estimators. The study discusses implications of the results for survey organisations collecting event history data, researchers using surveys for event history analysis, and researchers who develop methods to correct for non-sampling biases in event history data.
Resumo:
We report the results of Monte Carlo simulations with the aim to clarify the microscopic origin of exchange bias in the magnetization hysteresis loops of a model of individual core/shell nanoparticles. Increase of the exchange coupling across the core/shell interface leads to an enhancement of exchange bias and to an increasing asymmetry between the two branches of the loops which is due to different reversal mechanisms. A detailed study of the magnetic order of the interfacial spins shows compelling evidence that the existence of a net magnetization due to uncompensated spins at the shell interface is responsible for both phenomena and allows to quantify the loop shifts directly in terms of microscopic parameters with striking agreement with the macroscopic observed values.
Resumo:
Since 1991 Colombia has had a market-determined Peso - US Dollar Nominal Exchange Rate (NER), after more than 20 years of controlled and multiple exchange rates. The behavior (revaluation / devaluation) of the NER is constantly reported in news, editorials and op-eds of major newspapers of the nation with particular attention to revaluation. The uneven reporting of revaluation episodes can be explained by the existence of an interest group particulary affected by revaluation, looking to increase awareness and sympathy for help from public institutions. Using the number of news and op-eds from a major Colombian newspaper, it is shown that there is an over-reporting of revaluation episodes in contrast to devaluation ones. Secondly, using text analysis upon the content of the news, it is also shown that the words devaluation and revaluation are far apart in the distribution of words within the news; and revaluation is highly correlated with words related to: public institutions, exporters and the need of assistance. Finally it is also shown that the probability of the central bank buying US dollars to lessen revaluation effects increases with the number of news; even though the central bank allegedly intervenes in the exchange rate market only to tame volatility or accumulate international reserves.
Resumo:
Bayesian inference has been used to determine rigorous estimates of hydroxyl radical concentrations () and air mass dilution rates (K) averaged following air masses between linked observations of nonmethane hydrocarbons (NMHCs) spanning the North Atlantic during the Intercontinental Transport and Chemical Transformation (ITCT)-Lagrangian-2K4 experiment. The Bayesian technique obtains a refined (posterior) distribution of a parameter given data related to the parameter through a model and prior beliefs about the parameter distribution. Here, the model describes hydrocarbon loss through OH reaction and mixing with a background concentration at rate K. The Lagrangian experiment provides direct observations of hydrocarbons at two time points, removing assumptions regarding composition or sources upstream of a single observation. The estimates are sharpened by using many hydrocarbons with different reactivities and accounting for their variability and measurement uncertainty. A novel technique is used to construct prior background distributions of many species, described by variation of a single parameter . This exploits the high correlation of species, related by the first principal component of many NMHC samples. The Bayesian method obtains posterior estimates of , K and following each air mass. Median values are typically between 0.5 and 2.0 × 106 molecules cm−3, but are elevated to between 2.5 and 3.5 × 106 molecules cm−3, in low-level pollution. A comparison of estimates from absolute NMHC concentrations and NMHC ratios assuming zero background (the “photochemical clock” method) shows similar distributions but reveals systematic high bias in the estimates from ratios. Estimates of K are ∼0.1 day−1 but show more sensitivity to the prior distribution assumed.
Resumo:
It is generally accepted that genetics may be an important factor in explaining the variation between patients’ responses to certain drugs. However, identification and confirmation of the responsible genetic variants is proving to be a challenge in many cases. A number of difficulties that maybe encountered in pursuit of these variants, such as non-replication of a true effect, population structure and selection bias, can be mitigated or at least reduced by appropriate statistical methodology. Another major statistical challenge facing pharmacogenetics studies is trying to detect possibly small polygenic effects using large volumes of genetic data, while controlling the number of false positive signals. Here we review statistical design and analysis options available for investigations of genetic resistance to anti-epileptic drugs.
Resumo:
Concerns about potentially misleading reporting of pharmaceutical industry research have surfaced many times. The potential for duality (and thereby conflict) of interest is only too clear when you consider the sums of money required for the discovery, development and commercialization of new medicines. As the ability of major, mid-size and small pharmaceutical companies to innovate has waned, as evidenced by the seemingly relentless decline in the numbers of new medicines approved by Food and Drug Administration and European Medicines Agency year-on-year, not only has the cost per new approved medicine risen: so too has the public and media concern about the extent to which the pharmaceutical industry is open and honest about the efficacy, safety and quality of the drugs we manufacture and sell. In 2005 an Editorial in Journal of the American Medical Association made clear that, so great was their concern about misleading reporting of industry-sponsored studies, henceforth no article would be published that was not also guaranteed by independent statistical analysis. We examine the precursors to this Editorial, as well as its immediate and lasting effects for statisticians, for the manner in which statistical analysis is carried out, and for the industry more generally.
Resumo:
In this paper we discuss bias-corrected estimators for the regression and the dispersion parameters in an extended class of dispersion models (Jorgensen, 1997b). This class extends the regular dispersion models by letting the dispersion parameter vary throughout the observations, and contains the dispersion models as particular case. General formulae for the O(n(-1)) bias are obtained explicitly in dispersion models with dispersion covariates, which generalize previous results obtained by Botter and Cordeiro (1998), Cordeiro and McCullagh (1991), Cordeiro and Vasconcellos (1999), and Paula (1992). The practical use of the formulae is that we can derive closed-form expressions for the O(n(-1)) biases of the maximum likelihood estimators of the regression and dispersion parameters when the information matrix has a closed-form. Various expressions for the O(n(-1)) biases are given for special models. The formulae have advantages for numerical purposes because they require only a supplementary weighted linear regression. We also compare these bias-corrected estimators with two different estimators which are also bias-free to order O(n(-1)) that are based on bootstrap methods. These estimators are compared by simulation. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
This thesis develops and evaluates statistical methods for different types of genetic analyses, including quantitative trait loci (QTL) analysis, genome-wide association study (GWAS), and genomic evaluation. The main contribution of the thesis is to provide novel insights in modeling genetic variance, especially via random effects models. In variance component QTL analysis, a full likelihood model accounting for uncertainty in the identity-by-descent (IBD) matrix was developed. It was found to be able to correctly adjust the bias in genetic variance component estimation and gain power in QTL mapping in terms of precision. Double hierarchical generalized linear models, and a non-iterative simplified version, were implemented and applied to fit data of an entire genome. These whole genome models were shown to have good performance in both QTL mapping and genomic prediction. A re-analysis of a publicly available GWAS data set identified significant loci in Arabidopsis that control phenotypic variance instead of mean, which validated the idea of variance-controlling genes. The works in the thesis are accompanied by R packages available online, including a general statistical tool for fitting random effects models (hglm), an efficient generalized ridge regression for high-dimensional data (bigRR), a double-layer mixed model for genomic data analysis (iQTL), a stochastic IBD matrix calculator (MCIBD), a computational interface for QTL mapping (qtl.outbred), and a GWAS analysis tool for mapping variance-controlling loci (vGWAS).
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
The Box-Cox transformation is a technique mostly utilized to turn the probabilistic distribution of a time series data into approximately normal. And this helps statistical and neural models to perform more accurate forecastings. However, it introduces a bias when the reversion of the transformation is conducted with the predicted data. The statistical methods to perform a bias-free reversion require, necessarily, the assumption of Gaussianity of the transformed data distribution, which is a rare event in real-world time series. So, the aim of this study was to provide an effective method of removing the bias when the reversion of the Box-Cox transformation is executed. Thus, the developed method is based on a focused time lagged feedforward neural network, which does not require any assumption about the transformed data distribution. Therefore, to evaluate the performance of the proposed method, numerical simulations were conducted and the Mean Absolute Percentage Error, the Theil Inequality Index and the Signal-to-Noise ratio of 20-step-ahead forecasts of 40 time series were compared, and the results obtained indicate that the proposed reversion method is valid and justifies new studies. (C) 2014 Elsevier B.V. All rights reserved.
Resumo:
OBJECTIVE: The purpose of this study was to investigate the presence of publication bias (acceptance of articles indicating statistically significant results). METHODS: The journals possessing the highest impact factor (2008 data) in each dental specialty were included in the study. The content of the 6 most recent issues of each journal was hand searched and research articles were classified into 4 type categories: cross-sectional, case-control, cohort, and interventional (nonrandomized clinical trials and randomized controlled trials). In total, 396 articles were included in the analysis. Descriptive statistics and univariate and multivariate logistic regression was used to examine the association between article-reported statistical significance (dependent variable) and journal impact factor and article study type subject area (independent variables). RESULTS: A statistically significant acceptance rate of positive result was found, ranging from 75% to 90%, whereas the value of impact factor was not related to publication bias among leading dental journals. Compared with other research designs, clinical intervention studies (randomized or nonrandomized) presented the highest percentage of nonsignificant findings (20%); RCTs represented 6% of the examined investigations. CONCLUSIONS: Compared with the Journal of Clinical Periodontology, all other subspecialty journals, except the Journal of Oral and Maxillofacial Surgery, showed significantly decreased odds of publishing an RCT, which ranged from 60% to 93% (P < .05).
Resumo:
Much controversy exists over whether the course of schizophrenia, as defined by the lengths of repeated community tenures, is progressively ameliorating or deteriorating. This article employs a new statistical method proposed by Wang and Chen (2000) to analyze the Denmark registry data in Eaton, et al (1992). The new statistical method correctly handles the bias caused by induced informative censoring, which is an interaction of the heterogeneity of schizophrenia patients and long-term follow-up. The analysis shows a progressive deterioration pattern in terms of community tenures for the full registry cohort, rather than a progressive amelioration pattern as reported for a selected sub-cohort in Eaton, et al (1992). When adjusted for the long-term chronicity of calendar time, no significant progressive pattern was found for the full cohort.