121 resultados para Bootstrapping
Resumo:
The mathematical representation of Brunswik s lens model has been usedextensively to study human judgment and provides a unique opportunity to conduct ameta-analysis of studies that covers roughly five decades. Specifically, we analyzestatistics of the lens model equation (Tucker, 1964) associated with 259 different taskenvironments obtained from 78 papers. In short, we find on average fairly high levelsof judgmental achievement and note that people can achieve similar levels of cognitiveperformance in both noisy and predictable environments. Although overall performancevaries little between laboratory and field studies, both differ in terms of components ofperformance and types of environments (numbers of cues and redundancy). An analysisof learning studies reveals that the most effective form of feedback is information aboutthe task. We also analyze empirically when bootstrapping is more likely to occur. Weconclude by indicating shortcomings of the kinds of studies conducted to date, limitationsin the lens model methodology, and possibilities for future research.
Resumo:
This Article breaks new ground toward contractual and institutional innovation in models of homeownership, equity building, and mortgage enforcement. Inspired by recent developments in the affordable housing sector and other types of public financing schemes, we suggest extending institutional and financial strategies such as time- and place-based division of property rights, conditional subsidies, and credit mediation to alleviate the systemic risks of mortgage foreclosure. Two new solutions offer a broad theoretical basis for such developments in the economic and legal institution of homeownership: a for-profit shared equity scheme led by local governments alongside a private market shared equity model, one of "bootstrapping home buying with purchase options".
Resumo:
Although correspondence analysis is now widely available in statistical software packages and applied in a variety of contexts, notably the social and environmental sciences, there are still some misconceptions about this method as well as unresolved issues which remain controversial to this day. In this paper we hope to settle these matters, namely (i) the way CA measures variance in a two-way table and how to compare variances between tables of different sizes, (ii) the influence, or rather lack of influence, of outliers in the usual CA maps, (iii) the scaling issue and the biplot interpretation of maps,(iv) whether or not to rotate a solution, and (v) statistical significance of results.
Resumo:
ABSTRACT: BACKGROUND: Chest pain raises concern for the possibility of coronary heart disease. Scoring methods have been developed to identify coronary heart disease in emergency settings, but not in primary care. METHODS: Data were collected from a multicenter Swiss clinical cohort study including 672 consecutive patients with chest pain, who had visited one of 59 family practitioners' offices. Using delayed diagnosis we derived a prediction rule to rule out coronary heart disease by means of a logistic regression model. Known cardiovascular risk factors, pain characteristics, and physical signs associated with coronary heart disease were explored to develop a clinical score. Patients diagnosed with angina or acute myocardial infarction within the year following their initial visit comprised the coronary heart disease group. RESULTS: The coronary heart disease score was derived from eight variables: age, gender, duration of chest pain from 1 to 60 minutes, substernal chest pain location, pain increases with exertion, absence of tenderness point at palpation, cardiovascular risks factors, and personal history of cardiovascular disease. Area under the receiver operating characteristics curve was of 0.95 with a 95% confidence interval of 0.92; 0.97. From this score, 413 patients were considered as low risk for values of percentile 5 of the coronary heart disease patients. Internal validity was confirmed by bootstrapping. External validation using data from a German cohort (Marburg, n = 774) revealed a receiver operating characteristics curve of 0.75 (95% confidence interval, 0.72; 0.81) with a sensitivity of 85.6% and a specificity of 47.2%. CONCLUSIONS: This score, based only on history and physical examination, is a complementary tool for ruling out coronary heart disease in primary care patients complaining of chest pain.
Resumo:
Accurate detection of subpopulation size determinations in bimodal populations remains problematic yet it represents a powerful way by which cellular heterogeneity under different environmental conditions can be compared. So far, most studies have relied on qualitative descriptions of population distribution patterns, on population-independent descriptors, or on arbitrary placement of thresholds distinguishing biological ON from OFF states. We found that all these methods fall short of accurately describing small population sizes in bimodal populations. Here we propose a simple, statistics-based method for the analysis of small subpopulation sizes for use in the free software environment R and test this method on real as well as simulated data. Four so-called population splitting methods were designed with different algorithms that can estimate subpopulation sizes from bimodal populations. All four methods proved more precise than previously used methods when analyzing subpopulation sizes of transfer competent cells arising in populations of the bacterium Pseudomonas knackmussii B13. The methods' resolving powers were further explored by bootstrapping and simulations. Two of the methods were not severely limited by the proportions of subpopulations they could estimate correctly, but the two others only allowed accurate subpopulation quantification when this amounted to less than 25% of the total population. In contrast, only one method was still sufficiently accurate with subpopulations smaller than 1% of the total population. This study proposes a number of rational approximations to quantifying small subpopulations and offers an easy-to-use protocol for their implementation in the open source statistical software environment R.
Resumo:
Background: Detection rates for adenoma and early colorectal cancer (CRC) are unsatisfactory due to low compliance towards invasive screening procedures such as colonoscopy. There is a large unmet screening need calling for an accurate, non-invasive and cost-effective test to screen for early neoplastic and pre-neoplastic lesions. Our goal is to identify effective biomarker combinations to develop a screening test aimed at detecting precancerous lesions and early CRC stages, based on a multigene assay performed on peripheral blood mononuclear cells (PBMC).Methods: A pilot study was conducted on 92 subjects. Colonoscopy revealed 21 CRC, 30 adenomas larger than 1 cm and 41 healthy controls. A panel of 103 biomarkers was selected by two approaches: a candidate gene approach based on literature review and whole transcriptome analysis of a subset of this cohort by Illumina TAG profiling. Blood samples were taken from each patient and PBMC purified. Total RNA was extracted and the 103 biomarkers were tested by multiplex RT-qPCR on the cohort. Different univariate and multivariate statistical methods were applied on the PCR data and 60 biomarkers, with significant p-value (< 0.01) for most of the methods, were selected.Results: The 60 biomarkers are involved in several different biological functions, such as cell adhesion, cell motility, cell signaling, cell proliferation, development and cancer. Two distinct molecular signatures derived from the biomarker combinations were established based on penalized logistic regression to separate patients without lesion from those with CRC or adenoma. These signatures were validated using bootstrapping method, leading to a separation of patients without lesion from those with CRC (Se 67%, Sp 93%, AUC 0.87) and from those with adenoma larger than 1cm (Se 63%, Sp 83%, AUC 0.77). In addition, the organ and disease specificity of these signatures was confirmed by means of patients with other cancer types and inflammatory bowel diseases.Conclusions: The two defined biomarker combinations effectively detect the presence of CRC and adenomas larger than 1 cm with high sensitivity and specificity. A prospective, multicentric, pivotal study is underway in order to validate these results in a larger cohort.
ASTRAL-R score predicts non-recanalisation after intravenous thrombolysis in acute ischaemic stroke.
Resumo:
Intravenous thrombolysis (IVT) as treatment in acute ischaemic strokes may be insufficient to achieve recanalisation in certain patients. Predicting probability of non-recanalisation after IVT may have the potential to influence patient selection to more aggressive management strategies. We aimed at deriving and internally validating a predictive score for post-thrombolytic non-recanalisation, using clinical and radiological variables. In thrombolysis registries from four Swiss academic stroke centres (Lausanne, Bern, Basel and Geneva), patients were selected with large arterial occlusion on acute imaging and with repeated arterial assessment at 24 hours. Based on a logistic regression analysis, an integer-based score for each covariate of the fitted multivariate model was generated. Performance of integer-based predictive model was assessed by bootstrapping available data and cross validation (delete-d method). In 599 thrombolysed strokes, five variables were identified as independent predictors of absence of recanalisation: Acute glucose > 7 mmol/l (A), significant extracranial vessel STenosis (ST), decreased Range of visual fields (R), large Arterial occlusion (A) and decreased Level of consciousness (L). All variables were weighted 1, except for (L) which obtained 2 points based on β-coefficients on the logistic scale. ASTRAL-R scores 0, 3 and 6 corresponded to non-recanalisation probabilities of 18, 44 and 74 % respectively. Predictive ability showed AUC of 0.66 (95 %CI, 0.61-0.70) when using bootstrap and 0.66 (0.63-0.68) when using delete-d cross validation. In conclusion, the 5-item ASTRAL-R score moderately predicts non-recanalisation at 24 hours in thrombolysed ischaemic strokes. If its performance can be confirmed by external validation and its clinical usefulness can be proven, the score may influence patient selection for more aggressive revascularisation strategies in routine clinical practice.
Resumo:
Ten common doubts of chemistry students and professionals about their statistical applications are discussed. The use of the N-1 denominator instead of N is described for the standard deviation. The statistical meaning of the denominators of the root mean square error of calibration (RMSEC) and root mean square error of validation (RMSEV) are given for researchers using multivariate calibration methods. The reason why scientists and engineers use the average instead of the median is explained. Several problematic aspects about regression and correlation are treated. The popular use of triplicate experiments in teaching and research laboratories is seen to have its origin in statistical confidence intervals. Nonparametric statistics and bootstrapping methods round out the discussion.
Resumo:
A valoração econômica dos ativos ambientais é importante critério que subsidia a tomada de decisões na definição de políticas que gerenciam os recursos naturais. Nesse sentido, buscou-se valorar a Área de Proteção Ambiental Estadual da Cachoeira das Andorinhas (APAE/CA), importante Unidade de Conservação do Município de Ouro Preto, MG, com 18.700 ha de extensão. Apesar de possuir importante patrimônio natural, tal ativo ambiental possui áreas degradadas, além de ainda não ter o respaldo necessário dos órgãos gestores para o seu efetivo manejo. Como área de estudo, definiu-se a sub-bacia do Rio das Velhas, na qual a APAE/CA está inserida, abrangendo distritos localizados em Ouro Preto e Itabirito. Utilizando a Valoração Contingente, através da abordagem de Hanemann (1984) e do método do bootstrapping, obteve-se a disposição a pagar (DAP) mensal mediana por habitante dos distritos envolvidos pela melhoria e preservação da APAE/CA de R$15,43. O valor econômico calculado para o ativo ambiental foi de R$10.398.030,12, representando os benefícios anuais fornecidos pela APAE/CA e que são percebidos pelos entrevistados. Conjuntamente com a valoração, utilizaram-se variáveis de percepção ambiental, a fim de verificar seu impacto sobre a DAP em estudo. Os resultados encontrados apontaram que a percepção e a valoração ambiental são positivamente correlacionadas.
Resumo:
The purpose of this master thesis was to perform simulations that involve use of random number while testing hypotheses especially on two samples populations being compared weather by their means, variances or Sharpe ratios. Specifically, we simulated some well known distributions by Matlab and check out the accuracy of an hypothesis testing. Furthermore, we went deeper and check what could happen once the bootstrapping method as described by Effrons is applied on the simulated data. In addition to that, one well known RobustSharpe hypothesis testing stated in the paper of Ledoit and Wolf was applied to measure the statistical significance performance between two investment founds basing on testing weather there is a statistically significant difference between their Sharpe Ratios or not. We collected many literatures about our topic and perform by Matlab many simulated random numbers as possible to put out our purpose; As results we come out with a good understanding that testing are not always accurate; for instance while testing weather two normal distributed random vectors come from the same normal distribution. The Jacque-Berra test for normality showed that for the normal random vector r1 and r2, only 94,7% and 95,7% respectively are coming from normal distribution in contrast 5,3% and 4,3% failed to shown the truth already known; but when we introduce the bootstrapping methods by Effrons while estimating pvalues where the hypothesis decision is based, the accuracy of the test was 100% successful. From the above results the reports showed that bootstrapping methods while testing or estimating some statistics should always considered because at most cases the outcome are accurate and errors are minimized in the computation. Also the RobustSharpe test which is known to use one of the bootstrapping methods, studentised one, were applied first on different simulated data including distribution of many kind and different shape secondly, on real data, Hedge and Mutual funds. The test performed quite well to agree with the existence of statistical significance difference between their Sharpe ratios as described in the paper of Ledoit andWolf.
Resumo:
This thesis studies the possibility of using information on insiders’ transactions to forecast future stock returns after the implementation of Sarbanes Oxley Act in July 2003. Insider transactions between July 2003 and August 2009 are analysed with regression tests to identify the relationships between insiders’ transactions and future stock returns. This analysis is complemented with rudimentary bootstrapping procedures to verify the robustness of the findings. The underlying assumption of the thesis is that insiders constantly receive pieces of information that indicate future performance of the company. They may not be allowed to trade on large and tangible pieces of information but they can trade on accumulation of smaller, intangible pieces of information. Based on the analysis in the thesis insiders’ profits were found not to differ from the returns from broad stock index. However, their individual transactions were found to be linked to future stock returns. The initial model was found to be unstable but some of the predictive power could be sacrificed to achieve greater stability. Even after sacrificing some predictive power the relationship was significant enough to allow external investors to achieve abnormal profits after transaction costs and taxes. The thesis does not go into great detail about timing of transactions. Delay in publishing insiders’ transactions is not taken into account in the calculations and the closed windows are not studied in detail. The potential effects of these phenomena are looked into and they do not cause great changes in the findings. Additionally the remuneration policy of an insider or a company is not taken into account even though it most likely affects the trading patterns of insiders. Even with the limitations the findings offer promising opportunities for investors to improve their investment processes by incorporating additional information from insiders’ transaction into their decisions. The findings also raise questions on how insider trading should be regulated. Insiders achieve greater returns than other investors based on superior information. On the other hand, more efficient information transfer could warrant more lenient regulation. The fact that insiders’ returns are dominated by the large investment stake they maintain all the time in their own companies also speaks for more leniency. As Sarbanes Oxley Act considerably modified the insider trading landscape, this analysis provides information that has not been available before. The thesis also constitutes a thorough analysis of insider trading phenomenon which has previously been somewhat separated into several studies.
Resumo:
The pharmacokinetics of scorpion venom and its toxins has been investigated in experimental models using adult animals, although, severe scorpion accidents are associated more frequently with children. We compared the effect of age on the pharmacokinetics of tityustoxin, one of the most active principles of Tityus serrulatus venom, in young male/female rats (21-22 days old, N = 5-8) and in adult male rats (150-160 days old, N = 5-8). Tityustoxin (6 µg) labeled with 99mTechnetium was administered subcutaneously to young and adult rats. The plasma concentration vs time data were subjected to non-compartmental pharmacokinetic analysis to obtain estimates of various pharmacokinetic parameters such as total body clearance (CL/F), distribution volume (Vd/F), area under the curve (AUC), and mean residence time. The data were analyzed with and without considering body weight. The data without correction for body weight showed a higher Cmax (62.30 ± 7.07 vs 12.71 ± 2.11 ng/ml, P < 0.05) and AUC (296.49 ± 21.09 vs 55.96 ± 5.41 ng h-1 ml-1, P < 0.05) and lower Tmax (0.64 ± 0.19 vs 2.44 ± 0.49 h, P < 0.05) in young rats. Furthermore, Vd/F (0.15 vs 0.42 l/kg) and CL/F (0.02 ± 0.001 vs 0.11 ± 0.01 l h-1 kg-1, P < 0.05) were lower in young rats. However, when the data were reanalyzed taking body weight into consideration, the Cmax (40.43 ± 3.25 vs 78.21 ± 11.23 ng kg-1 ml-1, P < 0.05) and AUC (182.27 ± 11.74 vs 344.62 ± 32.11 ng h-1 ml-1, P < 0.05) were lower in young rats. The clearance (0.03 ± 0.002 vs 0.02 ± 0.002 l h-1 kg-1, P < 0.05) and Vd/F (0.210 vs 0.067 l/kg) were higher in young rats. The raw data (not adjusted for body weight) strongly suggest that age plays a pivotal role in the disposition of tityustoxin. Furthermore, our results also indicate that the differences in the severity of symptoms observed in children and adults after scorpion envenomation can be explained in part by differences in the pharmacokinetics of the toxin.
Resumo:
Le but de cette thèse est d étendre la théorie du bootstrap aux modèles de données de panel. Les données de panel s obtiennent en observant plusieurs unités statistiques sur plusieurs périodes de temps. Leur double dimension individuelle et temporelle permet de contrôler l 'hétérogénéité non observable entre individus et entre les périodes de temps et donc de faire des études plus riches que les séries chronologiques ou les données en coupe instantanée. L 'avantage du bootstrap est de permettre d obtenir une inférence plus précise que celle avec la théorie asymptotique classique ou une inférence impossible en cas de paramètre de nuisance. La méthode consiste à tirer des échantillons aléatoires qui ressemblent le plus possible à l échantillon d analyse. L 'objet statitstique d intérêt est estimé sur chacun de ses échantillons aléatoires et on utilise l ensemble des valeurs estimées pour faire de l inférence. Il existe dans la littérature certaines application du bootstrap aux données de panels sans justi cation théorique rigoureuse ou sous de fortes hypothèses. Cette thèse propose une méthode de bootstrap plus appropriée aux données de panels. Les trois chapitres analysent sa validité et son application. Le premier chapitre postule un modèle simple avec un seul paramètre et s 'attaque aux propriétés théoriques de l estimateur de la moyenne. Nous montrons que le double rééchantillonnage que nous proposons et qui tient compte à la fois de la dimension individuelle et la dimension temporelle est valide avec ces modèles. Le rééchantillonnage seulement dans la dimension individuelle n est pas valide en présence d hétérogénéité temporelle. Le ré-échantillonnage dans la dimension temporelle n est pas valide en présence d'hétérogénéité individuelle. Le deuxième chapitre étend le précédent au modèle panel de régression. linéaire. Trois types de régresseurs sont considérés : les caractéristiques individuelles, les caractéristiques temporelles et les régresseurs qui évoluent dans le temps et par individu. En utilisant un modèle à erreurs composées doubles, l'estimateur des moindres carrés ordinaires et la méthode de bootstrap des résidus, on montre que le rééchantillonnage dans la seule dimension individuelle est valide pour l'inférence sur les coe¢ cients associés aux régresseurs qui changent uniquement par individu. Le rééchantillonnage dans la dimen- sion temporelle est valide seulement pour le sous vecteur des paramètres associés aux régresseurs qui évoluent uniquement dans le temps. Le double rééchantillonnage est quand à lui est valide pour faire de l inférence pour tout le vecteur des paramètres. Le troisième chapitre re-examine l exercice de l estimateur de différence en di¤érence de Bertrand, Duflo et Mullainathan (2004). Cet estimateur est couramment utilisé dans la littérature pour évaluer l impact de certaines poli- tiques publiques. L exercice empirique utilise des données de panel provenant du Current Population Survey sur le salaire des femmes dans les 50 états des Etats-Unis d Amérique de 1979 à 1999. Des variables de pseudo-interventions publiques au niveau des états sont générées et on s attend à ce que les tests arrivent à la conclusion qu il n y a pas d e¤et de ces politiques placebos sur le salaire des femmes. Bertrand, Du o et Mullainathan (2004) montre que la non-prise en compte de l hétérogénéité et de la dépendance temporelle entraîne d importantes distorsions de niveau de test lorsqu'on évalue l'impact de politiques publiques en utilisant des données de panel. Une des solutions préconisées est d utiliser la méthode de bootstrap. La méthode de double ré-échantillonnage développée dans cette thèse permet de corriger le problème de niveau de test et donc d'évaluer correctement l'impact des politiques publiques.
Resumo:
L'objectif principal de ce travail est d’étudier en profondeur certaines techniques biostatistiques avancées en recherche évaluative en chirurgie cardiaque adulte. Les études ont été conçues pour intégrer les concepts d'analyse de survie, analyse de régression avec “propensity score”, et analyse de coûts. Le premier manuscrit évalue la survie après la réparation chirurgicale de la dissection aigüe de l’aorte ascendante. Les analyses statistiques utilisées comprennent : analyses de survie avec régression paramétrique des phases de risque et d'autres méthodes paramétriques (exponentielle, Weibull), semi-paramétriques (Cox) ou non-paramétriques (Kaplan-Meier) ; survie comparée à une cohorte appariée pour l’âge, le sexe et la race utilisant des tables de statistiques de survie gouvernementales ; modèles de régression avec “bootstrapping” et “multinomial logit model”. L'étude a démontrée que la survie s'est améliorée sur 25 ans en lien avec des changements dans les techniques chirurgicales et d’imagerie diagnostique. Le second manuscrit est axé sur les résultats des pontages coronariens isolés chez des patients ayant des antécédents d'intervention coronarienne percutanée. Les analyses statistiques utilisées comprennent : modèles de régression avec “propensity score” ; algorithme complexe d'appariement (1:3) ; analyses statistiques appropriées pour les groupes appariés (différences standardisées, “generalized estimating equations”, modèle de Cox stratifié). L'étude a démontrée que l’intervention coronarienne percutanée subie 14 jours ou plus avant la chirurgie de pontages coronariens n'est pas associée à des résultats négatifs à court ou long terme. Le troisième manuscrit évalue les conséquences financières et les changements démographiques survenant pour un centre hospitalier universitaire suite à la mise en place d'un programme de chirurgie cardiaque satellite. Les analyses statistiques utilisées comprennent : modèles de régression multivariée “two-way” ANOVA (logistique, linéaire ou ordinale) ; “propensity score” ; analyses de coûts avec modèles paramétriques Log-Normal. Des modèles d’analyse de « survie » ont également été explorés, utilisant les «coûts» au lieu du « temps » comme variable dépendante, et ont menés à des conclusions similaires. L'étude a démontrée que, après la mise en place du programme satellite, moins de patients de faible complexité étaient référés de la région du programme satellite au centre hospitalier universitaire, avec une augmentation de la charge de travail infirmier et des coûts.