678 resultados para Bootstrap (Estatistica)
Resumo:
Purpose Age-related changes in motion sensitivity have been found to relate to reductions in various indices of driving performance and safety. The aim of this study was to investigate the basis of this relationship in terms of determining which aspects of motion perception are most relevant to driving. Methods Participants included 61 regular drivers (age range 22–87 years). Visual performance was measured binocularly. Measures included visual acuity, contrast sensitivity and motion sensitivity assessed using four different approaches: (1) threshold minimum drift rate for a drifting Gabor patch, (2) Dmin from a random dot display, (3) threshold coherence from a random dot display, and (4) threshold drift rate for a second-order (contrast modulated) sinusoidal grating. Participants then completed the Hazard Perception Test (HPT) in which they were required to identify moving hazards in videos of real driving scenes, and also a Direction of Heading task (DOH) in which they identified deviations from normal lane keeping in brief videos of driving filmed from the interior of a vehicle. Results In bivariate correlation analyses, all motion sensitivity measures significantly declined with age. Motion coherence thresholds, and minimum drift rate threshold for the first-order stimulus (Gabor patch) both significantly predicted HPT performance even after controlling for age, visual acuity and contrast sensitivity. Bootstrap mediation analysis showed that individual differences in DOH accuracy partly explained these relationships, where those individuals with poorer motion sensitivity on the coherence and Gabor tests showed decreased ability to perceive deviations in motion in the driving videos, which related in turn to their ability to detect the moving hazards. Conclusions The ability to detect subtle movements in the driving environment (as determined by the DOH task) may be an important contributor to effective hazard perception, and is associated with age, and an individuals' performance on tests of motion sensitivity. The locus of the processing deficits appears to lie in first-order, rather than second-order motion pathways.
Resumo:
We consider the development of statistical models for prediction of constituent concentration of riverine pollutants, which is a key step in load estimation from frequent flow rate data and less frequently collected concentration data. We consider how to capture the impacts of past flow patterns via the average discounted flow (ADF) which discounts the past flux based on the time lapsed - more recent fluxes are given more weight. However, the effectiveness of ADF depends critically on the choice of the discount factor which reflects the unknown environmental cumulating process of the concentration compounds. We propose to choose the discount factor by maximizing the adjusted R-2 values or the Nash-Sutcliffe model efficiency coefficient. The R2 values are also adjusted to take account of the number of parameters in the model fit. The resulting optimal discount factor can be interpreted as a measure of constituent exhaustion rate during flood events. To evaluate the performance of the proposed regression estimators, we examine two different sampling scenarios by resampling fortnightly and opportunistically from two real daily datasets, which come from two United States Geological Survey (USGS) gaging stations located in Des Plaines River and Illinois River basin. The generalized rating-curve approach produces biased estimates of the total sediment loads by -30% to 83%, whereas the new approaches produce relatively much lower biases, ranging from -24% to 35%. This substantial improvement in the estimates of the total load is due to the fact that predictability of concentration is greatly improved by the additional predictors.
Resumo:
Rank-based inference is widely used because of its robustness. This article provides optimal rank-based estimating functions in analysis of clustered data with random cluster effects. The extensive simulation studies carried out to evaluate the performance of the proposed method demonstrate that it is robust to outliers and is highly efficient given the existence of strong cluster correlations. The performance of the proposed method is satisfactory even when the correlation structure is misspecified, or when heteroscedasticity in variance is present. Finally, a real dataset is analyzed for illustration.
Resumo:
We consider estimating the total load from frequent flow data but less frequent concentration data. There are numerous load estimation methods available, some of which are captured in various online tools. However, most estimators are subject to large biases statistically, and their associated uncertainties are often not reported. This makes interpretation difficult and the estimation of trends or determination of optimal sampling regimes impossible to assess. In this paper, we first propose two indices for measuring the extent of sampling bias, and then provide steps for obtaining reliable load estimates that minimizes the biases and makes use of informative predictive variables. The key step to this approach is in the development of an appropriate predictive model for concentration. This is achieved using a generalized rating-curve approach with additional predictors that capture unique features in the flow data, such as the concept of the first flush, the location of the event on the hydrograph (e.g. rise or fall) and the discounted flow. The latter may be thought of as a measure of constituent exhaustion occurring during flood events. Forming this additional information can significantly improve the predictability of concentration, and ultimately the precision with which the pollutant load is estimated. We also provide a measure of the standard error of the load estimate which incorporates model, spatial and/or temporal errors. This method also has the capacity to incorporate measurement error incurred through the sampling of flow. We illustrate this approach for two rivers delivering to the Great Barrier Reef, Queensland, Australia. One is a data set from the Burdekin River, and consists of the total suspended sediment (TSS) and nitrogen oxide (NO(x)) and gauged flow for 1997. The other dataset is from the Tully River, for the period of July 2000 to June 2008. For NO(x) Burdekin, the new estimates are very similar to the ratio estimates even when there is no relationship between the concentration and the flow. However, for the Tully dataset, by incorporating the additional predictive variables namely the discounted flow and flow phases (rising or recessing), we substantially improved the model fit, and thus the certainty with which the load is estimated.
Resumo:
There are numerous load estimation methods available, some of which are captured in various online tools. However, most estimators are subject to large biases statistically, and their associated uncertainties are often not reported. This makes interpretation difficult and the estimation of trends or determination of optimal sampling regimes impossible to assess. In this paper, we first propose two indices for measuring the extent of sampling bias, and then provide steps for obtaining reliable load estimates by minimizing the biases and making use of possible predictive variables. The load estimation procedure can be summarized by the following four steps: - (i) output the flow rates at regular time intervals (e.g. 10 minutes) using a time series model that captures all the peak flows; - (ii) output the predicted flow rates as in (i) at the concentration sampling times, if the corresponding flow rates are not collected; - (iii) establish a predictive model for the concentration data, which incorporates all possible predictor variables and output the predicted concentrations at the regular time intervals as in (i), and; - (iv) obtain the sum of all the products of the predicted flow and the predicted concentration over the regular time intervals to represent an estimate of the load. The key step to this approach is in the development of an appropriate predictive model for concentration. This is achieved using a generalized regression (rating-curve) approach with additional predictors that capture unique features in the flow data, namely the concept of the first flush, the location of the event on the hydrograph (e.g. rise or fall) and cumulative discounted flow. The latter may be thought of as a measure of constituent exhaustion occurring during flood events. The model also has the capacity to accommodate autocorrelation in model errors which are the result of intensive sampling during floods. Incorporating this additional information can significantly improve the predictability of concentration, and ultimately the precision with which the pollutant load is estimated. We also provide a measure of the standard error of the load estimate which incorporates model, spatial and/or temporal errors. This method also has the capacity to incorporate measurement error incurred through the sampling of flow. We illustrate this approach using the concentrations of total suspended sediment (TSS) and nitrogen oxide (NOx) and gauged flow data from the Burdekin River, a catchment delivering to the Great Barrier Reef. The sampling biases for NOx concentrations range from 2 to 10 times indicating severe biases. As we expect, the traditional average and extrapolation methods produce much higher estimates than those when bias in sampling is taken into account.
Resumo:
We consider rank-based regression models for repeated measures. To account for possible withinsubject correlations, we decompose the total ranks into between- and within-subject ranks and obtain two different estimators based on between- and within-subject ranks. A simple perturbation method is then introduced to generate bootstrap replicates of the estimating functions and the parameter estimates. This provides a convenient way for combining the corresponding two types of estimating function for more efficient estimation.
Resumo:
Recovering the motion of a non-rigid body from a set of monocular images permits the analysis of dynamic scenes in uncontrolled environments. However, the extension of factorisation algorithms for rigid structure from motion to the low-rank non-rigid case has proved challenging. This stems from the comparatively hard problem of finding a linear “corrective transform” which recovers the projection and structure matrices from an ambiguous factorisation. We elucidate that this greater difficulty is due to the need to find multiple solutions to a non-trivial problem, casting a number of previous approaches as alleviating this issue by either a) introducing constraints on the basis, making the problems nonidentical, or b) incorporating heuristics to encourage a diverse set of solutions, making the problems inter-dependent. While it has previously been recognised that finding a single solution to this problem is sufficient to estimate cameras, we show that it is possible to bootstrap this partial solution to find the complete transform in closed-form. However, we acknowledge that our method minimises an algebraic error and is thus inherently sensitive to deviation from the low-rank model. We compare our closed-form solution for non-rigid structure with known cameras to the closed-form solution of Dai et al. [1], which we find to produce only coplanar reconstructions. We therefore make the recommendation that 3D reconstruction error always be measured relative to a trivial reconstruction such as a planar one.
Resumo:
Botryosphaeria rhodina (anamorph Lasiodiplodia theobromae) is a common endophyte and opportunistic pathogen on more than 500 tree species in the tropics and subtropics. During routine disease surveys of plantations in Australia and Venezuela several isolates differing from L. theobromae were identified and subsequently characterized based upon morphology and ITS and EF1-a nucleotide sequences. These isolates grouped into three strongly supported clades related to but different from the known taxa, B. rhodina and L. gonubiensis, These have been described here as three new species L. venezuelensis sp. nov., L. crassispora sp. nov. and L. rubropurpurea sp. nov. The three could be distinguished easily from each other and the two described species of Lasiodiplodia, thus confirming phylogenetic separations. Furthermore all five Lasiodiplodia spp. now recognized separated from Diplodia spp. and Dothiorella spp. with 100% bootstrap support.
Resumo:
Avian haemophili demonstrating in vitro satellitic growth, also referred to as the V-factor or NAD requirement, have mainly been classified with Avibacterium paragallinarum (Haemophilus paragallinarum), Avibacterium avium (Pasteurella avium), Avibacterium volantium (Pasteurella volantium) and Avibacterium sp. A (Pasteurella species A). The aim of the present study was to assess the taxonomic position of 18 V-factor-requiring isolates of unclassified Haemophilus-like organisms isolated from galliforme, anseriforme, columbiforme and gruiforme birds as well as kestrels and psittacine birds including budgerigars by conventional phenotypic tests and 16S rRNA gene sequencing. All isolates shared phenotypical characteristics which allowed classification with Pasteurellaceae. Haemolysis of bovine red blood cells was negative. Haemin (X-factor) was not required for growth. Maximum-likelihood phylogenetic analysis including bootstrap analysis showed that six isolates were related to the avian 16S rRNA group and were classified as Avibacterium according to 16S rRNA sequence analysis. Surprisingly, the other 12 isolates were unrelated to Avibacterium. Two isolates were unrelated to any of the known 16S rRNA groups of Pasteurellaceae. Two isolates were related to Volucribacter of the avian 16S rRNA group. Seven isolates belonged to the Testudinis 16S rRNA group and out of these, two isolates were closely related to taxa 14 and 32 of Bisgaard, whereas four other isolates were found to form a genus-like group distantly related to taxon 40 and one isolated remained distantly related to other members of the Testudinis group. One isolate was closely related to taxon 26 (a member of Actinobacillus sensu stricto). The study documented major genetic diversity among V-factor-requiring avian isolates beyond the traditional interpretation that they only belong to Avibacterium, underlining the limited value of satellitic growth for identification of avian members of Pasteurellaceae. Our study also emphasized that these organisms will never be isolated without the use of special media satisfying the V-factor requirement.
Resumo:
Objective: To examine if streamlining a medical research funding application process saved time for applicants. Design: Cross-sectional surveys before and after the streamlining. Setting: The National Health and Medical Research Council (NHMRC) of Australia. Participants: Researchers who submitted one or more NHMRC Project Grant applications in 2012 or 2014. Main outcome measures: Average researcher time spent preparing an application and the total time for all applications in working days. Results: The average time per application increased from 34 working days before streamlining (95% CI 33 to 35) to 38 working days after streamlining (95% CI 37 to 39; mean difference 4 days, bootstrap p value <0.001). The estimated total time spent by all researchers on applications after streamlining was 614 working years, a 67-year increase from before streamlining. Conclusions: Streamlined applications were shorter but took longer to prepare on average. Researchers may be allocating a fixed amount of time to preparing funding applications based on their expected return, or may be increasing their time in response to increased competition. Many potentially productive years of researcher time are still being lost to preparing failed applications.
Resumo:
The DNA polymorphism among 22 isolates of Sclerospora graminicola, the causal agent of downy mildew disease of pearl millet was assessed using 20 inter simple sequence repeats (ISSR) primers. The objective of the study was to examine the effectiveness of using ISSR markers for unravelling the extent and pattern of genetic diversity in 22 S. graminicola isolates collected from different host cultivars in different states of India. The 19 functional ISSR primers generated 410 polymorphic bands and revealed 89% polymorphism and were able to distinguish all the 22 isolates. Polymorphic bands used to construct an unweighted pair group method of averages (UPGMA) dendrogram based on Jaccard's co-efficient of similarity and principal coordinate analysis resulted in the formation of four major clusters of 22 isolates. The standardized Nei genetic distance among the 22 isolates ranged from 0.0050 to 0.0206. The UPGMA clustering using the standardized genetic distance matrix resulted in the identification of four clusters of the 22 isolates with bootstrap values ranging from 15 to 100. The 3D-scale data supported the UPGMA results, which resulted into four clusters amounting to 70% variation among each other. However, comparing the two methods show that sub clustering by dendrogram and multi dimensional scaling plot is slightly different. All the S. graminicola isolates had distinct ISSR genotypes and cluster analysis origin. The results of ISSR fingerprints revealed significant level of genetic diversity among the isolates and that ISSR markers could be a powerful tool for fingerprinting and diversity analysis in fungal pathogens.
Resumo:
It is known that DNA-binding proteins can slide along the DNA helix while searching for specific binding sites, but their path of motion remains obscure. Do these proteins undergo simple one-dimensional (1D) translational diffusion, or do they rotate to maintain a specific orientation with respect to the DNA helix? We measured 1D diffusion constants as a function of protein size while maintaining the DNA-protein interface. Using bootstrap analysis of single-molecule diffusion data, we compared the results to theoretical predictions for pure translational motion and rotation-coupled sliding along the DNA. The data indicate that DNA-binding proteins undergo rotation-coupled sliding along the DNA helix and can be described by a model of diffusion along the DNA helix on a rugged free-energy landscape. A similar analysis including the 1D diffusion constants of eight proteins of varying size shows that rotation-coupled sliding is a general phenomenon. The average free-energy barrier for sliding along the DNA was 1.1 +/- 0.2 k(B)T. Such small barriers facilitate rapid search for binding sites.
Resumo:
This thesis studies binary time series models and their applications in empirical macroeconomics and finance. In addition to previously suggested models, new dynamic extensions are proposed to the static probit model commonly used in the previous literature. In particular, we are interested in probit models with an autoregressive model structure. In Chapter 2, the main objective is to compare the predictive performance of the static and dynamic probit models in forecasting the U.S. and German business cycle recession periods. Financial variables, such as interest rates and stock market returns, are used as predictive variables. The empirical results suggest that the recession periods are predictable and dynamic probit models, especially models with the autoregressive structure, outperform the static model. Chapter 3 proposes a Lagrange Multiplier (LM) test for the usefulness of the autoregressive structure of the probit model. The finite sample properties of the LM test are considered with simulation experiments. Results indicate that the two alternative LM test statistics have reasonable size and power in large samples. In small samples, a parametric bootstrap method is suggested to obtain approximately correct size. In Chapter 4, the predictive power of dynamic probit models in predicting the direction of stock market returns are examined. The novel idea is to use recession forecast (see Chapter 2) as a predictor of the stock return sign. The evidence suggests that the signs of the U.S. excess stock returns over the risk-free return are predictable both in and out of sample. The new "error correction" probit model yields the best forecasts and it also outperforms other predictive models, such as ARMAX models, in terms of statistical and economic goodness-of-fit measures. Chapter 5 generalizes the analysis of univariate models considered in Chapters 2 4 to the case of a bivariate model. A new bivariate autoregressive probit model is applied to predict the current state of the U.S. business cycle and growth rate cycle periods. Evidence of predictability of both cycle indicators is obtained and the bivariate model is found to outperform the univariate models in terms of predictive power.
Resumo:
Topics in Spatial Econometrics — With Applications to House Prices Spatial effects in data occur when geographical closeness of observations influences the relation between the observations. When two points on a map are close to each other, the observed values on a variable at those points tend to be similar. The further away the two points are from each other, the less similar the observed values tend to be. Recent technical developments, geographical information systems (GIS) and global positioning systems (GPS) have brought about a renewed interest in spatial matters. For instance, it is possible to observe the exact location of an observation and combine it with other characteristics. Spatial econometrics integrates spatial aspects into econometric models and analysis. The thesis concentrates mainly on methodological issues, but the findings are illustrated by empirical studies on house price data. The thesis consists of an introductory chapter and four essays. The introductory chapter presents an overview of topics and problems in spatial econometrics. It discusses spatial effects, spatial weights matrices, especially k-nearest neighbours weights matrices, and various spatial econometric models, as well as estimation methods and inference. Further, the problem of omitted variables, a few computational and empirical aspects, the bootstrap procedure and the spatial J-test are presented. In addition, a discussion on hedonic house price models is included. In the first essay a comparison is made between spatial econometrics and time series analysis. By restricting the attention to unilateral spatial autoregressive processes, it is shown that a unilateral spatial autoregression, which enjoys similar properties as an autoregression with time series, can be defined. By an empirical study on house price data the second essay shows that it is possible to form coordinate-based, spatially autoregressive variables, which are at least to some extent able to replace the spatial structure in a spatial econometric model. In the third essay a strategy for specifying a k-nearest neighbours weights matrix by applying the spatial J-test is suggested, studied and demonstrated. In the final fourth essay the properties of the asymptotic spatial J-test are further examined. A simulation study shows that the spatial J-test can be used for distinguishing between general spatial models with different k-nearest neighbours weights matrices. A bootstrap spatial J-test is suggested to correct the size of the asymptotic test in small samples.
Resumo:
In the thesis we consider inference for cointegration in vector autoregressive (VAR) models. The thesis consists of an introduction and four papers. The first paper proposes a new test for cointegration in VAR models that is directly based on the eigenvalues of the least squares (LS) estimate of the autoregressive matrix. In the second paper we compare a small sample correction for the likelihood ratio (LR) test of cointegrating rank and the bootstrap. The simulation experiments show that the bootstrap works very well in practice and dominates the correction factor. The tests are applied to international stock prices data, and the .nite sample performance of the tests are investigated by simulating the data. The third paper studies the demand for money in Sweden 1970—2000 using the I(2) model. In the fourth paper we re-examine the evidence of cointegration between international stock prices. The paper shows that some of the previous empirical results can be explained by the small-sample bias and size distortion of Johansen’s LR tests for cointegration. In all papers we work with two data sets. The first data set is a Swedish money demand data set with observations on the money stock, the consumer price index, gross domestic product (GDP), the short-term interest rate and the long-term interest rate. The data are quarterly and the sample period is 1970(1)—2000(1). The second data set consists of month-end stock market index observations for Finland, France, Germany, Sweden, the United Kingdom and the United States from 1980(1) to 1997(2). Both data sets are typical of the sample sizes encountered in economic data, and the applications illustrate the usefulness of the models and tests discussed in the thesis.