19 resultados para bootstrapping


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The report card for the introductory programming unit at our university has historically been unremarkable in terms of attendance rates, student success rates and student retention in both the unit and the degree course. After a course restructure recently involving a fresh approach to introducing programming, we reported a high retention in the unit, with consistently high attendance and a very low failure rate. Following those encouraging results, we collected student attendance data for several semesters and compared attendance rates to student results. We have found that interesting workshop material which directly relates to course-relevant assessment items and therefore drives the learning, in an engaging collaborative learning environment has improved attendance to an extraordinary extent, with student failure rates plummeting to the lowest in recorded history at our university.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper is concerned with the unsupervised learning of object representations by fusing visual and motor information. The problem is posed for a mobile robot that develops its representations as it incrementally gathers data. The scenario is problematic as the robot only has limited information at each time step with which it must generate and update its representations. Object representations are refined as multiple instances of sensory data are presented; however, it is uncertain whether two data instances are synonymous with the same object. This process can easily diverge from stability. The premise of the presented work is that a robot's motor information instigates successful generation of visual representations. An understanding of self-motion enables a prediction to be made before performing an action, resulting in a stronger belief of data association. The system is implemented as a data-driven partially observable semi-Markov decision process. Object representations are formed as the process's hidden states and are coordinated with motor commands through state transitions. Experiments show the prediction process is essential in enabling the unsupervised learning method to converge to a solution - improving precision and recall over using sensory data alone.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper firstly presents an extended ambiguity resolution model that deals with an ill-posed problem and constraints among the estimated parameters. In the extended model, the regularization criterion is used instead of the traditional least squares in order to estimate the float ambiguities better. The existing models can be derived from the general model. Secondly, the paper examines the existing ambiguity searching methods from four aspects: exclusion of nuisance integer candidates based on the available integer constraints; integer rounding; integer bootstrapping and integer least squares estimations. Finally, this paper systematically addresses the similarities and differences between the generalized TCAR and decorrelation methods from both theoretical and practical aspects.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The success rate of carrier phase ambiguity resolution (AR) is the probability that the ambiguities are successfully fixed to their correct integer values. In existing works, an exact success rate formula for integer bootstrapping estimator has been used as a sharp lower bound for the integer least squares (ILS) success rate. Rigorous computation of success rate for the more general ILS solutions has been considered difficult, because of complexity of the ILS ambiguity pull-in region and computational load of the integration of the multivariate probability density function. Contributions of this work are twofold. First, the pull-in region mathematically expressed as the vertices of a polyhedron is represented by a multi-dimensional grid, at which the cumulative probability can be integrated with the multivariate normal cumulative density function (mvncdf) available in Matlab. The bivariate case is studied where the pull-region is usually defined as a hexagon and the probability is easily obtained using mvncdf at all the grid points within the convex polygon. Second, the paper compares the computed integer rounding and integer bootstrapping success rates, lower and upper bounds of the ILS success rates to the actual ILS AR success rates obtained from a 24 h GPS data set for a 21 km baseline. The results demonstrate that the upper bound probability of the ILS AR probability given in the existing literatures agrees with the actual ILS success rate well, although the success rate computed with integer bootstrapping method is a quite sharp approximation to the actual ILS success rate. The results also show that variations or uncertainty of the unit–weight variance estimates from epoch to epoch will affect the computed success rates from different methods significantly, thus deserving more attentions in order to obtain useful success probability predictions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Global Navigation Satellite Systems (GNSS)-based observation systems can provide high precision positioning and navigation solutions in real time, in the order of subcentimetre if we make use of carrier phase measurements in the differential mode and deal with all the bias and noise terms well. However, these carrier phase measurements are ambiguous due to unknown, integer numbers of cycles. One key challenge in the differential carrier phase mode is to fix the integer ambiguities correctly. On the other hand, in the safety of life or liability-critical applications, such as for vehicle safety positioning and aviation, not only is high accuracy required, but also the reliability requirement is important. This PhD research studies to achieve high reliability for ambiguity resolution (AR) in a multi-GNSS environment. GNSS ambiguity estimation and validation problems are the focus of the research effort. Particularly, we study the case of multiple constellations that include initial to full operations of foreseeable Galileo, GLONASS and Compass and QZSS navigation systems from next few years to the end of the decade. Since real observation data is only available from GPS and GLONASS systems, the simulation method named Virtual Galileo Constellation (VGC) is applied to generate observational data from another constellation in the data analysis. In addition, both full ambiguity resolution (FAR) and partial ambiguity resolution (PAR) algorithms are used in processing single and dual constellation data. Firstly, a brief overview of related work on AR methods and reliability theory is given. Next, a modified inverse integer Cholesky decorrelation method and its performance on AR are presented. Subsequently, a new measure of decorrelation performance called orthogonality defect is introduced and compared with other measures. Furthermore, a new AR scheme considering the ambiguity validation requirement in the control of the search space size is proposed to improve the search efficiency. With respect to the reliability of AR, we also discuss the computation of the ambiguity success rate (ASR) and confirm that the success rate computed with the integer bootstrapping method is quite a sharp approximation to the actual integer least-squares (ILS) method success rate. The advantages of multi-GNSS constellations are examined in terms of the PAR technique involving the predefined ASR. Finally, a novel satellite selection algorithm for reliable ambiguity resolution called SARA is developed. In summary, the study demonstrats that when the ASR is close to one, the reliability of AR can be guaranteed and the ambiguity validation is effective. The work then focuses on new strategies to improve the ASR, including a partial ambiguity resolution procedure with a predefined success rate and a novel satellite selection strategy with a high success rate. The proposed strategies bring significant benefits of multi-GNSS signals to real-time high precision and high reliability positioning services.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Public transport travel time variability (PTTV) is essential for understanding deteriorations in the reliability of travel time, optimizing transit schedules and route choices. This paper establishes key definitions of PTTV in which firstly include all buses, and secondly include only a single service from a bus route. The paper then analyses the day-to-day distribution of public transport travel time by using Transit Signal Priority data. A comprehensive approach using both parametric bootstrapping Kolmogorov-Smirnov test and Bayesian Information Creation technique is developed, recommends Lognormal distribution as the best descriptor of bus travel time on urban corridors. The probability density function of Lognormal distribution is finally used for calculating probability indicators of PTTV. The findings of this study are useful for both traffic managers and statisticians for planning and researching the transit systems.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Reliability of carrier phase ambiguity resolution (AR) of an integer least-squares (ILS) problem depends on ambiguity success rate (ASR), which in practice can be well approximated by the success probability of integer bootstrapping solutions. With the current GPS constellation, sufficiently high ASR of geometry-based model can only be achievable at certain percentage of time. As a result, high reliability of AR cannot be assured by the single constellation. In the event of dual constellations system (DCS), for example, GPS and Beidou, which provide more satellites in view, users can expect significant performance benefits such as AR reliability and high precision positioning solutions. Simply using all the satellites in view for AR and positioning is a straightforward solution, but does not necessarily lead to high reliability as it is hoped. The paper presents an alternative approach that selects a subset of the visible satellites to achieve a higher reliability performance of the AR solutions in a multi-GNSS environment, instead of using all the satellites. Traditionally, satellite selection algorithms are mostly based on the position dilution of precision (PDOP) in order to meet accuracy requirements. In this contribution, some reliability criteria are introduced for GNSS satellite selection, and a novel satellite selection algorithm for reliable ambiguity resolution (SARA) is developed. The SARA algorithm allows receivers to select a subset of satellites for achieving high ASR such as above 0.99. Numerical results from a simulated dual constellation cases show that with the SARA procedure, the percentages of ASR values in excess of 0.99 and the percentages of ratio-test values passing the threshold 3 are both higher than those directly using all satellites in view, particularly in the case of dual-constellation, the percentages of ASRs (>0.99) and ratio-test values (>3) could be as high as 98.0 and 98.5 % respectively, compared to 18.1 and 25.0 % without satellite selection process. It is also worth noting that the implementation of SARA is simple and the computation time is low, which can be applied in most real-time data processing applications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Used frequently in food contact materials, bisphenol A (BPA) has been studied extensively in recent years, and ubiquitous exposure in the general population has been demonstrated worldwide. Characterising within- and between-individual variability of BPA concentrations is important for characterising exposure in biomonitoring studies, and this has been investigated previously in adults, but not in children. The aim of this study was to characterise the short-term variability of BPA in spot urine samples in young children. Children aged ≥2-<4 years (n = 25) were recruited from an existing cohort in Queensland Australia, and donated four spot urine samples each over a two day period. Samples were analysed for total BPA using isotope dilution online solid phase extraction-liquid chromatography-tandem mass spectrometry, and concentrations ranged from 0.53–74.5 ng/ml, with geometric mean and standard deviation of 2.70 ng/ml and 2.94 ng/ml, respectively. Sex and time of sample collection were not significant predictors of BPA concentration. The between-individual variability was approximately equal to the within-individual variability (ICC = 0.51), and this ICC is somewhat higher than previously reported literature values. This may be the result of physiological or behavioural differences between children and adults or of the relatively short exposure window assessed. Using a bootstrapping methodology, a single sample resulted in correct tertile classification approximately 70% of the time. This study suggests that single spot samples obtained from young children provide a reliable characterization of absolute and relative exposure over the short time window studied, but this may not hold true over longer timeframes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Public Transport Travel Time Variability (PTTV) is essential for understanding the deteriorations in the reliability of travel time, optimizing transit schedules and route choices. This paper establishes the key definitions of PTTV in which firstly include all buses, and secondly include only a single service from a bus route. The paper then analyzes the day-to-day distribution of public transport travel time by using Transit Signal Priority data. A comprehensive approach, using both parametric bootstrapping Kolmogorov-Smirnov test and Bayesian Information Creation technique is developed, recommends Lognormal distribution as the best descriptor of bus travel time on urban corridors. The probability density function of Lognormal distribution is finally used for calculating probability indicators of PTTV. The findings of this study are useful for both traffic managers and statisticians for planning and analyzing the transit systems.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As part of a wider study to develop an ecosystem-health monitoring program for wadeable streams of south-eastern Queensland, Australia, comparisons were made regarding the accuracy, precision and relative efficiency of single-pass backpack electrofishing and multiple-pass electrofishing plus supplementary seine netting to quantify fish assemblage attributes at two spatial scales (within discrete mesohabitat units and within stream reaches consisting of multiple mesohabitat units). The results demonstrate that multiple-pass electrofishing plus seine netting provide more accurate and precise estimates of fish species richness, assemblage composition and species relative abundances in comparison to single-pass electrofishing alone, and that intensive sampling of three mesohabitat units (equivalent to a riffle-run-pool sequence) is a more efficient sampling strategy to estimate reach-scale assemblage attributes than less intensive sampling over larger spatial scales. This intensive sampling protocol was sufficiently sensitive that relatively small differences in assemblage attributes (<20%) could be detected with a high statistical power (1-β > 0.95) and that relatively few stream reaches (<4) need be sampled to accurately estimate assemblage attributes close to the true population means. The merits and potential drawbacks of the intensive sampling strategy are discussed, and it is deemed to be suitable for a range of monitoring and bioassessment objectives.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper evaluates the performances of prediction intervals generated from alternative time series models, in the context of tourism forecasting. The forecasting methods considered include the autoregressive (AR) model, the AR model using the bias-corrected bootstrap, seasonal ARIMA models, innovations state space models for exponential smoothing, and Harvey’s structural time series models. We use thirteen monthly time series for the number of tourist arrivals to Hong Kong and Australia. The mean coverage rates and widths of the alternative prediction intervals are evaluated in an empirical setting. It is found that all models produce satisfactory prediction intervals, except for the autoregressive model. In particular, those based on the biascorrected bootstrap perform best in general, providing tight intervals with accurate coverage rates, especially when the forecast horizon is long.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The ambiguity acceptance test is an important quality control procedure in high precision GNSS data processing. Although the ambiguity acceptance test methods have been extensively investigated, its threshold determine method is still not well understood. Currently, the threshold is determined with the empirical approach or the fixed failure rate (FF-) approach. The empirical approach is simple but lacking in theoretical basis, while the FF-approach is theoretical rigorous but computationally demanding. Hence, the key of the threshold determination problem is how to efficiently determine the threshold in a reasonable way. In this study, a new threshold determination method named threshold function method is proposed to reduce the complexity of the FF-approach. The threshold function method simplifies the FF-approach by a modeling procedure and an approximation procedure. The modeling procedure uses a rational function model to describe the relationship between the FF-difference test threshold and the integer least-squares (ILS) success rate. The approximation procedure replaces the ILS success rate with the easy-to-calculate integer bootstrapping (IB) success rate. Corresponding modeling error and approximation error are analysed with simulation data to avoid nuisance biases and unrealistic stochastic model impact. The results indicate the proposed method can greatly simplify the FF-approach without introducing significant modeling error. The threshold function method makes the fixed failure rate threshold determination method feasible for real-time applications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Purpose: The purpose of this work was to evaluate the patient-borne financial cost of common, adverse breast cancer treatment-associated effects, comparing cost across women with or without these side-effects. Methods: 287 Australian women diagnosed with early-stage breast cancer were prospectively followed starting at six months post-surgery for 12 months, with three-monthly assessment of detailed treatment-related side effects and their direct and indirect patient costs attributable to breast cancer. Bootstrapping statistics were used to analyze cost data and adjusted logistic regression was used to evaluate the association between costs and adverse events from breast cancer. Costs were inflated and converted from 2002 Australian to 2014 US dollars. Results: More than 90% of women experienced at least one adverse effect (i.e. post-surgical issue, reaction to radiotherapy, upper-body symptoms or reduced function, lymphedema, fatigue or weight gain). On average, women paid $5,636 (95%CI: $4,694, $6,577) in total costs. Women with any one of the following symptoms (fatigue, reduced upper-body function, upper-body symptoms) or women who report ≥4 adverse treatment-related effects, have 1.5 to nearly 4 times the odds of having higher healthcare costs than women who do not report these complaints (p<0.05). Conclusions: Women face substantial economic burden due to a range of treatment-related health problems, which may persist beyond the treatment period. Improving breast cancer care by incorporating prospective surveillance of treatment-related side effects, and strategies for prevention and treatment of concerns (e.g., exercise) has real potential for reducing patient-borne costs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

- Objective To investigate if parental disapproval of alcohol use accounts for differences in adolescent alcohol use across regional and urban communities. - Design Secondary data analysis of grade-level stratified data from a random sample of schools. - Setting High schools in Victoria, Australia. - Participants A random sample of 10273 adolescents from Grade 7 (mean age=12.51 years), 9 (14.46 years) and 11 (16.42 years). - Main outcome measures The key independent variables were parental disapproval of adolescent alcohol use and regionality (regional/ urban), and the dependent variable was past 30 days alcohol use. - Results After adjusting for potential confounders, adolescents in regional areas were more likely to use alcohol in the past 30 days (OR=1.83, 1.44 and 1.37 for Grades 7, 9 and 11, respectively, P<0.05), and their parents have a lower level of disapproval of their alcohol use (b=-0.12, -0.15 and -0.19 for Grades 7, 9 and 11, respectively, P<0.001). Bootstrapping analyses suggested that 8.37%, 23.30% and 39.22% of the effect of regionality on adolescent alcohol use was mediated by parental disapproval of alcohol use for Grades 7, 9 and 11 participants respectively (P<0.05). - Conclusions Adolescents in urban areas had a lower risk of alcohol use compared with their regional counterparts, and differences in parental disapproval of alcohol use contributed to this difference.