11 resultados para Resampling
em Queensland University of Technology - ePrints Archive
Resumo:
Modularity has been suggested to be connected to evolvability because a higher degree of independence among parts allows them to evolve as separate units. Recently, the Escoufier RV coefficient has been proposed as a measure of the degree of integration between modules in multivariate morphometric datasets. However, it has been shown, using randomly simulated datasets, that the value of the RV coefficient depends on sample size. Also, so far there is no statistical test for the difference in the RV coefficient between a priori defined groups of observations. Here, we (1), using a rarefaction analysis, show that the value of the RV coefficient depends on sample size also in real geometric morphometric datasets; (2) propose a permutation procedure to test for the difference in the RV coefficient between a priori defined groups of observations; (3) show, through simulations, that such a permutation procedure has an appropriate Type I error; (4) suggest that a rarefaction procedure could be used to obtain sample-size-corrected values of the RV coefficient; and (5) propose a nearest-neighbor procedure that could be used when studying the variation of modularity in geographic space. The approaches outlined here, readily extendable to non-morphometric datasets, allow study of the variation in the degree of integration between a priori defined modules. A Java application – that will allow performance of the proposed test using a software with graphical user interface – has also been developed and is available at the Morphometrics at Stony Brook Web page (http://life.bio.sunysb.edu/morph/).
Resumo:
Biased estimation has the advantage of reducing the mean squared error (MSE) of an estimator. The question of interest is how biased estimation affects model selection. In this paper, we introduce biased estimation to a range of model selection criteria. Specifically, we analyze the performance of the minimum description length (MDL) criterion based on biased and unbiased estimation and compare it against modern model selection criteria such as Kay's conditional model order estimator (CME), the bootstrap and the more recently proposed hook-and-loop resampling based model selection. The advantages and limitations of the considered techniques are discussed. The results indicate that, in some cases, biased estimators can slightly improve the selection of the correct model. We also give an example for which the CME with an unbiased estimator fails, but could regain its power when a biased estimator is used.
Resumo:
The potential to sequester atmospheric carbon in agricultural and forest soils to offset greenhouse gas emissions has generated interest in measuring changes in soil carbon resulting from changes in land management. However, inherent spatial variability of soil carbon limits the precision of measurement of changes in soil carbon and hence, the ability to detect changes. We analyzed variability of soil carbon by intensively sampling sites under different land management as a step toward developing efficient soil sampling designs. Sites were tilled crop-land and a mixed deciduous forest in Tennessee, and old-growth and second-growth coniferous forest in western Washington, USA. Six soil cores within each of three microplots were taken as an initial sample and an additional six cores were taken to simulate resampling. Soil C variability was greater in Washington than in Tennessee, and greater in less disturbed than in more disturbed sites. Using this protocol, our data suggest that differences on the order of 2.0 Mg C ha(-1) could be detected by collection and analysis of cores from at least five (tilled) or two (forest) microplots in Tennessee. More spatial variability in the forested sites in Washington increased the minimum detectable difference, but these systems, consisting of low C content sandy soil with irregularly distributed pockets of organic C in buried logs, are likely to rank among the most spatially heterogeneous of systems. Our results clearly indicate that consistent intramicroplot differences at all sites will enable detection of much more modest changes if the same microplots are resampled.
Resumo:
Corneal-height data are typically measured with videokeratoscopes and modeled using a set of orthogonal Zernike polynomials. We address the estimation of the number of Zernike polynomials, which is formalized as a model-order selection problem in linear regression. Classical information-theoretic criteria tend to overestimate the corneal surface due to the weakness of their penalty functions, while bootstrap-based techniques tend to underestimate the surface or require extensive processing. In this paper, we propose to use the efficient detection criterion (EDC), which has the same general form of information-theoretic-based criteria, as an alternative to estimating the optimal number of Zernike polynomials. We first show, via simulations, that the EDC outperforms a large number of information-theoretic criteria and resampling-based techniques. We then illustrate that using the EDC for real corneas results in models that are in closer agreement with clinical expectations and provides means for distinguishing normal corneal surfaces from astigmatic and keratoconic surfaces.
Resumo:
This paper describes a novel probabilistic approach to incorporating odometric information into appearance-based SLAM systems, without performing metric map construction or calculating relative feature geometry. The proposed system, dubbed Continuous Appearance-based Trajectory SLAM (CAT-SLAM), represents location as a probability distribution along a trajectory, and represents appearance continuously over the trajectory rather than at discrete locations. The distribution is evaluated using a Rao-Blackwellised particle filter, which weights particles based on local appearance and odometric similarity and explicitly models both the likelihood of revisiting previous locations and visiting new locations. A modified resampling scheme counters particle deprivation and allows loop closure updates to be performed in constant time regardless of map size. We compare the performance of CAT-SLAM to FAB-MAP (an appearance-only SLAM algorithm) in an outdoor environment, demonstrating a threefold increase in the number of correct loop closures detected by CAT-SLAM.
Resumo:
This paper describes a new system, dubbed Continuous Appearance-based Trajectory Simultaneous Localisation and Mapping (CAT-SLAM), which augments sequential appearance-based place recognition with local metric pose filtering to improve the frequency and reliability of appearance-based loop closure. As in other approaches to appearance-based mapping, loop closure is performed without calculating global feature geometry or performing 3D map construction. Loop-closure filtering uses a probabilistic distribution of possible loop closures along the robot’s previous trajectory, which is represented by a linked list of previously visited locations linked by odometric information. Sequential appearance-based place recognition and local metric pose filtering are evaluated simultaneously using a Rao–Blackwellised particle filter, which weights particles based on appearance matching over sequential frames and the similarity of robot motion along the trajectory. The particle filter explicitly models both the likelihood of revisiting previous locations and exploring new locations. A modified resampling scheme counters particle deprivation and allows loop-closure updates to be performed in constant time for a given environment. We compare the performance of CAT-SLAM with FAB-MAP (a state-of-the-art appearance-only SLAM algorithm) using multiple real-world datasets, demonstrating an increase in the number of correct loop closures detected by CAT-SLAM.
Resumo:
Bactrocera dorsalis sensu stricto, B. papayae, B. philippinensis and B. carambolae are serious pest fruit fly species of the B. dorsalis complex that predominantly occur in south-east Asia and the Pacific. Identifying molecular diagnostics has proven problematic for these four taxa, a situation that cofounds biosecurity and quarantine efforts and which may be the result of at least some of these taxa representing the same biological species. We therefore conducted a phylogenetic study of these four species (and closely related outgroup taxa) based on the individuals collected from a wide geographic range; sequencing six loci (cox1, nad4-3′, CAD, period, ITS1, ITS2) for approximately 20 individuals from each of 16 sample sites. Data were analysed within maximum likelihood and Bayesian phylogenetic frameworks for individual loci and concatenated data sets for which we applied multiple monophyly and species delimitation tests. Species monophyly was measured by clade support, posterior probability or bootstrap resampling for Bayesian and likelihood analyses respectively, Rosenberg's reciprocal monophyly measure, P(AB), Rodrigo's (P(RD)) and the genealogical sorting index, gsi. We specifically tested whether there was phylogenetic support for the four 'ingroup' pest species using a data set of multiple individuals sampled from a number of populations. Based on our combined data set, Bactrocera carambolae emerges as a distinct monophyletic clade, whereas B. dorsalis s.s., B. papayae and B. philippinensis are unresolved. These data add to the growing body of evidence that B. dorsalis s.s., B. papayae and B. philippinensis are the same biological species, which poses consequences for quarantine, trade and pest management.
Resumo:
We developed an analysis pipeline enabling population studies of HARDI data, and applied it to map genetic influences on fiber architecture in 90 twin subjects. We applied tensor-driven 3D fluid registration to HARDI, resampling the spherical fiber orientation distribution functions (ODFs) in appropriate Riemannian manifolds, after ODF regularization and sharpening. Fitting structural equation models (SEM) from quantitative genetics, we evaluated genetic influences on the Jensen-Shannon divergence (JSD), a novel measure of fiber spatial coherence, and on the generalized fiber anisotropy (GFA) a measure of fiber integrity. With random-effects regression, we mapped regions where diffusion profiles were highly correlated with subjects' intelligence quotient (IQ). Fiber complexity was predominantly under genetic control, and higher in more highly anisotropic regions; the proportion of genetic versus environmental control varied spatially. Our methods show promise for discovering genes affecting fiber connectivity in the brain.
Resumo:
We consider the development of statistical models for prediction of constituent concentration of riverine pollutants, which is a key step in load estimation from frequent flow rate data and less frequently collected concentration data. We consider how to capture the impacts of past flow patterns via the average discounted flow (ADF) which discounts the past flux based on the time lapsed - more recent fluxes are given more weight. However, the effectiveness of ADF depends critically on the choice of the discount factor which reflects the unknown environmental cumulating process of the concentration compounds. We propose to choose the discount factor by maximizing the adjusted R-2 values or the Nash-Sutcliffe model efficiency coefficient. The R2 values are also adjusted to take account of the number of parameters in the model fit. The resulting optimal discount factor can be interpreted as a measure of constituent exhaustion rate during flood events. To evaluate the performance of the proposed regression estimators, we examine two different sampling scenarios by resampling fortnightly and opportunistically from two real daily datasets, which come from two United States Geological Survey (USGS) gaging stations located in Des Plaines River and Illinois River basin. The generalized rating-curve approach produces biased estimates of the total sediment loads by -30% to 83%, whereas the new approaches produce relatively much lower biases, ranging from -24% to 35%. This substantial improvement in the estimates of the total load is due to the fact that predictability of concentration is greatly improved by the additional predictors.
Resumo:
In analysis of longitudinal data, the variance matrix of the parameter estimates is usually estimated by the 'sandwich' method, in which the variance for each subject is estimated by its residual products. We propose smooth bootstrap methods by perturbing the estimating functions to obtain 'bootstrapped' realizations of the parameter estimates for statistical inference. Our extensive simulation studies indicate that the variance estimators by our proposed methods can not only correct the bias of the sandwich estimator but also improve the confidence interval coverage. We applied the proposed method to a data set from a clinical trial of antibiotics for leprosy.
Resumo:
Movement of tephritid flies underpins their survival, reproduction, and ability to establish in new areas and is thus of importance when designing effective management strategies. Much of the knowledge currently available on tephritid movement throughout landscapes comes from the use of direct or indirect methods that rely on the trapping of individuals. Here, we review published experimental designs and methods from mark-release-recapture (MRR) studies, as well as other methods, that have been used to estimate movement of the four major tephritid pest genera (Bactrocera, Ceratitis, Anastrepha, and Rhagoletis). In doing so, we aim to illustrate the theoretical and practical considerations needed to study tephritid movement. MRR studies make use of traps to directly estimate the distance that tephritid species can move within a generation and to evaluate the ecological and physiological factors that influence dispersal patterns. MRR studies, however, require careful planning to ensure that the results obtained are not biased by the methods employed, including marking methods, trap properties, trap spacing, and spatial extent of the trapping array. Despite these obstacles, MRR remains a powerful tool for determining tephritid movement, with data particularly required for understudied species that affect developing countries. To ensure that future MRR studies are successful, we suggest that site selection be carefully considered and sufficient resources be allocated to achieve optimal spacing and placement of traps in line with the stated aims of each study. An alternative to MRR is to make use of indirect methods for determining movement, or more correctly, gene flow, which have become widely available with the development of molecular tools. Key to these methods is the trapping and sequencing of a suitable number of individuals to represent the genetic diversity of the sampled population and investigate population structuring using nuclear genomic markers or non-recombinant mitochondrial DNA markers. Microsatellites are currently the preferred marker for detecting recent population displacement and provide genetic information that may be used in assignment tests for the direct determination of contemporary movement. Neither MRR nor molecular methods, however, are able to monitor fine-scale movements of individual flies. Recent developments in the miniaturization of electronics offer the tantalising possibility to track individual movements of insects using harmonic radar. Computer vision and radio frequency identification tags may also permit the tracking of fine-scale movements by tephritid flies by automated resampling, although these methods come with the same problems as traditional traps used in MRR studies. Although all methods described in this chapter have limitations, a better understanding of tephritid movement far outweighs the drawbacks of the individual methods because of the need for this information to manage tephritid populations.