929 resultados para PREDICTIVE PERFORMANCE


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Modeling the distributions of species, especially of invasive species in non-native ranges, involves multiple challenges. Here, we developed some novel approaches to species distribution modeling aimed at reducing the influences of such challenges and improving the realism of projections. We estimated species-environment relationships with four modeling methods run with multiple scenarios of (1) sources of occurrences and geographically isolated background ranges for absences, (2) approaches to drawing background (absence) points, and (3) alternate sets of predictor variables. We further tested various quantitative metrics of model evaluation against biological insight. Model projections were very sensitive to the choice of training dataset. Model accuracy was much improved by using a global dataset for model training, rather than restricting data input to the species’ native range. AUC score was a poor metric for model evaluation and, if used alone, was not a useful criterion for assessing model performance. Projections away from the sampled space (i.e. into areas of potential future invasion) were very different depending on the modeling methods used, raising questions about the reliability of ensemble projections. Generalized linear models gave very unrealistic projections far away from the training region. Models that efficiently fit the dominant pattern, but exclude highly local patterns in the dataset and capture interactions as they appear in data (e.g. boosted regression trees), improved generalization of the models. Biological knowledge of the species and its distribution was important in refining choices about the best set of projections. A post-hoc test conducted on a new Partenium dataset from Nepal validated excellent predictive performance of our “best” model. We showed that vast stretches of currently uninvaded geographic areas on multiple continents harbor highly suitable habitats for Parthenium hysterophorus L. (Asteraceae; parthenium). However, discrepancies between model predictions and parthenium invasion in Australia indicate successful management for this globally significant weed. This article is protected by copyright. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis studies binary time series models and their applications in empirical macroeconomics and finance. In addition to previously suggested models, new dynamic extensions are proposed to the static probit model commonly used in the previous literature. In particular, we are interested in probit models with an autoregressive model structure. In Chapter 2, the main objective is to compare the predictive performance of the static and dynamic probit models in forecasting the U.S. and German business cycle recession periods. Financial variables, such as interest rates and stock market returns, are used as predictive variables. The empirical results suggest that the recession periods are predictable and dynamic probit models, especially models with the autoregressive structure, outperform the static model. Chapter 3 proposes a Lagrange Multiplier (LM) test for the usefulness of the autoregressive structure of the probit model. The finite sample properties of the LM test are considered with simulation experiments. Results indicate that the two alternative LM test statistics have reasonable size and power in large samples. In small samples, a parametric bootstrap method is suggested to obtain approximately correct size. In Chapter 4, the predictive power of dynamic probit models in predicting the direction of stock market returns are examined. The novel idea is to use recession forecast (see Chapter 2) as a predictor of the stock return sign. The evidence suggests that the signs of the U.S. excess stock returns over the risk-free return are predictable both in and out of sample. The new "error correction" probit model yields the best forecasts and it also outperforms other predictive models, such as ARMAX models, in terms of statistical and economic goodness-of-fit measures. Chapter 5 generalizes the analysis of univariate models considered in Chapters 2 4 to the case of a bivariate model. A new bivariate autoregressive probit model is applied to predict the current state of the U.S. business cycle and growth rate cycle periods. Evidence of predictability of both cycle indicators is obtained and the bivariate model is found to outperform the univariate models in terms of predictive power.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Reynolds averaged Navier-Stokes model performances in the stagnation and wake regions for turbulent flows with relatively large Lagrangian length scales (generally larger than the scale of geometrical features) approaching small cylinders (both square and circular) is explored. The effective cylinder (or wire) diameter based Reynolds number, ReW ≤ 2.5 × 103. The following turbulence models are considered: a mixing-length; standard Spalart and Allmaras (SA) and streamline curvature (and rotation) corrected SA (SARC); Secundov's νt-92; Secundov et al.'s two equation νt-L; Wolfshtein's k-l model; the Explicit Algebraic Stress Model (EASM) of Abid et al.; the cubic model of Craft et al.; various linear k-ε models including those with wall distance based damping functions; Menter SST, k-ω and Spalding's LVEL model. The use of differential equation distance functions (Poisson and Hamilton-Jacobi equation based) for palliative turbulence modeling purposes is explored. The performance of SA with these distance functions is also considered in the sharp convex geometry region of an airfoil trailing edge. For the cylinder, with ReW ≈ 2.5 × 103 the mixing length and k-l models give strong turbulence production in the wake region. However, in agreement with eddy viscosity estimates, the LVEL and Secundov νt-92 models show relatively little cylinder influence on turbulence. On the other hand, two equation models (as does the one equation SA) suggest the cylinder gives a strong turbulence deficit in the wake region. Also, for SA, an order or magnitude cylinder diameter decrease from ReW = 2500 to 250 surprisingly strengthens the cylinder's disruptive influence. Importantly, results for ReW ≪ 250 are virtually identical to those for ReW = 250 i.e. no matter how small the cylinder/wire its influence does not, as it should, vanish. Similar tests for the Launder-Sharma k-ε, Menter SST and k-ω show, in accordance with physical reality, the cylinder's influence diminishing albeit slowly with size. Results suggest distance functions palliate the SA model's erroneous trait and improve its predictive performance in wire wake regions. Also, results suggest that, along the stagnation line, such functions improve the SA, mixing length, k-l and LVEL results. For the airfoil, with SA, the larger Poisson distance function increases the wake region turbulence levels by just under 5%. © 2007 Elsevier Inc. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis presents a concept for ultra-lightweight deformable mirrors based on a thin substrate of optical surface quality coated with continuous active piezopolymer layers that provide modes of actuation and shape correction. This concept eliminates any kind of stiff backing structure for the mirror surface and exploits micro-fabrication technologies to provide a tight integration of the active materials into the mirror structure, to avoid actuator print-through effects. Proof-of-concept, 10-cm-diameter mirrors with a low areal density of about 0.5 kg/m² have been designed, built and tested to measure their shape-correction performance and verify the models used for design. The low cost manufacturing scheme uses replication techniques, and strives for minimizing residual stresses that deviate the optical figure from the master mandrel. It does not require precision tolerancing, is lightweight, and is therefore potentially scalable to larger diameters for use in large, modular space telescopes. Other potential applications for such a laminate could include ground-based mirrors for solar energy collection, adaptive optics for atmospheric turbulence, laser communications, and other shape control applications.

The immediate application for these mirrors is for the Autonomous Assembly and Reconfiguration of a Space Telescope (AAReST) mission, which is a university mission under development by Caltech, the University of Surrey, and JPL. The design concept, fabrication methodology, material behaviors and measurements, mirror modeling, mounting and control electronics design, shape control experiments, predictive performance analysis, and remaining challenges are presented herein. The experiments have validated numerical models of the mirror, and the mirror models have been used within a model of the telescope in order to predict the optical performance. A demonstration of this mirror concept, along with other new telescope technologies, is planned to take place during the AAReST mission.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Reef fish distributions are patchy in time and space with some coral reef habitats supporting higher densities (i.e., aggregations) of fish than others. Identifying and quantifying fish aggregations (particularly during spawning events) are often top priorities for coastal managers. However, the rapid mapping of these aggregations using conventional survey methods (e.g., non-technical SCUBA diving and remotely operated cameras) are limited by depth, visibility and time. Acoustic sensors (i.e., splitbeam and multibeam echosounders) are not constrained by these same limitations, and were used to concurrently map and quantify the location, density and size of reef fish along with seafloor structure in two, separate locations in the U.S. Virgin Islands. Reef fish aggregations were documented along the shelf edge, an ecologically important ecotone in the region. Fish were grouped into three classes according to body size, and relationships with the benthic seascape were modeled in one area using Boosted Regression Trees. These models were validated in a second area to test their predictive performance in locations where fish have not been mapped. Models predicting the density of large fish (≥29 cm) performed well (i.e., AUC = 0.77). Water depth and standard deviation of depth were the most influential predictors at two spatial scales (100 and 300 m). Models of small (≤11 cm) and medium (12–28 cm) fish performed poorly (i.e., AUC = 0.49 to 0.68) due to the high prevalence (45–79%) of smaller fish in both locations, and the unequal prevalence of smaller fish in the training and validation areas. Integrating acoustic sensors with spatial modeling offers a new and reliable approach to rapidly identify fish aggregations and to predict the density large fish in un-surveyed locations. This integrative approach will help coastal managers to prioritize sites, and focus their limited resources on areas that may be of higher conservation value.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The use of L1 regularisation for sparse learning has generated immense research interest, with successful application in such diverse areas as signal acquisition, image coding, genomics and collaborative filtering. While existing work highlights the many advantages of L1 methods, in this paper we find that L1 regularisation often dramatically underperforms in terms of predictive performance when compared with other methods for inferring sparsity. We focus on unsupervised latent variable models, and develop L1 minimising factor models, Bayesian variants of "L1", and Bayesian models with a stronger L0-like sparsity induced through spike-and-slab distributions. These spike-and-slab Bayesian factor models encourage sparsity while accounting for uncertainty in a principled manner and avoiding unnecessary shrinkage of non-zero values. We demonstrate on a number of data sets that in practice spike-and-slab Bayesian methods outperform L1 minimisation, even on a computational budget. We thus highlight the need to re-assess the wide use of L1 methods in sparsity-reliant applications, particularly when we care about generalising to previously unseen data, and provide an alternative that, over many varying conditions, provides improved generalisation performance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The unscented Kalman filter (UKF) is a widely used method in control and time series applications. The UKF suffers from arbitrary parameters necessary for a step known as sigma point placement, causing it to perform poorly in nonlinear problems. We show how to treat sigma point placement in a UKF as a learning problem in a model based view. We demonstrate that learning to place the sigma points correctly from data can make sigma point collapse much less likely. Learning can result in a significant increase in predictive performance over default settings of the parameters in the UKF and other filters designed to avoid the problems of the UKF, such as the GP-ADF. At the same time, we maintain a lower computational complexity than the other methods. We call our method UKF-L. ©2010 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The unscented Kalman filter (UKF) is a widely used method in control and time series applications. The UKF suffers from arbitrary parameters necessary for sigma point placement, potentially causing it to perform poorly in nonlinear problems. We show how to treat sigma point placement in a UKF as a learning problem in a model based view. We demonstrate that learning to place the sigma points correctly from data can make sigma point collapse much less likely. Learning can result in a significant increase in predictive performance over default settings of the parameters in the UKF and other filters designed to avoid the problems of the UKF, such as the GP-ADF. At the same time, we maintain a lower computational complexity than the other methods. We call our method UKF-L. © 2011 Elsevier B.V.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Latent variable models for network data extract a summary of the relational structure underlying an observed network. The simplest possible models subdivide nodes of the network into clusters; the probability of a link between any two nodes then depends only on their cluster assignment. Currently available models can be classified by whether clusters are disjoint or are allowed to overlap. These models can explain a "flat" clustering structure. Hierarchical Bayesian models provide a natural approach to capture more complex dependencies. We propose a model in which objects are characterised by a latent feature vector. Each feature is itself partitioned into disjoint groups (subclusters), corresponding to a second layer of hierarchy. In experimental comparisons, the model achieves significantly improved predictive performance on social and biological link prediction tasks. The results indicate that models with a single layer hierarchy over-simplify real networks.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Choosing appropriate architectures and regularization strategies of deep networks is crucial to good predictive performance. To shed light on this problem, we analyze the analogous problem of constructing useful priors on compositions of functions. Specifically, we study the deep Gaussian process, a type of infinitely-wide, deep neural network. We show that in standard architectures, the representational capacity of the network tends to capture fewer degrees of freedom as the number of layers increases, retaining only a single degree of freedom in the limit. We propose an alternate network architecture which does not suffer from this pathology. We also examine deep covariance functions, obtained by composing infinitely many feature transforms. Lastly, we characterize the class of models obtained by performing dropout on Gaussian processes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Much sensory-motor behavior develops through imitation, as during the learning of handwriting by children. Such complex sequential acts are broken down into distinct motor control synergies, or muscle groups, whose activities overlap in time to generate continuous, curved movements that obey an intense relation between curvature and speed. The Adaptive Vector Integration to Endpoint (AVITEWRITE) model of Grossberg and Paine (2000) proposed how such complex movements may be learned through attentive imitation. The model suggest how frontal, parietal, and motor cortical mechanisms, such as difference vector encoding, under volitional control from the basal ganglia, interact with adaptively-timed, predictive cerebellar learning during movement imitation and predictive performance. Key psycophysical and neural data about learning to make curved movements were simulated, including a decrease in writing time as learning progresses; generation of unimodal, bell-shaped velocity profiles for each movement synergy; size scaling with isochrony, and speed scaling with preservation of the letter shape and the shapes of the velocity profiles; an inverse relation between curvature and tangential velocity; and a Two-Thirds Power Law relation between angular velocity and curvature. However, the model learned from letter trajectories of only one subject, and only qualitative kinematic comparisons were made with previously published human data. The present work describes a quantitative test of AVITEWRITE through direct comparison of a corpus of human handwriting data with the model's performance when it learns by tracing human trajectories. The results show that model performance was variable across subjects, with an average correlation between the model and human data of 89+/-10%. The present data from simulations using the AVITEWRITE model highlight some of its strengths while focusing attention on areas, such as novel shape learning in children, where all models of handwriting and learning of other complex sensory-motor skills would benefit from further research.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

OBJECTIVES: To compare the predictive performance and potential clinical usefulness of risk calculators of the European Randomized Study of Screening for Prostate Cancer (ERSPC RC) with and without information on prostate volume. METHODS: We studied 6 cohorts (5 European and 1 US) with a total of 15,300 men, all biopsied and with pre-biopsy TRUS measurements of prostate volume. Volume was categorized into 3 categories (25, 40, and 60 cc), to reflect use of digital rectal examination (DRE) for volume assessment. Risks of prostate cancer were calculated according to a ERSPC DRE-based RC (including PSA, DRE, prior biopsy, and prostate volume) and a PSA + DRE model (including PSA, DRE, and prior biopsy). Missing data on prostate volume were completed by single imputation. Risk predictions were evaluated with respect to calibration (graphically), discrimination (AUC curve), and clinical usefulness (net benefit, graphically assessed in decision curves). RESULTS: The AUCs of the ERSPC DRE-based RC ranged from 0.61 to 0.77 and were substantially larger than the AUCs of a model based on only PSA + DRE (ranging from 0.56 to 0.72) in each of the 6 cohorts. The ERSPC DRE-based RC provided net benefit over performing a prostate biopsy on the basis of PSA and DRE outcome in five of the six cohorts. CONCLUSIONS: Identifying men at increased risk for having a biopsy detectable prostate cancer should consider multiple factors, including an estimate of prostate volume.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

DNaseI footprinting is an established assay for identifying transcription factor (TF)-DNA interactions with single base pair resolution. High-throughput DNase-seq assays have recently been used to detect in vivo DNase footprints across the genome. Multiple computational approaches have been developed to identify DNase-seq footprints as predictors of TF binding. However, recent studies have pointed to a substantial cleavage bias of DNase and its negative impact on predictive performance of footprinting. To assess the potential for using DNase-seq to identify individual binding sites, we performed DNase-seq on deproteinized genomic DNA and determined sequence cleavage bias. This allowed us to build bias corrected and TF-specific footprint models. The predictive performance of these models demonstrated that predicted footprints corresponded to high-confidence TF-DNA interactions. DNase-seq footprints were absent under a fraction of ChIP-seq peaks, which we show to be indicative of weaker binding, indirect TF-DNA interactions or possible ChIP artifacts. The modeling approach was also able to detect variation in the consensus motifs that TFs bind to. Finally, cell type specific footprints were detected within DNase hypersensitive sites that are present in multiple cell types, further supporting that footprints can identify changes in TF binding that are not detectable using other strategies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In a Bayesian learning setting, the posterior distribution of a predictive model arises from a trade-off between its prior distribution and the conditional likelihood of observed data. Such distribution functions usually rely on additional hyperparameters which need to be tuned in order to achieve optimum predictive performance; this operation can be efficiently performed in an Empirical Bayes fashion by maximizing the posterior marginal likelihood of the observed data. Since the score function of this optimization problem is in general characterized by the presence of local optima, it is necessary to resort to global optimization strategies, which require a large number of function evaluations. Given that the evaluation is usually computationally intensive and badly scaled with respect to the dataset size, the maximum number of observations that can be treated simultaneously is quite limited. In this paper, we consider the case of hyperparameter tuning in Gaussian process regression. A straightforward implementation of the posterior log-likelihood for this model requires O(N^3) operations for every iteration of the optimization procedure, where N is the number of examples in the input dataset. We derive a novel set of identities that allow, after an initial overhead of O(N^3), the evaluation of the score function, as well as the Jacobian and Hessian matrices, in O(N) operations. We prove how the proposed identities, that follow from the eigendecomposition of the kernel matrix, yield a reduction of several orders of magnitude in the computation time for the hyperparameter optimization problem. Notably, the proposed solution provides computational advantages even with respect to state of the art approximations that rely on sparse kernel matrices.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A basic intuition is that arbitrage is easier when markets are most liquid. Surprisingly, we find that momentum profits are markedly larger in liquid market states. This finding is not explained by variation in liquidity risk, time-varying exposure to risk factors, or changes in macroeconomic condition, cross-sectional return dispersion, and investor sentiment. The predictive performance of aggregate market illiquidity for momentum profits uniformly exceed that of market return and market volatility states. While momentum strategies are unconditionally unprofitable in US, Japan, and Eurozone countries in the last decade, they are substantial following liquid market states.