4 resultados para non-parametric background modeling

em eResearch Archive - Queensland Department of Agriculture


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many statistical forecast systems are available to interested users. In order to be useful for decision-making, these systems must be based on evidence of underlying mechanisms. Once causal connections between the mechanism and their statistical manifestation have been firmly established, the forecasts must also provide some quantitative evidence of `quality’. However, the quality of statistical climate forecast systems (forecast quality) is an ill-defined and frequently misunderstood property. Often, providers and users of such forecast systems are unclear about what ‘quality’ entails and how to measure it, leading to confusion and misinformation. Here we present a generic framework to quantify aspects of forecast quality using an inferential approach to calculate nominal significance levels (p-values) that can be obtained either by directly applying non-parametric statistical tests such as Kruskal-Wallis (KW) or Kolmogorov-Smirnov (KS) or by using Monte-Carlo methods (in the case of forecast skill scores). Once converted to p-values, these forecast quality measures provide a means to objectively evaluate and compare temporal and spatial patterns of forecast quality across datasets and forecast systems. Our analysis demonstrates the importance of providing p-values rather than adopting some arbitrarily chosen significance levels such as p < 0.05 or p < 0.01, which is still common practice. This is illustrated by applying non-parametric tests (such as KW and KS) and skill scoring methods (LEPS and RPSS) to the 5-phase Southern Oscillation Index classification system using historical rainfall data from Australia, The Republic of South Africa and India. The selection of quality measures is solely based on their common use and does not constitute endorsement. We found that non-parametric statistical tests can be adequate proxies for skill measures such as LEPS or RPSS. The framework can be implemented anywhere, regardless of dataset, forecast system or quality measure. Eventually such inferential evidence should be complimented by descriptive statistical methods in order to fully assist in operational risk management.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Non-parametric difference tests such as triangle and duo-trio tests traditionally are used to establish differences or similarities between products. However they only supply the researcher with partial answers and often further testing is required to establish the nature, size and direction of differences. This paper looks at the advantages of the difference from control (DFC) test (also known as degree of difference test) and discusses appropriate applications of the test. The scope and principle of the test, panel composition and analysis of results are presented with the aid of suitable examples. Two of the major uses of the DFC test are in quality control and shelf-life testing. The role DFC takes in these areas and the use of other tests to complement the testing is discussed. Controls or standards are important in both these areas and the use of standard products, mental and written standards and blind controls are highlighted. The DFC test has applications in products where the duo-trio and triangle tests cannot be used because of the normal heterogeneity of the product. While the DFC test is a simple difference test it can be structured to give the researcher more valuable data and scope to make informed decisions about their product.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Modeling the distributions of species, especially of invasive species in non-native ranges, involves multiple challenges. Here, we developed some novel approaches to species distribution modeling aimed at reducing the influences of such challenges and improving the realism of projections. We estimated species-environment relationships with four modeling methods run with multiple scenarios of (1) sources of occurrences and geographically isolated background ranges for absences, (2) approaches to drawing background (absence) points, and (3) alternate sets of predictor variables. We further tested various quantitative metrics of model evaluation against biological insight. Model projections were very sensitive to the choice of training dataset. Model accuracy was much improved by using a global dataset for model training, rather than restricting data input to the species’ native range. AUC score was a poor metric for model evaluation and, if used alone, was not a useful criterion for assessing model performance. Projections away from the sampled space (i.e. into areas of potential future invasion) were very different depending on the modeling methods used, raising questions about the reliability of ensemble projections. Generalized linear models gave very unrealistic projections far away from the training region. Models that efficiently fit the dominant pattern, but exclude highly local patterns in the dataset and capture interactions as they appear in data (e.g. boosted regression trees), improved generalization of the models. Biological knowledge of the species and its distribution was important in refining choices about the best set of projections. A post-hoc test conducted on a new Partenium dataset from Nepal validated excellent predictive performance of our “best” model. We showed that vast stretches of currently uninvaded geographic areas on multiple continents harbor highly suitable habitats for Parthenium hysterophorus L. (Asteraceae; parthenium). However, discrepancies between model predictions and parthenium invasion in Australia indicate successful management for this globally significant weed. This article is protected by copyright. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Modeling the distributions of species, especially of invasive species in non-native ranges, involves multiple challenges. Here, we developed some novel approaches to species distribution modeling aimed at reducing the influences of such challenges and improving the realism of projections. We estimated species-environment relationships with four modeling methods run with multiple scenarios of (1) sources of occurrences and geographically isolated background ranges for absences, (2) approaches to drawing background (absence) points, and (3) alternate sets of predictor variables. We further tested various quantitative metrics of model evaluation against biological insight. Model projections were very sensitive to the choice of training dataset. Model accuracy was much improved by using a global dataset for model training, rather than restricting data input to the species’ native range. AUC score was a poor metric for model evaluation and, if used alone, was not a useful criterion for assessing model performance. Projections away from the sampled space (i.e. into areas of potential future invasion) were very different depending on the modeling methods used, raising questions about the reliability of ensemble projections. Generalized linear models gave very unrealistic projections far away from the training region. Models that efficiently fit the dominant pattern, but exclude highly local patterns in the dataset and capture interactions as they appear in data (e.g. boosted regression trees), improved generalization of the models. Biological knowledge of the species and its distribution was important in refining choices about the best set of projections. A post-hoc test conducted on a new Partenium dataset from Nepal validated excellent predictive performance of our “best” model. We showed that vast stretches of currently uninvaded geographic areas on multiple continents harbor highly suitable habitats for Parthenium hysterophorus L. (Asteraceae; parthenium). However, discrepancies between model predictions and parthenium invasion in Australia indicate successful management for this globally significant weed. This article is protected by copyright. All rights reserved.