950 resultados para Probabilistic metrics
Resumo:
Upscaling ecological information to larger scales in space and downscaling remote sensing observations or model simulations to finer scales remain grand challenges in Earth system science. Downscaling often involves inferring subgrid information from coarse-scale data, and such ill-posed problems are classically addressed using regularization. Here, we apply two-dimensional Tikhonov Regularization (2DTR) to simulate subgrid surface patterns for ecological applications. Specifically, we test the ability of 2DTR to simulate the spatial statistics of high-resolution (4 m) remote sensing observations of the normalized difference vegetation index (NDVI) in a tundra landscape. We find that the 2DTR approach as applied here can capture the major mode of spatial variability of the high-resolution information, but not multiple modes of spatial variability, and that the Lagrange multiplier (γ) used to impose the condition of smoothness across space is related to the range of the experimental semivariogram. We used observed and 2DTR-simulated maps of NDVI to estimate landscape-level leaf area index (LAI) and gross primary productivity (GPP). NDVI maps simulated using a γ value that approximates the range of observed NDVI result in a landscape-level GPP estimate that differs by ca 2% from those created using observed NDVI. Following findings that GPP per unit LAI is lower near vegetation patch edges, we simulated vegetation patch edges using multiple approaches and found that simulated GPP declined by up to 12% as a result. 2DTR can generate random landscapes rapidly and can be applied to disaggregate ecological information and compare of spatial observations against simulated landscapes.
Resumo:
The evaluation of forecast performance plays a central role both in the interpretation and use of forecast systems and in their development. Different evaluation measures (scores) are available, often quantifying different characteristics of forecast performance. The properties of several proper scores for probabilistic forecast evaluation are contrasted and then used to interpret decadal probability hindcasts of global mean temperature. The Continuous Ranked Probability Score (CRPS), Proper Linear (PL) score, and IJ Good’s logarithmic score (also referred to as Ignorance) are compared; although information from all three may be useful, the logarithmic score has an immediate interpretation and is not insensitive to forecast busts. Neither CRPS nor PL is local; this is shown to produce counter intuitive evaluations by CRPS. Benchmark forecasts from empirical models like Dynamic Climatology place the scores in context. Comparing scores for forecast systems based on physical models (in this case HadCM3, from the CMIP5 decadal archive) against such benchmarks is more informative than internal comparison systems based on similar physical simulation models with each other. It is shown that a forecast system based on HadCM3 out performs Dynamic Climatology in decadal global mean temperature hindcasts; Dynamic Climatology previously outperformed a forecast system based upon HadGEM2 and reasons for these results are suggested. Forecasts of aggregate data (5-year means of global mean temperature) are, of course, narrower than forecasts of annual averages due to the suppression of variance; while the average “distance” between the forecasts and a target may be expected to decrease, little if any discernible improvement in probabilistic skill is achieved.
Resumo:
Using lessons from idealised predictability experiments, we discuss some issues and perspectives on the design of operational seasonal to inter-annual Arctic sea-ice prediction systems. We first review the opportunities to use a hierarchy of different types of experiment to learn about the predictability of Arctic climate. We also examine key issues for ensemble system design, such as: measuring skill, the role of ensemble size and generation of ensemble members. When assessing the potential skill of a set of prediction experiments, using more than one metric is essential as different choices can significantly alter conclusions about the presence or lack of skill. We find that increasing both the number of hindcasts and ensemble size is important for reliably assessing the correlation and expected error in forecasts. For other metrics, such as dispersion, increasing ensemble size is most important. Probabilistic measures of skill can also provide useful information about the reliability of forecasts. In addition, various methods for generating the different ensemble members are tested. The range of techniques can produce surprisingly different ensemble spread characteristics. The lessons learnt should help inform the design of future operational prediction systems.
Resumo:
Recent advances in understanding have made it possible to relate global precipitation changes directly to emissions of particular gases and aerosols that influence climate. Using these advances, new indices are developed here called the Global Precipitation-change Potential for pulse (GPP_P) and sustained (GPP_S) emissions, which measure the precipitation change per unit mass of emissions. The GPP can be used as a metric to compare the effects of different emissions. This is akin to the global warming potential (GWP) and the global temperature-change potential (GTP) which are used to place emissions on a common scale. Hence the GPP provides an additional perspective of the relative or absolute effects of emissions. It is however recognised that precipitation changes are predicted to be highly variable in size and sign between different regions and this limits the usefulness of a purely global metric. The GPP_P and GPP_S formulation consists of two terms, one dependent on the surface temperature change and the other dependent on the atmospheric component of the radiative forcing. For some forcing agents, and notably for CO2, these two terms oppose each other – as the forcing and temperature perturbations have different timescales, even the sign of the absolute GPP_P and GPP_S varies with time, and the opposing terms can make values sensitive to uncertainties in input parameters. This makes the choice of CO2 as a reference gas problematic, especially for the GPP_S at time horizons less than about 60 years. In addition, few studies have presented results for the surface/atmosphere partitioning of different forcings, leading to more uncertainty in quantifying the GPP than the GWP or GTP. Values of the GPP_P and GPP_S for five long- and short-lived forcing agents (CO2, CH4, N2O, sulphate and black carbon – BC) are presented, using illustrative values of required parameters. The resulting precipitation changes are given as the change at a specific time horizon (and hence they are end-point metrics) but it is noted that the GPPS can also be interpreted as the time-integrated effect of a pulse emission. Using CO2 as a references gas, the GPP_P and GPP_S for the non-CO2 species are larger than the corresponding GTP values. For BC emissions, the atmospheric forcing is sufficiently strong that the GPP_S is opposite in sign to the GTP_S. The sensitivity of these values to a number of input parameters is explored. The GPP can also be used to evaluate the contribution of different emissions to precipitation change during or after a period of emissions. As an illustration, the precipitation changes resulting from emissions in 2008 (using the GPP_P) and emissions sustained at 2008 levels (using the GPP_S) are presented. These indicate that for periods of 20 years (after the 2008 emissions) and 50 years (for sustained emissions at 2008 levels) methane is the dominant driver of positive precipitation changes due to those emissions. For sustained emissions, the sum of the effect of the five species included here does not become positive until after 50 years, by which time the global surface temperature increase exceeds 1 K.
Resumo:
An ability to quantify the reliability of probabilistic flood inundation predictions is a requirement not only for guiding model development but also for their successful application. Probabilistic flood inundation predictions are usually produced by choosing a method of weighting the model parameter space, but previous study suggests that this choice leads to clear differences in inundation probabilities. This study aims to address the evaluation of the reliability of these probabilistic predictions. However, a lack of an adequate number of observations of flood inundation for a catchment limits the application of conventional methods of evaluating predictive reliability. Consequently, attempts have been made to assess the reliability of probabilistic predictions using multiple observations from a single flood event. Here, a LISFLOOD-FP hydraulic model of an extreme (>1 in 1000 years) flood event in Cockermouth, UK, is constructed and calibrated using multiple performance measures from both peak flood wrack mark data and aerial photography captured post-peak. These measures are used in weighting the parameter space to produce multiple probabilistic predictions for the event. Two methods of assessing the reliability of these probabilistic predictions using limited observations are utilized; an existing method assessing the binary pattern of flooding, and a method developed in this paper to assess predictions of water surface elevation. This study finds that the water surface elevation method has both a better diagnostic and discriminatory ability, but this result is likely to be sensitive to the unknown uncertainties in the upstream boundary condition
Resumo:
Forecasting wind power is an important part of a successful integration of wind power into the power grid. Forecasts with lead times longer than 6 h are generally made by using statistical methods to post-process forecasts from numerical weather prediction systems. Two major problems that complicate this approach are the non-linear relationship between wind speed and power production and the limited range of power production between zero and nominal power of the turbine. In practice, these problems are often tackled by using non-linear non-parametric regression models. However, such an approach ignores valuable and readily available information: the power curve of the turbine's manufacturer. Much of the non-linearity can be directly accounted for by transforming the observed power production into wind speed via the inverse power curve so that simpler linear regression models can be used. Furthermore, the fact that the transformed power production has a limited range can be taken care of by employing censored regression models. In this study, we evaluate quantile forecasts from a range of methods: (i) using parametric and non-parametric models, (ii) with and without the proposed inverse power curve transformation and (iii) with and without censoring. The results show that with our inverse (power-to-wind) transformation, simpler linear regression models with censoring perform equally or better than non-linear models with or without the frequently used wind-to-power transformation.
Resumo:
The weak-constraint inverse for nonlinear dynamical models is discussed and derived in terms of a probabilistic formulation. The well-known result that for Gaussian error statistics the minimum of the weak-constraint inverse is equal to the maximum-likelihood estimate is rederived. Then several methods based on ensemble statistics that can be used to find the smoother (as opposed to the filter) solution are introduced and compared to traditional methods. A strong point of the new methods is that they avoid the integration of adjoint equations, which is a complex task for real oceanographic or atmospheric applications. they also avoid iterative searches in a Hilbert space, and error estimates can be obtained without much additional computational effort. the feasibility of the new methods is illustrated in a two-layer quasigeostrophic model.
Resumo:
Probabilistic hydro-meteorological forecasts have over the last decades been used more frequently to communicate forecastuncertainty. This uncertainty is twofold, as it constitutes both an added value and a challenge for the forecaster and the user of the forecasts. Many authors have demonstrated the added (economic) value of probabilistic over deterministic forecasts across the water sector (e.g. flood protection, hydroelectric power management and navigation). However, the richness of the information is also a source of challenges for operational uses, due partially to the difficulty to transform the probability of occurrence of an event into a binary decision. This paper presents the results of a risk-based decision-making game on the topic of flood protection mitigation, called “How much are you prepared to pay for a forecast?”. The game was played at several workshops in 2015, which were attended by operational forecasters and academics working in the field of hydrometeorology. The aim of this game was to better understand the role of probabilistic forecasts in decision-making processes and their perceived value by decision-makers. Based on the participants’ willingness-to-pay for a forecast, the results of the game show that the value (or the usefulness) of a forecast depends on several factors, including the way users perceive the quality of their forecasts and link it to the perception of their own performances as decision-makers.
Resumo:
We investigate the critical behaviour of a probabilistic mixture of cellular automata (CA) rules 182 and 200 (in Wolfram`s enumeration scheme) by mean-field analysis and Monte Carlo simulations. We found that as we switch off one CA and switch on the other by the variation of the single parameter of the model, the probabilistic CA (PCA) goes through an extinction-survival-type phase transition, and the numerical data indicate that it belongs to the directed percolation universality class of critical behaviour. The PCA displays a characteristic stationary density profile and a slow, diffusive dynamics close to the pure CA 200 point that we discuss briefly. Remarks on an interesting related stochastic lattice gas are addressed in the conclusions.
Resumo:
Establishing metrics to assess machine translation (MT) systems automatically is now crucial owing to the widespread use of MT over the web. In this study we show that such evaluation can be done by modeling text as complex networks. Specifically, we extend our previous work by employing additional metrics of complex networks, whose results were used as input for machine learning methods and allowed MT texts of distinct qualities to be distinguished. Also shown is that the node-to-node mapping between source and target texts (English-Portuguese and Spanish-Portuguese pairs) can be improved by adding further hierarchical levels for the metrics out-degree, in-degree, hierarchical common degree, cluster coefficient, inter-ring degree, intra-ring degree and convergence ratio. The results presented here amount to a proof-of-principle that the possible capturing of a wider context with the hierarchical levels may be combined with machine learning methods to yield an approach for assessing the quality of MT systems. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
This presentation was offered as part of the CUNY Library Assessment Conference, Reinventing Libraries: Reinventing Assessment, held at the City University of New York in June 2014.