502 resultados para BENCHMARKS
Resumo:
We present a description of the theoretical framework and "best practice" for using the paleo-climate model component of the Coupled Model Intercomparison Project (Phase 5) (CMIP5) to constrain future projections of climate using the same models. The constraints arise from measures of skill in hindcasting paleo-climate changes from the present over 3 periods: the Last Glacial Maximum (LGM) (21 thousand years before present, ka), the mid-Holocene (MH) (6 ka) and the Last Millennium (LM) (850–1850 CE). The skill measures may be used to validate robust patterns of climate change across scenarios or to distinguish between models that have differing outcomes in future scenarios. We find that the multi-model ensemble of paleo-simulations is adequate for addressing at least some of these issues. For example, selected benchmarks for the LGM and MH are correlated to the rank of future projections of precipitation/temperature or sea ice extent to indicate that models that produce the best agreement with paleoclimate information give demonstrably different future results than the rest of the models. We also find that some comparisons, for instance associated with model variability, are strongly dependent on uncertain forcing timeseries, or show time dependent behaviour, making direct inferences for the future problematic. Overall, we demonstrate that there is a strong potential for the paleo-climate simulations to help inform the future projections and urge all the modeling groups to complete this subset of the CMIP5 runs.
Resumo:
The UK Government's Department for Energy and Climate Change has been investigating the feasibility of developing a national energy efficiency data framework covering both domestic and non-domestic buildings. Working closely with the Energy Saving Trust and energy suppliers, the aim is to develop a data framework to monitor changes in energy efficiency, develop and evaluate programmes and improve information available to consumers. Key applications of the framework are to understand trends in built stock energy use, identify drivers and evaluate the success of different policies. For energy suppliers, it could identify what energy uses are growing, in which sectors and why. This would help with market segmentation and the design of products. For building professionals, it could supplement energy audits and modelling of end-use consumption with real data and support the generation of accurate and comprehensive benchmarks. This paper critically examines the results of the first phase of work to construct a national energy efficiency data-framework for the domestic sector focusing on two specific issues: (a) drivers of domestic energy consumption in terms of the physical nature of the dwellings and socio-economic characteristics of occupants and (b) the impact of energy efficiency measures on energy consumption.
Resumo:
Four CO2 concentration inversions and the Global Fire Emissions Database (GFED) versions 2.1 and 3 are used to provide benchmarks for climate-driven modeling of the global land-atmosphere CO2 flux and the contribution of wildfire to this flux. The Land surface Processes and exchanges (LPX) model is introduced. LPX is based on the Lund-Potsdam-Jena Spread and Intensity of FIRE (LPJ-SPITFIRE) model with amended fire probability calculations. LPX omits human ignition sources yet simulates many aspects of global fire adequately. It captures the major features of observed geographic pattern in burnt area and its seasonal timing and the unimodal relationship of burnt area to precipitation. It simulates features of geographic variation in the sign of the interannual correlations of burnt area with antecedent dryness and precipitation. It simulates well the interannual variability of the global total land-atmosphere CO2 flux. There are differences among the global burnt area time series from GFED2.1, GFED3 and LPX, but some features are common to all. GFED3 fire CO2 fluxes account for only about 1/3 of the variation in total CO2 flux during 1997–2005. This relationship appears to be dominated by the strong climatic dependence of deforestation fires. The relationship of LPX-modeled fire CO2 fluxes to total CO2 fluxes is weak. Observed and modeled total CO2 fluxes track the El Niño–Southern Oscillation (ENSO) closely; GFED3 burnt area and global fire CO2 flux track the ENSO much less so. The GFED3 fire CO2 flux-ENSO connection is most prominent for the El Niño of 1997–1998, which produced exceptional burning conditions in several regions, especially equatorial Asia. The sign of the observed relationship between ENSO and fire varies regionally, and LPX captures the broad features of this variation. These complexities underscore the need for process-based modeling to assess the consequences of global change for fire and its implications for the carbon cycle.
Resumo:
The complexity of current and emerging high performance architectures provides users with options about how best to use the available resources, but makes predicting performance challenging. In this work a benchmark-driven performance modelling approach is outlined that is appro- priate for modern multicore architectures. The approach is demonstrated by constructing a model of a simple shallow water code on a Cray XE6 system, from application-specific benchmarks that illustrate precisely how architectural char- acteristics impact performance. The model is found to recre- ate observed scaling behaviour up to 16K cores, and used to predict optimal rank-core affinity strategies, exemplifying the type of problem such a model can be used for.
Resumo:
We pursue the first large-scale investigation of a strongly growing mutual fund type: Islamic funds. Based on an unexplored, survivorship bias-adjusted data set, we analyse the financial performance and investment style of 265 Islamic equity funds from 20 countries. As Islamic funds often have diverse investment regions, we develop a (conditional) three-level Carhart model to simultaneously control for exposure to different national, regional and global equity markets and investment styles. Consistent with recent evidence for conventional funds, we find Islamic funds to display superior learning in more developed Islamic financial markets. While Islamic funds from these markets are competitive to international equity benchmarks, funds from especially Western nations with less Islamic assets tend to significantly underperform. Islamic funds’ investment style is somewhat tilted towards growth stocks. Funds from predominantly Muslim economies also show a clear small cap preference. These results are consistent over time and robust to time varying market exposures and capital market restrictions.
Resumo:
We present a benchmark system for global vegetation models. This system provides a quantitative evaluation of multiple simulated vegetation properties, including primary production; seasonal net ecosystem production; vegetation cover; composition and height; fire regime; and runoff. The benchmarks are derived from remotely sensed gridded datasets and site-based observations. The datasets allow comparisons of annual average conditions and seasonal and inter-annual variability, and they allow the impact of spatial and temporal biases in means and variability to be assessed separately. Specifically designed metrics quantify model performance for each process, and are compared to scores based on the temporal or spatial mean value of the observations and a "random" model produced by bootstrap resampling of the observations. The benchmark system is applied to three models: a simple light-use efficiency and water-balance model (the Simple Diagnostic Biosphere Model: SDBM), the Lund-Potsdam-Jena (LPJ) and Land Processes and eXchanges (LPX) dynamic global vegetation models (DGVMs). In general, the SDBM performs better than either of the DGVMs. It reproduces independent measurements of net primary production (NPP) but underestimates the amplitude of the observed CO2 seasonal cycle. The two DGVMs show little difference for most benchmarks (including the inter-annual variability in the growth rate and seasonal cycle of atmospheric CO2), but LPX represents burnt fraction demonstrably more accurately. Benchmarking also identified several weaknesses common to both DGVMs. The benchmarking system provides a quantitative approach for evaluating how adequately processes are represented in a model, identifying errors and biases, tracking improvements in performance through model development, and discriminating among models. Adoption of such a system would do much to improve confidence in terrestrial model predictions of climate change impacts and feedbacks.
Resumo:
Palaeodata in synthesis form are needed as benchmarks for the Palaeoclimate Modelling Intercomparison Project (PMIP). Advances since the last synthesis of terrestrial palaeodata from the last glacial maximum (LGM) call for a new evaluation, especially of data from the tropics. Here pollen, plant-macrofossil, lake-level, noble gas (from groundwater) and δ18O (from speleothems) data are compiled for 18±2 ka (14C), 32 °N–33 °S. The reliability of the data was evaluated using explicit criteria and some types of data were re-analysed using consistent methods in order to derive a set of mutually consistent palaeoclimate estimates of mean temperature of the coldest month (MTCO), mean annual temperature (MAT), plant available moisture (PAM) and runoff (P-E). Cold-month temperature (MAT) anomalies from plant data range from −1 to −2 K near sea level in Indonesia and the S Pacific, through −6 to −8 K at many high-elevation sites to −8 to −15 K in S China and the SE USA. MAT anomalies from groundwater or speleothems seem more uniform (−4 to −6 K), but the data are as yet sparse; a clear divergence between MAT and cold-month estimates from the same region is seen only in the SE USA, where cold-air advection is expected to have enhanced cooling in winter. Regression of all cold-month anomalies against site elevation yielded an estimated average cooling of −2.5 to −3 K at modern sea level, increasing to ≈−6 K by 3000 m. However, Neotropical sites showed larger than the average sea-level cooling (−5 to −6 K) and a non-significant elevation effect, whereas W and S Pacific sites showed much less sea-level cooling (−1 K) and a stronger elevation effect. These findings support the inference that tropical sea-surface temperatures (SSTs) were lower than the CLIMAP estimates, but they limit the plausible average tropical sea-surface cooling, and they support the existence of CLIMAP-like geographic patterns in SST anomalies. Trends of PAM and lake levels indicate wet LGM conditions in the W USA, and at the highest elevations, with generally dry conditions elsewhere. These results suggest a colder-than-present ocean surface producing a weaker hydrological cycle, more arid continents, and arguably steeper-than-present terrestrial lapse rates. Such linkages are supported by recent observations on freezing-level height and tropical SSTs; moreover, simulations of “greenhouse” and LGM climates point to several possible feedback processes by which low-level temperature anomalies might be amplified aloft.
Implication of methodological uncertainties for mid-Holocene sea surface temperature reconstructions
Resumo:
We present and examine a multi-sensor global compilation of mid-Holocene (MH) sea surface temperatures (SST), based on Mg/Ca and alkenone palaeothermometry and reconstructions obtained using planktonic foraminifera and organic-walled dinoflagellate cyst census counts. We assess the uncertainties originating from using different methodologies and evaluate the potential of MH SST reconstructions as a benchmark for climate-model simulations. The comparison between different analytical approaches (time frame, baseline climate) shows the choice of time window for the MH has a negligible effect on the reconstructed SST pattern, but the choice of baseline climate affects both the magnitude and spatial pattern of the reconstructed SSTs. Comparison of the SST reconstructions made using different sensors shows significant discrepancies at a regional scale, with uncertainties often exceeding the reconstructed SST anomaly. Apparent patterns in SST may largely be a reflection of the use of different sensors in different regions. Overall, the uncertainties associated with the SST reconstructions are generally larger than the MH anomalies. Thus, the SST data currently available cannot serve as a target for benchmarking model simulations. Further evaluations of potential subsurface and/or seasonal artifacts that may contribute to obscure the MH SST reconstructions are urgently needed to provide reliable benchmarks for model evaluations.
A benchmark-driven modelling approach for evaluating deployment choices on a multi-core architecture
Resumo:
The complexity of current and emerging architectures provides users with options about how best to use the available resources, but makes predicting performance challenging. In this work a benchmark-driven model is developed for a simple shallow water code on a Cray XE6 system, to explore how deployment choices such as domain decomposition and core affinity affect performance. The resource sharing present in modern multi-core architectures adds various levels of heterogeneity to the system. Shared resources often includes cache, memory, network controllers and in some cases floating point units (as in the AMD Bulldozer), which mean that the access time depends on the mapping of application tasks, and the core's location within the system. Heterogeneity further increases with the use of hardware-accelerators such as GPUs and the Intel Xeon Phi, where many specialist cores are attached to general-purpose cores. This trend for shared resources and non-uniform cores is expected to continue into the exascale era. The complexity of these systems means that various runtime scenarios are possible, and it has been found that under-populating nodes, altering the domain decomposition and non-standard task to core mappings can dramatically alter performance. To find this out, however, is often a process of trial and error. To better inform this process, a performance model was developed for a simple regular grid-based kernel code, shallow. The code comprises two distinct types of work, loop-based array updates and nearest-neighbour halo-exchanges. Separate performance models were developed for each part, both based on a similar methodology. Application specific benchmarks were run to measure performance for different problem sizes under different execution scenarios. These results were then fed into a performance model that derives resource usage for a given deployment scenario, with interpolation between results as necessary.
Resumo:
The Land surface Processes and eXchanges (LPX) model is a fire-enabled dynamic global vegetation model that performs well globally but has problems representing fire regimes and vegetative mix in savannas. Here we focus on improving the fire module. To improve the representation of ignitions, we introduced a reatment of lightning that allows the fraction of ground strikes to vary spatially and seasonally, realistically partitions strike distribution between wet and dry days, and varies the number of dry days with strikes. Fuel availability and moisture content were improved by implementing decomposition rates specific to individual plant functional types and litter classes, and litter drying rates driven by atmospheric water content. To improve water extraction by grasses, we use realistic plant-specific treatments of deep roots. To improve fire responses, we introduced adaptive bark thickness and post-fire resprouting for tropical and temperate broadleaf trees. All improvements are based on extensive analyses of relevant observational data sets. We test model performance for Australia, first evaluating parameterisations separately and then measuring overall behaviour against standard benchmarks. Changes to the lightning parameterisation produce a more realistic simulation of fires in southeastern and central Australia. Implementation of PFT-specific decomposition rates enhances performance in central Australia. Changes in fuel drying improve fire in northern Australia, while changes in rooting depth produce a more realistic simulation of fuel availability and structure in central and northern Australia. The introduction of adaptive bark thickness and resprouting produces more realistic fire regimes in Australian savannas. We also show that the model simulates biomass recovery rates consistent with observations from several different regions of the world characterised by resprouting vegetation. The new model (LPX-Mv1) produces an improved simulation of observed vegetation composition and mean annual burnt area, by 33 and 18% respectively compared to LPX.
Resumo:
This paper uses a novel numerical optimization technique - robust optimization - that is well suited to solving the asset-liability management (ALM) problem for pension schemes. It requires the estimation of fewer stochastic parameters, reduces estimation risk and adopts a prudent approach to asset allocation. This study is the first to apply it to a real-world pension scheme, and the first ALM model of a pension scheme to maximise the Sharpe ratio. We disaggregate pension liabilities into three components - active members, deferred members and pensioners, and transform the optimal asset allocation into the scheme’s projected contribution rate. The robust optimization model is extended to include liabilities and used to derive optimal investment policies for the Universities Superannuation Scheme (USS), benchmarked against the Sharpe and Tint, Bayes-Stein, and Black-Litterman models as well as the actual USS investment decisions. Over a 144 month out-of-sample period robust optimization is superior to the four benchmarks across 20 performance criteria, and has a remarkably stable asset allocation – essentially fix-mix. These conclusions are supported by six robustness checks.
Resumo:
The evaluation of forecast performance plays a central role both in the interpretation and use of forecast systems and in their development. Different evaluation measures (scores) are available, often quantifying different characteristics of forecast performance. The properties of several proper scores for probabilistic forecast evaluation are contrasted and then used to interpret decadal probability hindcasts of global mean temperature. The Continuous Ranked Probability Score (CRPS), Proper Linear (PL) score, and IJ Good’s logarithmic score (also referred to as Ignorance) are compared; although information from all three may be useful, the logarithmic score has an immediate interpretation and is not insensitive to forecast busts. Neither CRPS nor PL is local; this is shown to produce counter intuitive evaluations by CRPS. Benchmark forecasts from empirical models like Dynamic Climatology place the scores in context. Comparing scores for forecast systems based on physical models (in this case HadCM3, from the CMIP5 decadal archive) against such benchmarks is more informative than internal comparison systems based on similar physical simulation models with each other. It is shown that a forecast system based on HadCM3 out performs Dynamic Climatology in decadal global mean temperature hindcasts; Dynamic Climatology previously outperformed a forecast system based upon HadGEM2 and reasons for these results are suggested. Forecasts of aggregate data (5-year means of global mean temperature) are, of course, narrower than forecasts of annual averages due to the suppression of variance; while the average “distance” between the forecasts and a target may be expected to decrease, little if any discernible improvement in probabilistic skill is achieved.
Resumo:
We present a selection of methodologies for using the palaeo-climate model component of the Coupled Model Intercomparison Project (Phase 5) (CMIP5) to attempt to constrain future climate projections using the same models. The constraints arise from measures of skill in hindcasting palaeo-climate changes from the present over three periods: the Last Glacial Maximum (LGM) (21 000 yr before present, ka), the mid-Holocene (MH) (6 ka) and the Last Millennium (LM) (850–1850 CE). The skill measures may be used to validate robust patterns of climate change across scenarios or to distinguish between models that have differing outcomes in future scenarios. We find that the multi-model ensemble of palaeo-simulations is adequate for addressing at least some of these issues. For example, selected benchmarks for the LGM and MH are correlated to the rank of future projections of precipitation/temperature or sea ice extent to indicate that models that produce the best agreement with palaeo-climate information give demonstrably different future results than the rest of the models. We also explore cases where comparisons are strongly dependent on uncertain forcing time series or show important non-stationarity, making direct inferences for the future problematic. Overall, we demonstrate that there is a strong potential for the palaeo-climate simulations to help inform the future projections and urge all the modelling groups to complete this subset of the CMIP5 runs.
Resumo:
This thesis examines three different, but related problems in the broad area of portfolio management for long-term institutional investors, and focuses mainly on the case of pension funds. The first idea (Chapter 3) is the application of a novel numerical technique – robust optimization – to a real-world pension scheme (the Universities Superannuation Scheme, USS) for first time. The corresponding empirical results are supported by many robustness checks and several benchmarks such as the Bayes-Stein and Black-Litterman models that are also applied for first time in a pension ALM framework, the Sharpe and Tint model and the actual USS asset allocations. The second idea presented in Chapter 4 is the investigation of whether the selection of the portfolio construction strategy matters in the SRI industry, an issue of great importance for long term investors. This study applies a variety of optimal and naïve portfolio diversification techniques to the same SRI-screened universe, and gives some answers to the question of which portfolio strategies tend to create superior SRI portfolios. Finally, the third idea (Chapter 5) compares the performance of a real-world pension scheme (USS) before and after the recent major changes in the pension rules under different dynamic asset allocation strategies and the fixed-mix portfolio approach and quantifies the redistributive effects between various stakeholders. Although this study deals with a specific pension scheme, the methodology can be applied by other major pension schemes in countries such as the UK and USA that have changed their rules.
Resumo:
Identifying the correct sense of a word in context is crucial for many tasks in natural language processing (machine translation is an example). State-of-the art methods for Word Sense Disambiguation (WSD) build models using hand-crafted features that usually capturing shallow linguistic information. Complex background knowledge, such as semantic relationships, are typically either not used, or used in specialised manner, due to the limitations of the feature-based modelling techniques used. On the other hand, empirical results from the use of Inductive Logic Programming (ILP) systems have repeatedly shown that they can use diverse sources of background knowledge when constructing models. In this paper, we investigate whether this ability of ILP systems could be used to improve the predictive accuracy of models for WSD. Specifically, we examine the use of a general-purpose ILP system as a method to construct a set of features using semantic, syntactic and lexical information. This feature-set is then used by a common modelling technique in the field (a support vector machine) to construct a classifier for predicting the sense of a word. In our investigation we examine one-shot and incremental approaches to feature-set construction applied to monolingual and bilingual WSD tasks. The monolingual tasks use 32 verbs and 85 verbs and nouns (in English) from the SENSEVAL-3 and SemEval-2007 benchmarks; while the bilingual WSD task consists of 7 highly ambiguous verbs in translating from English to Portuguese. The results are encouraging: the ILP-assisted models show substantial improvements over those that simply use shallow features. In addition, incremental feature-set construction appears to identify smaller and better sets of features. Taken together, the results suggest that the use of ILP with diverse sources of background knowledge provide a way for making substantial progress in the field of WSD.