99 resultados para BENCHMARK
Resumo:
We present an intercomparison and verification analysis of 20 GCMs (Global Circulation Models) included in the 4th IPCC assessment report regarding their representation of the hydrological cycle on the Danube river basin for 1961–2000 and for the 2161–2200 SRESA1B scenario runs. The basin-scale properties of the hydrological cycle are computed by spatially integrating the precipitation, evaporation, and runoff fields using the Voronoi-Thiessen tessellation formalism. The span of the model- simulated mean annual water balances is of the same order of magnitude of the observed Danube discharge of the Delta; the true value is within the range simulated by the models. Some land components seem to have deficiencies since there are cases of violation of water conservation when annual means are considered. The overall performance and the degree of agreement of the GCMs are comparable to those of the RCMs (Regional Climate Models) analyzed in a previous work, in spite of the much higher resolution and common nesting of the RCMs. The reanalyses are shown to feature several inconsistencies and cannot be used as a verification benchmark for the hydrological cycle in the Danubian region. In the scenario runs, for basically all models the water balance decreases, whereas its interannual variability increases. Changes in the strength of the hydrological cycle are not consistent among models: it is confirmed that capturing the impact of climate change on the hydrological cycle is not an easy task over land areas. Moreover, in several cases we find that qualitatively different behaviors emerge among the models: the ensemble mean does not represent any sort of average model, and often it falls between the models’ clusters.
Resumo:
The evaluation of investment fund performance has been one of the main developments of modern portfolio theory. Most studies employ the technique developed by Jensen (1968) that compares a particular fund's returns to a benchmark portfolio of equal risk. However, the standard measures of fund manager performance are known to suffer from a number of problems in practice. In particular previous studies implicitly assume that the risk level of the portfolio is stationary through the evaluation period. That is unconditional measures of performance do not account for the fact that risk and expected returns may vary with the state of the economy. Therefore many of the problems encountered in previous performance studies reflect the inability of traditional measures to handle the dynamic behaviour of returns. As a consequence Ferson and Schadt (1996) suggest an approach to performance evaluation called conditional performance evaluation which is designed to address this problem. This paper utilises such a conditional measure of performance on a sample of 27 UK property funds, over the period 1987-1998. The results of which suggest that once the time varying nature of the funds beta is corrected for, by the addition of the market indicators, the average fund performance show an improvement over that of the traditional methods of analysis.
Resumo:
Previous studies of the place of Property in the multi-asset portfolio have generally relied on historical data, and have been concerned with the supposed risk reduction effects that Property would have on such portfolios. In this paper a different approach has been taken. Not only are expectations data used, but we have also concentrated upon the required return that Property would have to offer to achieve a holding of 15% in typical UK pension fund portfolios. Using two benchmark portfolios for pension funds, we have shown that Property's required return is less than that expected, and therefore it could justify a 15% holding.
Resumo:
The position of Real Estate within a multi-asset portfolio has received considerable attention recently. Previous research has concentrated on the percentage holding property would achieve given its risk/return characteristics. Such studies have invariably used Modern Portfolio Theory and these approaches have been criticised for both the quality of the real estate data and problems with the methodology itself. The first problem is now well understood, and the second can be addressed by the use of realistic constraints on asset holdings. This paper takes a different approach. We determine the level of return that Real Estate needs to achieve to justify an allocation within the multi asset portfolio. In order to test the importance of the quality of the data we use historic appraisal based and desmoothed returns to examine the sensitivity of the results. Consideration is also given to the Holding period and the imposition of realistic constraints on the asset holdings in order to model portfolios held by pension fund investors. We conclude, using several benchmark levels of portfolio risk and return, that using appraisal based data the required level of return for Real Estate was less than that achieved over the period 1972-1993. The use of desmoothed series can reverse this result at the highest levels of desmoothing although within a restricted holding period Real Estate offered returns in excess of those required to enter the portfolio and might have a role to play in the multi-asset portfolio.
Resumo:
Motivation: The ability of a simple method (MODCHECK) to determine the sequence–structure compatibility of a set of structural models generated by fold recognition is tested in a thorough benchmark analysis. Four Model Quality Assessment Programs (MQAPs) were tested on 188 targets from the latest LiveBench-9 automated structure evaluation experiment. We systematically test and evaluate whether the MQAP methods can successfully detect native-likemodels. Results: We show that compared with the other three methods tested MODCHECK is the most reliable method for consistently performing the best top model selection and for ranking the models. In addition, we show that the choice of model similarity score used to assess a model's similarity to the experimental structure can influence the overall performance of these tools. Although these MQAP methods fail to improve the model selection performance for methods that already incorporate protein three dimension (3D) structural information, an improvement is observed for methods that are purely sequence-based, including the best profile–profile methods. This suggests that even the best sequence-based fold recognition methods can still be improved by taking into account the 3D structural information.
Resumo:
There are many published methods available for creating keyphrases for documents. Previous work in the field has shown that in a significant proportion of cases author selected keyphrases are not appropriate for the document they accompany. This requires the use of such automated methods to improve the use of keyphrases. Often the keyphrases are not updated when the focus of a paper changes or include keyphrases that are more classificatory than explanatory. The published methods are all evaluated using different corpora, typically one relevant to their field of study. This not only makes it difficult to incorporate the useful elements of algorithms in future work but also makes comparing the results of each method inefficient and ineffective. This paper describes the work undertaken to compare five methods across a common baseline of six corpora. The methods chosen were term frequency, inverse document frequency, the C-Value, the NC-Value, and a synonym based approach. These methods were compared to evaluate performance and quality of results, and to provide a future benchmark. It is shown that, with the comparison metric used for this study Term Frequency and Inverse Document Frequency were the best algorithms, with the synonym based approach following them. Further work in the area is required to determine an appropriate (or more appropriate) comparison metric.
Resumo:
The Aqua-Planet Experiment (APE) was first proposed by Neale and Hoskins (2000a) as a benchmark for atmospheric general circulation models (AGCMs) on an idealised water-covered Earth. The experiment and its aims are summarised, and its context within a modelling hierarchy used to evaluate complex models and to provide a link between realistic simulation and conceptual models of atmospheric phenomena is discussed. The simplified aqua-planet configuration bridges a gap in the existing hierarchy. It is designed to expose differences between models and to focus attention on particular phenomena and their response to changes in the underlying distribution of sea surface temperature.
Resumo:
Asset allocation is concerned with the development of multi--‐asset portfolio strategies that are likely to meet an investor’s objectives based on the interaction of expected returns, risk, correlation and implementation from a range of distinct asset classes or beta sources. Challenges associated with the discipline are often particularly significant in private markets. Specifically, composition differences between the ‘index’ or ‘benchmark’ universe and the investible universe mean that there can often be substantial and meaningful deviations between the investment characteristics implied in asset allocation decisions and those delivered by investment teams. For example, while allocation decisions are often based on relatively low--‐risk diversified real estate ‘equity’ exposure, implementation decisions frequently include exposure to higher risk forms of the asset class as well as investments in debt based instruments. These differences can have a meaningful impact on the contribution of the asset class to the overall portfolio and, therefore, lead to a potential misalignment between asset allocation decisions and implementation. Despite this, the key conclusion from this paper is not that real estate investors should become slaves to a narrowly defined mandate based on IPD / NCREIF or other forms of benchmark replication. The discussion suggests that such an approach would likely lead to the underutilization of real estate in multi--‐asset portfolio strategies. Instead, it is that to achieve asset allocation alignment, real estate exposure should be divided into multiple pools representing distinct forms of the asset class. In addition, the paper suggests that associated investment guidelines and processes should be collaborative and reflect the portfolio wide asset allocation objectives of each pool. Further, where appropriate they should specifically target potential for ‘additional’ beta or, more marginally, ‘alpha’.
Resumo:
We diagnose forcing and climate feedbacks in benchmark sensitivity experiments with the new Met Office Hadley Centre Earth system climate model HadGEM2-ES. To identify the impact of newly-included biogeophysical and chemical processes, results are compared to a parallel set of experiments performed with these processes switched off, and different couplings with the biogeochemistry. In abrupt carbon dioxide quadrupling experiments we find that the inclusion of these processes does not alter the global climate sensitivity of the model. However, when the change in carbon dioxide is uncoupled from the vegetation, or when the model is forced with a non-carbon dioxide forcing – an increase in solar constant – new feedbacks emerge that make the climate system less sensitive to external perturbations. We identify a strong negative dust-vegetation feedback on climate change that is small in standard carbon dioxide sensitivity experiments due to the physiological/fertilization effects of carbon dioxide on plants in this model.
Resumo:
Global flood hazard maps can be used in the assessment of flood risk in a number of different applications, including (re)insurance and large scale flood preparedness. Such global hazard maps can be generated using large scale physically based models of rainfall-runoff and river routing, when used in conjunction with a number of post-processing methods. In this study, the European Centre for Medium Range Weather Forecasts (ECMWF) land surface model is coupled to ERA-Interim reanalysis meteorological forcing data, and resultant runoff is passed to a river routing algorithm which simulates floodplains and flood flow across the global land area. The global hazard map is based on a 30 yr (1979–2010) simulation period. A Gumbel distribution is fitted to the annual maxima flows to derive a number of flood return periods. The return periods are calculated initially for a 25×25 km grid, which is then reprojected onto a 1×1 km grid to derive maps of higher resolution and estimate flooded fractional area for the individual 25×25 km cells. Several global and regional maps of flood return periods ranging from 2 to 500 yr are presented. The results compare reasonably to a benchmark data set of global flood hazard. The developed methodology can be applied to other datasets on a global or regional scale.
Resumo:
This Atlas presents statistical analyses of the simulations submitted to the Aqua-Planet Experiment (APE) data archive. The simulations are from global Atmospheric General Circulation Models (AGCM) applied to a water-covered earth. The AGCMs include ones actively used or being developed for numerical weather prediction or climate research. Some are mature, application models and others are more novel and thus less well tested in Earth-like applications. The experiment applies AGCMs with their complete parameterization package to an idealization of the planet Earth which has a greatly simplified lower boundary that consists of an ocean only. It has no land and its associated orography, and no sea ice. The ocean is represented by Sea Surface Temperatures (SST) which are specified everywhere with simple, idealized distributions. Thus in the hierarchy of tests available for AGCMs, APE falls between tests with simplified forcings such as those proposed by Held and Suarez (1994) and Boer and Denis (1997) and Earth-like simulations of the Atmospheric Modeling Intercomparison Project (AMIP, Gates et al., 1999). Blackburn and Hoskins (2013) summarize the APE and its aims. They discuss where the APE fits within a modeling hierarchy which has evolved to evaluate complete models and which provides a link between realistic simulation and conceptual models of atmospheric phenomena. The APE bridges a gap in the existing hierarchy. The goals of APE are to provide a benchmark of current model behaviors and to stimulate research to understand the cause of inter-model differences., APE is sponsored by the World Meteorological Organization (WMO) joint Commission on Atmospheric Science (CAS), World Climate Research Program (WCRP) Working Group on Numerical Experimentation (WGNE). Chapter 2 of this Atlas provides an overview of the specification of the eight APE experiments and of the data collected. Chapter 3 lists the participating models and includes brief descriptions of each. Chapters 4 through 7 present a wide variety of statistics from the 14 participating models for the eight different experiments. Additional intercomparison figures created by Dr. Yukiko Yamada in AGU group are available at http://www.gfd-dennou.org/library/ape/comparison/. This Atlas is intended to present and compare the statistics of the APE simulations but does not contain a discussion of interpretive analyses. Such analyses are left for journal papers such as those included in the Special Issue of the Journal of the Meteorological Society of Japan (2013, Vol. 91A) devoted to the APE. Two papers in that collection provide an overview of the simulations. One (Blackburn et al., 2013) concentrates on the CONTROL simulation and the other (Williamson et al., 2013) on the response to changes in the meridional SST profile. Additional papers provide more detailed analysis of the basic simulations, while others describe various sensitivities and applications. The APE experiment data base holds a wealth of data that is now publicly available from the APE web site: http://climate.ncas.ac.uk/ape/. We hope that this Atlas will stimulate future analyses and investigations to understand the large variation seen in the model behaviors.
Resumo:
This article summarizes the main research findings from the first of a series of annual surveys conducted for the British Council of Shopping Centres. The study examines the changing pattern of retailing in the United Kingdom and provides an overview of key research from previous studies in both the U.K. and the United States. The main findings are then presented, including an examination of the impact of e-commerce on sales and rental values and on the future space and ownership / leasing requirements of U.K. retailers for 2000-2005. The impact on a shopping center in a case study town in the U.K. is also considered. The difficulties of isolating the impact of e-commerce from other forces for change in retailing are highlighted. In contrast to other viewpoints, the results show that e-commerce will not mean the death of conventional store-based U.K. retailing, although further benchmark research is needed.
Resumo:
We present a benchmark system for global vegetation models. This system provides a quantitative evaluation of multiple simulated vegetation properties, including primary production; seasonal net ecosystem production; vegetation cover, composition and 5 height; fire regime; and runoff. The benchmarks are derived from remotely sensed gridded datasets and site-based observations. The datasets allow comparisons of annual average conditions and seasonal and inter-annual variability, and they allow the impact of spatial and temporal biases in means and variability to be assessed separately. Specifically designed metrics quantify model performance for each process, 10 and are compared to scores based on the temporal or spatial mean value of the observations and a “random” model produced by bootstrap resampling of the observations. The benchmark system is applied to three models: a simple light-use efficiency and water-balance model (the Simple Diagnostic Biosphere Model: SDBM), and the Lund-Potsdam-Jena (LPJ) and Land Processes and eXchanges (LPX) dynamic global 15 vegetation models (DGVMs). SDBM reproduces observed CO2 seasonal cycles, but its simulation of independent measurements of net primary production (NPP) is too high. The two DGVMs show little difference for most benchmarks (including the interannual variability in the growth rate and seasonal cycle of atmospheric CO2), but LPX represents burnt fraction demonstrably more accurately. Benchmarking also identified 20 several weaknesses common to both DGVMs. The benchmarking system provides a quantitative approach for evaluating how adequately processes are represented in a model, identifying errors and biases, tracking improvements in performance through model development, and discriminating among models. Adoption of such a system would do much to improve confidence in terrestrial model predictions of climate change 25 impacts and feedbacks.
Resumo:
Keyphrases are added to documents to help identify the areas of interest they contain. However, in a significant proportion of papers author selected keyphrases are not appropriate for the document they accompany: for instance, they can be classificatory rather than explanatory, or they are not updated when the focus of the paper changes. As such, automated methods for improving the use of keyphrases are needed, and various methods have been published. However, each method was evaluated using a different corpus, typically one relevant to the field of study of the method’s authors. This not only makes it difficult to incorporate the useful elements of algorithms in future work, but also makes comparing the results of each method inefficient and ineffective. This paper describes the work undertaken to compare five methods across a common baseline of corpora. The methods chosen were Term Frequency, Inverse Document Frequency, the C-Value, the NC-Value, and a Synonym based approach. These methods were analysed to evaluate performance and quality of results, and to provide a future benchmark. It is shown that Term Frequency and Inverse Document Frequency were the best algorithms, with the Synonym approach following them. Following these findings, a study was undertaken into the value of using human evaluators to judge the outputs. The Synonym method was compared to the original author keyphrases of the Reuters’ News Corpus. The findings show that authors of Reuters’ news articles provide good keyphrases but that more often than not they do not provide any keyphrases.
Resumo:
Background: Since their inception, Twitter and related microblogging systems have provided a rich source of information for researchers and have attracted interest in their affordances and use. Since 2009 PubMed has included 123 journal articles on medicine and Twitter, but no overview exists as to how the field uses Twitter in research. // Objective: This paper aims to identify published work relating to Twitter indexed by PubMed, and then to classify it. This classification will provide a framework in which future researchers will be able to position their work, and to provide an understanding of the current reach of research using Twitter in medical disciplines. Limiting the study to papers indexed by PubMed ensures the work provides a reproducible benchmark. // Methods: Papers, indexed by PubMed, on Twitter and related topics were identified and reviewed. The papers were then qualitatively classified based on the paper’s title and abstract to determine their focus. The work that was Twitter focused was studied in detail to determine what data, if any, it was based on, and from this a categorization of the data set size used in the studies was developed. Using open coded content analysis additional important categories were also identified, relating to the primary methodology, domain and aspect. // Results: As of 2012, PubMed comprises more than 21 million citations from biomedical literature, and from these a corpus of 134 potentially Twitter related papers were identified, eleven of which were subsequently found not to be relevant. There were no papers prior to 2009 relating to microblogging, a term first used in 2006. Of the remaining 123 papers which mentioned Twitter, thirty were focussed on Twitter (the others referring to it tangentially). The early Twitter focussed papers introduced the topic and highlighted the potential, not carrying out any form of data analysis. The majority of published papers used analytic techniques to sort through thousands, if not millions, of individual tweets, often depending on automated tools to do so. Our analysis demonstrates that researchers are starting to use knowledge discovery methods and data mining techniques to understand vast quantities of tweets: the study of Twitter is becoming quantitative research. // Conclusions: This work is to the best of our knowledge the first overview study of medical related research based on Twitter and related microblogging. We have used five dimensions to categorise published medical related research on Twitter. This classification provides a framework within which researchers studying development and use of Twitter within medical related research, and those undertaking comparative studies of research relating to Twitter in the area of medicine and beyond, can position and ground their work.