34 resultados para Performance Measures
Resumo:
Recently there has been an increasing interest in the development of new methods using Pareto optimality to deal with multi-objective criteria (for example, accuracy and architectural complexity). Once one has learned a model based on their devised method, the problem is then how to compare it with the state of art. In machine learning, algorithms are typically evaluated by comparing their performance on different data sets by means of statistical tests. Unfortunately, the standard tests used for this purpose are not able to jointly consider performance measures. The aim of this paper is to resolve this issue by developing statistical procedures that are able to account for multiple competing measures at the same time. In particular, we develop two tests: a frequentist procedure based on the generalized likelihood-ratio test and a Bayesian procedure based on a multinomial-Dirichlet conjugate model. We further extend them by discovering conditional independences among measures to reduce the number of parameter of such models, as usually the number of studied cases is very reduced in such comparisons. Real data from a comparison among general purpose classifiers is used to show a practical application of our tests.
Resumo:
There has been an increasing interest in the development of new methods using Pareto optimality to deal with multi-objective criteria (for example, accuracy and time complexity). Once one has developed an approach to a problem of interest, the problem is then how to compare it with the state of art. In machine learning, algorithms are typically evaluated by comparing their performance on different data sets by means of statistical tests. Standard tests used for this purpose are able to consider jointly neither performance measures nor multiple competitors at once. The aim of this paper is to resolve these issues by developing statistical procedures that are able to account for multiple competing measures at the same time and to compare multiple algorithms altogether. In particular, we develop two tests: a frequentist procedure based on the generalized likelihood-ratio test and a Bayesian procedure based on a multinomial-Dirichlet conjugate model. We further extend them by discovering conditional independences among measures to reduce the number of parameters of such models, as usually the number of studied cases is very reduced in such comparisons. Data from a comparison among general purpose classifiers is used to show a practical application of our tests.
Resumo:
Purpose The aim of this paper is to explore the issues involved in developing and applying performance management approaches within a large UK public sector department using a multiple stakeholder perspective and an accompanying theoretical framework. Design/methodology/approach An initial short questionnaire was used to determine perceptions about the implementation and effectiveness of the new performance management system across the organisation. In total, 700 questionnaires were distributed. Running concurrently with an ethnographic approach, and informed by the questionnaire responses, was a series of semi-structured interviews and focus groups. Findings Staff at all levels had an understanding of the new system and perceived it as being beneficial. However, there were concerns that the approach was not continuously managed throughout the year and was in danger of becoming an annual event, rather than an ongoing process. Furthermore, the change process seemed to have advanced without corresponding changes to appraisal and reward and recognition systems. Thus, the business objectives were not aligned with motivating factors within the organisation. Research limitations/implications Additional research to test the validity and usefulness of the theoretical model, as discussed in this paper, would be beneficial. Practical implications The strategic integration of the stakeholder performance measures and scorecards was found to be essential to producing an overall stakeholder-driven strategy within the case study organisation. Originality/value This paper discusses in detail the approach adopted and the progress made by one large UK public sector organisation, as it attempts to develop better relationships with all of its stakeholders and hence improve its performance. This paper provides a concerted attempt to link theory with practice.
Resumo:
Purpose – The purpose of this paper is to summarize the accumulated body of knowledge on the performance of new product projects and provide directions for further research. Design/methodology/approach – Using a refined classification of antecedents of new product project performance the research results are meta-analyzed in the literature in order to identify the strength and stability of predictor-performance relationships. Findings – The results reveal that 22 variables have a significant relationship with new product project performance, of which only 12 variables have a sizable relationship. In order of importance these factors are the degree of organizational interaction, R&D and marketing interface, general product development proficiency, product advantage, financial/business analysis, technical proficiency, management skill, marketing proficiency, market orientation, technology synergy, project manager competency and launch activities. Of the 34 variables 16 predictors show potential for moderator effects. Research limitations/implications – The validity of the results is constrained by publication bias and heterogeneity of performance measures, and directions for the presentation of data in future empirical publications are provided. Practical implications – This study helps new product project managers in understanding and managing the performance of new product development projects. Originality/value – This paper provides unique insights into the importance of predictors of new product performance at the project level. Furthermore, it identifies which predictor-performance relations are contingent on other factors.
Resumo:
Complex collaboration in rapidly changing business environments create challenges for management capability in Utility Horizontal Supply Chains (UHSCs) involving the deploying and evolving of performance measures. The aim of the study is twofold. First, there is a need to explore how management capability can be developed and used to deploy and evolve Performance Measurement (PM), both across a UHSC and within its constituent organisations, drawing upon a theoretical nexus of Dynamic Capability (DC) theory and complementary Goal Theory. Second, to make a contribution to knowledge by empirically building theory using these constructs to show the management motivations and behaviours within PM-based DCs. The methodology uses an interpretive theory building, multiple case based approach (n=3) as part of a USHC. The data collection methods include, interviews (n=54), focus groups (n=10), document analysis and participant observation (reflective learning logs) over a five-year period giving longitudinal data. The empirical findings lead to the development of a conceptual framework showing that management capabilities in driving PM deployment and evolution can be represented as multilevel renewal and incremental Dynamic Capabilities, which can be further understood in terms of motivation and behaviour by Goal-Theoretic constructs. In addition three interrelated cross cutting themes of management capabilities in consensus building, goal setting and resource change were identified. These management capabilities require carefully planned development and nurturing within the UHSC.
Resumo:
Sexual selection theory suggests that females might prefer males on the basis of testosterone (T)-dependent secondary sexual traits such as song. Correlational studies have linked high plasma T-levels to high diurnal song output. This has been confirmed in experiments where T-levels were kept high at times when natural T-levels have decreased. However, surprisingly little is known about the relation between T-levels during the early breeding season and song. In many passerine birds males sing at a high rate at dawn early in the breeding season, referred to as the dawn chorus. In blue tits (Parus caeruleus), the dawn chorus coincides with the fertile period of the female, whereas diurnal song occurs throughout the breeding season. Previous studies on blue tits showed that characteristics of the dawn chorus correlate with male reproductive success. We experimentally elevated plasma T-levels in male blue tits during the pre-fertile and fertile period. Our aim was to test whether increased plasma T-levels affect dawn song characteristics and increase the amount of diurnal song. Although T-implants successfully raised circulating T-levels, we did not find any difference between T- and control males in temporal performance measures of dawn song or in diurnal song output. Our results suggest that either there is no direct causal link between song output or quality and individual T-levels, or experimental manipulations of T-levels using implants do not permit detection of such effects during the early breeding season. Although we cannot exclude that individual T-levels are causally linked to other (e.g. structural) song parameters, our results cast doubt on T-dependence as the mechanisms that enforces honesty on song as a sexually selected trait.
Resumo:
Logistic regression and Gaussian mixture model (GMM) classifiers have been trained to estimate the probability of acute myocardial infarction (AMI) in patients based upon the concentrations of a panel of cardiac markers. The panel consists of two new markers, fatty acid binding protein (FABP) and glycogen phosphorylase BB (GPBB), in addition to the traditional cardiac troponin I (cTnI), creatine kinase MB (CKMB) and myoglobin. The effect of using principal component analysis (PCA) and Fisher discriminant analysis (FDA) to preprocess the marker concentrations was also investigated. The need for classifiers to give an accurate estimate of the probability of AMI is argued and three categories of performance measure are described, namely discriminatory ability, sharpness, and reliability. Numerical performance measures for each category are given and applied. The optimum classifier, based solely upon the samples take on admission, was the logistic regression classifier using FDA preprocessing. This gave an accuracy of 0.85 (95% confidence interval: 0.78-0.91) and a normalised Brier score of 0.89. When samples at both admission and a further time, 1-6 h later, were included, the performance increased significantly, showing that logistic regression classifiers can indeed use the information from the five cardiac markers to accurately and reliably estimate the probability AMI. © Springer-Verlag London Limited 2008.
Resumo:
In the past few decades, Coxian phase-type distributions have become increasingly more popular as a means of representing survival times. In healthcare, they are considered suitable for modelling the length of stay of patients in hospital and more recently for modelling the patient waiting times in Accident and Emergency Departments. The Coxian phase-type distribution has not only been shown to provide a good representation of real survival data, but its interpretation seems reasonably initiative to the medical experts. The drawback, however, is fitting the distribution to the data. There have been many attempts at accurately estimating the Coxian phase-type parameters. This paper wishes to examine the most promising of the approaches reported in the literature to determine the most accurate. Three performance measures are introduced to assess the fitting process of the algorithms along with the likelihood values and AIC to examine the goodness of fit and complexity of the model. Previous research suggests that the fitting process is strongly influenced by the initial parameter estimates and the data itself being quite variable. To overcome this, one experiment in this research paper will use the same initial parameter values for each estimation and perform the fits on the data simulated from a Coxian phase-type distribution with known parameters.
Resumo:
The increasing importance placed upon regional development and the knowledge-based economy as economic growth stimuli has led to a changing role for Universities and their interaction with the business community through (though not limited to) the transfer of technology from academia to industry. With the emergence of Local Enterprise Partnerships (LEPs) replacing the Regional Development Agencies (RDAs), there is a need for policy and practice going forward to be clearly informed by a critique of TTO (Technology Transfer Office)–RDA stakeholder relationship in a lessons learned approach so that LEPs can benefit from a faster learning curve. Thus, the aim of this paper is to examine the stakeholder relationship between three regional universities in the context of its TTO and the RDA with a view to determining lessons learned for the emerging LEP approach. Although the issues raised are contextual, the abstracted stakeholder conceptualisation of the TTO–RDA relationship should enable wider generalisation of the issues raised beyond the UK. Stakeholder theory relationship and stage development models are used to guide a repeat interview study of the TTO and RDA stakeholder groupings. The findings, interpreted using combined category and stage based stakeholder models, show how the longitudinal development of the TTO–RDA stakeholder relationship for each case has progressed through different stakeholder pathways, and stages where specific targeting of funding was dependant on the stakeholder stage. Greater targeted policy and funding, based on the stakeholder relationship approach, led to the development of joint mechanisms and a closer alignment of performance measures between the TTO and the RDA. However, over-reliance on the unitary nature of the TTO–RDA relationship may lead to a lack of cultivation and dependency for funding from other stakeholders.
Resumo:
Coxian phase-type distributions are becoming a popular means of representing survival times within a health care environment. They are favoured as they show a distribution as a system of phases and can allow for an easy visual representation of the rate of flow of patients through a system. Difficulties arise, however, in determining the parameter estimates of the Coxian phase-type distribution. This paper examines ways of making the fitting of the Coxian phase-type distribution less cumbersome by outlining different software packages and algorithms available to perform the fit and assessing their capabilities through a number of performance measures. The performance measures rate each of the methods and help in identifying the more efficient. Conclusions drawn from these performance measures suggest SAS to be the most robust package. It has a high rate of convergence in each of the four example model fits considered, short computational times, detailed output, convergence criteria options, along with a succinct ability to switch between different algorithms.
Resumo:
Improvement in the quality of end-of-life (EOL) care is a priority health care issue since serious deficiencies in quality of care have been reported across care settings. Increasing pressure is now focused on Canadian health care organizations to be accountable for the quality of palliative and EOL care delivered. Numerous domains of quality EOL care upon which to create accountability frameworks are now published, with some derived from the patient/family perspective. There is a need to reach common ground on the domains of quality EOL care valued by patients and families in order to develop consistent performance measures and set priorities for health care improvement. This paper describes a meta-synthesis study to develop a common conceptual framework of quality EOL care integrating attributes of quality valued by patients and their families. © 2005 Centre for Bioethics, IRCM.
Resumo:
Model selection between competing models is a key consideration in the discovery of prognostic multigene signatures. The use of appropriate statistical performance measures as well as verification of biological significance of the signatures is imperative to maximise the chance of external validation of the generated signatures. Current approaches in time-to-event studies often use only a single measure of performance in model selection, such as logrank test p-values, or dichotomise the follow-up times at some phase of the study to facilitate signature discovery. In this study we improve the prognostic signature discovery process through the application of the multivariate partial Cox model combined with the concordance index, hazard ratio of predictions, independence from available clinical covariates and biological enrichment as measures of signature performance. The proposed framework was applied to discover prognostic multigene signatures from early breast cancer data. The partial Cox model combined with the multiple performance measures were used in both guiding the selection of the optimal panel of prognostic genes and prediction of risk within cross validation without dichotomising the follow-up times at any stage. The signatures were successfully externally cross validated in independent breast cancer datasets, yielding a hazard ratio of 2.55 [1.44, 4.51] for the top ranking signature.
Resumo:
When applying biometric algorithms to forensic verification, false acceptance and false rejection can mean a failure to identify a criminal, or worse, lead to the prosecution of individuals for crimes they did not commit. It is therefore critical that biometric evaluations be performed as accurately as possible to determine their legitimacy as a forensic tool. This paper argues that, for forensic verification scenarios, traditional performance measures are insufficiently accurate. This inaccuracy occurs because existing verification evaluations implicitly assume that an imposter claiming a false identity would claim a random identity rather than consciously selecting a target to impersonate. In addition to describing this new vulnerability, the paper describes a novel Targeted.. FAR metric that combines the traditional False Acceptance Rate (FAR) measure with a term that indicates how performance degrades with the number of potential targets. The paper includes an evaluation of the effects of targeted impersonation on an existing academic face verification system. This evaluation reveals that even with a relatively small number of targets false acceptance rates can increase significantly, making the analysed biometric systems unreliable.
Outperformance in exchange-traded fund pricing deviations: Generalized control of data snooping bias
Resumo:
An investigation into exchange-traded fund (ETF) outperforrnance during the period 2008-2012 is undertaken utilizing a data set of 288 U.S. traded securities. ETFs are tested for net asset value (NAV) premium, underlying index and market benchmark outperformance, with Sharpe, Treynor, and Sortino ratios employed as risk-adjusted performance measures. A key contribution is the application of an innovative generalized stepdown procedure in controlling for data snooping bias. We find that a large proportion of optimized replication and debt asset class ETFs display risk-adjusted premiums with energy and precious metals focused funds outperforming the S&P 500 market benchmark.