15 resultados para Parametric Models
em Aston University Research Archive
Resumo:
Practitioners assess performance of entities in increasingly large and complicated datasets. If non-parametric models, such as Data Envelopment Analysis, were ever considered as simple push-button technologies, this is impossible when many variables are available or when data have to be compiled from several sources. This paper introduces by the 'COOPER-framework' a comprehensive model for carrying out non-parametric projects. The framework consists of six interrelated phases: Concepts and objectives, On structuring data, Operational models, Performance comparison model, Evaluation, and Result and deployment. Each of the phases describes some necessary steps a researcher should examine for a well defined and repeatable analysis. The COOPER-framework provides for the novice analyst guidance, structure and advice for a sound non-parametric analysis. The more experienced analyst benefits from a check list such that important issues are not forgotten. In addition, by the use of a standardized framework non-parametric assessments will be more reliable, more repeatable, more manageable, faster and less costly. © 2010 Elsevier B.V. All rights reserved.
Resumo:
In recent years there has been an increased interest in applying non-parametric methods to real-world problems. Significant research has been devoted to Gaussian processes (GPs) due to their increased flexibility when compared with parametric models. These methods use Bayesian learning, which generally leads to analytically intractable posteriors. This thesis proposes a two-step solution to construct a probabilistic approximation to the posterior. In the first step we adapt the Bayesian online learning to GPs: the final approximation to the posterior is the result of propagating the first and second moments of intermediate posteriors obtained by combining a new example with the previous approximation. The propagation of em functional forms is solved by showing the existence of a parametrisation to posterior moments that uses combinations of the kernel function at the training points, transforming the Bayesian online learning of functions into a parametric formulation. The drawback is the prohibitive quadratic scaling of the number of parameters with the size of the data, making the method inapplicable to large datasets. The second step solves the problem of the exploding parameter size and makes GPs applicable to arbitrarily large datasets. The approximation is based on a measure of distance between two GPs, the KL-divergence between GPs. This second approximation is with a constrained GP in which only a small subset of the whole training dataset is used to represent the GP. This subset is called the em Basis Vector, or BV set and the resulting GP is a sparse approximation to the true posterior. As this sparsity is based on the KL-minimisation, it is probabilistic and independent of the way the posterior approximation from the first step is obtained. We combine the sparse approximation with an extension to the Bayesian online algorithm that allows multiple iterations for each input and thus approximating a batch solution. The resulting sparse learning algorithm is a generic one: for different problems we only change the likelihood. The algorithm is applied to a variety of problems and we examine its performance both on more classical regression and classification tasks and to the data-assimilation and a simple density estimation problems.
Resumo:
Traditionally, geostatistical algorithms are contained within specialist GIS and spatial statistics software. Such packages are often expensive, with relatively complex user interfaces and steep learning curves, and cannot be easily integrated into more complex process chains. In contrast, Service Oriented Architectures (SOAs) promote interoperability and loose coupling within distributed systems, typically using XML (eXtensible Markup Language) and Web services. Web services provide a mechanism for a user to discover and consume a particular process, often as part of a larger process chain, with minimal knowledge of how it works. Wrapping current geostatistical algorithms with a Web service layer would thus increase their accessibility, but raises several complex issues. This paper discusses a solution to providing interoperable, automatic geostatistical processing through the use of Web services, developed in the INTAMAP project (INTeroperability and Automated MAPping). The project builds upon Open Geospatial Consortium standards for describing observations, typically used within sensor webs, and employs Geography Markup Language (GML) to describe the spatial aspect of the problem domain. Thus the interpolation service is extremely flexible, being able to support a range of observation types, and can cope with issues such as change of support and differing error characteristics of sensors (by utilising descriptions of the observation process provided by SensorML). XML is accepted as the de facto standard for describing Web services, due to its expressive capabilities which allow automatic discovery and consumption by ‘naive’ users. Any XML schema employed must therefore be capable of describing every aspect of a service and its processes. However, no schema currently exists that can define the complex uncertainties and modelling choices that are often present within geostatistical analysis. We show a solution to this problem, developing a family of XML schemata to enable the description of a full range of uncertainty types. These types will range from simple statistics, such as the kriging mean and variances, through to a range of probability distributions and non-parametric models, such as realisations from a conditional simulation. By employing these schemata within a Web Processing Service (WPS) we show a prototype moving towards a truly interoperable geostatistical software architecture.
Resumo:
Background: The controversy surrounding the non-uniqueness of predictive gene lists (PGL) of small selected subsets of genes from very large potential candidates as available in DNA microarray experiments is now widely acknowledged 1. Many of these studies have focused on constructing discriminative semi-parametric models and as such are also subject to the issue of random correlations of sparse model selection in high dimensional spaces. In this work we outline a different approach based around an unsupervised patient-specific nonlinear topographic projection in predictive gene lists. Methods: We construct nonlinear topographic projection maps based on inter-patient gene-list relative dissimilarities. The Neuroscale, the Stochastic Neighbor Embedding(SNE) and the Locally Linear Embedding(LLE) techniques have been used to construct two-dimensional projective visualisation plots of 70 dimensional PGLs per patient, classifiers are also constructed to identify the prognosis indicator of each patient using the resulting projections from those visualisation techniques and investigate whether a-posteriori two prognosis groups are separable on the evidence of the gene lists. A literature-proposed predictive gene list for breast cancer is benchmarked against a separate gene list using the above methods. Generalisation ability is investigated by using the mapping capability of Neuroscale to visualise the follow-up study, but based on the projections derived from the original dataset. Results: The results indicate that small subsets of patient-specific PGLs have insufficient prognostic dissimilarity to permit a distinction between two prognosis patients. Uncertainty and diversity across multiple gene expressions prevents unambiguous or even confident patient grouping. Comparative projections across different PGLs provide similar results. Conclusion: The random correlation effect to an arbitrary outcome induced by small subset selection from very high dimensional interrelated gene expression profiles leads to an outcome with associated uncertainty. This continuum and uncertainty precludes any attempts at constructing discriminative classifiers. However a patient's gene expression profile could possibly be used in treatment planning, based on knowledge of other patients' responses. We conclude that many of the patients involved in such medical studies are intrinsically unclassifiable on the basis of provided PGL evidence. This additional category of 'unclassifiable' should be accommodated within medical decision support systems if serious errors and unnecessary adjuvant therapy are to be avoided.
Resumo:
This paper proposes a semiparametric smooth-coefficient stochastic production frontier model where all the coefficients are expressed as some unknown functions of environmental factors. The inefficiency term is multiplicatively decomposed into a scaling function of the environmental factors and a standard truncated normal random variable. A testing procedure is suggested for the relevance of the environmental factors. Monte Carlo study shows plausible ¯nite sample behavior of our proposed estimation and inference procedure. An empirical example is given, where both the semiparametric and standard parametric models are estimated and results are compared.
Resumo:
We examine the empirical evidence for an environmental Kuznets curve using a semiparametric smooth coefficient regression model that allows us to incorporate flexibility in the parameter estimates, while maintaining the basic econometric structure that is typically used to estimate the pollution-income relationship. This allows us to assess the sensitivity to parameter heterogeneity of typical parametric models used to estimate the relationship between pollution and income, as well as identify why the results from such models are seldom found to be robust. Our results confirm that the resulting relationship between pollution and income is fragile; we show that the estimated pollution-income relationship depends substantially on the heterogeneity of the slope coefficients and the parameter values at which the relationship is evaluated. Different sets of parameters obtained from the semiparametric model give rise to many different shapes for the pollution-income relationship that are commonly found in the literature.
Resumo:
Most parametric software cost estimation models used today evolved in the late 70's and early 80's. At that time, the dominant software development techniques being used were the early 'structured methods'. Since then, several new systems development paradigms and methods have emerged, one being Jackson Systems Development (JSD). As current cost estimating methods do not take account of these developments, their non-universality means they cannot provide adequate estimates of effort and hence cost. In order to address these shortcomings two new estimation methods have been developed for JSD projects. One of these methods JSD-FPA, is a top-down estimating method, based on the existing MKII function point method. The other method, JSD-COCOMO, is a sizing technique which sizes a project, in terms of lines of code, from the process structure diagrams and thus provides an input to the traditional COCOMO method.The JSD-FPA method allows JSD projects in both the real-time and scientific application areas to be costed, as well as the commercial information systems applications to which FPA is usually applied. The method is based upon a three-dimensional view of a system specification as opposed to the largely data-oriented view traditionally used by FPA. The method uses counts of various attributes of a JSD specification to develop a metric which provides an indication of the size of the system to be developed. This size metric is then transformed into an estimate of effort by calculating past project productivity and utilising this figure to predict the effort and hence cost of a future project. The effort estimates produced were validated by comparing them against the effort figures for six actual projects.The JSD-COCOMO method uses counts of the levels in a process structure chart as the input to an empirically derived model which transforms them into an estimate of delivered source code instructions.
Resumo:
The spatial distribution of self-employment in India: evidence from semiparametric geoadditive models, Regional Studies. The entrepreneurship literature has rarely considered spatial location as a micro-determinant of occupational choice. It has also ignored self-employment in developing countries. Using Bayesian semiparametric geoadditive techniques, this paper models spatial location as a micro-determinant of self-employment choice in India. The empirical results suggest the presence of spatial occupational neighbourhoods and a clear north–south divide in self-employment when the entire sample is considered; however, spatial variation in the non-agriculture sector disappears to a large extent when individual factors that influence self-employment choice are explicitly controlled. The results further suggest non-linear effects of age, education and wealth on self-employment.
Resumo:
The increasing intensity of global competition has led organizations to utilize various types of performance measurement tools for improving the quality of their products and services. Data envelopment analysis (DEA) is a methodology for evaluating and measuring the relative efficiencies of a set of decision making units (DMUs) that use multiple inputs to produce multiple outputs. All the data in the conventional DEA with input and/or output ratios assumes the form of crisp numbers. However, the observed values of data in real-world problems are sometimes expressed as interval ratios. In this paper, we propose two new models: general and multiplicative non-parametric ratio models for DEA problems with interval data. The contributions of this paper are fourfold: (1) we consider input and output data expressed as interval ratios in DEA; (2) we address the gap in DEA literature for problems not suitable or difficult to model with crisp values; (3) we propose two new DEA models for evaluating the relative efficiencies of DMUs with interval ratios, and (4) we present a case study involving 20 banks with three interval ratios to demonstrate the applicability and efficacy of the proposed models where the traditional indicators are mostly financial ratios. © 2011 Elsevier Inc.
Resumo:
Most object-based approaches to Geographical Information Systems (GIS) have concentrated on the representation of geometric properties of objects in terms of fixed geometry. In our road traffic marking application domain we have a requirement to represent the static locations of the road markings but also enforce the associated regulations, which are typically geometric in nature. For example a give way line of a pedestrian crossing in the UK must be within 1100-3000 mm of the edge of the crossing pattern. In previous studies of the application of spatial rules (often called 'business logic') in GIS emphasis has been placed on the representation of topological constraints and data integrity checks. There is very little GIS literature that describes models for geometric rules, although there are some examples in the Computer Aided Design (CAD) literature. This paper introduces some of the ideas from so called variational CAD models to the GIS application domain, and extends these using a Geography Markup Language (GML) based representation. In our application we have an additional requirement; the geometric rules are often changed and vary from country to country so should be represented in a flexible manner. In this paper we describe an elegant solution to the representation of geometric rules, such as requiring lines to be offset from other objects. The method uses a feature-property model embraced in GML 3.1 and extends the possible relationships in feature collections to permit the application of parameterized geometric constraints to sub features. We show the parametric rule model we have developed and discuss the advantage of using simple parametric expressions in the rule base. We discuss the possibilities and limitations of our approach and relate our data model to GML 3.1. © 2006 Springer-Verlag Berlin Heidelberg.
Resumo:
The use of Diagnosis Related Groups (DRG) as a mechanism for hospital financing is a currently debated topic in Portugal. The DRG system was scheduled to be initiated by the Health Ministry of Portugal on January 1, 1990 as an instrument for the allocation of public hospital budgets funded by the National Health Service (NHS), and as a method of payment for other third party payers (e.g., Public Employees (ADSE), private insurers, etc.). Based on experience from other countries such as the United States, it was expected that implementation of this system would result in more efficient hospital resource utilisation and a more equitable distribution of hospital budgets. However, in order to minimise the potentially adverse financial impact on hospitals, the Portuguese Health Ministry decided to gradually phase in the use of the DRG system for budget allocation by using blended hospitalspecific and national DRG casemix rates. Since implementation in 1990, the percentage of each hospitals budget based on hospital specific costs was to decrease, while the percentage based on DRG casemix was to increase. This was scheduled to continue until 1995 when the plan called for allocating yearly budgets on a 50% national and 50% hospitalspecific cost basis. While all other nonNHS third party payers are currently paying based on DRGs, the adoption of DRG casemix as a National Health Service budget setting tool has been slower than anticipated. There is now some argument in both the political and academic communities as to the appropriateness of DRGs as a budget setting criterion as well as to their impact on hospital efficiency in Portugal. This paper uses a twostage procedure to assess the impact of actual DRG payment on the productivity (through its components, i.e., technological change and technical efficiency change) of diagnostic technology in Portuguese hospitals during the years 1992–1994, using both parametric and nonparametric frontier models. We find evidence that the DRG payment system does appear to have had a positive impact on productivity and technical efficiency of some commonly employed diagnostic technologies in Portugal during this time span.
Resumo:
Often observations are nested within other units. This is particularly the case in the educational sector where school performance in terms of value added is the result of school contribution as well as pupil academic ability and other features relating to the pupil. Traditionally, the literature uses parametric (i.e. it assumes a priori a particular function on the production process) Multi-Level Models to estimate the performance of nested entities. This paper discusses the use of the non-parametric (i.e. without a priori assumptions on the production process) Free Disposal Hull model as an alternative approach. While taking into account contextual characteristics as well as atypical observations, we show how to decompose non-parametrically the overall inefficiency of a pupil into a unit specific and a higher level (i.e. a school) component. By a sample of entry and exit attainments of 3017 girls in British ordinary single sex schools, we test the robustness of the non-parametric and parametric estimates. We find that the two methods agree in the relative measures of the scope for potential attainment improvement. Further, the two methods agree on the variation in pupil attainment and the proportion attributable to pupil and school level.
Resumo:
The research is concerned with the application of the computer simulation technique to study the performance of reinforced concrete columns in a fire environment. The effect of three different concrete constitutive models incorporated in the computer simulation on the structural response of reinforced concrete columns exposed to fire is investigated. The material models differed mainly in respect to the formulation of the mechanical properties of concrete. The results from the simulation have clearly illustrated that a more realistic response of a reinforced concrete column exposed to fire is given by a constitutive model with transient creep or appropriate strain effect The assessment of the relative effect of the three concrete material models is considered from the analysis by adopting the approach of a parametric study, carried out using the results from a series of analyses on columns heated on three sides which produce substantial thermal gradients. Three different loading conditions were used on the column; axial loading and eccentric loading both to induce moments in the same sense and opposite sense to those induced by the thermal gradient. An axially loaded column heated on four sides was also considered. The computer modelling technique adopted separated the thermal and structural responses into two distinct computer programs. A finite element heat transfer analysis was used to determine the thermal response of the reinforced concrete columns when exposed to the ISO 834 furnace environment. The temperature distribution histories obtained were then used in conjunction with a structural response program. The effect of the occurrence of spalling on the structural behaviour of reinforced concrete column is also investigated. There is general recognition of the potential problems of spalling but no real investigation into what effect spalling has on the fire resistance of reinforced concrete members. In an attempt to address the situation, a method has been developed to model concrete columns exposed to fire which incorporates the effect of spalling. A total of 224 computer simulations were undertaken by varying the amounts of concrete lost during a specified period of exposure to fire. An array of six percentages of spalling were chosen for one range of simulation while a two stage progressive spalling regime was used for a second range. The quantification of the reduction in fire resistance of the columns against the amount of spalling, heating and loading patterns, and the time at which the concrete spalls appears to indicate that it is the amount of spalling which is the most significant variable in the reduction of fire resistance.
Resumo:
The predictive accuracy of competing crude-oil price forecast densities is investigated for the 1994–2006 period. Moving beyond standard ARCH type models that rely exclusively on past returns, we examine the benefits of utilizing the forward-looking information that is embedded in the prices of derivative contracts. Risk-neutral densities, obtained from panels of crude-oil option prices, are adjusted to reflect real-world risks using either a parametric or a non-parametric calibration approach. The relative performance of the models is evaluated for the entire support of the density, as well as for regions and intervals that are of special interest for the economic agent. We find that non-parametric adjustments of risk-neutral density forecasts perform significantly better than their parametric counterparts. Goodness-of-fit tests and out-of-sample likelihood comparisons favor forecast densities obtained by option prices and non-parametric calibration methods over those constructed using historical returns and simulated ARCH processes. © 2010 Wiley Periodicals, Inc. Jrl Fut Mark 31:727–754, 2011
Resumo:
We study the dynamics of a growing crystalline facet where the growth mechanism is controlled by the geometry of the local curvature. A continuum model, in (2+1) dimensions, is developed in analogy with the Kardar-Parisi-Zhang (KPZ) model is considered for the purpose. Following standard coarse graining procedures, it is shown that in the large time, long distance limit, the continuum model predicts a curvature independent KPZ phase, thereby suppressing all explicit effects of curvature and local pinning in the system, in the "perturbative" limit. A direct numerical integration of this growth equation, in 1+1 dimensions, supports this observation below a critical parametric range, above which generic instabilities, in the form of isolated pillared structures lead to deviations from standard scaling behaviour. Possibilities of controlling this instability by introducing statistically "irrelevant" (in the sense of renormalisation groups) higher ordered nonlinearities have also been discussed.