923 resultados para model selection in binary regression
Resumo:
We derive a mean field algorithm for binary classification with Gaussian processes which is based on the TAP approach originally proposed in Statistical Physics of disordered systems. The theory also yields an approximate leave-one-out estimator for the generalization error which is computed with no extra computational cost. We show that from the TAP approach, it is possible to derive both a simpler 'naive' mean field theory and support vector machines (SVM) as limiting cases. For both mean field algorithms and support vectors machines, simulation results for three small benchmark data sets are presented. They show 1. that one may get state of the art performance by using the leave-one-out estimator for model selection and 2. the built-in leave-one-out estimators are extremely precise when compared to the exact leave-one-out estimate. The latter result is a taken as a strong support for the internal consistency of the mean field approach.
Resumo:
Data visualization algorithms and feature selection techniques are both widely used in bioinformatics but as distinct analytical approaches. Until now there has been no method of measuring feature saliency while training a data visualization model. We derive a generative topographic mapping (GTM) based data visualization approach which estimates feature saliency simultaneously with the training of the visualization model. The approach not only provides a better projection by modeling irrelevant features with a separate noise model but also gives feature saliency values which help the user to assess the significance of each feature. We compare the quality of projection obtained using the new approach with the projections from traditional GTM and self-organizing maps (SOM) algorithms. The results obtained on a synthetic and a real-life chemoinformatics dataset demonstrate that the proposed approach successfully identifies feature significance and provides coherent (compact) projections. © 2006 IEEE.
Resumo:
Predicated on the assumption that employee careerist orientation resulting from organizational actions to cut costs constitutes a potential threat to their long-term profitability and success, this study proposed and tested a social exchange model of careerist orientation in the People's Republic of China. Specifically, it was hypothesized that organizational justice and career growth opportunities will be related to careerist orientation, but the relationship will be mediated by trust in employer. Structural equation modeling results provided support for the model. Trust in organization fully mediated the relationship between careerist orientation and its antecedents.
Resumo:
This empirical study examines the extent of non-linearity in a multivariate model of monthly financial series. To capture the conditional heteroscedasticity in the series, both the GARCH(1,1) and GARCH(1,1)-in-mean models are employed. The conditional errors are assumed to follow the normal and Student-t distributions. The non-linearity in the residuals of a standard OLS regression are also assessed. It is found that the OLS residuals as well as conditional errors of the GARCH models exhibit strong non-linearity. Under the Student density, the extent of non-linearity in the GARCH conditional errors was generally similar to those of the standard OLS. The GARCH-in-mean regression generated the worse out-of-sample forecasts.
Resumo:
We report the case of a neologistic jargonaphasic and ask whether her target-related and abstruse neologisms are the result of a single deficit, which affects some items more severely than others, or two deficits: one to lexical access and the other to phonological encoding. We analyse both correct/incorrect performance and errors and apply both traditional and formal methods (maximum-likelihood estimation and model selection). All evidence points to a single deficit at the level of phonological encoding. Further characteristics are used to constrain the locus still further. V.S. does not show the type of length effect expected of a memory component, nor the pattern of errors associated with an articulatory deficit. We conclude that her neologistic errors can result from a single deficit at a level of phonological encoding that immediately follows lexical access where segments are represented in terms of their features. We do not conclude, however, that this is the only possible locus that will produce phonological errors in aphasia, or, indeed, jargonaphasia.
Resumo:
When constructing and using environmental models, it is typical that many of the inputs to the models will not be known perfectly. In some cases, it will be possible to make observations, or occasionally physics-based uncertainty propagation, to ascertain the uncertainty on these inputs. However, such observations are often either not available or even possible, and another approach to characterising the uncertainty on the inputs must be sought. Even when observations are available, if the analysis is being carried out within a Bayesian framework then prior distributions will have to be specified. One option for gathering or at least estimating this information is to employ expert elicitation. Expert elicitation is well studied within statistics and psychology and involves the assessment of the beliefs of a group of experts about an uncertain quantity, (for example an input / parameter within a model), typically in terms of obtaining a probability distribution. One of the challenges in expert elicitation is to minimise the biases that might enter into the judgements made by the individual experts, and then to come to a consensus decision within the group of experts. Effort is made in the elicitation exercise to prevent biases clouding the judgements through well-devised questioning schemes. It is also important that, when reaching a consensus, the experts are exposed to the knowledge of the others in the group. Within the FP7 UncertWeb project (http://www.uncertweb.org/), there is a requirement to build a Webbased tool for expert elicitation. In this paper, we discuss some of the issues of building a Web-based elicitation system - both the technological aspects and the statistical and scientific issues. In particular, we demonstrate two tools: a Web-based system for the elicitation of continuous random variables and a system designed to elicit uncertainty about categorical random variables in the setting of landcover classification uncertainty. The first of these examples is a generic tool developed to elicit uncertainty about univariate continuous random variables. It is designed to be used within an application context and extends the existing SHELF method, adding a web interface and access to metadata. The tool is developed so that it can be readily integrated with environmental models exposed as web services. The second example was developed for the TREES-3 initiative which monitors tropical landcover change through ground-truthing at confluence points. It allows experts to validate the accuracy of automated landcover classifications using site-specific imagery and local knowledge. Experts may provide uncertainty information at various levels: from a general rating of their confidence in a site validation to a numerical ranking of the possible landcover types within a segment. A key challenge in the web based setting is the design of the user interface and the method of interacting between the problem owner and the problem experts. We show the workflow of the elicitation tool, and show how we can represent the final elicited distributions and confusion matrices using UncertML, ready for integration into uncertainty enabled workflows.We also show how the metadata associated with the elicitation exercise is captured and can be referenced from the elicited result, providing crucial lineage information and thus traceability in the decision making process.
Resumo:
This thesis presents a new approach to designing large organizational databases. The approach emphasizes the need for a holistic approach to the design process. The development of the proposed approach was based on a comprehensive examination of the issues of relevance to the design and utilization of databases. Such issues include conceptual modelling, organization theory, and semantic theory. The conceptual modelling approach presented in this thesis is developed over three design stages, or model perspectives. In the semantic perspective, concept definitions were developed based on established semantic principles. Such definitions rely on meaning - provided by intension and extension - to determine intrinsic conceptual definitions. A tool, called meaning-based classification (MBC), is devised to classify concepts based on meaning. Concept classes are then integrated using concept definitions and a set of semantic relations which rely on concept content and form. In the application perspective, relationships are semantically defined according to the application environment. Relationship definitions include explicit relationship properties and constraints. The organization perspective introduces a new set of relations specifically developed to maintain conformity of conceptual abstractions with the nature of information abstractions implied by user requirements throughout the organization. Such relations are based on the stratification of work hierarchies, defined elsewhere in the thesis. Finally, an example of an application of the proposed approach is presented to illustrate the applicability and practicality of the modelling approach.
Resumo:
Emotional liability and mood dysregulation characterize bipolar disorder (BD), yet no study has examined effective connectivity between parahippocampal gyrus and prefrontal cortical regions in ventromedial and dorsal/lateral neural systems subserving mood regulation in BD. Participants comprised 46 individuals (age range: 18-56 years): 21 with a DSM-IV diagnosis of BD, type I currently remitted; and 25 age- and gender-matched healthy controls (HC). Participants performed an event-related functional magnetic resonance imaging paradigm, viewing mild and intense happy and neutral faces. We employed dynamic causal modeling (DCM) to identify significant alterations in effective connectivity between BD and HC. Bayes model selection was used to determine the best model. The right parahippocampal gyrus (PHG) and right subgenual cingulate gyrus (sgCG) were included as representative regions of the ventromedial neural system. The right dorsolateral prefrontal cortex (DLPFC) region was included as representative of the dorsal/lateral neural system. Right PHG-sgCG effective connectivity was significantly greater in BD than HC, reflecting more rapid, forward PHG-sgCG signaling in BD than HC. There was no between-group difference in sgCG-DLPFC effective connectivity. In BD, abnormally increased right PHG-sgCG effective connectivity and reduced right PHG activity to emotional stimuli suggest a dysfunctional ventromedial neural system implicated in early stimulus appraisal, encoding and automatic regulation of emotion that may represent a pathophysiological functional neural mechanism for mood dysregulation in BD.
Resumo:
Lack of discrimination power and poor weight dispersion remain major issues in Data Envelopment Analysis (DEA). Since the initial multiple criteria DEA (MCDEA) model developed in the late 1990s, only goal programming approaches; that is, the GPDEA-CCR and GPDEA-BCC were introduced for solving the said problems in a multi-objective framework. We found GPDEA models to be invalid and demonstrate that our proposed bi-objective multiple criteria DEA (BiO-MCDEA) outperforms the GPDEA models in the aspects of discrimination power and weight dispersion, as well as requiring less computational codes. An application of energy dependency among 25 European Union member countries is further used to describe the efficacy of our approach. © 2013 Elsevier B.V. All rights reserved.
Resumo:
This study presents a computational fluid dynamic (CFD) study of Dimethyl Ether steam reforming (DME-SR) in a large scale Circulating Fluidized Bed (CFB) reactor. The CFD model is based on Eulerian-Eulerian dispersed flow and solved using commercial software (ANSYS FLUENT). The DME-SR reactions scheme and kinetics in the presence of a bifunctional catalyst of CuO/ZnO/Al2O3+ZSM-5 were incorporated in the model using in-house developed user-defined function. The model was validated by comparing the predictions with experimental data from the literature. The results revealed for the first time detailed CFB reactor hydrodynamics, gas residence time, temperature distribution and product gas composition at a selected operating condition of 300 °C and steam to DME mass ratio of 3 (molar ratio of 7.62). The spatial variation in the gas species concentrations suggests the existence of three distinct reaction zones but limited temperature variations. The DME conversion and hydrogen yield were found to be 87% and 59% respectively, resulting in a product gas consisting of 72 mol% hydrogen. In part II of this study, the model presented here will be used to optimize the reactor design and study the effect of operating conditions on the reactor performance and products.
Resumo:
Portfolio analysis exists, perhaps, as long, as people think about acceptance of rational decisions connected with use of the limited resources. However the occurrence moment of portfolio analysis can be dated precisely enough is having connected it with a publication of pioneer work of Harry Markovittz (Markovitz H. Portfolio Selection) in 1952. The model offered in this work, simple enough in essence, has allowed catching the basic features of the financial market, from the point of view of the investor, and has supplied the last with the tool for development of rational investment decisions. The central problem in Markovitz theory is the portfolio choice that is a set of operations. Thus in estimation, both separate operations and their portfolios two major factors are considered: profitableness and risk of operations and their portfolios. The risk thus receives a quantitative estimation. The account of mutual correlation dependences between profitablenesses of operations appears the essential moment in the theory. This account allows making effective diversification of portfolio, leading to essential decrease in risk of a portfolio in comparison with risk of the operations included in it. At last, the quantitative characteristic of the basic investment characteristics allows defining and solving a problem of a choice of an optimum portfolio in the form of a problem of quadratic optimization.
Resumo:
Selection in privatization is a decision-making process of choosing state-owned enterprises (SOEs), prioritizing and sequencing privatizing events, and determining the extent of private ownership in partial privatization. We investigate this process in an important but rarely studied case of China. Based on the SOE population over 1998-2008, we track 49,456 wholly SOEs and identify 9,359 privatization cases over time. Our econometric analysis concludes: (i) The privatization selection is a complex decision-making process in which local governments balance between various economic, financial and political objectives. (ii) In the recent Chinese privatization, firm performance relates to the selection, staging and sequencing in privatization in an inverted-U fashion. The worse and the best performing SOEs are more likely to remain state-owned, maintain higher state holding when privatized, and are less likely to be privatized later in time. These patterns suggest the privatization reform slowdown and the underlying changes in the privatization policy.
Resumo:
2010 Mathematics Subject Classification: 94A17, 62B10, 62F03.
Resumo:
Due to dynamic variability, identifying the specific conditions under which non-functional requirements (NFRs) are satisfied may be only possible at runtime. Therefore, it is necessary to consider the dynamic treatment of relevant information during the requirements specifications. The associated data can be gathered by monitoring the execution of the application and its underlying environment to support reasoning about how the current application configuration is fulfilling the established requirements. This paper presents a dynamic decision-making infrastructure to support both NFRs representation and monitoring, and to reason about the degree of satisfaction of NFRs during runtime. The infrastructure is composed of: (i) an extended feature model aligned with a domain-specific language for representing NFRs to be monitored at runtime; (ii) a monitoring infrastructure to continuously assess NFRs at runtime; and (iii) a exible decision-making process to select the best available configuration based on the satisfaction degree of the NRFs. The evaluation of the approach has shown that it is able to choose application configurations that well fit user NFRs based on runtime information. The evaluation also revealed that the proposed infrastructure provided consistent indicators regarding the best application configurations that fit user NFRs. Finally, a benefit of our approach is that it allows us to quantify the level of satisfaction with respect to NFRs specification.
Resumo:
There has been an increasing interest in the use of agent-based simulation and some discussion of the relative merits of this approach as compared to discrete-event simulation. There are differing views on whether an agent-based simulation offers capabilities that discrete-event cannot provide or whether all agent-based applications can at least in theory be undertaken using a discrete-event approach. This paper presents a simple agent-based NetLogo model and corresponding discrete-event versions implemented in the widely used ARENA software. The two versions of the discrete-event model presented use a traditional process flow approach normally adopted in discrete-event simulation software and also an agent-based approach to the model build. In addition a real-time spatial visual display facility is provided using a spreadsheet platform controlled by VBA code embedded within the ARENA model. Initial findings from this investigation are that discrete-event simulation can indeed be used to implement agent-based models and with suitable integration elements such as VBA provide the spatial displays associated with agent-based software.