41 resultados para Models performance
Resumo:
In this article we compare regression models obtained to predict PhD students’ academic performance in the universities of Girona (Spain) and Slovenia. Explanatory variables are characteristics of PhD student’s research group understood as an egocentered social network, background and attitudinal characteristics of the PhD students and some characteristics of the supervisors. Academic performance was measured by the weighted number of publications. Two web questionnaires were designed, one for PhD students and one for their supervisors and other research group members. Most of the variables were easily comparable across universities due to the careful translation procedure and pre-tests. When direct comparison was notpossible we created comparable indicators. We used a regression model in which the country was introduced as a dummy coded variable including all possible interaction effects. The optimal transformations of the main and interaction variables are discussed. Some differences between Slovenian and Girona universities emerge. Some variables like supervisor’s performance and motivation for autonomy prior to starting the PhD have the same positive effect on the PhD student’s performance in both countries. On the other hand, variables like too close supervision by the supervisor and having children have a negative influence in both countries. However, we find differences between countries when we observe the motivation for research prior to starting the PhD which increases performance in Slovenia but not in Girona. As regards network variables, frequency of supervisor advice increases performance in Slovenia and decreases it in Girona. The negative effect in Girona could be explained by the fact that additional contacts of the PhD student with his/her supervisor might indicate a higher workload in addition to or instead of a better advice about the dissertation. The number of external student’s advice relationships and social support mean contact intensity are not significant in Girona, but they have a negative effect in Slovenia. We might explain the negative effect of external advice relationships in Slovenia by saying that a lot of external advice may actually result from a lack of the more relevant internal advice
Resumo:
Earth System Models (ESM) have been successfuly developed over past few years, and are currently beeing used for simulating present day-climate, seasonal to interanual predictions of climate change. The supercomputer performance plays an important role in climate modeling since one of the challenging issues for climate modellers is to efficiently and accurately couple earth System components on present day computers architectures. At the Barcelona Supercomputing Center (BSC), we work with the EC- Earth System Model. The EC- Earth is an ESM, which currently consists of an atmosphere (IFS) and an ocean (NEMO) model that communicate with each other through the OASIS coupler. Additional modules (e.g. for chemistry and vegetation ) are under development. The EC-Earth ESM has been ported successfully over diferent high performance computin platforms (e.g, IBM P6 AIX, CRAY XT-5, Intelbased Linux Clusters, SGI Altix) at diferent sites in Europ (e.g., KNMI, ICHEC, ECMWF). The objective of the first phase of the project was to identify and document the issues related with the portability and performance of EC-Earth on the MareNostrum supercomputer, a System based on IBM PowerPC 970MP processors and run under a Linux Suse Distribution. EC-Earth was successfully ported to MareNostrum, and a compilation incompatibilty was solved by a two step compilation approach using XLF version 10.1 and 12.1 compilers. In addition, the EC-Earth performance was analyzed with respect to escalability and trace analysis with the Paravear software. This analysis showed that EC-Earth with a larger number of IFS CPUs (<128) is not feasible at the moment since some issues exists with the IFS-NEMO balance and MPI Communications.
Resumo:
The purpose of this paper is to examine (1) some of the models commonly used to represent fading,and (2) the information-theoretic metrics most commonly used to evaluate performance over those models. We raise the question of whether these models and metrics remain adequate in light of the advances that wireless systems haveundergone over the last two decades. Weaknesses are pointedout, and ideas on possible fixes are put forth.
Resumo:
Expected utility theory (EUT) has been challenged as a descriptive theoryin many contexts. The medical decision analysis context is not an exception.Several researchers have suggested that rank dependent utility theory (RDUT)may accurately describe how people evaluate alternative medical treatments.Recent research in this domain has addressed a relevant feature of RDU models-probability weighting-but to date no direct test of this theoryhas been made. This paper provides a test of the main axiomatic differencebetween EUT and RDUT when health profiles are used as outcomes of riskytreatments. Overall, EU best described the data. However, evidence on theediting and cancellation operation hypothesized in Prospect Theory andCumulative Prospect Theory was apparent in our study. we found that RDUoutperformed EU in the presentation of the risky treatment pairs in whichthe common outcome was not obvious. The influence of framing effects onthe performance of RDU and their importance as a topic for future researchis discussed.
Resumo:
Several studies have reported high performance of simple decision heuristics multi-attribute decision making. In this paper, we focus on situations where attributes are binary and analyze the performance of Deterministic-Elimination-By-Aspects (DEBA) and similar decision heuristics. We consider non-increasing weights and two probabilistic models for the attribute values: one where attribute values are independent Bernoulli randomvariables; the other one where they are binary random variables with inter-attribute positive correlations. Using these models, we show that good performance of DEBA is explained by the presence of cumulative as opposed to simple dominance. We therefore introduce the concepts of cumulative dominance compliance and fully cumulative dominance compliance and show that DEBA satisfies those properties. We derive a lower bound with which cumulative dominance compliant heuristics will choose a best alternative and show that, even with many attributes, this is not small. We also derive an upper bound for the expected loss of fully cumulative compliance heuristics and show that this is moderateeven when the number of attributes is large. Both bounds are independent of the values ofthe weights.
Resumo:
Most methods for small-area estimation are based on composite estimators derived from design- or model-based methods. A composite estimator is a linear combination of a direct and an indirect estimator with weights that usually depend on unknown parameters which need to be estimated. Although model-based small-area estimators are usually based on random-effects models, the assumption of fixed effects is at face value more appropriate.Model-based estimators are justified by the assumption of random (interchangeable) area effects; in practice, however, areas are not interchangeable. In the present paper we empirically assess the quality of several small-area estimators in the setting in which the area effects are treated as fixed. We consider two settings: one that draws samples from a theoretical population, and another that draws samples from an empirical population of a labor force register maintained by the National Institute of Social Security (NISS) of Catalonia. We distinguish two types of composite estimators: a) those that use weights that involve area specific estimates of bias and variance; and, b) those that use weights that involve a common variance and a common squared bias estimate for all the areas. We assess their precision and discuss alternatives to optimizing composite estimation in applications.
Resumo:
This paper theoretically and empirically documents a puzzle that arises when an RBC economy with a job matching function is used to model unemployment. The standard model can generate sufficiently large cyclical fluctuations in unemployment, or a sufficiently small response of unemployment to labor market policies, but it cannot do both. Variable search and separation, finite UI benefit duration, efficiency wages, and capital all fail to resolve this puzzle. However, either sticky wages or match-specific productivity shocks can improve the model's performance by making the firm's flow of surplus more procyclical, which makes hiring more procyclical too.
Resumo:
Revenue management (RM) is a complicated business process that can best be described ascontrol of sales (using prices, restrictions, or capacity), usually using software as a tool to aiddecisions. RM software can play a mere informative role, supplying analysts with formatted andsummarized data who use it to make control decisions (setting a price or allocating capacity fora price point), or, play a deeper role, automating the decisions process completely, at the otherextreme. The RM models and algorithms in the academic literature by and large concentrateon the latter, completely automated, level of functionality.A firm considering using a new RM model or RM system needs to evaluate its performance.Academic papers justify the performance of their models using simulations, where customerbooking requests are simulated according to some process and model, and the revenue perfor-mance of the algorithm compared to an alternate set of algorithms. Such simulations, whilean accepted part of the academic literature, and indeed providing research insight, often lackcredibility with management. Even methodologically, they are usually awed, as the simula-tions only test \within-model" performance, and say nothing as to the appropriateness of themodel in the first place. Even simulations that test against alternate models or competition arelimited by their inherent necessity on fixing some model as the universe for their testing. Theseproblems are exacerbated with RM models that attempt to model customer purchase behav-ior or competition, as the right models for competitive actions or customer purchases remainsomewhat of a mystery, or at least with no consensus on their validity.How then to validate a model? Putting it another way, we want to show that a particularmodel or algorithm is the cause of a certain improvement to the RM process compared to theexisting process. We take care to emphasize that we want to prove the said model as the causeof performance, and to compare against a (incumbent) process rather than against an alternatemodel.In this paper we describe a \live" testing experiment that we conducted at Iberia Airlineson a set of flights. A set of competing algorithms control a set of flights during adjacentweeks, and their behavior and results are observed over a relatively long period of time (9months). In parallel, a group of control flights were managed using the traditional mix of manualand algorithmic control (incumbent system). Such \sandbox" testing, while common at manylarge internet search and e-commerce companies is relatively rare in the revenue managementarea. Sandbox testing has an undisputable model of customer behavior but the experimentaldesign and analysis of results is less clear. In this paper we describe the philosophy behind theexperiment, the organizational challenges, the design and setup of the experiment, and outlinethe analysis of the results. This paper is a complement to a (more technical) related paper thatdescribes the econometrics and statistical analysis of the results.
Resumo:
This paper discusses inference in self exciting threshold autoregressive (SETAR)models. Of main interest is inference for the threshold parameter. It iswell-known that the asymptotics of the corresponding estimator depend uponwhether the SETAR model is continuous or not. In the continuous case, thelimiting distribution is normal and standard inference is possible. Inthe discontinuous case, the limiting distribution is non-normal and cannotbe estimated consistently. We show valid inference can be drawn by theuse of the subsampling method. Moreover, the method can even be extendedto situations where the (dis)continuity of the model is unknown. In thiscase, also the inference for the regression parameters of the modelbecomes difficult and subsampling can be used advantageously there aswell. In addition, we consider an hypothesis test for the continuity ofthe SETAR model. A simulation study examines small sample performance.
Resumo:
Research on judgment and decision making presents a confusing picture of human abilities. For example, much research has emphasized the dysfunctional aspects of judgmental heuristics, and yet, other findings suggest that these can be highly effective. A further line of research has modeled judgment as resulting from as if linear models. This paper illuminates the distinctions in these approaches by providing a common analytical framework based on the central theoretical premise that understanding human performance requires specifying how characteristics of the decision rules people use interact with the demands of the tasks they face. Our work synthesizes the analytical tools of lens model research with novel methodology developed to specify the effectiveness of heuristics in different environments and allows direct comparisons between the different approaches. We illustrate with both theoretical analyses and simulations. We further link our results to the empirical literature by a meta-analysis of lens model studies and estimate both human andheuristic performance in the same tasks. Our results highlight the trade-off betweenlinear models and heuristics. Whereas the former are cognitively demanding, the latterare simple to use. However, they require knowledge and thus maps of when andwhich heuristic to employ.
Resumo:
The objective of this paper is to compare the performance of twopredictive radiological models, logistic regression (LR) and neural network (NN), with five different resampling methods. One hundred and sixty-seven patients with proven calvarial lesions as the only known disease were enrolled. Clinical and CT data were used for LR and NN models. Both models were developed with cross validation, leave-one-out and three different bootstrap algorithms. The final results of each model were compared with error rate and the area under receiver operating characteristic curves (Az). The neural network obtained statistically higher Az than LR with cross validation. The remaining resampling validation methods did not reveal statistically significant differences between LR and NN rules. The neural network classifier performs better than the one based on logistic regression. This advantage is well detected by three-fold cross-validation, but remains unnoticed when leave-one-out or bootstrap algorithms are used.
Resumo:
In this paper we analyse, using Monte Carlo simulation, the possible consequences of incorrect assumptions on the true structure of the random effects covariance matrix and the true correlation pattern of residuals, over the performance of an estimation method for nonlinear mixed models. The procedure under study is the well known linearization method due to Lindstrom and Bates (1990), implemented in the nlme library of S-Plus and R. Its performance is studied in terms of bias, mean square error (MSE), and true coverage of the associated asymptotic confidence intervals. Ignoring other criteria like the convenience of avoiding over parameterised models, it seems worst to erroneously assume some structure than do not assume any structure when this would be adequate.
Resumo:
Evaluating other individuals with respect to personality characteristics plays a crucial role in human relations and it is the focus of attention for research in diverse fields such as psychology and interactive computer systems. In psychology, face perception has been recognized as a key component of this evaluation system. Multiple studies suggest that observers use face information to infer personality characteristics. Interactive computer systems are trying to take advantage of these findings and apply them to increase the natural aspect of interaction and to improve the performance of interactive computer systems. Here, we experimentally test whether the automatic prediction of facial trait judgments (e.g. dominance) can be made by using the full appearance information of the face and whether a reduced representation of its structure is sufficient. We evaluate two separate approaches: a holistic representation model using the facial appearance information and a structural model constructed from the relations among facial salient points. State of the art machine learning methods are applied to a) derive a facial trait judgment model from training data and b) predict a facial trait value for any face. Furthermore, we address the issue of whether there are specific structural relations among facial points that predict perception of facial traits. Experimental results over a set of labeled data (9 different trait evaluations) and classification rules (4 rules) suggest that a) prediction of perception of facial traits is learnable by both holistic and structural approaches; b) the most reliable prediction of facial trait judgments is obtained by certain type of holistic descriptions of the face appearance; and c) for some traits such as attractiveness and extroversion, there are relationships between specific structural features and social perceptions.
Resumo:
The performance of different correlation functionals has been tested for alkali metals, Li to Cs, interacting with cluster models simulating different active sites of the Si(111) surface. In all cases, the ab initio Hartree-Fock density has been obtained and used as a starting point. The electronic correlation energy is then introduced as an a posteriori correction to the Hartree-Fock energy using different correlation functionals. By making use of the ionic nature of the interaction and of different dissociation limits we have been able to prove that all functionals tested introduce the right correlation energy, although to a different extent. Hence, correlation functionals appear as an effective and easy way to introduce electronic correlation in the ab initio Hartree-Fock description of the chemisorption bond in complex systems where conventional configuration interaction techniques cannot be used. However, the calculated energies may differ by some tens of eV. Therefore, these methods can be employed to get a qualitative idea of how important correlation effects are, but they have some limitations if accurate binding energies are to be obtained.
Resumo:
Application of semi-distributed hydrological models to large, heterogeneous watersheds deals with several problems. On one hand, the spatial and temporal variability in catchment features should be adequately represented in the model parameterization, while maintaining the model complexity in an acceptable level to take advantage of state-of-the-art calibration techniques. On the other hand, model complexity enhances uncertainty in adjusted model parameter values, therefore increasing uncertainty in the water routing across the watershed. This is critical for water quality applications, where not only streamflow, but also a reliable estimation of the surface versus subsurface contributions to the runoff is needed. In this study, we show how a regularized inversion procedure combined with a multiobjective function calibration strategy successfully solves the parameterization of a complex application of a water quality-oriented hydrological model. The final value of several optimized parameters showed significant and consistentdifferences across geological and landscape features. Although the number of optimized parameters was significantly increased by the spatial and temporal discretization of adjustable parameters, the uncertainty in water routing results remained at reasonable values. In addition, a stepwise numerical analysis showed that the effects on calibration performance due to inclusion of different data types in the objective function could be inextricably linked. Thus caution should be taken when adding or removing data from an aggregated objective function.