903 resultados para scoring
Resumo:
A disadvantage of multiple-choice tests is that students have incentives to guess. To discourage guessing, it is common to use scoring rules that either penalize wrong answers or reward omissions. These scoring rules are considered equivalent in psychometrics, although experimental evidence has not always been consistent with this claim. We model students' decisions and show, first, that equivalence holds only under risk neutrality and, second, that the two rules can be modified so that they become equivalent even under risk aversion. This paper presents the results of a field experiment in which we analyze the decisions of subjects taking multiple-choice exams. The evidence suggests that differences between scoring rules are due to risk aversion as theory predicts. We also find that the number of omitted items depends on the scoring rule, knowledge, gender and other covariates.
Resumo:
The main contribution of this work is to analyze and describe the state of the art performance as regards answer scoring systems from the SemEval- 2013 task, as well as to continue with the development of an answer scoring system (EHU-ALM) developed in the University of the Basque Country. On the overall this master thesis focuses on finding any possible configuration that lets improve the results in the SemEval dataset by using attribute engineering techniques in order to find optimal feature subsets, along with trying different hierarchical configurations in order to analyze its performance against the traditional one versus all approach. Altogether, throughout the work we propose two alternative strategies: on the one hand, to improve the EHU-ALM system without changing the architecture, and, on the other hand, to improve the system adapting it to an hierarchical con- figuration. To build such new models we describe and use distinct attribute engineering, data preprocessing, and machine learning techniques.
Resumo:
A concessão de crédito a empresas que participam do mercado consiste na entrega de um ativo em determinado momento, com a promessa de pagamento deste bem ou direito em data futura. Tal situação se configura como um evento incerto, pois existe a possibilidade de que tal obrigação não seja honrada pela promitente compradora, originando desta forma, o risco de crédito. Cabe à parte concessora do ativo que origina o risco de crédito, verificar a capacidade de seu cliente em cumprir o compromisso futuro assumido, analisando as variáveis que sugerem o sucesso da operação de crédito. As empresas que se encontram em fase de implantação caracterizam-se não somente pela ausência de histórico das variáveis acima, como também pelo aumento considerável do risco de continuidade. Tal situação é comprovada por pesquisas realizadas em empresas com até cinco anos de atuação. A impossibilidade na mensuração da capacidade de crédito proporcionada por este cenário, ocasiona severa restrição creditícia às empresas novas, principalmente ao crédito de longo prazo, imprescindível nesta fase de investimentos. Entretanto, esta restrição não se verifica em empresas de franquia, cujo empreendedor tem o privilégio de iniciar seu negócio com linhas de crédito de investimentos já prontas no mercado com esta finalidade. Este estudo objetiva identificar quais as características presentes em empresas franqueadas que permitem a concessão de crédito segura na fase de implantação por parte das instituições financeiras e se tais características podem discriminar variáveis que são determinantes no sucesso da franqueada proponente ao crédito bancário. A aplicação de análise fatorial em banco de dados com empresas de franquia permitiu identificar com sucesso um grupo de sete principais variáveis principais, que serviram de base a um modelo de regressão logística e análise discriminante. O modelo de regressão logística mostrou-se bom para a melhora da probabilidade de acerto de empresas solventes ao passo que a análise discriminante não apresentou melhora nesses resultados.
Resumo:
This paper investigates a method of automatic pronunciation scoring for use in computer-assisted language learning (CALL) systems. The method utilizes a likelihood-based `Goodness of Pronunciation' (GOP) measure which is extended to include individual thresholds for each phone based on both averaged native confidence scores and on rejection statistics provided by human judges. Further improvements are obtained by incorporating models of the subject's native language and by augmenting the recognition networks to include expected pronunciation errors. The various GOP measures are assessed using a specially recorded database of non-native speakers which has been annotated to mark phone-level pronunciation errors. Since pronunciation assessment is highly subjective, a set of four performance measures has been designed, each of them measuring different aspects of how well computer-derived phone-level scores agree with human scores. These performance measures are used to cross-validate the reference annotations and to assess the basic GOP algorithm and its refinements. The experimental results suggest that a likelihood-based pronunciation scoring metric can achieve usable performance, especially after applying the various enhancements.
Resumo:
The identification of near native protein-protein complexes among a set of decoys remains highly challenging. A stategy for improving the success rate of near native detection is to enrich near native docking decoys in a small number of top ranked decoys. Recently, we found that a combination of three scoring functions (energy, conservation, and interface propensity) can predict the location of binding interface regions with reasonable accuracy. Here, these three scoring functions are modified and combined into a consensus scoring function called ENDES for enriching near native docking decoys. We found that all individual scores result in enrichment for the majority of 28 targets in ZDOCK2.3 decoy set and the 22 targets in Benchmark 2.0. Among the three scores, the interface propensity score yields the highest enrichment in both sets of protein complexes. When these scores are combined into the ENDES consensus score, a significant increase in enrichment of near-native structures is found. For example, when 2000 dock decoys are reduced to 200 decoys by ENDES, the fraction of near-native structures in docking decoys increases by a factor of about six in average. ENDES was implemented into a computer program that is available for download at http://sparks.informatics.iupui.edu.
Resumo:
R. Daly and Q. Shen. A Framework for the Scoring of Operators on the Search Space of Equivalence Classes of Bayesian Network Structures. Proceedings of the 2005 UK Workshop on Computational Intelligence, pages 67-74.
Resumo:
The purpose of this preliminary study is to identify signs of fatigue in specific muscle groups that in turn directly influence accuracy in professional darts. Electromyography (EMG) sensors are employed to monitor the electrical activity produced by skeletal muscles of the trunk and upper limb during throw. It is noted that the Flexor Pollicis Brevis muscle which controls the critical release action during throw shows signs of fatigue. This is accompanied by an inherent increase in mean integral EMG amplitude for a number of other throw related muscles indicating an attempt to maintain constant applied throwing force. A strong correlation is shown to exist between average score and decrease in mean integral ECG amplitude for the Flexor Pollicis Brevis.
Resumo:
Objectives: Genetic testing for the breast and ovarian cancer susceptibility genes BRCA1 and BRCA2 has important implications for the clinical management of people found to carry a mutation. However, genetic testing is expensive and may be associated with adverse psychosocial effects. To provide a cost-efficient and clinically appropriate genetic counselling service, genetic testing should be targeted at those individuals most likely to carry pathogenic mutations. Several algorithms that predict the likelihood of carrying a BRCA1 or a BRCA2 mutation are currently used in clinical practice to identify such individuals.
Resumo:
Computer-assisted pathological immunohistochemistry scoring is more time-effective than conventional scoring, but provides no analytical advantage