979 resultados para Automatic evaluation
Resumo:
La enseñanza y evaluación automática a través de un sistema Computer Based Assessment (CBA) requiere de software especializado que se adapte a la tipología de actividades a tratar y evaluar. En esta tesis se ha desarrollado un entorno CBA que facilita el aprendizaje y evaluación de los principales temas de una asignatura de bases de datos. Para ello se han analizado las herramientas existentes en cada uno de estos temas (Diagramas Entidad/Relación, diagramas de clases, esquemas de bases de datos relacionales, normalización, consultas en álgebra relacional y lenguaje SQL) y para cada uno de ellos se ha analizado, diseñado e implementado un módulo de corrección y evaluación automática que aporta mejoras respecto a los existentes. Estos módulos se han integrado en un mismo entorno al que hemos llamado ACME-DB.
Resumo:
Complex networks have been increasingly used in text analysis, including in connection with natural language processing tools, as important text features appear to be captured by the topology and dynamics of the networks. Following previous works that apply complex networks concepts to text quality measurement, summary evaluation, and author characterization, we now focus on machine translation (MT). In this paper we assess the possible representation of texts as complex networks to evaluate cross-linguistic issues inherent in manual and machine translation. We show that different quality translations generated by NIT tools can be distinguished from their manual counterparts by means of metrics such as in-(ID) and out-degrees (OD), clustering coefficient (CC), and shortest paths (SP). For instance, we demonstrate that the average OD in networks of automatic translations consistently exceeds the values obtained for manual ones, and that the CC values of source texts are not preserved for manual translations, but are for good automatic translations. This probably reflects the text rearrangements humans perform during manual translation. We envisage that such findings could lead to better NIT tools and automatic evaluation metrics.
Resumo:
There is a growing interest of the Computer Science education community for including testing concepts on introductory programming courses. Aiming at contributing to this issue, we introduce POPT, a Problem-Oriented Programming and Testing approach for Introductory Programming Courses. POPT main goal is to improve the traditional method of teaching introductory programming that concentrates mainly on implementation and neglects testing. POPT extends POP (Problem Oriented Programing) methodology proposed on the PhD Thesis of Andrea Mendonça (UFCG). In both methodologies POPT and POP, students skills in dealing with ill-defined problems must be developed since the first programming courses. In POPT however, students are stimulated to clarify ill-defined problem specifications, guided by de definition of test cases (in a table-like manner). This paper presents POPT, and TestBoot a tool developed to support the methodology. In order to evaluate the approach a case study and a controlled experiment (which adopted the Latin Square design) were performed. In an Introductory Programming course of Computer Science and Software Engineering Graduation Programs at the Federal University of Rio Grande do Norte, Brazil. The study results have shown that, when compared to a Blind Testing approach, POPT stimulates the implementation of programs of better external quality the first program version submitted by POPT students passed in twice the number of test cases (professor-defined ones) when compared to non-POPT students. Moreover, POPT students submitted fewer program versions and spent more time to submit the first version to the automatic evaluation system, which lead us to think that POPT students are stimulated to think better about the solution they are implementing. The controlled experiment confirmed the influence of the proposed methodology on the quality of the code developed by POPT students
Resumo:
Este trabalho apresenta uma nova abordagem para avaliação automática de consultas SQL. Essa abordagem propõe uma solução para o desafio de estimular o aprendiz a aperfeiçoar a sua solução: buscando, além de uma resposta que retorna o resultado correto, uma consulta com complexidade próxima da solução ótima. Essa proposta pode ser utilizada em ambientes de educação a distancia ou na educação presencial em atividades de laboratório, incluindo as avaliações. A solução proposta tem como vantagens: (1) o aprendiz recebe um feedback instantâneo durante a atividade prática de programação, o qual permite ao aprendiz refatorar a sua solução em direção a uma solução ótima; (2) completa integração entre o ensino de conceitos de programação com exemplo de fragmentos de programas executáveis on-line; (3) monitoramento das atividades do aprendiz (quantos exemplos foram executados; em cada exercício quantas tentativas de execução foram feitas, etc). Este trabalho é um primeiro passo na direção de construção de um ambiente totalmente assistido (por exemplo com avaliação automática) para ensino da linguagem de programação SQL, onde o professor é liberado do árduo trabalho de correção de comandos SQL podendo realizar tarefas pedagógicas mais relevantes. O método, fundamentado em estatística e métricas da Engenharia de Software, pode ser adaptado para outras linguagens tais como Java e Pascal. Além disso, o LabSQL serve com um laboratório para experimentação de duas novas técnicas, uma de avaliação e outra de acompanhamento, que estão sendo pesquisadas em trabalhos em paralelos: (a) avaliação automática de questões conceituais discursivas, além de permitir as tradicionais perguntas objetivas, (b) método de acompanhamento através de montagem de uma rubrica de avaliação.
Resumo:
The realization that statistical physics methods can be applied to analyze written texts represented as complex networks has led to several developments in natural language processing, including automatic summarization and evaluation of machine translation. Most importantly, so far only a few metrics of complex networks have been used and therefore there is ample opportunity to enhance the statistics-based methods as new measures of network topology and dynamics are created. In this paper, we employ for the first time the metrics betweenness, vulnerability and diversity to analyze written texts in Brazilian Portuguese. Using strategies based on diversity metrics, a better performance in automatic summarization is achieved in comparison to previous work employing complex networks. With an optimized method the Rouge score (an automatic evaluation method used in summarization) was 0.5089, which is the best value ever achieved for an extractive summarizer with statistical methods based on complex networks for Brazilian Portuguese. Furthermore, the diversity metric can detect keywords with high precision, which is why we believe it is suitable to produce good summaries. It is also shown that incorporating linguistic knowledge through a syntactic parser does enhance the performance of the automatic summarizers, as expected, but the increase in the Rouge score is only minor. These results reinforce the suitability of complex network methods for improving automatic summarizers in particular, and treating text in general. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Optical coherence tomography (OCT) is a well-established image modality in ophthalmology and used daily in the clinic. Automatic evaluation of such datasets requires an accurate segmentation of the retinal cell layers. However, due to the naturally low signal to noise ratio and the resulting bad image quality, this task remains challenging. We propose an automatic graph-based multi-surface segmentation algorithm that internally uses soft constraints to add prior information from a learned model. This improves the accuracy of the segmentation and increase the robustness to noise. Furthermore, we show that the graph size can be greatly reduced by applying a smart segmentation scheme. This allows the segmentation to be computed in seconds instead of minutes, without deteriorating the segmentation accuracy, making it ideal for a clinical setup. An extensive evaluation on 20 OCT datasets of healthy eyes was performed and showed a mean unsigned segmentation error of 3.05 ±0.54 μm over all datasets when compared to the average observer, which is lower than the inter-observer variability. Similar performance was measured for the task of drusen segmentation, demonstrating the usefulness of using soft constraints as a tool to deal with pathologies.
Resumo:
Unterstützungssysteme für die Programmierausbildung sind weit verbreitet, doch gängige Standards für den Austausch von allgemeinen (Lern-) Inhalten und Tests erfüllen nicht die speziellen Anforderungen von Programmieraufgaben wie z. B. den Umgang mit komplexen Einreichungen aus mehreren Dateien oder die Kombination verschiedener (automatischer) Bewertungsverfahren. Dadurch können Aufgaben nicht zwischen Systemen ausgetauscht werden, was aufgrund des hohen Aufwands für die Entwicklung guter Aufgaben jedoch wünschenswert wäre. In diesem Beitrag wird ein erweiterbares XML-basiertes Format zum Austausch von Programmieraufgaben vorgestellt, das bereits von mehreren Systemen prototypisch genutzt wird. Die Spezifikation des Austauschformats ist online verfügbar [PFMA].
Resumo:
In the framework of ACTRIS (Aerosols, Clouds, and Trace Gases Research Infrastructure Network) summer 2012 measurement campaign (8 June–17 July 2012), EARLINET organized and performed a controlled exercise of feasibility to demonstrate its potential to perform operational, coordinated measurements and deliver products in near-real time. Eleven lidar stations participated in the exercise which started on 9 July 2012 at 06:00 UT and ended 72 h later on 12 July at 06:00 UT. For the first time, the single calculus chain (SCC) – the common calculus chain developed within EARLINET for the automatic evaluation of lidar data from raw signals up to the final products – was used. All stations sent in real-time measurements of a 1 h duration to the SCC server in a predefined netcdf file format. The pre-processing of the data was performed in real time by the SCC, while the optical processing was performed in near-real time after the exercise ended. 98 and 79 % of the files sent to SCC were successfully pre-processed and processed, respectively. Those percentages are quite large taking into account that no cloud screening was performed on the lidar data. The paper draws present and future SCC users' attention to the most critical parameters of the SCC product configuration and their possible optimal value but also to the limitations inherent to the raw data. The continuous use of SCC direct and derived products in heterogeneous conditions is used to demonstrate two potential applications of EARLINET infrastructure: the monitoring of a Saharan dust intrusion event and the evaluation of two dust transport models. The efforts made to define the measurements protocol and to configure properly the SCC pave the way for applying this protocol for specific applications such as the monitoring of special events, atmospheric modeling, climate research and calibration/validation activities of spaceborne observations.
Resumo:
This paper describes the experimental set up of a system composed by a set of wearable sensors devices for the recording of the motion signals and software algorithms for the signal analysis. This system is able to automatically detect and assess the severity of bradykinesia, tremor, dyskinesia and akinesia motor symptoms. Based on the assessment of the akinesia, the ON-OFF status of the patient is determined for each moment. The assessment performed through the automatic evaluation of the akinesia is compared with the status reported by the patients in their diaries. Preliminary results with a total recording period of 32 hours with two PD patients are presented, where a good correspondence (88.2 +/- 3.7 %) was observed. Best (93.7 por ciento) and worst (87 por ciento) correlation results are illustrated, together with the analysis of the automatic assessment of the akinesia symptom leading to the status determination. The results obtained are promising, and if confirmed with further data, this automatic assessment of PD motor symptoms will lead to a better adjustment of medication dosages and timing, cost savings and an improved quality of life of the patients.
Resumo:
To support the efficient execution of post-genomic multi-centric clinical trials in breast cancer we propose a solution that streamlines the assessment of the eligibility of patients for available trials. The assessment of the eligibility of a patient for a trial requires evaluating whether each eligibility criterion is satisfied and is often a time consuming and manual task. The main focus in the literature has been on proposing different methods for modelling and formalizing the eligibility criteria. However the current adoption of these approaches in clinical care is limited. Less effort has been dedicated to the automatic matching of criteria to the patient data managed in clinical care. We address both aspects and propose a scalable, efficient and pragmatic patient screening solution enabling automatic evaluation of eligibility of patients for a relevant set of trials. This covers the flexible formalization of criteria and of other relevant trial metadata and the efficient management of these representations.
Resumo:
Paper submitted to ACE 2013, 10th IFAC Symposium on Advances in Control Education, University of Sheffield, UK, August 28-30, 2013.
Resumo:
This thesis reviews the existing manufacturing control techniques and identifies their practical drawbacks when applied in a high variety, low and medium volume environment. It advocates that the significant drawbacks inherent in such systems, could impair their applications under such manufacturing environment. The key weaknesses identified in the system were: capacity insensitive nature of Material Requirements Planning (MRP); the centralised approach to planning and control applied in Manufacturing Resources Planning (MRP IT); the fact that Kanban can only be used in repetitive environments; Optimised Productivity Techniques's (OPT) inability to deal with transient bottlenecks, etc. On the other hand, cellular systems offer advantages in simplifying the control problems of manufacturing and the thesis reviews systems designed for cellular manufacturing including Distributed Manufacturing Resources Planning (DMRP) and Flexible Manufacturing System (FMS) controllers. It advocates that a newly developed cellular manufacturing control methodology, which is fully automatic, capacity sensitive and responsive, has the potential to resolve the core manufacturing control problems discussed above. It's development is envisaged within the framework of a DMRP environment, in which each cell is provided with its own MRP II system and decision making capability. It is a cellular based closed loop control system, which revolves on single level Bill-Of-Materials (BOM) structure and hence provides better linkage between shop level scheduling activities and relevant entries in the MPS. This provides a better prospect of undertaking rapid response to changes in the status of manufacturing resources and incoming enquiries. Moreover, it also permits automatic evaluation of capacity and due date constraints and hence facilitates the automation of MPS within such system. A prototype cellular manufacturing control model, was developed to demonstrate the underlying principles and operational logic of the cellular manufacturing control methodology, based on the above concept. This was shown to offer significant advantages from the prospective of operational planning and control. Results of relevant tests proved that the model is capable of producing reasonable due date and undertake automation of MPS. The overall performance of the model proved satisfactory and acceptable.
Resumo:
In this paper we study the generation of lace knitting stitch patterns by using genetic programming. We devise a genetic representation of knitting charts that accurately reflects their usage for hand knitting the pattern. We apply a basic evolutionary algorithm for generating the patterns, where the key of success is evaluation. We propose automatic evaluation of the patterns, without interaction with the user. We present some patterns generated by the method and then discuss further possibilities for bringing automatic evaluation closer to human evaluation. Copyright 2007 ACM.
Resumo:
A large number of methods have been published that aim to evaluate various components of multi-view geometry systems. Most of these have focused on the feature extraction, description and matching stages (the visual front end), since geometry computation can be evaluated through simulation. Many data sets are constrained to small scale scenes or planar scenes that are not challenging to new algorithms, or require special equipment. This paper presents a method for automatically generating geometry ground truth and challenging test cases from high spatio-temporal resolution video. The objective of the system is to enable data collection at any physical scale, in any location and in various parts of the electromagnetic spectrum. The data generation process consists of collecting high resolution video, computing accurate sparse 3D reconstruction, video frame culling and down sampling, and test case selection. The evaluation process consists of applying a test 2-view geometry method to every test case and comparing the results to the ground truth. This system facilitates the evaluation of the whole geometry computation process or any part thereof against data compatible with a realistic application. A collection of example data sets and evaluations is included to demonstrate the range of applications of the proposed system.
Resumo:
The tonic is a fundamental concept in Indian art music. It is the base pitch, which an artist chooses in order to construct the melodies during a rg(a) rendition, and all accompanying instruments are tuned using the tonic pitch. Consequently, tonic identification is a fundamental task for most computational analyses of Indian art music, such as intonation analysis, melodic motif analysis and rg recognition. In this paper we review existing approaches for tonic identification in Indian art music and evaluate them on six diverse datasets for a thorough comparison and analysis. We study the performance of each method in different contexts such as the presence/absence of additional metadata, the quality of audio data, the duration of audio data, music tradition (Hindustani/Carnatic) and the gender of the singer (male/female). We show that the approaches that combine multi-pitch analysis with machine learning provide the best performance in most cases (90% identification accuracy on average), and are robust across the aforementioned contexts compared to the approaches based on expert knowledge. In addition, we also show that the performance of the latter can be improved when additional metadata is available to further constrain the problem. Finally, we present a detailed error analysis of each method, providing further insights into the advantages and limitations of the methods.