950 resultados para Software repository mining. Process mining. Software developer contribution
Resumo:
Data mining is one of the most important analysis techniques to automatically extract knowledge from large amount of data. Nowadays, data mining is based on low-level specifications of the employed techniques typically bounded to a specific analysis platform. Therefore, data mining lacks a modelling architecture that allows analysts to consider it as a truly software-engineering process. Bearing in mind this situation, we propose a model-driven approach which is based on (i) a conceptual modelling framework for data mining, and (ii) a set of model transformations to automatically generate both the data under analysis (that is deployed via data-warehousing technology) and the analysis models for data mining (tailored to a specific platform). Thus, analysts can concentrate on understanding the analysis problem via conceptual data-mining models instead of wasting efforts on low-level programming tasks related to the underlying-platform technical details. These time consuming tasks are now entrusted to the model-transformations scaffolding. The feasibility of our approach is shown by means of a hypothetical data-mining scenario where a time series analysis is required.
Resumo:
Comunicación presentada en las XVI Jornadas de Ingeniería del Software y Bases de Datos, JISBD 2011, A Coruña, 5-7 septiembre 2011.
Resumo:
Imagine being told that your wage was going to be cut in half. Well, that’s what’s soon going to happen to those who make money from Bitcoin mining, the process of earning the online currency Bitcoin. The current expected date for this change is 11 July 2016. Many see this as the day when Bitcoin prices will rocket and when Bitcoin owners could make a great deal of money. Others see it as the start of a Bitcoin crash. At present no one quite knows which way it will go. Bitcoin was created in 2009 by someone known as Satoshi Nakamoto, borrowing from a whole lot of research methods. It is a cryptocurrency, meaning it uses digital encryption techniques to create bitcoins and secure financial transactions. It doesn’t need a central government or organisation to regulate it, nor a broker to manage payments. Conventional currencies usually have a central bank that creates money and controls its supply. Bitcoin is instead created when individuals “mine” for it by using their computers to perform complex calculations through special software. The algorithm behind Bitcoin is designed to limit the number of bitcoins that can ever be created. All Bitcoin transactions are recorded on a public database known as a blockchain. Every time someone mines for Bitcoin, it is recorded with a new block that is transmitted to every Bitcoin app across the network, like a bank updating its online records.
MINING AND VERIFICATION OF TEMPORAL EVENTS WITH APPLICATIONS IN COMPUTER MICRO-ARCHITECTURE RESEARCH
Resumo:
Computer simulation programs are essential tools for scientists and engineers to understand a particular system of interest. As expected, the complexity of the software increases with the depth of the model used. In addition to the exigent demands of software engineering, verification of simulation programs is especially challenging because the models represented are complex and ridden with unknowns that will be discovered by developers in an iterative process. To manage such complexity, advanced verification techniques for continually matching the intended model to the implemented model are necessary. Therefore, the main goal of this research work is to design a useful verification and validation framework that is able to identify model representation errors and is applicable to generic simulators. The framework that was developed and implemented consists of two parts. The first part is First-Order Logic Constraint Specification Language (FOLCSL) that enables users to specify the invariants of a model under consideration. From the first-order logic specification, the FOLCSL translator automatically synthesizes a verification program that reads the event trace generated by a simulator and signals whether all invariants are respected. The second part consists of mining the temporal flow of events using a newly developed representation called State Flow Temporal Analysis Graph (SFTAG). While the first part seeks an assurance of implementation correctness by checking that the model invariants hold, the second part derives an extended model of the implementation and hence enables a deeper understanding of what was implemented. The main application studied in this work is the validation of the timing behavior of micro-architecture simulators. The study includes SFTAGs generated for a wide set of benchmark programs and their analysis using several artificial intelligence algorithms. This work improves the computer architecture research and verification processes as shown by the case studies and experiments that have been conducted.
Resumo:
This paper presents the proposal for a reference model for developing software aimed at small companies. Despite the importance of that represent the small software companies in Latin America, the fact of not having its own standards, and able to meet their specific, has created serious difficulties in improving their process and also in quality certification. In this sense and as a contribution to better understanding of the subject they propose a reference model and as a means to validate the proposal, presents a report of its application in a small Brazilian company, committed to certification of the quality model MPS.BR.
Resumo:
Over the past years, component-based software engineering has become an established paradigm in the area of complex software intensive systems. However, many techniques for analyzing these systems for critical properties currently do not make use of the component orientation. In particular, safety analysis of component-based systems is an open field of research. In this chapter we investigate the problems arising and define a set of requirements that apply when adapting the analysis of safety properties to a component-based software engineering process. Based on these requirements some important component-oriented safety evaluation approaches are examined and compared.
Resumo:
PURPOSE: Fatty liver disease (FLD) is an increasing prevalent disease that can be reversed if detected early. Ultrasound is the safest and ubiquitous method for identifying FLD. Since expert sonographers are required to accurately interpret the liver ultrasound images, lack of the same will result in interobserver variability. For more objective interpretation, high accuracy, and quick second opinions, computer aided diagnostic (CAD) techniques may be exploited. The purpose of this work is to develop one such CAD technique for accurate classification of normal livers and abnormal livers affected by FLD. METHODS: In this paper, the authors present a CAD technique (called Symtosis) that uses a novel combination of significant features based on the texture, wavelet transform, and higher order spectra of the liver ultrasound images in various supervised learning-based classifiers in order to determine parameters that classify normal and FLD-affected abnormal livers. RESULTS: On evaluating the proposed technique on a database of 58 abnormal and 42 normal liver ultrasound images, the authors were able to achieve a high classification accuracy of 93.3% using the decision tree classifier. CONCLUSIONS: This high accuracy added to the completely automated classification procedure makes the authors' proposed technique highly suitable for clinical deployment and usage.
Resumo:
Introduction: A major focus of data mining process - especially machine learning researches - is to automatically learn to recognize complex patterns and help to take the adequate decisions strictly based on the acquired data. Since imaging techniques like MPI – Myocardial Perfusion Imaging on Nuclear Cardiology, can implicate a huge part of the daily workflow and generate gigabytes of data, there could be advantages on Computerized Analysis of data over Human Analysis: shorter time, homogeneity and consistency, automatic recording of analysis results, relatively inexpensive, etc.Objectives: The aim of this study relates with the evaluation of the efficacy of this methodology on the evaluation of MPI Stress studies and the process of decision taking concerning the continuation – or not – of the evaluation of each patient. It has been pursued has an objective to automatically classify a patient test in one of three groups: “Positive”, “Negative” and “Indeterminate”. “Positive” would directly follow to the Rest test part of the exam, the “Negative” would be directly exempted from continuation and only the “Indeterminate” group would deserve the clinician analysis, so allowing economy of clinician’s effort, increasing workflow fluidity at the technologist’s level and probably sparing time to patients. Methods: WEKA v3.6.2 open source software was used to make a comparative analysis of three WEKA algorithms (“OneR”, “J48” and “Naïve Bayes”) - on a retrospective study using the comparison with correspondent clinical results as reference, signed by nuclear cardiologist experts - on “SPECT Heart Dataset”, available on University of California – Irvine, at the Machine Learning Repository. For evaluation purposes, criteria as “Precision”, “Incorrectly Classified Instances” and “Receiver Operating Characteristics (ROC) Areas” were considered. Results: The interpretation of the data suggests that the Naïve Bayes algorithm has the best performance among the three previously selected algorithms. Conclusions: It is believed - and apparently supported by the findings - that machine learning algorithms could significantly assist, at an intermediary level, on the analysis of scintigraphic data obtained on MPI, namely after Stress acquisition, so eventually increasing efficiency of the entire system and potentially easing both roles of Technologists and Nuclear Cardiologists. In the actual continuation of this study, it is planned to use more patient information and significantly increase the population under study, in order to allow improving system accuracy.
Resumo:
Dissertação apresentada para a obtenção do Grau de Doutor em Informática pela Universidade Nova de Lisboa, Faculdade de Ciências e Tecnologia
Resumo:
Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para obtenção do grau de Mestre em Engenharia Electrotécnica e de Computadores
Resumo:
Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para obtenção do grau de Mestre em Engenharia Electrotécnica e de Computadores
Resumo:
This paper discusses the results of applied research on the eco-driving domain based on a huge data set produced from a fleet of Lisbon's public transportation buses for a three-year period. This data set is based on events automatically extracted from the control area network bus and enriched with GPS coordinates, weather conditions, and road information. We apply online analytical processing (OLAP) and knowledge discovery (KD) techniques to deal with the high volume of this data set and to determine the major factors that influence the average fuel consumption, and then classify the drivers involved according to their driving efficiency. Consequently, we identify the most appropriate driving practices and styles. Our findings show that introducing simple practices, such as optimal clutch, engine rotation, and engine running in idle, can reduce fuel consumption on average from 3 to 5l/100 km, meaning a saving of 30 l per bus on one day. These findings have been strongly considered in the drivers' training sessions.
Resumo:
Worldwide electricity markets have been evolving into regional and even continental scales. The aim at an efficient use of renewable based generation in places where it exceeds the local needs is one of the main reasons. A reference case of this evolution is the European Electricity Market, where countries are connected, and several regional markets were created, each one grouping several countries, and supporting transactions of huge amounts of electrical energy. The continuous transformations electricity markets have been experiencing over the years create the need to use simulation platforms to support operators, regulators, and involved players for understanding and dealing with this complex environment. This paper focuses on demonstrating the advantage that real electricity markets data has for the creation of realistic simulation scenarios, which allow the study of the impacts and implications that electricity markets transformations will bring to the participant countries. A case study using MASCEM (Multi-Agent System for Competitive Electricity Markets) is presented, with a scenario based on real data, simulating the European Electricity Market environment, and comparing its performance when using several different market mechanisms.
Resumo:
Trabalho de Projeto apresentado como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação
Resumo:
A Work Project, presented as part of the requirements for the Award of a Masters Degree in Management from the NOVA – School of Business and Economics