950 resultados para Software repository mining. Process mining. Software developer contribution


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper aims to present a preliminary version of asupport-system in the air transport passenger domain. This system relies upon an underlying on-tological structure representing a normative framework to facilitatethe provision of contextualized relevant legal information.This information includes the pas-senger's rights and itenhances self-litigation and the decision-making process of passengers.Our contribution is based in the attempt of rendering a user-centric-legal informationgroundedon case-scenarios of the most pronounced incidents related to the consumer complaints in the EU.A number ofadvantages with re-spect to the current state-of-the-art services are discussed and a case study illu-strates a possible technological application.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract. Speckle is being used as a characterization tool for the analysis of the dynamics of slow-varying phenomena occurring in biological and industrial samples at the surface or near-surface regions. The retrieved data take the form of a sequence of speckle images. These images contain information about the inner dynamics of the biological or physical process taking place in the sample. Principal component analysis (PCA) is able to split the original data set into a collection of classes. These classes are related to processes showing different dynamics. In addition, statistical descriptors of speckle images are used to retrieve information on the characteristics of the sample. These statistical descriptors can be calculated in almost real time and provide a fast monitoring of the sample. On the other hand, PCA requires a longer computation time, but the results contain more information related to spatial–temporal patterns associated to the process under analysis. This contribution merges both descriptions and uses PCA as a preprocessing tool to obtain a collection of filtered images, where statistical descriptors are evaluated on each of them. The method applies to slow-varying biological and industrial processes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Orthodox contingency theory links effective organisational performance to compatible relationships between the environment and organisation strategy and structure and assumes that organisations have the capacity to adapt as the environment changes. Recent contributions to the literature on organisation theory claim that the key to effective performance is effective adaptation which in turn requires the simultaneous reconciliation of efficiency and innovation which is afforded by an unique environment-organisation configuration. The literature on organisation theory recognises the continuing confusion caused by the fragmented and often conflicting results from cross-sectional studies. Although the case is made for longitudinal studies which comprehensively describe the evolving relationship between the environment and the organisation there is little to suggest how such studies should be executed in practice. Typically the choice is between the approaches of the historicised case study and statistical analysis of large populations which examine the relationship between environment and organisation strategy and/or structure and ignore the product-process relationship. This study combines the historicised case study and the multi-variable and ordinal scale approach of statistical analysis to construct an analytical framework which tracks and exposes the environment-organisation-performance relationship over time. The framework examines changes in the environment, strategy and structure and uniquely includes an assessment of the organisation's product-process relationship and its contribution to organisational efficiency and innovation. The analytical framework is applied to examine the evolving environment-organisation relationship of two organisations in the same industry over the same twenty-five year period to provide a sector perspective of organisational adaptation. The findings demonstrate the significance of the environment-organisation configuration to the scope and frequency of adaptation and suggest that the level of sector homogeneity may be linked to the level of product-process standardisation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Baths containing sulphuric acid as catalyst and others with selected secondary catalysts (methane sulphonic acid - MSA, SeO2, a KBrO3/KIO3 mixture, indium, uranium and commercial high speed catalysts (HEEF-25 and HEEF-405)) were studied. The secondary catalysts influenced CCE, brightness and cracking. Chromium deposition mechanisms were studied in Part II using potentiostatic and potentiodynamic electroanalytical techniques under stationary and hydrodynamic conditions. Sulphuric acid as a primary catalyst and MSA, HEEF-25, HEEF-405 and sulphosalycilic acid as co-catalysts were explored for different rotation, speeds and scan rates. Maximum current was resolved into diffusion and kinetically limited components, and a contribution towards understanding the electrochemical mechanism is proposed. Reaction kinetics were further studied for H2SO4, MSA and methane disulphonic acid catalysed systems and their influence on reaction mechanisms elaborated. Charge transfer coefficient and electrochemical reaction rate orders for the first stage of the electrodeposition process were determined. A contribution was made toward understanding of H2SO4 and MSA influence on the evolution rate of hydrogen. Anodic dissolution of chromium in the chromic acid solution was studied with a number of techniques. An electrochemical dissolution mechanism is proposed, based on the results of rotating gold ring disc experiments and scanning electron microscopy. Finally, significant increases in chromium electrodeposition rates under non-stationary conditions (PRC mode) were studied and a deposition mechanisms is elaborated based on experimental data and theoretical considerations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Atmospheric dust samples collected along a transect off the West African coast have been investigated for their lipid content and compound-specific stable carbon isotope compositions. The saturated hydrocarbon fractions of the organic solvent extracts consist mainly of long-chain n-alkanes derived from epicuticular wax coatings of terrestrial plants. Backward trajectories for each sampling day and location were calculated using a global atmospheric circulation model. The main atmospheric transport took place in the low-level trade-wind layer, except in the southern region, where long-range transport in the mid-troposphere occurred. Changes in the chain length distributions of the n-alkane homologous series are probably related to aridity, rather than temperature or vegetation type. The carbon preference of the leaf-wax n-alkanes shows significant variation, attributed to a variable contribution of fossil fuel- or marine-derived lipids. The effect of this nonwax contribution on the d13C values of the two dominant n-alkanes in the aerosols, n-C29 and n-C31 alkane, is, however, insignificant. Their d13C values were translated into a percentage of C4 vs. C3 plant type contribution, using a two-component mixing equation with isotopic end-member values from the literature. The data indicate that only regions with a predominant C4 type vegetation, i.e. the Sahara, the Sahel, and Gabon, supply C4 plant-derived lipids to dust organic matter. The stable carbon isotopic compositions of leaf-wax lipids in aerosols mainly reflect the modern vegetation type along their transport pathway. Wind abrasion of wax particles from leaf surfaces, enhanced by a sandblasting effect, is most probably the dominant process of terrigenous lipid contribution to aerosols.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this project a research both in finding predictors via clustering techniques and in reviewing the Data Mining free software is achieved. The research is based in a case of study, from where additionally to the KDD free software used by the scientific community; a new free tool for pre-processing the data is presented. The predictors are intended for the e-learning domain as the data from where these predictors have to be inferred are student qualifications from different e-learning environments. Through our case of study not only clustering algorithms are tested but also additional goals are proposed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present a method to enhance fault localization for software systems based on a frequent pattern mining algorithm. Our method is based on a large set of test cases for a given set of programs in which faults can be detected. The test executions are recorded as function call trees. Based on test oracles the tests can be classified into successful and failing tests. A frequent pattern mining algorithm is used to identify frequent subtrees in successful and failing test executions. This information is used to rank functions according to their likelihood of containing a fault. The ranking suggests an order in which to examine the functions during fault analysis. We validate our approach experimentally using a subset of Siemens benchmark programs.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Esta memoria es el resultado de un proyecto cuyo objetivo ha sido realizar un análisis de la posible aplicación de técnicas relativas al Process Mining para entornos AmI (Ambient Intelligence). Dicho análisis tiene la facultad de presentar de forma clara los resultados extraídos de los procesos relativos a un caso de uso planteado, así como de aplicar dichos resultados a aplicaciones relativas a entornos AmI, como automatización de tareas o simulación social basada en agentes. Para que dicho análisis sea comprensible por el lector, se presentan detalladas explicaciones de los conceptos tratados y las técnicas empleadas. Además, se analizan exhaustivamente las dos herramientas software más utilizadas en cuanto a minería de procesos se refiere, ProM y Disco, presentando ventajas e inconvenientes de cada una, así como una comparación entre las dos. Posteriormente se ha desarrollado una metodología para el análisis de procesos con la herramienta ProM, anteriormente mencionada, explicando cuidadosamente cada uno de los pasos así como los fundamentos de los algoritmos utilizados. Por último, se han presentado las conclusiones extraídas del trabajo, así como las posibles líneas de continuación del proyecto.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Computer software plays an important role in business, government, society and sciences. To solve real-world problems, it is very important to measure the quality and reliability in the software development life cycle (SDLC). Software Engineering (SE) is the computing field concerned with designing, developing, implementing, maintaining and modifying software. The present paper gives an overview of the Data Mining (DM) techniques that can be applied to various types of SE data in order to solve the challenges posed by SE tasks such as programming, bug detection, debugging and maintenance. A specific DM software is discussed, namely one of the analytical tools for analyzing data and summarizing the relationships that have been identified. The paper concludes that the proposed techniques of DM within the domain of SE could be well applied in fields such as Customer Relationship Management (CRM), eCommerce and eGovernment. ACM Computing Classification System (1998): H.2.8.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Software product line modeling aims at capturing a set of software products in an economic yet meaningful way. We introduce a class of variability models that capture the sharing between the software artifacts forming the products of a software product line (SPL) in a hierarchical fashion, in terms of commonalities and orthogonalities. Such models are useful when analyzing and verifying all products of an SPL, since they provide a scheme for divide-and-conquer-style decomposition of the analysis or verification problem at hand. We define an abstract class of SPLs for which variability models can be constructed that are optimal w.r.t. the chosen representation of sharing. We show how the constructed models can be fed into a previously developed algorithmic technique for compositional verification of control-flow temporal safety properties, so that the properties to be verified are iteratively decomposed into simpler ones over orthogonal parts of the SPL, and are not re-verified over the shared parts. We provide tool support for our technique, and evaluate our tool on a small but realistic SPL of cash desks.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The nation's freeway systems are becoming increasingly congested. A major contribution to traffic congestion on freeways is due to traffic incidents. Traffic incidents are non-recurring events such as accidents or stranded vehicles that cause a temporary roadway capacity reduction, and they can account for as much as 60 percent of all traffic congestion on freeways. One major freeway incident management strategy involves diverting traffic to avoid incident locations by relaying timely information through Intelligent Transportation Systems (ITS) devices such as dynamic message signs or real-time traveler information systems. The decision to divert traffic depends foremost on the expected duration of an incident, which is difficult to predict. In addition, the duration of an incident is affected by many contributing factors. Determining and understanding these factors can help the process of identifying and developing better strategies to reduce incident durations and alleviate traffic congestion. A number of research studies have attempted to develop models to predict incident durations, yet with limited success. ^ This dissertation research attempts to improve on this previous effort by applying data mining techniques to a comprehensive incident database maintained by the District 4 ITS Office of the Florida Department of Transportation (FDOT). Two categories of incident duration prediction models were developed: "offline" models designed for use in the performance evaluation of incident management programs, and "online" models for real-time prediction of incident duration to aid in the decision making of traffic diversion in the event of an ongoing incident. Multiple data mining analysis techniques were applied and evaluated in the research. The multiple linear regression analysis and decision tree based method were applied to develop the offline models, and the rule-based method and a tree algorithm called M5P were used to develop the online models. ^ The results show that the models in general can achieve high prediction accuracy within acceptable time intervals of the actual durations. The research also identifies some new contributing factors that have not been examined in past studies. As part of the research effort, software code was developed to implement the models in the existing software system of District 4 FDOT for actual applications. ^

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The principal topic of this work is the application of data mining techniques, in particular of machine learning, to the discovery of knowledge in a protein database. In the first chapter a general background is presented. Namely, in section 1.1 we overview the methodology of a Data Mining project and its main algorithms. In section 1.2 an introduction to the proteins and its supporting file formats is outlined. This chapter is concluded with section 1.3 which defines that main problem we pretend to address with this work: determine if an amino acid is exposed or buried in a protein, in a discrete way (i.e.: not continuous), for five exposition levels: 2%, 10%, 20%, 25% and 30%. In the second chapter, following closely the CRISP-DM methodology, whole the process of construction the database that supported this work is presented. Namely, it is described the process of loading data from the Protein Data Bank, DSSP and SCOP. Then an initial data exploration is performed and a simple prediction model (baseline) of the relative solvent accessibility of an amino acid is introduced. It is also introduced the Data Mining Table Creator, a program developed to produce the data mining tables required for this problem. In the third chapter the results obtained are analyzed with statistical significance tests. Initially the several used classifiers (Neural Networks, C5.0, CART and Chaid) are compared and it is concluded that C5.0 is the most suitable for the problem at stake. It is also compared the influence of parameters like the amino acid information level, the amino acid window size and the SCOP class type in the accuracy of the predictive models. The fourth chapter starts with a brief revision of the literature about amino acid relative solvent accessibility. Then, we overview the main results achieved and finally discuss about possible future work. The fifth and last chapter consists of appendices. Appendix A has the schema of the database that supported this thesis. Appendix B has a set of tables with additional information. Appendix C describes the software provided in the DVD accompanying this thesis that allows the reconstruction of the present work.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper presents the Realistic Scenarios Generator (RealScen), a tool that processes data from real electricity markets to generate realistic scenarios that enable the modeling of electricity market players’ characteristics and strategic behavior. The proposed tool provides significant advantages to the decision making process in an electricity market environment, especially when coupled with a multi-agent electricity markets simulator. The generation of realistic scenarios is performed using mechanisms for intelligent data analysis, which are based on artificial intelligence and data mining algorithms. These techniques allow the study of realistic scenarios, adapted to the existing markets, and improve the representation of market entities as software agents, enabling a detailed modeling of their profiles and strategies. This work contributes significantly to the understanding of the interactions between the entities acting in electricity markets by increasing the capability and realism of market simulations.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Transcriptional Regulatory Networks (TRNs) are powerful tool for representing several interactions that occur within a cell. Recent studies have provided information to help researchers in the tasks of building and understanding these networks. One of the major sources of information to build TRNs is biomedical literature. However, due to the rapidly increasing number of scientific papers, it is quite difficult to analyse the large amount of papers that have been published about this subject. This fact has heightened the importance of Biomedical Text Mining approaches in this task. Also, owing to the lack of adequate standards, as the number of databases increases, several inconsistencies concerning gene and protein names and identifiers are common. In this work, we developed an integrated approach for the reconstruction of TRNs that retrieve the relevant information from important biological databases and insert it into a unique repository, named KREN. Also, we applied text mining techniques over this integrated repository to build TRNs. However, was necessary to create a dictionary of names and synonyms associated with these entities and also develop an approach that retrieves all the abstracts from the related scientific papers stored on PubMed, in order to create a corpora of data about genes. Furthermore, these tasks were integrated into @Note, a software system that allows to use some methods from the Biomedical Text Mining field, including an algorithms for Named Entity Recognition (NER), extraction of all relevant terms from publication abstracts, extraction relationships between biological entities (genes, proteins and transcription factors). And finally, extended this tool to allow the reconstruction Transcriptional Regulatory Networks through using scientific literature.