950 resultados para Software repository mining. Process mining. Software developer contribution
Resumo:
The performance of parts produced by Free Form Extrusion (FFE), an increasingly popular additive manufacturing technique, depends mainly on their dimensional accuracy, surface quality and mechanical performance. These attributes are strongly influenced by the evolution of the filament temperature and deformation during deposition and solidification. Consequently, the availability of adequate process modelling software would offer a powerful tool to support efficient process set-up and optimisation. This work examines the contribution to the overall heat transfer of various thermal phenomena developing during the manufacturing sequence, including convection and radiation with the environment, conduction with support and between adjacent filaments, radiation between adjacent filaments and convection with entrapped air. The magnitude of the mechanical deformation is also studied. Once this exercise is completed, it is possible to select the material properties, process variables and thermal phenomena that should be taken in for effective numerical modelling of FFE.
Resumo:
Software engineering, software measurement, software process engineering, capability, maturity
Resumo:
This paper presents a process of mining research & development abstract databases to profile current status and to project potential developments for target technologies, The process is called "technology opportunities analysis." This article steps through the process using a sample data set of abstracts from the INSPEC database on the topic o "knowledge discovery and data mining." The paper offers a set of specific indicators suitable for mining such databases to understand innovation prospects. In illustrating the uses of such indicators, it offers some insights into the status of knowledge discovery research*.
Resumo:
O assunto Brasil foi analisado na base de teses francesas DocThèses, compreendendo os anos de 1969 a 1999. Utilizou-se a técnica de Data Mining como ferramenta para obter inteligência e conhecimento. O software utilizado para a limpeza da base DocThèses foi o Infotrans, e, para a preparação dos dados, empregou-se o Dataview. Os resultados da análise foram ilustrados com a aplicação dos pressupostos da Lei de Zipf, classificando-se as informações em trivial, interessante e ruído, conforme a distribuição de freqüência. Conclui-se que a técnica do Data Mining associada a softwares especialistas é uma poderosa aliada no emprego de inteligência no processo decisório em todos os níveis, inclusive o nível macro, pois oferece subsídios para a consolidação, investimento e desenvolvimento de ações e políticas.
Resumo:
The broad aim of biomedical science in the postgenomic era is to link genomic and phenotype information to allow deeper understanding of the processes leading from genomic changes to altered phenotype and disease. The EuroPhenome project (http://www.EuroPhenome.org) is a comprehensive resource for raw and annotated high-throughput phenotyping data arising from projects such as EUMODIC. EUMODIC is gathering data from the EMPReSSslim pipeline (http://www.empress.har.mrc.ac.uk/) which is performed on inbred mouse strains and knock-out lines arising from the EUCOMM project. The EuroPhenome interface allows the user to access the data via the phenotype or genotype. It also allows the user to access the data in a variety of ways, including graphical display, statistical analysis and access to the raw data via web services. The raw phenotyping data captured in EuroPhenome is annotated by an annotation pipeline which automatically identifies statistically different mutants from the appropriate baseline and assigns ontology terms for that specific test. Mutant phenotypes can be quickly identified using two EuroPhenome tools: PhenoMap, a graphical representation of statistically relevant phenotypes, and mining for a mutant using ontology terms. To assist with data definition and cross-database comparisons, phenotype data is annotated using combinations of terms from biological ontologies.
Resumo:
The book presents the state of the art in machine learning algorithms (artificial neural networks of different architectures, support vector machines, etc.) as applied to the classification and mapping of spatially distributed environmental data. Basic geostatistical algorithms are presented as well. New trends in machine learning and their application to spatial data are given, and real case studies based on environmental and pollution data are carried out. The book provides a CD-ROM with the Machine Learning Office software, including sample sets of data, that will allow both students and researchers to put the concepts rapidly to practice.
Resumo:
Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.
Resumo:
The software development industry is constantly evolving. The rise of the agile methodologies in the late 1990s, and new development tools and technologies require growing attention for everybody working within this industry. The organizations have, however, had a mixture of various processes and different process languages since a standard software development process language has not been available. A promising process meta-model called Software & Systems Process Engineering Meta- Model (SPEM) 2.0 has been released recently. This is applied by tools such as Eclipse Process Framework Composer, which is designed for implementing and maintaining processes and method content. Its aim is to support a broad variety of project types and development styles. This thesis presents the concepts of software processes, models, traditional and agile approaches, method engineering, and software process improvement. Some of the most well-known methodologies (RUP, OpenUP, OpenMethod, XP and Scrum) are also introduced with a comparison provided between them. The main focus is on the Eclipse Process Framework and SPEM 2.0, their capabilities, usage and modeling. As a proof of concept, I present a case study of modeling OpenMethod with EPF Composer and SPEM 2.0. The results show that the new meta-model and tool have made it possible to easily manage method content, publish versions with customized content, and connect project tools (such as MS Project) with the process content. The software process modeling also acts as a process improvement activity.
Resumo:
Työn päätavoitteena oli tuoda esiin tärkeimmät julkistamisprosessin tehokkuuteen vaikuttavat tekijät. Tutkimuksessa tarkasteltiin aihetta julkistamisprojektien vetäjän näkökulmasta. Kirjallinen selvitys kattaa keskeisimmät ohjelmistoprosessin, palvelun laadun sekä projektihallinnan teoriat. Kokeellisena aineistona käytettiin asiakkailta ja myynnin sekä käyttöönoton organisaatioilta tullutta palautetta ja asiantuntijahaastatteluita. Case-tuotteena tarkasteltiin suuren kansainvälisen yrityksen jälleenmyymää leikkaussalihallinnan ohjelmistoa. Tärkeimpiä julkistamisprosessin tehokkuuteen vaikuttavia tekijöitä ovat tiekartan ja julkistamispakettien sisällön hallinta, projektin aikataulujen pitäminen, rehellinen ja nopea kommunikaatio myyntikanavaan ja asiakkaille, sekä hyvin toteutettu testaus. Työssä käydään läpi esimerkkistrategioita kehittymiseen näillä alueilla.
Resumo:
Ohjelmiston kehitystyökalut käyttävät infromaatiota kehittäjän tuottamasta lähdekoodista. Informaatiota hyödynnetään ohjelmistoprojektin eri vaiheissa ja eri tarkoituksissa. Moderneissa ohjelmistoprojekteissa käytetyn informaation määrä voi kasvaa erittäin suureksi. Ohjelmistotyökaluilla on omat informaatiomallinsa ja käyttömekanisminsa. Informaation määrä sekä erilliset työkaluinformaatiomallit tekevät erittäin hankalaksi rakentaa joustavaa työkaluympäristöä, erityisesti ongelma-aluekohtaiseen ohjelmiston kehitysprosessiin. Tässä työssä on analysoitu perusinformaatiometamalleja Unified Modeling language kielestä, Python ohjelmointikielestä ja C++ ohjelmointikielestä. Metainformaation taso on rajoitettu rakenteelliselle tasolle. Ajettavat rakenteet on jätetty pois. ModelBase metamalli on yhdistetty olemassa olevista analysoiduista metamalleista. Tätä metamallia voidaan käyttää tulevaisuudessa ohjelmistotyökalujen kehitykseen.
Resumo:
Software engineering is criticized as not being engineering or 'well-developed' science at all. Software engineers seem not to know exactly how long their projects will last, what they will cost, and will the software work properly after release. Measurements have to be taken in software projects to improve this situation. It is of limited use to only collect metrics afterwards. The values of the relevant metrics have to be predicted, too. The predictions (i.e. estimates) form the basis for proper project management. One of the most painful problems in software projects is effort estimation. It has a clear and central effect on other project attributes like cost and schedule, and to product attributes like size and quality. Effort estimation can be used for several purposes. In this thesis only the effort estimation in software projects for project management purposes is discussed. There is a short introduction to the measurement issues, and some metrics relevantin estimation context are presented. Effort estimation methods are covered quite broadly. The main new contribution in this thesis is the new estimation model that has been created. It takes use of the basic concepts of Function Point Analysis, but avoids the problems and pitfalls found in the method. It is relativelyeasy to use and learn. Effort estimation accuracy has significantly improved after taking this model into use. A major innovation related to the new estimationmodel is the identified need for hierarchical software size measurement. The author of this thesis has developed a three level solution for the estimation model. All currently used size metrics are static in nature, but this new proposed metric is dynamic. It takes use of the increased understanding of the nature of the work as specification and design work proceeds. It thus 'grows up' along with software projects. The effort estimation model development is not possible without gathering and analyzing history data. However, there are many problems with data in software engineering. A major roadblock is the amount and quality of data available. This thesis shows some useful techniques that have been successful in gathering and analyzing the data needed. An estimation process is needed to ensure that methods are used in a proper way, estimates are stored, reported and analyzed properly, and they are used for project management activities. A higher mechanism called measurement framework is also introduced shortly. The purpose of the framework is to define and maintain a measurement or estimationprocess. Without a proper framework, the estimation capability of an organization declines. It requires effort even to maintain an achieved level of estimationaccuracy. Estimation results in several successive releases are analyzed. It isclearly seen that the new estimation model works and the estimation improvementactions have been successful. The calibration of the hierarchical model is a critical activity. An example is shown to shed more light on the calibration and the model itself. There are also remarks about the sensitivity of the model. Finally, an example of usage is shown.
Resumo:
Agile software development methods are attempting to provide an answer to the software development industry's need of lighter weight, more agile processes that offer the possibility to react to changes during the software development process. The objective of this thesis is to analyze and experiment the possibility of using agile methods or practices also in small software projects, even in projects containing only one developer. In the practical part of the thesis a small software project was executed with some agile methods and practices that in the theoretical part of the thesis were found possible to be applied to the project. In the project a Bluetooth proxy application that is run in the S60 smartphone platform and PC was developed further to contain some new features. As a result it was found that certain agile practices can be useful even in the very small projects. The selection of the suitable practices depends on the project and the size of the project team.
Resumo:
Ohjelmistokehitys on monimutkainen prosessi. Yksi keskeisistä tekijöistä siinä on ohjelmistolle asetettavat vaatimukset. Näitä vaatimuksia on hyvin monenlaisia, ja eri tasoisia; toivotusta toiminnallisuudesta hyvinkin yksityiskohtaisiin vaatimuksiin. Näiden vaatimusten hallinta on myöskin hyvin monitahoista, vaikkakin se on kirjallisuudessa esitetty selkeänä prosessissa, joka on sarja toisistaan erottuviavaiheita. Työn painopiste oli näiden vaatimusten muutoksen ja valmiiseen ohjelmistoon kohdistuvan palautteen hallinnassa, ja kuinka vaatimustenhallintaohjelmisto voisi olla avuksi näissä prosesseissa. Vaatimustenhallintatyökalun käyttö ei sinällään ratkaise mitään ongelmia, mutta se suo puitteet parantaa vaatimusten hallitsemista. Työkalun käytöstä on muun muassa seuraavia etuja: vaatimusten keskitetty varastointi, käyttäjäoikeuksien määrittely koskien eri käyttäjiä ja heidän pääsyään näkemään tai muuttamaan tietoa, muutoksenhallintaprosessin hallinta, muutosten vaikutuksen analysointi ja jäljitettävyys ja pääsy tietoihin web-selaimella.
Resumo:
Ohjelmistojen tärkeys nykypäivän yhteiskunnalle kasvaa jatkuvasti. Monia ohjelmistoprojekteja vaivaavat ongelmat aikataulussa pysymisestä, korkean tuottavuuden ylläpitämisestä ja riittävän korkeasta laadusta. Ohjelmistokehitysprosessien parantamisessa on naiden ongelmien minimoimiseksi tehty suuria investointeja. Investointien syynä on ollut olettamus ohjelmistokehityksen kapasiteetin suora riippuvuus tuotteen laadusta. Tämän tutkimuksen tarkoituksena oli tutkia Ohjelmistokehitysprosessien parantamisen mahdollisuuksia. Olemassaolevat ohjelmistokehityksen ja Ohjelmistokehitysprosessin parantamisen mallit, tekniikat ja metodologiat esiteltiin. Esiteltyjen mallien, tekniikoiden ja metodologioiden soveltuvuus analysoitiin ja suositus mallien käytöstä annettiin.
Resumo:
Vaatimusmäärittelyn tavoitteena on luoda halutun järjestelmän kokonaisen, yhtenäisen vaatimusluettelon vaatimusten määrittämiseksi käsitteellisellä tasolla. Liiketoimintaprosessien mallintaminen on varsin hyödyllinen vaatimusmäärittelyn varhaisissa vaiheissa. Tämä työ tutkii liiketoimintaprosessien mallintamista tietojärjestelmien kehittämistä varten. Nykyään on olemassa erilaisia liiketoimintaprosessien mallintamiseen tarkoitettuja tekniikoita. Tämä työ tarkastaa liiketoimintaprosessien mallintamisen periaatteet ja näkökohdat sekä eri mallinnustekniikoita. Uusi menetelmä, joka on suunniteltu erityisesti pienille ja keskisuurille ohjelmistoprojekteille, on kehitetty prosessinäkökohtien ja UML-kaavioiden perusteella.