950 resultados para Software repository mining. Process mining. Software developer contribution
Resumo:
The new generation of multicore processors opens new perspectives for the design of embedded systems. Multiprocessing, however, poses new challenges to the scheduling of real-time applications, in which the ever-increasing computational demands are constantly flanked by the need of meeting critical time constraints. Many research works have contributed to this field introducing new advanced scheduling algorithms. However, despite many of these works have solidly demonstrated their effectiveness, the actual support for multiprocessor real-time scheduling offered by current operating systems is still very limited. This dissertation deals with implementative aspects of real-time schedulers in modern embedded multiprocessor systems. The first contribution is represented by an open-source scheduling framework, which is capable of realizing complex multiprocessor scheduling policies, such as G-EDF, on conventional operating systems exploiting only their native scheduler from user-space. A set of experimental evaluations compare the proposed solution to other research projects that pursue the same goals by means of kernel modifications, highlighting comparable scheduling performances. The principles that underpin the operation of the framework, originally designed for symmetric multiprocessors, have been further extended first to asymmetric ones, which are subjected to major restrictions such as the lack of support for task migrations, and later to re-programmable hardware architectures (FPGAs). In the latter case, this work introduces a scheduling accelerator, which offloads most of the scheduling operations to the hardware and exhibits extremely low scheduling jitter. The realization of a portable scheduling framework presented many interesting software challenges. One of these has been represented by timekeeping. In this regard, a further contribution is represented by a novel data structure, called addressable binary heap (ABH). Such ABH, which is conceptually a pointer-based implementation of a binary heap, shows very interesting average and worst-case performances when addressing the problem of tick-less timekeeping of high-resolution timers.
Resumo:
L'innovazione delle tecnologie di sequenziamento negli ultimi anni ha reso possibile la catalogazione delle varianti genetiche nei campioni umani, portando nuove scoperte e comprensioni nella ricerca medica, farmaceutica, dell'evoluzione e negli studi sulla popolazione. La quantità di sequenze prodotta è molto cospicua, e per giungere all'identificazione delle varianti sono necessari diversi stadi di elaborazione delle informazioni genetiche in cui, ad ogni passo, vengono generate ulteriori informazioni. Insieme a questa immensa accumulazione di dati, è nata la necessità da parte della comunità scientifica di organizzare i dati in repository, dapprima solo per condividere i risultati delle ricerche, poi per permettere studi statistici direttamente sui dati genetici. Gli studi su larga scala coinvolgono quantità di dati nell'ordine dei petabyte, il cui mantenimento continua a rappresentare una sfida per le infrastrutture. Per la varietà e la quantità di dati prodotti, i database giocano un ruolo di primaria importanza in questa sfida. Modelli e organizzazione dei dati in questo campo possono fare la differenza non soltanto per la scalabilità, ma anche e soprattutto per la predisposizione al data mining. Infatti, la memorizzazione di questi dati in file con formati quasi-standard, la dimensione di questi file, e i requisiti computazionali richiesti, rendono difficile la scrittura di software di analisi efficienti e scoraggiano studi su larga scala e su dati eterogenei. Prima di progettare il database si è perciò studiata l’evoluzione, negli ultimi vent’anni, dei formati quasi-standard per i flat file biologici, contenenti metadati eterogenei e sequenze nucleotidiche vere e proprie, con record privi di relazioni strutturali. Recentemente questa evoluzione è culminata nell’utilizzo dello standard XML, ma i flat file delimitati continuano a essere gli standard più supportati da tools e piattaforme online. È seguita poi un’analisi dell’organizzazione interna dei dati per i database biologici pubblici. Queste basi di dati contengono geni, varianti genetiche, strutture proteiche, ontologie fenotipiche, relazioni tra malattie e geni, relazioni tra farmaci e geni. Tra i database pubblici studiati rientrano OMIM, Entrez, KEGG, UniProt, GO. L'obiettivo principale nello studio e nella modellazione del database genetico è stato quello di strutturare i dati in modo da integrare insieme i dati eterogenei prodotti e rendere computazionalmente possibili i processi di data mining. La scelta di tecnologia Hadoop/MapReduce risulta in questo caso particolarmente incisiva, per la scalabilità garantita e per l’efficienza nelle analisi statistiche più complesse e parallele, come quelle riguardanti le varianti alleliche multi-locus.
Resumo:
During the last few decades an unprecedented technological growth has been at the center of the embedded systems design paramount, with Moore’s Law being the leading factor of this trend. Today in fact an ever increasing number of cores can be integrated on the same die, marking the transition from state-of-the-art multi-core chips to the new many-core design paradigm. Despite the extraordinarily high computing power, the complexity of many-core chips opens the door to several challenges. As a result of the increased silicon density of modern Systems-on-a-Chip (SoC), the design space exploration needed to find the best design has exploded and hardware designers are in fact facing the problem of a huge design space. Virtual Platforms have always been used to enable hardware-software co-design, but today they are facing with the huge complexity of both hardware and software systems. In this thesis two different research works on Virtual Platforms are presented: the first one is intended for the hardware developer, to easily allow complex cycle accurate simulations of many-core SoCs. The second work exploits the parallel computing power of off-the-shelf General Purpose Graphics Processing Units (GPGPUs), with the goal of an increased simulation speed. The term Virtualization can be used in the context of many-core systems not only to refer to the aforementioned hardware emulation tools (Virtual Platforms), but also for two other main purposes: 1) to help the programmer to achieve the maximum possible performance of an application, by hiding the complexity of the underlying hardware. 2) to efficiently exploit the high parallel hardware of many-core chips in environments with multiple active Virtual Machines. This thesis is focused on virtualization techniques with the goal to mitigate, and overtake when possible, some of the challenges introduced by the many-core design paradigm.
Resumo:
In den westlichen Industrieländern ist das Mammakarzinom der häufigste bösartige Tumor der Frau. Sein weltweiter Anteil an allen Krebserkrankungen der Frau beläuft sich auf etwa 21 %. Inzwischen ist jede neunte Frau bedroht, während ihres Lebens an Brustkrebs zu erkranken. Die alterstandardisierte Mortalitätrate liegt derzeit bei knapp 27 %.rnrnDas Mammakarzinom hat eine relative geringe Wachstumsrate. Die Existenz eines diagnostischen Verfahrens, mit dem alle Mammakarzinome unter 10 mm Durchmesser erkannt und entfernt werden, würden den Tod durch Brustkrebs praktisch beseitigen. Denn die 20-Jahres-Überlebungsrate bei Erkrankung durch initiale Karzinome der Größe 5 bis 10 mm liegt mit über 95 % sehr hoch.rnrnMit der Kontrastmittel gestützten Bildgebung durch die MRT steht eine relativ junge Untersuchungsmethode zur Verfügung, die sensitiv genug zur Erkennung von Karzinomen ab einer Größe von 3 mm Durchmesser ist. Die diagnostische Methodik ist jedoch komplex, fehleranfällig, erfordert eine lange Einarbeitungszeit und somit viel Erfahrung des Radiologen.rnrnEine Computer unterstützte Diagnosesoftware kann die Qualität einer solch komplexen Diagnose erhöhen oder zumindest den Prozess beschleunigen. Das Ziel dieser Arbeit ist die Entwicklung einer vollautomatischen Diagnose Software, die als Zweitmeinungssystem eingesetzt werden kann. Meines Wissens existiert eine solche komplette Software bis heute nicht.rnrnDie Software führt eine Kette von verschiedenen Bildverarbeitungsschritten aus, die dem Vorgehen des Radiologen nachgeahmt wurden. Als Ergebnis wird eine selbstständige Diagnose für jede gefundene Läsion erstellt: Zuerst eleminiert eine 3d Bildregistrierung Bewegungsartefakte als Vorverarbeitungsschritt, um die Bildqualität der nachfolgenden Verarbeitungsschritte zu verbessern. Jedes kontrastanreichernde Objekt wird durch eine regelbasierte Segmentierung mit adaptiven Schwellwerten detektiert. Durch die Berechnung kinetischer und morphologischer Merkmale werden die Eigenschaften der Kontrastmittelaufnahme, Form-, Rand- und Textureeigenschaften für jedes Objekt beschrieben. Abschließend werden basierend auf den erhobenen Featurevektor durch zwei trainierte neuronale Netze jedes Objekt in zusätzliche Funde oder in gut- oder bösartige Läsionen klassifiziert.rnrnDie Leistungsfähigkeit der Software wurde auf Bilddaten von 101 weiblichen Patientinnen getested, die 141 histologisch gesicherte Läsionen enthielten. Die Vorhersage der Gesundheit dieser Läsionen ergab eine Sensitivität von 88 % bei einer Spezifität von 72 %. Diese Werte sind den in der Literatur bekannten Vorhersagen von Expertenradiologen ähnlich. Die Vorhersagen enthielten durchschnittlich 2,5 zusätzliche bösartige Funde pro Patientin, die sich als falsch klassifizierte Artefakte herausstellten.rn
Resumo:
Agile methodologies have become the standard approach to software development. The most popular and used one is Scrum. Scrum is a very simple and flexible framework that respond to unpredictability in a really effective way. However, his implementation must be correct, and since Scrum tells you what to do but not how to do it, this is not trivial. In this thesis I will describe the Scrum Framework, how to implement it and a tool that can help to do this. The thesis is divided into three parts. The first part is called Scrum. Here I will introduce the framework itself, its key concepts and its components. In Scrum there are three components: roles, meetings and artifacts. Each of these is meant to accomplish a series of specific tasks. After describing the “what to do”, in the second part, Best Practices, I will focus on the “how to do it”. For example, how to decide which items should be included in the next sprint, how to estimate tasks, and how should the team workspace be. Finally, in the third part called Tools, I will introduce Visual Studio Online, a cloud service from Microsoft that offers Git and TFVC repositories and the opportunity to manage projects with Scrum. == Versione italiana: I metodi Agile sono diventati l’approccio standard per lo sviluppo di software. Il più famoso ed utilizzato è Scrum. Scrum è un framework molto semplice e flessibile che risponde ai cambiamenti in una maniera molto efficace. La sua implementazione deve però essere corretta, e visto che Scrum ci dice cosa fare ma non come farlo, questo non risulta essere immediato. In questa tesi descriverò Scrum, come implementarlo ed uno strumento che ci può aiutare a farlo. La tesi è divisa in tre parti. La prima parte è chiamata Scrum. Qui introdurrò il framework, i suoi concetti base e le sue componenti. In Scrum ci sono tre componenti: i ruoli, i meeting e gli artifact. Ognuno di questi è studiato per svolgere una serie di compiti specifici. Dopo aver descritto il “cosa fare”, nella seconda parte, Best Practices, mi concentrerò sul “come farlo”. Ad esempio, come decidere quali oggetti includere nella prossima sprint, come stimare ogni task e come dovrebbe essere il luogo di lavoro del team. Infine, nella terza parte chiamata Tools, introdurrò Visual Studio Online, un servizio cloud della Microsoft che offre repository Git e TFVC e l’opportunità di gestire un progetto con Scrum.
Resumo:
Public participation is an important component of Michigan’s Part 632 Nonferrous Mining law and is identified by researchers as important to decision-making processes. The Kennecott Eagle Project, which is located near Marquette, Michigan, is the first mine permitted under Michigan’s new mining regulation, and this research examines how public participation is structured in regulations, how the permitting process occurred during the permitting of the Eagle Project, and how participants in the permitting process perceived their participation. To understand these issues, this research implemented a review of existing mining policy and public participation policy literature, examination of documents related to the Kennecott Eagle Project and completion of semi-structured, ethnographic interviews with participants in the decision-making process. Interviewees identified issues with the structure of participation, the technical nature of the permitting process, concerns about the Michigan Department of Environmental Quality’s (DEQ) handling of mine permitting, and trust among participants. This research found that the permitting of the Kennecott Eagle Mine progressed as structured by regulation and collected technical input on the mine permit application, but did not meet the expectations of some participants who opposed the project. Findings from this research indicated that current mining regulation in Michigan is resilient to public opposition, there is need for more transparency from the Michigan DEQ during the permitting process, and current participatory structures limit the opportunities for some stakeholder groups to influence decision-making.
Resumo:
Many schools do not begin to introduce college students to software engineering until they have had at least one semester of programming. Since software engineering is a large, complex, and abstract subject it is difficult to construct active learning exercises that build on the students’ elementary knowledge of programming and still teach basic software engineering principles. It is also the case that beginning students typically know how to construct small programs, but they have little experience with the techniques necessary to produce reliable and long-term maintainable modules. I have addressed these two concerns by defining a local standard (Montana Tech Method (MTM) Software Development Standard for Small Modules Template) that step-by-step directs students toward the construction of highly reliable small modules using well known, best-practices software engineering techniques. “Small module” is here defined as a coherent development task that can be unit tested, and can be car ried out by a single (or a pair of) software engineer(s) in at most a few weeks. The standard describes the process to be used and also provides a template for the top-level documentation. The instructional module’s sequence of mini-lectures and exercises associated with the use of this (and other) local standards are used throughout the course, which perforce covers more abstract software engineering material using traditional reading and writing assignments. The sequence of mini-lectures and hands-on assignments (many of which are done in small groups) constitutes an instructional module that can be used in any similar software engineering course.
Resumo:
For popular software systems, the number of daily submitted bug reports is high. Triaging these incoming reports is a time consuming task. Part of the bug triage is the assignment of a report to a developer with the appropriate expertise. In this paper, we present an approach to automatically suggest developers who have the appropriate expertise for handling a bug report. We model developer expertise using the vocabulary found in their source code contributions and compare this vocabulary to the vocabulary of bug reports. We evaluate our approach by comparing the suggested experts to the persons who eventually worked on the bug. Using eight years of Eclipse development as a case study, we achieve 33.6\% top-1 precision and 71.0\% top-10 recall.
Resumo:
Software metrics offer us the promise of distilling useful information from vast amounts of software in order to track development progress, to gain insights into the nature of the software, and to identify potential problems. Unfortunately, however, many software metrics exhibit highly skewed, non-Gaussian distributions. As a consequence, usual ways of interpreting these metrics --- for example, in terms of "average" values --- can be highly misleading. Many metrics, it turns out, are distributed like wealth --- with high concentrations of values in selected locations. We propose to analyze software metrics using the Gini coefficient, a higher-order statistic widely used in economics to study the distribution of wealth. Our approach allows us not only to observe changes in software systems efficiently, but also to assess project risks and monitor the development process itself. We apply the Gini coefficient to numerous metrics over a range of software projects, and we show that many metrics not only display remarkably high Gini values, but that these values are remarkably consistent as a project evolves over time.
Resumo:
The article introduces the E-learning Circle, a tool developed to assure the quality of the software design process of e-learning systems, considering pedagogical principles as well as technology. The E-learning Circle consists of a number of concentric circles which are divided into three sectors. The content of the inner circles is based on pedagogical principles, while the outer circle specifies how the pedagogical principles may be implemented with technology. The circle’s centre is dedicated to the subject taught, ensuring focus on the specific subject’s properties. The three sectors represent the student, the teacher and the learning objectives. The strengths of the E-learning Circle are the compact presentation combined with the overview it provides, as well as the usefulness of a design tool dealing with complexity, providing a common language and embedding best practice. The E-learning Circle is not a prescriptive method, but is useful in several design models and processes. The article presents two projects where the E-learning Circle was used as a design tool.
Resumo:
Die kurzen Technologiezyklen in der IT-Industrie stellen Unternehmen vor das Problem, Mitarbeiter zeit- und themenadäquat weiter zu qualifizieren. Für Bildungsanbieter erwächst damit die Herausforderung, relevante Bildungsthemen möglichst frühzeitig zu identifizieren, ökonomisch zu bewerten und ausgewählte Themen in Form geeigneter Bildungsangebote zur Marktreife zu bringen. Zur Handhabung dieser Problematik wurde an der Hochschule für Telekommunikation Leipzig (HfTL), die sich in Trägerschaft der Deutsche Telekom AG befindet, ein innovatives Analyseinstrument entwickelt. Mit diesem Instrument, dem IT-KompetenzBarometer, werden Stellenanzeigen, die in Jobportalen online publiziert werden, ausgelesen und mithilfe von Text Mining-Methoden untersucht. Auf diese Weise können Informationen gewonnen werden, die differenzierte Auskunft über die qualitativen Kompetenzanforderungen zentraler Berufsbilder des IT-Sektors liefern. Dieser Beitrag stellt Ergebnisse vor, die durch Analyse von mehr als 40.000 Stellenanzeigen für IT-Fachkräfte aus Jobportalen im Zeitraum von Juni-September 2012 gewonnen werden konnten. Diese Ergebnisse liefern eine Informationsgrundlage, um marktrelevante Bildungsthemen zu identifizieren, sodass Bildungsangebote erfolgreich gestaltet und weiterentwickelt werden können.
Resumo:
In diesem Beitrag wird eine dezentral aufgebaute und auf Selbstorganisation basierende Methodik zur Grobplanung von Intralogistiksystemen thematisiert. Diese Methodik sieht eine Kombination des Wissenschaftsgebiets der Agentensysteme aus der Informatik mit der Materialflussplanung vor. Dieser Artikel leistet somit einen Beitrag für die Entwicklung eines intelligenten, rechnergestützten Assistenzsystems zur Planung intralogistischer Systeme.
Resumo:
In this paper we follow a theory-based approach to study the assimilation of compliance software in highly regulated multinational enterprises. These relatively new software products support the automation of controls which are associated with mandatory compliance requirements. We use institutional and success factor theories to explain the assimilation of compliance software. A framework for analyzing the assimilation of Access Control Systems (ACS), a special type of compliance software, is developed and used to reflect the experiences obtained in four in-depth case studies. One result is that coercive, mimetic, and normative pressures significantly effect ACS assimilation. On the other hand, quality aspects have only a moderate impact at the beginning of the assimilation process, in later phases the impact may increase if performance and improvement objectives become more relevant. In addition, it turns out that position of the enterprises and compatibility heavily influence the assimilation process.
Resumo:
Given the centrality of control for achieving success in outsourced software projects, past research has identified key exogenous factors that determine the choice of controls. This view of exogenously driven control choice is based on a number of assumptions; particularly, clients and vendors are seen as separate cognitive entities that combat opportunistic threats under environmental uncertainty by one-off choices or infrequent revisions of controls. In this paper we complement this perspective by acknowledging that an outsourced software project may be characterized as a collective, evolving process faced with the challenge of coping with cognitive limitations of both client and vendor through a continuous process of learning. We argue that if viewed in this way, controls are less subject of a deliberate choice but rather are subject of endogenously driven change, i.e. controls evolve in close interaction with the evolving software project. Accordingly, we suggest a complementary model of endogenous control, where controls mediate individual and collective learning processes. Our research contributes to a better understanding of the dynamics in outsourced software projects. It also spells out methodological implications that may help improve cross-section control research.