Biblioteca Digital

368 resultados para compiler backend

A Visual Analytics Approach to Comparing Cohorts of Event Sequences

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sequences of timestamped events are currently being generated across nearly every domain of data analytics, from e-commerce web logging to electronic health records used by doctors and medical researchers. Every day, this data type is reviewed by humans who apply statistical tests, hoping to learn everything they can about how these processes work, why they break, and how they can be improved upon. To further uncover how these processes work the way they do, researchers often compare two groups, or cohorts, of event sequences to find the differences and similarities between outcomes and processes. With temporal event sequence data, this task is complex because of the variety of ways single events and sequences of events can differ between the two cohorts of records: the structure of the event sequences (e.g., event order, co-occurring events, or frequencies of events), the attributes about the events and records (e.g., gender of a patient), or metrics about the timestamps themselves (e.g., duration of an event). Running statistical tests to cover all these cases and determining which results are significant becomes cumbersome. Current visual analytics tools for comparing groups of event sequences emphasize a purely statistical or purely visual approach for comparison. Visual analytics tools leverage humans' ability to easily see patterns and anomalies that they were not expecting, but is limited by uncertainty in findings. Statistical tools emphasize finding significant differences in the data, but often requires researchers have a concrete question and doesn't facilitate more general exploration of the data. Combining visual analytics tools with statistical methods leverages the benefits of both approaches for quicker and easier insight discovery. Integrating statistics into a visualization tool presents many challenges on the frontend (e.g., displaying the results of many different metrics concisely) and in the backend (e.g., scalability challenges with running various metrics on multi-dimensional data at once). I begin by exploring the problem of comparing cohorts of event sequences and understanding the questions that analysts commonly ask in this task. From there, I demonstrate that combining automated statistics with an interactive user interface amplifies the benefits of both types of tools, thereby enabling analysts to conduct quicker and easier data exploration, hypothesis generation, and insight discovery. The direct contributions of this dissertation are: (1) a taxonomy of metrics for comparing cohorts of temporal event sequences, (2) a statistical framework for exploratory data analysis with a method I refer to as high-volume hypothesis testing (HVHT), (3) a family of visualizations and guidelines for interaction techniques that are useful for understanding and parsing the results, and (4) a user study, five long-term case studies, and five short-term case studies which demonstrate the utility and impact of these methods in various domains: four in the medical domain, one in web log analysis, two in education, and one each in social networks, sports analytics, and security. My dissertation contributes an understanding of how cohorts of temporal event sequences are commonly compared and the difficulties associated with applying and parsing the results of these metrics. It also contributes a set of visualizations, algorithms, and design guidelines for balancing automated statistics with user-driven analysis to guide users to significant, distinguishing features between cohorts. This work opens avenues for future research in comparing two or more groups of temporal event sequences, opening traditional machine learning and data mining techniques to user interaction, and extending the principles found in this dissertation to data types beyond temporal event sequences.

Veja mais

Du terme prédicatif au cadre sémantique : méthodologie de compilation d'une ressource terminologique pour les termes arabes de l'informatique

Relevância:

10.00% 10.00%

Publicador:

Resumo:

La description des termes dans les ressources terminologiques traditionnelles se limite à certaines informations, comme le terme (principalement nominal), sa définition et son équivalent dans une langue étrangère. Cette description donne rarement d’autres informations qui peuvent être très utiles pour l’utilisateur, surtout s’il consulte les ressources dans le but d’approfondir ses connaissances dans un domaine de spécialité, maitriser la rédaction professionnelle ou trouver des contextes où le terme recherché est réalisé. Les informations pouvant être utiles dans ce sens comprennent la description de la structure actancielle des termes, des contextes provenant de sources authentiques et l’inclusion d’autres parties du discours comme les verbes. Les verbes et les noms déverbaux, ou les unités terminologiques prédicatives (UTP), souvent ignorés par la terminologie classique, revêtent une grande importance lorsqu’il s’agit d’exprimer une action, un processus ou un évènement. Or, la description de ces unités nécessite un modèle de description terminologique qui rend compte de leurs particularités. Un certain nombre de terminologues (Condamines 1993, Mathieu-Colas 2002, Gross et Mathieu-Colas 2001 et L’Homme 2012, 2015) ont d’ailleurs proposé des modèles de description basés sur différents cadres théoriques. Notre recherche consiste à proposer une méthodologie de description terminologique des UTP de la langue arabe, notamment l’arabe standard moderne (ASM), selon la théorie de la Sémantique des cadres (Frame Semantics) de Fillmore (1976, 1977, 1982, 1985) et son application, le projet FrameNet (Ruppenhofer et al. 2010). Le domaine de spécialité qui nous intéresse est l’informatique. Dans notre recherche, nous nous appuyons sur un corpus recueilli du web et nous nous inspirons d’une ressource terminologique existante, le DiCoInfo (L’Homme 2008), pour compiler notre propre ressource. Nos objectifs se résument comme suit. Premièrement, nous souhaitons jeter les premières bases d’une version en ASM de cette ressource. Cette version a ses propres particularités : 1) nous visons des unités bien spécifiques, à savoir les UTP verbales et déverbales; 2) la méthodologie développée pour la compilation du DiCoInfo original devra être adaptée pour prendre en compte une langue sémitique. Par la suite, nous souhaitons créer une version en cadres de cette ressource, où nous regroupons les UTP dans des cadres sémantiques, en nous inspirant du modèle de FrameNet. À cette ressource, nous ajoutons les UTP anglaises et françaises, puisque cette partie du travail a une portée multilingue. La méthodologie consiste à extraire automatiquement les unités terminologiques verbales et nominales (UTV et UTN), comme Ham~ala (حمل) (télécharger) et taHmiyl (تحميل) (téléchargement). Pour ce faire, nous avons adapté un extracteur automatique existant, TermoStat (Drouin 2004). Ensuite, à l’aide des critères de validation terminologique (L’Homme 2004), nous validons le statut terminologique d’une partie des candidats. Après la validation, nous procédons à la création de fiches terminologiques, à l’aide d’un éditeur XML, pour chaque UTV et UTN retenue. Ces fiches comprennent certains éléments comme la structure actancielle des UTP et jusqu’à vingt contextes annotés. La dernière étape consiste à créer des cadres sémantiques à partir des UTP de l’ASM. Nous associons également des UTP anglaises et françaises en fonction des cadres créés. Cette association a mené à la création d’une ressource terminologique appelée « DiCoInfo : A Framed Version ». Dans cette ressource, les UTP qui partagent les mêmes propriétés sémantiques et structures actancielles sont regroupées dans des cadres sémantiques. Par exemple, le cadre sémantique Product_development regroupe des UTP comme Taw~ara (طور) (développer), to develop et développer. À la suite de ces étapes, nous avons obtenu un total de 106 UTP ASM compilées dans la version en ASM du DiCoInfo et 57 cadres sémantiques associés à ces unités dans la version en cadres du DiCoInfo. Notre recherche montre que l’ASM peut être décrite avec la méthodologie que nous avons mise au point.

Veja mais

Provable Security for Cryptocurrencies

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The past several years have seen the surprising and rapid rise of Bitcoin and other “cryptocurrencies.” These are decentralized peer-to-peer networks that allow users to transmit money, tocompose financial instruments, and to enforce contracts between mutually distrusting peers, andthat show great promise as a foundation for financial infrastructure that is more robust, efficientand equitable than ours today. However, it is difficult to reason about the security of cryptocurrencies. Bitcoin is a complex system, comprising many intricate and subtly-interacting protocol layers. At each layer it features design innovations that (prior to our work) have not undergone any rigorous analysis. Compounding the challenge, Bitcoin is but one of hundreds of competing cryptocurrencies in an ecosystem that is constantly evolving. The goal of this thesis is to formally reason about the security of cryptocurrencies, reining in their complexity, and providing well-defined and justified statements of their guarantees. We provide a formal specification and construction for each layer of an abstract cryptocurrency protocol, and prove that our constructions satisfy their specifications. The contributions of this thesis are centered around two new abstractions: “scratch-off puzzles,” and the “blockchain functionality” model. Scratch-off puzzles are a generalization of the Bitcoin “mining” algorithm, its most iconic and novel design feature. We show how to provide secure upgrades to a cryptocurrency by instantiating the protocol with alternative puzzle schemes. We construct secure puzzles that address important and well-known challenges facing Bitcoin today, including wasted energy and dangerous coalitions. The blockchain functionality is a general-purpose model of a cryptocurrency rooted in the “Universal Composability” cryptography theory. We use this model to express a wide range of applications, including transparent “smart contracts” (like those featured in Bitcoin and Ethereum), and also privacy-preserving applications like sealed-bid auctions. We also construct a new protocol compiler, called Hawk, which translates user-provided specifications into privacy-preserving protocols based on zero-knowledge proofs.

Veja mais

EGCL: an extended G-Code Language with flow control, functions and mnemonic variables

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the context of computer numerical control (CNC) and computer aided manufacturing (CAM), the capabilities of programming languages such as symbolic and intuitive programming, program portability and geometrical portfolio have special importance -- They allow to save time and to avoid errors during part programming and permit code re-usage -- Our updated literature review indicates that the current state of art presents voids in parametric programming, program portability and programming flexibility -- In response to this situation, this article presents a compiler implementation for EGCL (Extended G-code Language), a new, enriched CNC programming language which allows the use of descriptive variable names, geometrical functions and flow-control statements (if-then-else, while) -- Our compiler produces low-level generic, elementary ISO-compliant Gcode, thus allowing for flexibility in the choice of the executing CNC machine and in portability -- Our results show that readable variable names and flow control statements allow a simplified and intuitive part programming and permit re-usage of the programs -- Future work includes allowing the programmer to define own functions in terms of EGCL, in contrast to the current status of having them as library built-in functions

Veja mais

Improvements in Hardware Transactional Memory for GPU Architectures

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the multi-core CPU world, transactional memory (TM)has emerged as an alternative to lock-based programming for thread synchronization. Recent research proposes the use of TM in GPU architectures, where a high number of computing threads, organized in SIMT fashion, requires an effective synchronization method. In contrast to CPUs, GPUs offer two memory spaces: global memory and local memory. The local memory space serves as a shared scratch-pad for a subset of the computing threads, and it is used by programmers to speed-up their applications thanks to its low latency. Prior work from the authors proposed a lightweight hardware TM (HTM) support based in the local memory, modifying the SIMT execution model and adding a conflict detection mechanism. An efficient implementation of these features is key in order to provide an effective synchronization mechanism at the local memory level. After a quick description of the main features of our HTM design for GPU local memory, in this work we gather together a number of proposals designed with the aim of improving those mechanisms with high impact on performance. Firstly, the SIMT execution model is modified to increase the parallelism of the application when transactions must be serialized in order to make forward progress. Secondly, the conflict detection mechanism is optimized depending on application characteristics, such us the read/write sets, the probability of conflict between transactions and the existence of read-only transactions. As these features can be present in hardware simultaneously, it is a task of the compiler and runtime to determine which ones are more important for a given application. This work includes a discussion on the analysis to be done in order to choose the best configuration solution.

Veja mais

Integration of a graph-based model indexer in commercial modelling tools

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Softeam has over 20 years of experience providing UML-based modelling solutions, such as its Modelio modelling tool, and its Constellation enterprise model management and collaboration environment. Due to the increasing number and size of the models used by Softeam’s clients, Softeam joined the MONDO FP7 EU research project, which worked on solutions for these scalability challenges and produced the Hawk model indexer among other results. This paper presents the technical details and several case studies on the integration of Hawk into Softeam’s toolset. The first case study measured the performance of Hawk’s Modelio support using varying amounts of memory for the Neo4j backend. In another case study, Hawk was integrated into Constellation to provide scalable global querying of model repositories. Finally, the combination of Hawk and the Epsilon Generation Language was compared against Modelio for document generation: for the largest model, Hawk was two orders of magnitude faster.

Veja mais

Modelling the infrastructure of the “Śivadharma Database”

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A Digital Scholarly Edition is a conceptually and structurally sophisticated entity. Throughout the centuries, diverse methodologies have been employed to reconstruct a text transmitted through one or multiple sources, resulting in various edition types. With the advent of digital technology in philology, these practices have undergone a significant transformation, compelling scholars to reconsider their approach in light of the web. In the digital age, philologists are expected to possess (too) advanced technical skills to prepare interactive and enriched editions, even though, in most cases, only mechanical or documentary editions are published online. The Śivadharma Database is a web Content Management System (CMS) designed to facilitate the preparation, publication, and updating of Digital Scholarly Editions. By providing scholars with a user-friendly CRUD web application to reconstruct and annotate a text, they can prepare their textus with additional components such as apparatus, notes, translations, citations, and parallels. It is possible by leveraging an annotation system based on HTML and graph data structure. This choice is made because the text entity is multidimensional and multifaceted, even if its sequential presentation constrains it. In particular, editions of South Asian texts of the Śivadharma corpus, the case study of this research, contain a series of phenomena that are difficult to manage formally, such as overlapping hierarchies. Hence, it becomes necessary to establish the data structure best suited to represent this complexity. In Śivadharma Database, the textus is an HTML file readily displayable. Textual fragments, annotated via an interface without requiring philologists to write code and saved in the backend, form the atomic unit of multiple relationships organised in a graph database. This approach enables the formal representation of complex and overlapping textual phenomena, allowing for good annotation expressiveness with minimal effort to learn the relevant technologies during the editing workflow.

Veja mais

Dragonlifter - Un lifter da p-code a C

Relevância:

10.00% 10.00%

Publicador:

Resumo:

L'analisi di codice compilato è un'attività sempre più richiesta e necessaria, critica per la sicurezza e stabilità delle infrastrutture informatiche utilizzate in tutto il mondo. Le tipologie di file binari da analizzare sono numerose e in costante evoluzione, si può passare da applicativi desktop o mobile a firmware di router o baseband. Scopo della tesi è progettare e realizzare Dragonlifter, un convertitore da codice compilato a C che sia estendibile e in grado di supportare un numero elevato di architetture, sistemi operativi e formati file. Questo rende possibile eseguire programmi compilati per altre architetture, tracciare la loro esecuzione e modificarli per mitigare vulnerabilità o cambiarne il comportamento.

Veja mais

368 resultados para compiler backend

Filtro por publicador