940 resultados para Graph analytics
Resumo:
Sequences of timestamped events are currently being generated across nearly every domain of data analytics, from e-commerce web logging to electronic health records used by doctors and medical researchers. Every day, this data type is reviewed by humans who apply statistical tests, hoping to learn everything they can about how these processes work, why they break, and how they can be improved upon. To further uncover how these processes work the way they do, researchers often compare two groups, or cohorts, of event sequences to find the differences and similarities between outcomes and processes. With temporal event sequence data, this task is complex because of the variety of ways single events and sequences of events can differ between the two cohorts of records: the structure of the event sequences (e.g., event order, co-occurring events, or frequencies of events), the attributes about the events and records (e.g., gender of a patient), or metrics about the timestamps themselves (e.g., duration of an event). Running statistical tests to cover all these cases and determining which results are significant becomes cumbersome. Current visual analytics tools for comparing groups of event sequences emphasize a purely statistical or purely visual approach for comparison. Visual analytics tools leverage humans' ability to easily see patterns and anomalies that they were not expecting, but is limited by uncertainty in findings. Statistical tools emphasize finding significant differences in the data, but often requires researchers have a concrete question and doesn't facilitate more general exploration of the data. Combining visual analytics tools with statistical methods leverages the benefits of both approaches for quicker and easier insight discovery. Integrating statistics into a visualization tool presents many challenges on the frontend (e.g., displaying the results of many different metrics concisely) and in the backend (e.g., scalability challenges with running various metrics on multi-dimensional data at once). I begin by exploring the problem of comparing cohorts of event sequences and understanding the questions that analysts commonly ask in this task. From there, I demonstrate that combining automated statistics with an interactive user interface amplifies the benefits of both types of tools, thereby enabling analysts to conduct quicker and easier data exploration, hypothesis generation, and insight discovery. The direct contributions of this dissertation are: (1) a taxonomy of metrics for comparing cohorts of temporal event sequences, (2) a statistical framework for exploratory data analysis with a method I refer to as high-volume hypothesis testing (HVHT), (3) a family of visualizations and guidelines for interaction techniques that are useful for understanding and parsing the results, and (4) a user study, five long-term case studies, and five short-term case studies which demonstrate the utility and impact of these methods in various domains: four in the medical domain, one in web log analysis, two in education, and one each in social networks, sports analytics, and security. My dissertation contributes an understanding of how cohorts of temporal event sequences are commonly compared and the difficulties associated with applying and parsing the results of these metrics. It also contributes a set of visualizations, algorithms, and design guidelines for balancing automated statistics with user-driven analysis to guide users to significant, distinguishing features between cohorts. This work opens avenues for future research in comparing two or more groups of temporal event sequences, opening traditional machine learning and data mining techniques to user interaction, and extending the principles found in this dissertation to data types beyond temporal event sequences.
Resumo:
Kinematic structure of planar mechanisms addresses the study of attributes determined exclusively by the joining pattern among the links forming a mechanism. The system group classification is central to the kinematic structure and consists of determining a sequence of kinematically and statically independent-simple chains which represent a modular basis for the kinematics and force analysis of the mechanism. This article presents a novel graph-based algorithm for structural analysis of planar mechanisms with closed-loop kinematic structure which determines a sequence of modules (Assur groups) representing the topology of the mechanism. The computational complexity analysis and proof of correctness of the implemented algorithm are provided. A case study is presented to illustrate the results of the devised method.
Resumo:
Reconfigurable hardware can be used to build a multitasking system where tasks are assigned to HW resources at run-time according to the requirements of the running applications. These tasks are frequently represented as direct acyclic graphs and their execution is typically controlled by an embedded processor that schedules the graph execution. In order to improve the efficiency of the system, the scheduler can apply prefetch and reuse techniques that can greatly reduce the reconfiguration latencies. For an embedded processor all these computations represent a heavy computational load that can significantly reduce the system performance. To overcome this problem we have implemented a HW scheduler using reconfigurable resources. In addition we have implemented both prefetch and replacement techniques that obtain as good results as previous complex SW approaches, while demanding just a few clock cycles to carry out the computations. We consider that the HW cost of the system (in our experiments 3% of a Virtex-II PRO xc2vp30 FPGA) is affordable taking into account the great efficiency of the techniques applied to hide the reconfiguration latency and the negligible run-time penalty introduced by the scheduler computations.
Resumo:
Reconfigurable hardware can be used to build multi tasking systems that dynamically adapt themselves to the requirements of the running applications. This is especially useful in embedded systems, since the available resources are very limited and the reconfigurable hardware can be reused for different applications. In these systems computations are frequently represented as task graphs that are executed taking into account their internal dependencies and the task schedule. The management of the task graph execution is critical for the system performance. In this regard, we have developed two dif erent versions, a software module and a hardware architecture, of a generic task-graph execution manager for reconfigurable multi-tasking systems. The second version reduces the run-time management overheads by almost two orders of magnitude. Hence it is especially suitable for systems with exigent timing constraints. Both versions include specific support to optimize the reconfiguration process.
Resumo:
Part 5: Service Orientation in Collaborative Networks
Resumo:
Nowadays, the development of the photovoltaic (PV) technology is consolidated as a source of renewable energy. The research in the topic of maximum improvement on the energy efficiency of the PV plants is today a major challenge. The main requirement for this purpose is to know the performance of each of the PV modules that integrate the PV field in real time. In this respect, a PLC communications based Smart Monitoring and Communications Module, which is able to monitor at PV level their operating parameters, has been developed at the University of Malaga. With this device you can check if any of the panels is suffering any type of overriding performance, due to a malfunction or partial shadowing of its surface. Since these fluctuations in electricity production from a single panel affect the overall sum of all panels that conform a string, it is necessary to isolate the problem and modify the routes of energy through alternative paths in case of PV panels array configuration.
Resumo:
I Big Data stanno guidando una rivoluzione globale. In tutti i settori, pubblici o privati, e le industrie quali Vendita al dettaglio, Sanità, Media e Trasporti, i Big Data stanno influenzando la vita di miliardi di persone. L’impatto dei Big Data è sostanziale, ma così discreto da passare inosservato alla maggior parte delle persone. Le applicazioni di Business Intelligence e Advanced Analytics vogliono studiare e trarre informazioni dai Big Data. Si studia il passaggio dalla prima alla seconda, mettendo in evidenza aspetti simili e differenze.
Resumo:
Otto-von-Guericke-Universität Magdeburg, Fakultät für Informatik, Habilitationsschrift, 2016
Resumo:
This dissertation introduces a new approach for assessing the effects of pediatric epilepsy on the language connectome. Two novel data-driven network construction approaches are presented. These methods rely on connecting different brain regions using either extent or intensity of language related activations as identified by independent component analysis of fMRI data. An auditory description decision task (ADDT) paradigm was used to activate the language network for 29 patients and 30 controls recruited from three major pediatric hospitals. Empirical evaluations illustrated that pediatric epilepsy can cause, or is associated with, a network efficiency reduction. Patients showed a propensity to inefficiently employ the whole brain network to perform the ADDT language task; on the contrary, controls seemed to efficiently use smaller segregated network components to achieve the same task. To explain the causes of the decreased efficiency, graph theoretical analysis was carried out. The analysis revealed no substantial global network feature differences between the patient and control groups. It also showed that for both subject groups the language network exhibited small-world characteristics; however, the patient’s extent of activation network showed a tendency towards more random networks. It was also shown that the intensity of activation network displayed ipsilateral hub reorganization on the local level. The left hemispheric hubs displayed greater centrality values for patients, whereas the right hemispheric hubs displayed greater centrality values for controls. This hub hemispheric disparity was not correlated with a right atypical language laterality found in six patients. Finally it was shown that a multi-level unsupervised clustering scheme based on self-organizing maps, a type of artificial neural network, and k-means was able to fairly and blindly separate the subjects into their respective patient or control groups. The clustering was initiated using the local nodal centrality measurements only. Compared to the extent of activation network, the intensity of activation network clustering demonstrated better precision. This outcome supports the assertion that the local centrality differences presented by the intensity of activation network can be associated with focal epilepsy.
Resumo:
Softeam has over 20 years of experience providing UML-based modelling solutions, such as its Modelio modelling tool, and its Constellation enterprise model management and collaboration environment. Due to the increasing number and size of the models used by Softeam’s clients, Softeam joined the MONDO FP7 EU research project, which worked on solutions for these scalability challenges and produced the Hawk model indexer among other results. This paper presents the technical details and several case studies on the integration of Hawk into Softeam’s toolset. The first case study measured the performance of Hawk’s Modelio support using varying amounts of memory for the Neo4j backend. In another case study, Hawk was integrated into Constellation to provide scalable global querying of model repositories. Finally, the combination of Hawk and the Epsilon Generation Language was compared against Modelio for document generation: for the largest model, Hawk was two orders of magnitude faster.
Resumo:
L’échocardiographie et l’imagerie par résonance magnétique sont toutes deux des techniques non invasives utilisées en clinique afin de diagnostiquer ou faire le suivi de maladies cardiaques. La première mesure un délai entre l’émission et la réception d’ultrasons traversant le corps, tandis que l’autre mesure un signal électromagnétique généré par des protons d’hydrogène présents dans le corps humain. Les résultats des acquisitions de ces deux modalités d’imagerie sont fondamentalement différents, mais contiennent dans les deux cas de l’information sur les structures du coeur humain. La segmentation du ventricule gauche consiste à délimiter les parois internes du muscle cardiaque, le myocarde, afin d’en calculer différentes métriques cliniques utiles au diagnostic et au suivi de différentes maladies cardiaques, telle la quantité de sang qui circule à chaque battement de coeur. Suite à un infarctus ou autre condition, les performances ainsi que la forme du coeur en sont affectées. L’imagerie du ventricule gauche est utilisée afin d’aider les cardiologues à poser les bons diagnostics. Cependant, dessiner les tracés manuels du ventricule gauche requiert un temps non négligeable aux cardiologues experts, d’où l’intérêt pour une méthode de segmentation automatisée fiable et rapide. Ce mémoire porte sur la segmentation du ventricule gauche. La plupart des méthodes existantes sont spécifiques à une seule modalité d’imagerie. Celle proposée dans ce document permet de traiter rapidement des acquisitions provenant de deux modalités avec une précision de segmentation équivalente au tracé manuel d’un expert. Pour y parvenir, elle opère dans un espace anatomique, induisant ainsi une forme a priori implicite. L’algorithme de Graph Cut, combiné avec des stratégies telles les cartes probabilistes et les enveloppes convexes régionales, parvient à générer des résultats qui équivalent (ou qui, pour la majorité des cas, surpassent) l’état de l’art ii Sommaire au moment de la rédaction de ce mémoire. La performance de la méthode proposée, quant à l’état de l’art, a été démontrée lors d’un concours international. Elle est également validée exhaustivement via trois bases de données complètes en se comparant aux tracés manuels de deux experts et des tracés automatisés du logiciel Syngovia. Cette recherche est un projet collaboratif avec l’Université de Bourgogne, en France.
Resumo:
Analytics is the technology working with the manipulation of data to produce information able to change the world we live every day. Analytics have been largely used within the last decade to cluster people’s behaviour to predict their preferences of items to buy, music to listen, movies to watch and even electoral preference. The most advanced companies succeded in controlling people’s behaviour using analytics. Despite the evidence of the super-power of analytics, they are rarely applied to the big data collected within supply chain systems (i.e. distribution network, storage systems and production plants). This PhD thesis explores the fourth research paradigm (i.e. the generation of knowledge from data) applied to supply chain system design and operations management. An ontology defining the entities and the metrics of supply chain systems is used to design data structures for data collection in supply chain systems. The consistency of this data is provided by mathematical demonstrations inspired by the factory physics theory. The availability, quantity and quality of the data within these data structures define different decision patterns. Ten decision patterns are identified, and validated on-field, to address ten different class of design and control problems in the field of supply chain systems research.
Resumo:
The convergence between the recent developments in sensing technologies, data science, signal processing and advanced modelling has fostered a new paradigm to the Structural Health Monitoring (SHM) of engineered structures, which is the one based on intelligent sensors, i.e., embedded devices capable of stream processing data and/or performing structural inference in a self-contained and near-sensor manner. To efficiently exploit these intelligent sensor units for full-scale structural assessment, a joint effort is required to deal with instrumental aspects related to signal acquisition, conditioning and digitalization, and those pertaining to data management, data analytics and information sharing. In this framework, the main goal of this Thesis is to tackle the multi-faceted nature of the monitoring process, via a full-scale optimization of the hardware and software resources involved by the {SHM} system. The pursuit of this objective has required the investigation of both: i) transversal aspects common to multiple application domains at different abstraction levels (such as knowledge distillation, networking solutions, microsystem {HW} architectures), and ii) the specificities of the monitoring methodologies (vibrations, guided waves, acoustic emission monitoring). The key tools adopted in the proposed monitoring frameworks belong to the embedded signal processing field: namely, graph signal processing, compressed sensing, ARMA System Identification, digital data communication and TinyML.