956 resultados para Data compression (Computer science)
Resumo:
In dieser Arbeit werden dynamisch adaptive Mehrgitterverfahren parallelisiert. Bei dynamisch adaptiven Mehrgitterverfahren wird ein Gebiet mit einem Gitter überdeckt, und auf diesem Gitter wird gerechnet, indem Gitterpunkte in der Umgebung herangezogen werden, um den Wert des nächsten Zeitpunktes zu bestimmen. Dann werden gröbere und feinere Gitter erzeugt und verwendet, wobei die feineren Gitter sich auf Teilgebiete konzentrieren. Diese Teilgebiete ändern sich im Verlauf der Zeit. Durch die Verwendung der zusätzlichen Gitter werden die numerischen Eigenschaften verbessert. Die Parallelisierung solcher Verfahren geschieht in der Regel durch Bisektion. In der vorliegenden Arbeit wird die Umverteilung der Gebiete realisiert, indem Mengen von einzelnen Gitterpunkten verschickt werden. Das ist ein Scheduling-Verfahren. Die Mehrgitterstrukturen sind so aufgebaut, dass fast beliebige Gitterpunktverteilungen auf den Gitterebenen vorliegen können. Die Strukturen werden einmal erzeugt, und nur bei Bedarf geändert, sodass keine Speicherallokationen während der Iterationen nötig sind. Neben dem Gitter sind zusätzliche Strukturen, wie zum Beispiel die Randstrukturen, erforderlich. Eine Struktur Farbenfeld verzeichnet, auf welchem Kern sich ein Außenrandpunkt befindet. In der parallelen adaptiven Verfeinerung werden für einzelne durch ein Entscheidungskriterium ausgewählte Gitterpunkte 5 x 5 Punktüberdeckungen vorgenommen. Dazu werden die verfügbaren Entscheidungsinformationen zur Bestimmung von komplexeren Strukturen herangezogen. Damit muss das Verfeinerungsgitter nicht komplett abgebaut und dann wieder aufgebaut werden, sondern nur die Änderungen am Gitter sind vorzunehmen. Das spart viel Berechnungszeit. Der letzte Schritt besteht darin, den Lastausgleich durchzuführen. Zunächst werden die Lasttransferwerte bestimmt, die angeben, wie viele Gitterpunkte von wo nach wo zu verschicken sind. Das geschieht mit Hilfe einer PLB genannten Methode bzw. einer Variante. PLB wurde bisher vor allem für kombinatorische Probleme eingesetzt. Dann erfolgt eine Auswahl der zu verschickenden Gitterpunkte mit einer Strategie, welche Punkte eines Kerns zu welchen Nachbarkernen transferiert werden sollen. Im letzten Schritt werden schließlich die ausgewählten Punkte migriert, wobei alle Gitterpunktstrukturen umgebaut werden und solche Informationen gepackt werden müssen, sodass ein Umbau seiner Gitterpunktstrukturen bei dem Empfänger möglich wird. Neben den Gitterpunktstrukturen müssen auch Strukturen für die parallele adaptive Verfeinerung verändert werden. Es muss ein Weiterverschicken von Gitterpunkten möglich sein, wenn über die Lastkanten in mehreren Runden Last verschickt wird. Während des Lastausgleichs wird noch Arbeit durch eine Struktur Zwischenkorrektur durchgeführt, die es ermöglicht, das Farbenfeld intakt zu halten, wenn benachbarte Gitterpunkte gleichzeitig verschickt werden.
Resumo:
Parlare di KeyCrime significa rapportarsi con un software che si fonda prioritariamente su di un metodo scientifico che fa proprio il ragionamento conclusivo (conclusive reasoning), applicato al decison making , pertanto all’intelligence investigativa e alla predictive policing. Potremmo pensare a KeyCrime come un paradigma operativo che si pone in sinergia tra la filosofia, il cognitivismo giuridico e le scienze applicate (Romeo F., 2006). Quando analisi e decision making trovano in un unico contesto il terreno fertile dove svilupparsi, ma ancor più, creare presupposti di ragionamento, ecco che da queste è facile comprendere da quale altra condizione sono nate, attivate e soprattutto utilizzate ai fini di un risultato: questa non è altro che “l’osservazione”; se ben fatta, profonda e scientifica offre una sistematica quanto utile predisposizione alle indagini, specialmente di carattere preventivo per l’anticrimine e la sicurezza.
Resumo:
This document presents GEmSysC, an unified cryptographic API for embedded systems. Software layers implementing this API can be built over existing libraries, allowing embedded software to access cryptographic functions in a consistent way that does not depend on the underlying library. The API complies to good practices for API design and good practices for embedded software development and took its inspiration from other cryptographic libraries and standards. The main inspiration for creating GEmSysC was the CMSIS-RTOS standard, which defines an unified API for embedded software in an implementation-independent way, but targets operating systems instead of cryptographic functions. GEmSysC is made of a generic core and attachable modules, one for each cryptographic algorithm. This document contains the specification of the core of GEmSysC and three of its modules: AES, RSA and SHA-256. GEmSysC was built targeting embedded systems, but this does not restrict its use only in such systems – after all, embedded systems are just very limited computing devices. As a proof of concept, two implementations of GEmSysC were made. One of them was built over wolfSSL, which is an open source library for embedded systems. The other was built over OpenSSL, which is open source and a de facto standard. Unlike wolfSSL, OpenSSL does not specifically target embedded systems. The implementation built over wolfSSL was evaluated in a Cortex- M3 processor with no operating system while the implementation built over OpenSSL was evaluated on a personal computer with Windows 10 operating system. This document displays test results showing GEmSysC to be simpler than other libraries in some aspects. These results have shown that both implementations incur in little overhead in computation time compared to the cryptographic libraries themselves. The overhead of the implementation has been measured for each cryptographic algorithm and is between around 0% and 0.17% for the implementation over wolfSSL and between 0.03% and 1.40% for the one over OpenSSL. This document also presents the memory costs for each implementation.
Resumo:
This thesis builds a framework for evaluating downside risk from multivariate data via a special class of risk measures (RM). The peculiarity of the analysis lies in getting rid of strong data distributional assumptions and in orientation towards the most critical data in risk management: those with asymmetries and heavy tails. At the same time, under typical assumptions, such as the ellipticity of the data probability distribution, the conformity with classical methods is shown. The constructed class of RM is a multivariate generalization of the coherent distortion RM, which possess valuable properties for a risk manager. The design of the framework is twofold. The first part contains new computational geometry methods for the high-dimensional data. The developed algorithms demonstrate computability of geometrical concepts used for constructing the RM. These concepts bring visuality and simplify interpretation of the RM. The second part develops models for applying the framework to actual problems. The spectrum of applications varies from robust portfolio selection up to broader spheres, such as stochastic conic optimization with risk constraints or supervised machine learning.
Resumo:
Spatial data representation and compression has become a focus issue in computer graphics and image processing applications. Quadtrees, as one of hierarchical data structures, basing on the principle of recursive decomposition of space, always offer a compact and efficient representation of an image. For a given image, the choice of quadtree root node plays an important role in its quadtree representation and final data compression. The goal of this thesis is to present a heuristic algorithm for finding a root node of a region quadtree, which is able to reduce the number of leaf nodes when compared with the standard quadtree decomposition. The empirical results indicate that, this proposed algorithm has quadtree representation and data compression improvement when in comparison with the traditional method.
Resumo:
In this session we'll explore how Microsoft uses data science and machine learning across it's entire business, from Windows and Office, to Skype and XBox. We'll look at how companies across the world use Microsoft technology for empowering their businesses in many different industries. And we'll look at data science technologies you can use yourselves, such as Azure Machine Learning and Power BI. Finally we'll discuss job opportunities for data scientists and tips on how you can be successful!
Resumo:
Thesis (M. S.)--University of Illinois at Urbana-Champaign.
Resumo:
The focus of this thesis is placed on text data compression based on the fundamental coding scheme referred to as the American Standard Code for Information Interchange or ASCII. The research objective is the development of software algorithms that result in significant compression of text data. Past and current compression techniques have been thoroughly reviewed to ensure proper contrast between the compression results of the proposed technique with those of existing ones. The research problem is based on the need to achieve higher compression of text files in order to save valuable memory space and increase the transmission rate of these text files. It was deemed necessary that the compression algorithm to be developed would have to be effective even for small files and be able to contend with uncommon words as they are dynamically included in the dictionary once they are encountered. A critical design aspect of this compression technique is its compatibility to existing compression techniques. In other words, the developed algorithm can be used in conjunction with existing techniques to yield even higher compression ratios. This thesis demonstrates such capabilities and such outcomes, and the research objective of achieving higher compression ratio is attained.
Resumo:
The generation of heterogeneous big data sources with ever increasing volumes, velocities and veracities over the he last few years has inspired the data science and research community to address the challenge of extracting knowledge form big data. Such a wealth of generated data across the board can be intelligently exploited to advance our knowledge about our environment, public health, critical infrastructure and security. In recent years we have developed generic approaches to process such big data at multiple levels for advancing decision-support. It specifically concerns data processing with semantic harmonisation, low level fusion, analytics, knowledge modelling with high level fusion and reasoning. Such approaches will be introduced and presented in context of the TRIDEC project results on critical oil and gas industry drilling operations and also the ongoing large eVacuate project on critical crowd behaviour detection in confined spaces.
Resumo:
Abstract Heading into the 2020s, Physics and Astronomy are undergoing experimental revolutions that will reshape our picture of the fabric of the Universe. The Large Hadron Collider (LHC), the largest particle physics project in the world, produces 30 petabytes of data annually that need to be sifted through, analysed, and modelled. In astrophysics, the Large Synoptic Survey Telescope (LSST) will be taking a high-resolution image of the full sky every 3 days, leading to data rates of 30 terabytes per night over ten years. These experiments endeavour to answer the question why 96% of the content of the universe currently elude our physical understanding. Both the LHC and LSST share the 5-dimensional nature of their data, with position, energy and time being the fundamental axes. This talk will present an overview of the experiments and data that is gathered, and outlines the challenges in extracting information. Common strategies employed are very similar to industrial data! Science problems (e.g., data filtering, machine learning, statistical interpretation) and provide a seed for exchange of knowledge between academia and industry. Speaker Biography Professor Mark Sullivan Mark Sullivan is a Professor of Astrophysics in the Department of Physics and Astronomy. Mark completed his PhD at Cambridge, and following postdoctoral study in Durham, Toronto and Oxford, now leads a research group at Southampton studying dark energy using exploding stars called "type Ia supernovae". Mark has many years' experience of research that involves repeatedly imaging the night sky to track the arrival of transient objects, involving significant challenges in data handling, processing, classification and analysis.
Resumo:
The discovery of new materials and their functions has always been a fundamental component of technological progress. Nowadays, the quest for new materials is stronger than ever: sustainability, medicine, robotics and electronics are all key assets which depend on the ability to create specifically tailored materials. However, designing materials with desired properties is a difficult task, and the complexity of the discipline makes it difficult to identify general criteria. While scientists developed a set of best practices (often based on experience and expertise), this is still a trial-and-error process. This becomes even more complex when dealing with advanced functional materials. Their properties depend on structural and morphological features, which in turn depend on fabrication procedures and environment, and subtle alterations leads to dramatically different results. Because of this, materials modeling and design is one of the most prolific research fields. Many techniques and instruments are continuously developed to enable new possibilities, both in the experimental and computational realms. Scientists strive to enforce cutting-edge technologies in order to make progress. However, the field is strongly affected by unorganized file management, proliferation of custom data formats and storage procedures, both in experimental and computational research. Results are difficult to find, interpret and re-use, and a huge amount of time is spent interpreting and re-organizing data. This also strongly limit the application of data-driven and machine learning techniques. This work introduces possible solutions to the problems described above. Specifically, it talks about developing features for specific classes of advanced materials and use them to train machine learning models and accelerate computational predictions for molecular compounds; developing method for organizing non homogeneous materials data; automate the process of using devices simulations to train machine learning models; dealing with scattered experimental data and use them to discover new patterns.
Resumo:
This work proposes a method based on both preprocessing and data mining with the objective of identify harmonic current sources in residential consumers. In addition, this methodology can also be applied to identify linear and nonlinear loads. It should be emphasized that the entire database was obtained through laboratory essays, i.e., real data were acquired from residential loads. Thus, the residential system created in laboratory was fed by a configurable power source and in its output were placed the loads and the power quality analyzers (all measurements were stored in a microcomputer). So, the data were submitted to pre-processing, which was based on attribute selection techniques in order to minimize the complexity in identifying the loads. A newer database was generated maintaining only the attributes selected, thus, Artificial Neural Networks were trained to realized the identification of loads. In order to validate the methodology proposed, the loads were fed both under ideal conditions (without harmonics), but also by harmonic voltages within limits pre-established. These limits are in accordance with IEEE Std. 519-1992 and PRODIST (procedures to delivery energy employed by Brazilian`s utilities). The results obtained seek to validate the methodology proposed and furnish a method that can serve as alternative to conventional methods.
Resumo:
Computer viruses are an important risk to computational systems endangering either corporations of all sizes or personal computers used for domestic applications. Here, classical epidemiological models for disease propagation are adapted to computer networks and, by using simple systems identification techniques a model called SAIC (Susceptible, Antidotal, Infectious, Contaminated) is developed. Real data about computer viruses are used to validate the model. (c) 2008 Elsevier Ltd. All rights reserved.