37 resultados para Distributed coding
em Universidad Politécnica de Madrid
Resumo:
En este proyecto se hace un análisis en profundidad de las técnicas de ataque a las redes de ordenadores conocidas como APTs (Advanced Persistent Threats), viendo cuál es el impacto que pueden llegar a tener en los equipos de una empresa y el posible robo de información y pérdida monetaria que puede llevar asociada. Para hacer esta introspección veremos qué técnicas utilizan los atacantes para introducir el malware en la red y también cómo dicho malware escala privilegios, obtiene información privilegiada y se mantiene oculto. Además, y cómo parte experimental de este proyecto se ha desarrollado una plataforma para la detección de malware de una red en base a las webs, URLs e IPs que visitan los nodos que la componen. Obtendremos esta visión gracias a la extracción de los logs y registros de DNS de consulta de la compañía, sobre los que realizaremos un análisis exhaustivo. Para poder inferir correctamente qué equipos están infectados o no se ha utilizado un algoritmo de desarrollo propio inspirado en la técnica Belief Propagation (“Propagación basada en creencia”) que ya ha sido usada antes por desarrolladores cómo los de los Álamos en Nuevo México (Estados Unidos) para fines similares a los que aquí se muestran. Además, para mejorar la velocidad de inferencia y el rendimiento del sistema se propone un algoritmo adaptado a la plataforma Hadoop de Apache, por lo que se modifica el paradigma de programación habitual y se busca un nuevo paradigma conocido como MapReduce que consiste en la división de la información en conceptos clave-valor. Por una parte, los algoritmos que existen basados en Belief Propagation para el descubrimiento de malware son propietarios y no han sido publicados completamente hasta la fecha, por otra parte, estos algoritmos aún no han sido adaptados a Hadoop ni a ningún modelo de programación distribuida aspecto que se abordará en este proyecto. No es propósito de este proyecto desarrollar una plataforma comercial o funcionalmente completa, sino estudiar el problema de las APTs y una implementación que demuestre que la plataforma mencionada es factible de implementar. Este proyecto abre, a su vez, un horizonte nuevo de investigación en el campo de la adaptación al modelo MapReduce de algoritmos del tipo Belief Propagation basados en la detección del malware mediante registros DNS. ABSTRACT. This project makes an in-depth investigation about problems related to APT in computer networks nowadays, seeing how much damage could they inflict on the hosts of a Company and how much monetary and information loss may they cause. In our investigation we will find what techniques are generally applied by attackers to inject malware into networks and how this malware escalates its privileges, extracts privileged information and stays hidden. As the main part of this Project, this paper shows how to develop and configure a platform that could detect malware from URLs and IPs visited by the hosts of the network. This information can be extracted from the logs and DNS query records of the Company, on which we will make an analysis in depth. A self-developed algorithm inspired on Belief Propagation technique has been used to infer which hosts are infected and which are not. This technique has been used before by developers of Los Alamos Lab (New Mexico, USA) for similar purposes. Moreover, this project proposes an algorithm adapted to Apache Hadoop Platform in order to improve the inference speed and system performance. This platform replaces the traditional coding paradigm by a new paradigm called MapReduce which splits and shares information among hosts and uses key-value tokens. On the one hand, existing algorithms based on Belief Propagation are part of owner software and they have not been published yet because they have been patented due to the huge economic benefits they could give. On the other hand these algorithms have neither been adapted to Hadoop nor to other distributed coding paradigms. This situation turn the challenge into a complicated problem and could lead to a dramatic increase of its installation difficulty on a client corporation. The purpose of this Project is to develop a complete and 100% functional brand platform. Herein, show a short summary of the APT problem will be presented and make an effort will be made to demonstrate the viability of an APT discovering platform. At the same time, this project opens up new horizons of investigation about adapting Belief Propagation algorithms to the MapReduce model and about malware detection with DNS records.
Resumo:
Distributed real-time embedded systems are becoming increasingly important to society. More demands will be made on them and greater reliance will be placed on the delivery of their services. A relevant subset of them is high-integrity or hard real-time systems, where failure can cause loss of life, environmental harm, or significant financial loss. Additionally, the evolution of communication networks and paradigms as well as the necessity of demanding processing power and fault tolerance, motivated the interconnection between electronic devices; many of the communications have the possibility of transferring data at a high speed. The concept of distributed systems emerged as systems where different parts are executed on several nodes that interact with each other via a communication network. Java’s popularity, facilities and platform independence have made it an interesting language for the real-time and embedded community. This was the motivation for the development of RTSJ (Real-Time Specification for Java), which is a language extension intended to allow the development of real-time systems. The use of Java in the development of high-integrity systems requires strict development and testing techniques. However, RTJS includes a number of language features that are forbidden in such systems. In the context of the HIJA project, the HRTJ (Hard Real-Time Java) profile was developed to define a robust subset of the language that is amenable to static analysis for high-integrity system certification. Currently, a specification under the Java community process (JSR- 302) is being developed. Its purpose is to define those capabilities needed to create safety critical applications with Java technology called Safety Critical Java (SCJ). However, neither RTSJ nor its profiles provide facilities to develop distributed realtime applications. This is an important issue, as most of the current and future systems will be distributed. The Distributed RTSJ (DRTSJ) Expert Group was created under the Java community process (JSR-50) in order to define appropriate abstractions to overcome this problem. Currently there is no formal specification. The aim of this thesis is to develop a communication middleware that is suitable for the development of distributed hard real-time systems in Java, based on the integration between the RMI (Remote Method Invocation) model and the HRTJ profile. It has been designed and implemented keeping in mind the main requirements such as the predictability and reliability in the timing behavior and the resource usage. iThe design starts with the definition of a computational model which identifies among other things: the communication model, most appropriate underlying network protocols, the analysis model, and a subset of Java for hard real-time systems. In the design, the remote references are the basic means for building distributed applications which are associated with all non-functional parameters and resources needed to implement synchronous or asynchronous remote invocations with real-time attributes. The proposed middleware separates the resource allocation from the execution itself by defining two phases and a specific threading mechanism that guarantees a suitable timing behavior. It also includes mechanisms to monitor the functional and the timing behavior. It provides independence from network protocol defining a network interface and modules. The JRMP protocol was modified to include two phases, non-functional parameters, and message size optimizations. Although serialization is one of the fundamental operations to ensure proper data transmission, current implementations are not suitable for hard real-time systems and there are no alternatives. This thesis proposes a predictable serialization that introduces a new compiler to generate optimized code according to the computational model. The proposed solution has the advantage of allowing us to schedule the communications and to adjust the memory usage at compilation time. In order to validate the design and the implementation a demanding validation process was carried out with emphasis in the functional behavior, the memory usage, the processor usage (the end-to-end response time and the response time in each functional block) and the network usage (real consumption according to the calculated consumption). The results obtained in an industrial application developed by Thales Avionics (a Flight Management System) and in exhaustive tests show that the design and the prototype are reliable for industrial applications with strict timing requirements. Los sistemas empotrados y distribuidos de tiempo real son cada vez más importantes para la sociedad. Su demanda aumenta y cada vez más dependemos de los servicios que proporcionan. Los sistemas de alta integridad constituyen un subconjunto de gran importancia. Se caracterizan por que un fallo en su funcionamiento puede causar pérdida de vidas humanas, daños en el medio ambiente o cuantiosas pérdidas económicas. La necesidad de satisfacer requisitos temporales estrictos, hace más complejo su desarrollo. Mientras que los sistemas empotrados se sigan expandiendo en nuestra sociedad, es necesario garantizar un coste de desarrollo ajustado mediante el uso técnicas adecuadas en su diseño, mantenimiento y certificación. En concreto, se requiere una tecnología flexible e independiente del hardware. La evolución de las redes y paradigmas de comunicación, así como la necesidad de mayor potencia de cómputo y de tolerancia a fallos, ha motivado la interconexión de dispositivos electrónicos. Los mecanismos de comunicación permiten la transferencia de datos con alta velocidad de transmisión. En este contexto, el concepto de sistema distribuido ha emergido como sistemas donde sus componentes se ejecutan en varios nodos en paralelo y que interactúan entre ellos mediante redes de comunicaciones. Un concepto interesante son los sistemas de tiempo real neutrales respecto a la plataforma de ejecución. Se caracterizan por la falta de conocimiento de esta plataforma durante su diseño. Esta propiedad es relevante, por que conviene que se ejecuten en la mayor variedad de arquitecturas, tienen una vida media mayor de diez anos y el lugar ˜ donde se ejecutan puede variar. El lenguaje de programación Java es una buena base para el desarrollo de este tipo de sistemas. Por este motivo se ha creado RTSJ (Real-Time Specification for Java), que es una extensión del lenguaje para permitir el desarrollo de sistemas de tiempo real. Sin embargo, RTSJ no proporciona facilidades para el desarrollo de aplicaciones distribuidas de tiempo real. Es una limitación importante dado que la mayoría de los actuales y futuros sistemas serán distribuidos. El grupo DRTSJ (DistributedRTSJ) fue creado bajo el proceso de la comunidad de Java (JSR-50) con el fin de definir las abstracciones que aborden dicha limitación, pero en la actualidad aun no existe una especificacion formal. El objetivo de esta tesis es desarrollar un middleware de comunicaciones para el desarrollo de sistemas distribuidos de tiempo real en Java, basado en la integración entre el modelo de RMI (Remote Method Invocation) y el perfil HRTJ. Ha sido diseñado e implementado teniendo en cuenta los requisitos principales, como la predecibilidad y la confiabilidad del comportamiento temporal y el uso de recursos. El diseño parte de la definición de un modelo computacional el cual identifica entre otras cosas: el modelo de comunicaciones, los protocolos de red subyacentes más adecuados, el modelo de análisis, y un subconjunto de Java para sistemas de tiempo real crítico. En el diseño, las referencias remotas son el medio básico para construcción de aplicaciones distribuidas las cuales son asociadas a todos los parámetros no funcionales y los recursos necesarios para la ejecución de invocaciones remotas síncronas o asíncronas con atributos de tiempo real. El middleware propuesto separa la asignación de recursos de la propia ejecución definiendo dos fases y un mecanismo de hebras especifico que garantiza un comportamiento temporal adecuado. Además se ha incluido mecanismos para supervisar el comportamiento funcional y temporal. Se ha buscado independencia del protocolo de red definiendo una interfaz de red y módulos específicos. También se ha modificado el protocolo JRMP para incluir diferentes fases, parámetros no funcionales y optimizaciones de los tamaños de los mensajes. Aunque la serialización es una de las operaciones fundamentales para asegurar la adecuada transmisión de datos, las actuales implementaciones no son adecuadas para sistemas críticos y no hay alternativas. Este trabajo propone una serialización predecible que ha implicado el desarrollo de un nuevo compilador para la generación de código optimizado acorde al modelo computacional. La solución propuesta tiene la ventaja que en tiempo de compilación nos permite planificar las comunicaciones y ajustar el uso de memoria. Con el objetivo de validar el diseño e implementación se ha llevado a cabo un exigente proceso de validación con énfasis en: el comportamiento funcional, el uso de memoria, el uso del procesador (tiempo de respuesta de extremo a extremo y en cada uno de los bloques funcionales) y el uso de la red (consumo real conforme al estimado). Los buenos resultados obtenidos en una aplicación industrial desarrollada por Thales Avionics (un sistema de gestión de vuelo) y en las pruebas exhaustivas han demostrado que el diseño y el prototipo son fiables para aplicaciones industriales con estrictos requisitos temporales.
Resumo:
The use of modular or ‘micro’ maximum power point tracking (MPPT) converters at module level in series association, commercially known as “power optimizers”, allows the individual adaptation of each panel to the load, solving part of the problems related to partial shadows and different tilt and/or orientation angles of the photovoltaic (PV) modules. This is particularly relevant in building integrated PV systems. This paper presents useful behavioural analytical studies of cascade MPPT converters and evaluation test results of a prototype developed under a Spanish national research project. On the one hand, this work focuses on the development of new useful expressions which can be used to identify the behaviour of individual MPPT converters applied to each module and connected in series, in a typical grid-connected PV system. On the other hand, a novel characterization method of MPPT converters is developed, and experimental results of the prototype are obtained: when individual partial shading is applied, and they are connected in a typical grid connected PV array
Resumo:
Protein-coding gene families are sets of similar genes with a shared evolutionary origin and, generally, with similar biological functions. In plants, the size and role of gene families has been only partially addressed. However, suitable bioinformatics tools are being developed to cluster the enormous number of sequences currently available in databases. Specifically, comparative genomic databases promise to become powerful tools for gene family annotation in plant clades. In this review, I evaluate the data retrieved from various gene family databases, the ease with which they can be extracted and how useful the extracted information is.
Resumo:
With electricity consumption increasing within the UnitedStates, new paradigms of delivering electricity are required in order to meet demand. One promising option is the increased use of distributedpowergeneration. Already a growing percentage of electricity generation, distributedgeneration locates the power plant physically close to the consumer, avoiding transmission and distribution losses as well as providing the possibility of combined heat and power. Despite the efficiency gains possible, regulators and utilities have been reluctant to implement distributedgeneration, creating numerous technical, regulatory, and business barriers. Certain governments, most notable California, are making concerted efforts to overcome these barriers in order to ensure distributedgeneration plays a part as the country meets demand while shifting to cleaner sources of energy.
Resumo:
Work on distributed data management commenced shortly after the introduction of the relational model in the mid-1970's. 1970's and 1980's were very active periods for the development of distributed relational database technology, and claims were made that in the following ten years centralized databases will be an “antique curiosity” and most organizations will move toward distributed database managers [1]. That prediction has certainly become true, and all commercial DBMSs today are distributed.
Resumo:
The problem of fairly distributing the capacity of a network among a set of sessions has been widely studied. In this problem, each session connects via a single path a source and a destination, and its goal is to maximize its assigned transmission rate (i.e., its throughput). Since the links of the network have limited bandwidths, some criterion has to be defined to fairly distribute their capacity among the sessions. A popular criterion is max-min fairness that, in short, guarantees that each session i gets a rate λi such that no session s can increase λs without causing another session s' to end up with a rate λs/ <; λs. Many max-min fair algorithms have been proposed, both centralized and distributed. However, to our knowledge, all proposed distributed algorithms require control data being continuously transmitted to recompute the max-min fair rates when needed (because none of them has mechanisms to detect convergence to the max-min fair rates). In this paper we propose B-Neck, a distributed max-min fair algorithm that is also quiescent. This means that, in absence of changes (i.e., session arrivals or departures), once the max min rates have been computed, B-Neck stops generating network traffic. Quiescence is a key design concept of B-Neck, because B-Neck routers are capable of detecting and notifying changes in the convergence conditions of max-min fair rates. As far as we know, B-Neck is the first distributed max-min fair algorithm that does not require a continuous injection of control traffic to compute the rates. The correctness of B-Neck is formally proved, and extensive simulations are conducted. In them, it is shown that B-Neck converges relatively fast and behaves nicely in presence of sessions arriving and departing.
Resumo:
Membrane systems are computational equivalent to Turing machines. However, their distributed and massively parallel nature obtains polynomial solutions opposite to traditional non-polynomial ones. At this point, it is very important to develop dedicated hardware and software implementations exploiting those two membrane systems features. Dealing with distributed implementations of P systems, the bottleneck communication problem has arisen. When the number of membranes grows up, the network gets congested. The purpose of distributed architectures is to reach a compromise between the massively parallel character of the system and the needed evolution step time to transit from one configuration of the system to the next one, solving the bottleneck communication problem. The goal of this paper is twofold. Firstly, to survey in a systematic and uniform way the main results regarding the way membranes can be placed on processors in order to get a software/hardware simulation of P-Systems in a distributed environment. Secondly, we improve some results about the membrane dissolution problem, prove that it is connected, and discuss the possibility of simulating this property in the distributed model. All this yields an improvement in the system parallelism implementation since it gets an increment of the parallelism of the external communication among processors. Proposed ideas improve previous architectures to tackle the communication bottleneck problem, such as reduction of the total time of an evolution step, increase of the number of membranes that could run on a processor and reduction of the number of processors.
Resumo:
An extended 3D distributed model based on distributed circuit units for the simulation of triple‐junction solar cells under realistic conditions for the light distribution has been developed. A special emphasis has been put in the capability of the model to accurately account for current mismatch and chromatic aberration effects. This model has been validated, as shown by the good agreement between experimental and simulation results, for different light spot characteristics including spectral mismatch and irradiance non‐uniformities. This model is then used for the prediction of the performance of a triple‐junction solar cell for a light spot corresponding to a real optical architecture in order to illustrate its suitability in assisting concentrator system analysis and design process.
Resumo:
The consideration of real operating conditions for the design and optimization of a multijunction solar cell receiver-concentrator assembly is indispensable. Such a requirement involves the need for suitable modeling and simulation tools in order to complement the experimental work and circumvent its well-known burdens and restrictions. Three-dimensional distributed models have been demonstrated in the past to be a powerful choice for the analysis of distributed phenomena in single- and dual-junction solar cells, as well as for the design of strategies to minimize the solar cell losses when operating under high concentrations. In this paper, we present the application of these models for the analysis of triple-junction solar cells under real operating conditions. The impact of different chromatic aberration profiles on the short-circuit current of triple-junction solar cells is analyzed in detail using the developed distributed model. Current spreading conditions the impact of a given chromatic aberration profile on the solar cell I-V curve. The focus is put on determining the role of current spreading in the connection between photocurrent profile, subcell voltage and current, and semiconductor layers sheet resistance.
Resumo:
The manipulation and handling of an ever increasing volume of data by current data-intensive applications require novel techniques for e?cient data management. Despite recent advances in every aspect of data management (storage, access, querying, analysis, mining), future applications are expected to scale to even higher degrees, not only in terms of volumes of data handled but also in terms of users and resources, often making use of multiple, pre-existing autonomous, distributed or heterogeneous resources.
Resumo:
The goal of this paper is twofold. Firstly, to survey in a systematic and uniform way the main results regarding the way membranes can be placed on processors in order to get a software/hardware simulation of P-Systems in a distributed environment. Secondly, we improve some results about the membrane dissolution problem, prove that it is connected, and discuss the possibility of simulating this property in the distributed model. All this yields an improvement in the system parallelism implementation since it gets an increment of the parallelism of the external communication among processors. Also, the number of processors grows in such a way that is notorious the increment of the parallelism in the application of the evolution rules and the internal communica-tionsstudy because it gets an increment of the parallelism in the application of the evolution rules and the internal communications. Proposed ideas improve previous architectures to tackle the communication bottleneck problem, such as reduction of the total time of an evolution step, increase of the number of membranes that could run on a processor and reduction of the number of processors
Resumo:
Belief propagation (BP) is a technique for distributed inference in wireless networks and is often used even when the underlying graphical model contains cycles. In this paper, we propose a uniformly reweighted BP scheme that reduces the impact of cycles by weighting messages by a constant ?edge appearance probability? rho ? 1. We apply this algorithm to distributed binary hypothesis testing problems (e.g., distributed detection) in wireless networks with Markov random field models. We demonstrate that in the considered setting the proposed method outperforms standard BP, while maintaining similar complexity. We then show that the optimal ? can be approximated as a simple function of the average node degree, and can hence be computed in a distributed fashion through a consensus algorithm.
Resumo:
This article proposes a MAS architecture for network diagnosis under uncertainty. Network diagnosis is divided into two inference processes: hypothesis generation and hypothesis confirmation. The first process is distributed among several agents based on a MSBN, while the second one is carried out by agents using semantic reasoning. A diagnosis ontology has been defined in order to combine both inference processes. To drive the deliberation process, dynamic data about the influence of observations are taken during diagnosis process. In order to achieve quick and reliable diagnoses, this influence is used to choose the best action to perform. This approach has been evaluated in a P2P video streaming scenario. Computational and time improvements are highlight as conclusions.
Resumo:
Many of the emerging telecom services make use of Outer Edge Networks, in particular Home Area Networks. The configuration and maintenance of such services may not be under full control of the telecom operator which still needs to guarantee the service quality experienced by the consumer. Diagnosing service faults in these scenarios becomes especially difficult since there may be not full visibility between different domains. This paper describes the fault diagnosis solution developed in the MAGNETO project, based on the application of Bayesian Inference to deal with the uncertainty. It also takes advantage of a distributed framework to deploy diagnosis components in the different domains and network elements involved, spanning both the telecom operator and the Outer Edge networks. In addition, MAGNETO features self-learning capabilities to automatically improve diagnosis knowledge over time and a partition mechanism that allows breaking down the overall diagnosis knowledge into smaller subsets. The MAGNETO solution has been prototyped and adapted to a particular outer edge scenario, and has been further validated on a real testbed. Evaluation of the results shows the potential of our approach to deal with fault management of outer edge networks.