884 resultados para Hadoop distributed file system (HDFS)
Resumo:
This paper describes ExperNet, an intelligent multi-agent system that was developed under an EU funded project to assist in the management of a large-scale data network. ExperNet assists network operators at various nodes of a WAN to detect and diagnose hardware failures and network traffic problems and suggests the most feasible solution, through a web-based interface. ExperNet is composed by intelligent agents, capable of both local problem solving and social interaction among them for coordinating problem diagnosis and repair. The current network state is captured and maintained by conventional network management and monitoring software components, which have been smoothly integrated into the system through sophisticated information exchange interfaces. For the implementation of the agents, a distributed Prolog system enhanced with networking facilities was developed. The agents’ knowledge base is developed in an extensible and reactive knowledge base system capable of handling multiple types of knowledge representation. ExperNet has been developed, installed and tested successfully in an experimental network zone of Ukraine.
Resumo:
Los ataques a redes de información son cada vez más sofisticados y exigen una constante evolución y mejora de las técnicas de detección. Para ello, en este proyecto se ha diseñado e implementado una plataforma cooperativa para la detección de intrusiones basada en red. En primer lugar, se ha realizado un estudio teórico previo del marco tecnológico relacionado con este ámbito, en el que se describe y caracteriza el software que se utiliza para realizar ataques a sistemas (malware) así como los métodos que se utilizan para llegar a transmitir ese software (vectores de ataque). En el documento también se describen los llamados APT, que son ataques dirigidos con una gran inversión económica y temporal. Estos pueden englobar todos los malware y vectores de ataque existentes. Para poder evitar estos ataques, se estudiarán los sistemas de detección y prevención de intrusiones, describiendo brevemente los algoritmos que se tienden a utilizar en la actualidad. En segundo lugar, se ha planteado y desarrollado una plataforma en red dedicada al análisis de paquetes y conexiones para detectar posibles intrusiones. Este sistema está orientado a sistemas SCADA (Supervisory Control And Data Adquisition) aunque funciona sobre cualquier red IPv4/IPv6, para ello se definirá previamente lo que es un sistema SCADA, así como sus partes principales. Para implementar el sistema se han utilizado dispositivos de bajo consumo llamados Raspberry PI, estos se ubican entre la red y el equipo final que se quiera analizar. En ellos se ejecutan 2 aplicaciones desarrolladas de tipo cliente-servidor (la Raspberry central ejecutará la aplicación servidora y las esclavas la aplicación cliente) que funcionan de forma cooperativa utilizando la tecnología distribuida de Hadoop, la cual se explica previamente. Mediante esta tecnología se consigue desarrollar un sistema completamente escalable. La aplicación servidora muestra una interfaz gráfica que permite administrar la plataforma de análisis de forma centralizada, pudiendo ver así las alarmas de cada dispositivo y calificando cada paquete según su peligrosidad. El algoritmo desarrollado en la aplicación calcula el ratio de paquetes/tiempo que entran/salen del equipo final, procesando los paquetes y analizándolos teniendo en cuenta la información de señalización, creando diferentes bases de datos que irán mejorando la robustez del sistema, reduciendo así la posibilidad de ataques externos. Para concluir, el proyecto inicial incluía el procesamiento en la nube de la aplicación principal, pudiendo administrar así varias infraestructuras concurrentemente, aunque debido al trabajo extra necesario se ha dejado preparado el sistema para poder implementar esta funcionalidad. En el caso experimental actual el procesamiento de la aplicación servidora se realiza en la Raspberry principal, creando un sistema escalable, rápido y tolerante a fallos. ABSTRACT. The attacks to networks of information are increasingly sophisticated and demand a constant evolution and improvement of the technologies of detection. For this project it is developed and implemented a cooperative platform for detect intrusions based on networking. First, there has been a previous theoretical study of technological framework related to this area, which describes the software used for attacks on systems (malware) as well as the methods used in order to transmit this software (attack vectors). In this document it is described the APT, which are attacks directed with a big economic and time inversion. These can contain all existing malware and attack vectors. To prevent these attacks, intrusion detection systems and prevention intrusion systems will be discussed, describing previously the algorithms tend to use today. Secondly, a platform for analyzing network packets has been proposed and developed to detect possible intrusions in SCADA (Supervisory Control And Data Adquisition) systems. This platform is designed for SCADA systems (Supervisory Control And Data Acquisition) but works on any IPv4 / IPv6 network. Previously, it is defined what a SCADA system is and the main parts of it. To implement it, we used low-power devices called Raspberry PI, these are located between the network and the final device to analyze it. In these Raspberry run two applications client-server developed (the central Raspberry runs the server application and the slaves the client application) that work cooperatively using Hadoop distributed technology, which is previously explained. Using this technology is achieved develop a fully scalable system. The server application displays a graphical interface to manage analytics platform centrally, thereby we can see each device alarms and qualifying each packet by dangerousness. The algorithm developed in the application calculates the ratio of packets/time entering/leaving the terminal device, processing the packets and analyzing the signaling information of each packet, reating different databases that will improve the system, thereby reducing the possibility of external attacks. In conclusion, the initial project included cloud computing of the main application, being able to manage multiple concurrent infrastructure, but due to the extra work required has been made ready the system to implement this funcionality. In the current test case the server application processing is made on the main Raspberry, creating a scalable, fast and fault-tolerant system.
Resumo:
La computación ubicua está extendiendo su aplicación desde entornos específicos hacia el uso cotidiano; el Internet de las cosas (IoT, en inglés) es el ejemplo más brillante de su aplicación y de la complejidad intrínseca que tiene, en comparación con el clásico desarrollo de aplicaciones. La principal característica que diferencia la computación ubicua de los otros tipos está en como se emplea la información de contexto. Las aplicaciones clásicas no usan en absoluto la información de contexto o usan sólo una pequeña parte de ella, integrándola de una forma ad hoc con una implementación específica para la aplicación. La motivación de este tratamiento particular se tiene que buscar en la dificultad de compartir el contexto con otras aplicaciones. En realidad lo que es información de contexto depende del tipo de aplicación: por poner un ejemplo, para un editor de imágenes, la imagen es la información y sus metadatos, tales como la hora de grabación o los ajustes de la cámara, son el contexto, mientras que para el sistema de ficheros la imagen junto con los ajustes de cámara son la información, y el contexto es representado por los metadatos externos al fichero como la fecha de modificación o la de último acceso. Esto significa que es difícil compartir la información de contexto, y la presencia de un middleware de comunicación que soporte el contexto de forma explícita simplifica el desarrollo de aplicaciones para computación ubicua. Al mismo tiempo el uso del contexto no tiene que ser obligatorio, porque si no se perdería la compatibilidad con las aplicaciones que no lo usan, convirtiendo así dicho middleware en un middleware de contexto. SilboPS, que es nuestra implementación de un sistema publicador/subscriptor basado en contenido e inspirado en SIENA [11, 9], resuelve dicho problema extendiendo el paradigma con dos elementos: el Contexto y la Función de Contexto. El contexto representa la información contextual propiamente dicha del mensaje por enviar o aquella requerida por el subscriptor para recibir notificaciones, mientras la función de contexto se evalúa usando el contexto del publicador y del subscriptor. Esto permite desacoplar la lógica de gestión del contexto de aquella de la función de contexto, incrementando de esta forma la flexibilidad de la comunicación entre varias aplicaciones. De hecho, al utilizar por defecto un contexto vacío, las aplicaciones clásicas y las que manejan el contexto pueden usar el mismo SilboPS, resolviendo de esta forma la incompatibilidad entre las dos categorías. En cualquier caso la posible incompatibilidad semántica sigue existiendo ya que depende de la interpretación que cada aplicación hace de los datos y no puede ser solucionada por una tercera parte agnóstica. El entorno IoT conlleva retos no sólo de contexto, sino también de escalabilidad. La cantidad de sensores, el volumen de datos que producen y la cantidad de aplicaciones que podrían estar interesadas en manipular esos datos está en continuo aumento. Hoy en día la respuesta a esa necesidad es la computación en la nube, pero requiere que las aplicaciones sean no sólo capaces de escalar, sino de hacerlo de forma elástica [22]. Desgraciadamente no hay ninguna primitiva de sistema distribuido de slicing que soporte un particionamiento del estado interno [33] junto con un cambio en caliente, además de que los sistemas cloud actuales como OpenStack u OpenNebula no ofrecen directamente una monitorización elástica. Esto implica que hay un problema bilateral: cómo puede una aplicación escalar de forma elástica y cómo monitorizar esa aplicación para saber cuándo escalarla horizontalmente. E-SilboPS es la versión elástica de SilboPS y se adapta perfectamente como solución para el problema de monitorización, gracias al paradigma publicador/subscriptor basado en contenido y, a diferencia de otras soluciones [5], permite escalar eficientemente, para cumplir con la carga de trabajo sin sobre-provisionar o sub-provisionar recursos. Además está basado en un algoritmo recientemente diseñado que muestra como añadir elasticidad a una aplicación con distintas restricciones sobre el estado: sin estado, estado aislado con coordinación externa y estado compartido con coordinación general. Su evaluación enseña como se pueden conseguir notables speedups, siendo el nivel de red el principal factor limitante: de hecho la eficiencia calculada (ver Figura 5.8) demuestra cómo se comporta cada configuración en comparación con las adyacentes. Esto permite conocer la tendencia actual de todo el sistema, para saber si la siguiente configuración compensará el coste que tiene con la ganancia que lleva en el throughput de notificaciones. Se tiene que prestar especial atención en la evaluación de los despliegues con igual coste, para ver cuál es la mejor solución en relación a una carga de trabajo dada. Como último análisis se ha estimado el overhead introducido por las distintas configuraciones a fin de identificar el principal factor limitante del throughput. Esto ayuda a determinar la parte secuencial y el overhead de base [26] en un despliegue óptimo en comparación con uno subóptimo. Efectivamente, según el tipo de carga de trabajo, la estimación puede ser tan baja como el 10 % para un óptimo local o tan alta como el 60 %: esto ocurre cuando se despliega una configuración sobredimensionada para la carga de trabajo. Esta estimación de la métrica de Karp-Flatt es importante para el sistema de gestión porque le permite conocer en que dirección (ampliar o reducir) es necesario cambiar el despliegue para mejorar sus prestaciones, en lugar que usar simplemente una política de ampliación. ABSTRACT The application of pervasive computing is extending from field-specific to everyday use. The Internet of Things (IoT) is the shiniest example of its application and of its intrinsic complexity compared with classical application development. The main characteristic that differentiates pervasive from other forms of computing lies in the use of contextual information. Some classical applications do not use any contextual information whatsoever. Others, on the other hand, use only part of the contextual information, which is integrated in an ad hoc fashion using an application-specific implementation. This information is handled in a one-off manner because of the difficulty of sharing context across applications. As a matter of fact, the application type determines what the contextual information is. For instance, for an imaging editor, the image is the information and its meta-data, like the time of the shot or camera settings, are the context, whereas, for a file-system application, the image, including its camera settings, is the information and the meta-data external to the file, like the modification date or the last accessed timestamps, constitute the context. This means that contextual information is hard to share. A communication middleware that supports context decidedly eases application development in pervasive computing. However, the use of context should not be mandatory; otherwise, the communication middleware would be reduced to a context middleware and no longer be compatible with non-context-aware applications. SilboPS, our implementation of content-based publish/subscribe inspired by SIENA [11, 9], solves this problem by adding two new elements to the paradigm: the context and the context function. Context represents the actual contextual information specific to the message to be sent or that needs to be notified to the subscriber, whereas the context function is evaluated using the publisher’s context and the subscriber’s context to decide whether the current message and context are useful for the subscriber. In this manner, context logic management is decoupled from context management, increasing the flexibility of communication and usage across different applications. Since the default context is empty, context-aware and classical applications can use the same SilboPS, resolving the syntactic mismatch that there is between the two categories. In any case, the possible semantic mismatch is still present because it depends on how each application interprets the data, and it cannot be resolved by an agnostic third party. The IoT environment introduces not only context but scaling challenges too. The number of sensors, the volume of the data that they produce and the number of applications that could be interested in harvesting such data are growing all the time. Today’s response to the above need is cloud computing. However, cloud computing applications need to be able to scale elastically [22]. Unfortunately there is no slicing, as distributed system primitives that support internal state partitioning [33] and hot swapping and current cloud systems like OpenStack or OpenNebula do not provide elastic monitoring out of the box. This means there is a two-sided problem: 1) how to scale an application elastically and 2) how to monitor the application and know when it should scale in or out. E-SilboPS is the elastic version of SilboPS. I t is the solution for the monitoring problem thanks to its content-based publish/subscribe nature and, unlike other solutions [5], it scales efficiently so as to meet workload demand without overprovisioning or underprovisioning. Additionally, it is based on a newly designed algorithm that shows how to add elasticity in an application with different state constraints: stateless, isolated stateful with external coordination and shared stateful with general coordination. Its evaluation shows that it is able to achieve remarkable speedups where the network layer is the main limiting factor: the calculated efficiency (see Figure 5.8) shows how each configuration performs with respect to adjacent configurations. This provides insight into the actual trending of the whole system in order to predict if the next configuration would offset its cost against the resulting gain in notification throughput. Particular attention has been paid to the evaluation of same-cost deployments in order to find out which one is the best for the given workload demand. Finally, the overhead introduced by the different configurations has been estimated to identify the primary limiting factor for throughput. This helps to determine the intrinsic sequential part and base overhead [26] of an optimal versus a suboptimal deployment. Depending on the type of workload, this can be as low as 10% in a local optimum or as high as 60% when an overprovisioned configuration is deployed for a given workload demand. This Karp-Flatt metric estimation is important for system management because it indicates the direction (scale in or out) in which the deployment has to be changed in order to improve its performance instead of simply using a scale-out policy.
Resumo:
Working memory is the process of actively maintaining a representation of information for a brief period of time so that it is available for use. In monkeys, visual working memory involves the concerted activity of a distributed neural system, including posterior areas in visual cortex and anterior areas in prefrontal cortex. Within visual cortex, ventral stream areas are selectively involved in object vision, whereas dorsal stream areas are selectively involved in spatial vision. This domain specificity appears to extend forward into prefrontal cortex, with ventrolateral areas involved mainly in working memory for objects and dorsolateral areas involved mainly in working memory for spatial locations. The organization of this distributed neural system for working memory in monkeys appears to be conserved in humans, though some differences between the two species exist. In humans, as compared with monkeys, areas specialized for object vision in the ventral stream have a more inferior location in temporal cortex, whereas areas specialized for spatial vision in the dorsal stream have a more superior location in parietal cortex. Displacement of both sets of visual areas away from the posterior perisylvian cortex may be related to the emergence of language over the course of brain evolution. Whereas areas specialized for object working memory in humans and monkeys are similarly located in ventrolateral prefrontal cortex, those specialized for spatial working memory occupy a more superior and posterior location within dorsal prefrontal cortex in humans than in monkeys. As in posterior cortex, this displacement in frontal cortex also may be related to the emergence of new areas to serve distinctively human cognitive abilities.
Resumo:
Thesis (Master's)--University of Washington, 2016-06
Resumo:
Distributed digital control systems provide alternatives to conventional, centralised digital control systems. Typically, a modern distributed control system will comprise a multi-processor or network of processors, a communications network, an associated set of sensors and actuators, and the systems and applications software. This thesis addresses the problem of how to design robust decentralised control systems, such as those used to control event-driven, real-time processes in time-critical environments. Emphasis is placed on studying the dynamical behaviour of a system and identifying ways of partitioning the system so that it may be controlled in a distributed manner. A structural partitioning technique is adopted which makes use of natural physical sub-processes in the system, which are then mapped into the software processes to control the system. However, communications are required between the processes because of the disjoint nature of the distributed (i.e. partitioned) state of the physical system. The structural partitioning technique, and recent developments in the theory of potential controllability and observability of a system, are the basis for the design of controllers. In particular, the method is used to derive a decentralised estimate of the state vector for a continuous-time system. The work is also extended to derive a distributed estimate for a discrete-time system. Emphasis is also given to the role of communications in the distributed control of processes and to the partitioning technique necessary to design distributed and decentralised systems with resilient structures. A method is presented for the systematic identification of necessary communications for distributed control. It is also shwon that the structural partitions can be used directly in the design of software fault tolerant concurrent controllers. In particular, the structural partition can be used to identify the boundary of the conversation which can be used to protect a specific part of the system. In addition, for certain classes of system, the partitions can be used to identify processes which may be dynamically reconfigured in the event of a fault. These methods should be of use in the design of robust distributed systems.
Resumo:
The immune system is perhaps the largest yet most diffuse and distributed somatic system in vertebrates. It plays vital roles in fighting infection and in the homeostatic control of chronic disease. As such, the immune system in both pathological and healthy states is a prime target for therapeutic interventions by drugs-both small-molecule and biologic. Comprising both the innate and adaptive immune systems, human immunity is awash with potential unexploited molecular targets. Key examples include the pattern recognition receptors of the innate immune system and the major histocompatibility complex of the adaptive immune system. Moreover, the immune system is also the source of many current and, hopefully, future drugs, of which the prime example is the monoclonal antibody, the most exciting and profitable type of present-day drug moiety. This brief review explores the identity and synergies of the hierarchy of drug targets represented by the human immune system, with particular emphasis on the emerging paradigm of systems pharmacology. © the authors, publisher and licensee Libertas Academica Limited.
Resumo:
To explore the feasibility of processing Compact Muon Solenoid (CMS) analysis jobs across the wide area network, the FIU CMS Tier-3 center and the Florida CMS Tier-2 center designed a remote data access strategy. A Kerberized Lustre test bed was installed at the Tier-2 with the design to provide storage resources to private-facing worker nodes at the Tier-3. However, the Kerberos security layer is not capable of authenticating resources behind a private network. As a remedy, an xrootd server on a public-facing node at the Tier-3 was installed to export the file system to the private-facing worker nodes. We report the performance of CMS analysis jobs processed by the Tier-3 worker nodes accessing data from a Kerberized Lustre file. The processing performance of this configuration is benchmarked against a direct connection to the Lustre file system, and separately, where the xrootd server is near the Lustre file system.
Resumo:
In this paper, a review on radio-over-fiber (RoF) technology is conducted to support the exploding growth of mobile broadband. An RoF system will provide a platform for distributed antenna system (DAS) as a fronthaul of long term evolution (LTE) technology. A higher splitting ratio from a macrocell is required to support large DAS topology, hence higher optical launch power (OLP) is the right approach. However, high OLP generates undesired nonlinearities, namely the stimulated Brillouin scattering (SBS). Three different aspects of solving the SBS process are covered in this paper, where the solutions ultimately provided an additional 4 dB link budget.
Resumo:
Formal Concept Analysis is an unsupervised machine learning technique that has successfully been applied to document organisation by considering documents as objects and keywords as attributes. The basic algorithms of Formal Concept Analysis then allow an intelligent information retrieval system to cluster documents according to keyword views. This paper investigates the scalability of this idea. In particular we present the results of applying spatial data structures to large datasets in formal concept analysis. Our experiments are motivated by the application of the Formal Concept Analysis idea of a virtual filesystem [11,17,15]. In particular the libferris [1] Semantic File System. This paper presents customizations to an RD-Tree Generalized Index Search Tree based index structure to better support the application of Formal Concept Analysis to large data sources.
Resumo:
O presente trabalho pretende contribuir para a melhoria da eficiência dos sistemas de transporte e distribuição de água, possível de conseguir através da recuperação de energia potencial que, em certas situações, existe em excesso em condutas gravíticas. Sendo uma questão já abordada em diversos estudos, as poupanças de energia a que poderá conduzir, justificam a análise de todas as oportunidades, em especial no nosso País, cuja dependência energética do exterior é bem conhecida. Todavia, a implementação de soluções que recorrem à instalação de turbinas em condutas de abastecimento de água, causam naturalmente alguma apreensão às respectivas entidades gestoras, uma vez que pode pôr em causa a integridade das condutas e, em consequência, o abastecimento de água. Neste contexto, o estudo de modelos de controlo específicos para os referidos equipamentos poderá ser um contributo para a implementação mais alargada das soluções de melhoria da eficiência de sistemas de abastecimento de água, através da instalação de geradores hidroeléctricos, que terão a dupla função de controlo de caudal e produção de energia. O estudo e simulação dos modelos de controlo contidos neste trabalho permite concluir que é possível garantir a segurança das condutas e produzir energia eléctrica com turbinas nelas instaladas. Interessa assim aprofundar este tipo de estudos de forma a conseguir modelos de controlo que, com as premissas indicadas, possibilitem a optimização da produção de energia.
Resumo:
Scheduling resolution requires the intervention of highly skilled human problemsolvers. This is a very hard and challenging domain because current systems are becoming more and more complex, distributed, interconnected and subject to rapidly changing. A natural Autonomic Computing evolution in relation to Current Computing is to provide systems with Self-Managing ability with a minimum human interference. This paper addresses the resolution of complex scheduling problems using cooperative negotiation. A Multi-Agent Autonomic and Meta-heuristics based framework with self-configuring capabilities is proposed.
Resumo:
Consider the problem of disseminating data from an arbitrary source node to all other nodes in a distributed computer system, like Wireless Sensor Networks (WSNs). We assume that wireless broadcast is used and nodes do not know the topology. We propose new protocols which disseminate data faster and use fewer broadcasts than the simple broadcast protocol.
Resumo:
O recurso à monitorização do comportamento dos programas durante a execução é necessário em diversos contextos de aplicação. Por exemplo, para verificar a utilização dos recursos computacionais durante a execução, para calcular métricas que permitam melhor definir o perfil da aplicação ou para melhor identificar em que pontos da execução estão as causas de desvios do comportamento desejado de um programa e, noutros casos, para controlar a configuração da aplicação ou do sistema que suporta a sua execução. Esta técnica tem sido aplicada, quer no caso de programas sequenciais, quer se trate de programas distribuídos. Em particular, no caso de computações paralelas, dada a complexidade devida ao seu não determinismo, estas técnicas têm sido a melhor fonte de informação para compreender a execução da aplicação, quer em termos da sua correcção, quer na avaliação do seu desempenho e utilização dos recursos computacionais. As principais dificuldades no desenvolvimento e na adopção de ferramentas de monitorização, prendem-se com a complexidade dos sistemas de computação paralela e distribuída e com a necessidade de desenvolver soluções específicas para cada plataforma, para cada arquitectura e para cada objectivo. No entanto existem funcionalidades genéricas que, se presentes em todos os casos, podem ajudar ao desenvolvimento de novas ferramentas e à sua adaptação a diferentes ambientes computacionais. Esta dissertação propõe um modelo para suportar a observação e o controlo de aplicações paralelas e distribuídas (DAMS - Distributed ApplicationsMonitoring System). O modelo define uma arquitectura abstracta de monitorização baseada num núcleo mínimo sobre o qual assentam conjuntos de serviços que realizam as funcionalidades pretendidas em cada cenário de utilização. A sua organização em camadas de abstracção e a capacidade de extensão modular, permitem suportar o desenvolvimento de conjuntos de funcionalidades que podem ser partilhadas por distintas ferramentas. Por outro lado, o modelo proposto facilita o desenvolvimento de ferramentas de observação e controlo, sobre diferentes plataformas de suporte à execução. Nesta dissertação, são apresentados exemplos da utilização do modelo e da infraestrutura que o suporta, em diversos cenários de observação e controlo. Descreve-se também a experimentação realizada, com base em protótipos desenvolvidos sobre duas plataformas computacionais distintas.
The utilization bound of non-preemptive rate-monotonic scheduling in controller area networks is 25%
Resumo:
Consider a distributed computer system comprising many computer nodes, each interconnected with a controller area network (CAN) bus. We prove that if priorities to message streams are assigned using rate-monotonic (RM) and if the requested capacity of the CAN bus does not exceed 25% then all deadlines are met.