Biblioteca Digital

74 resultados para Loops parallelization

A tutorial on program development and optimization using the Ciao preprocessor

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present in a tutorial fashion CiaoPP, the preprocessor of the Ciao multi-paradigm programming system, which implements a novel program development framework which uses abstract interpretation as a fundamental tool. The framework uses modular, incremental abstract interpretation to obtain information about the program. This information is used to validate programs, to detect bugs with respect to partial specifications written using assertions (in the program itself and/or in system libraries), to generate and simplify run-time tests, and to perform high-level program transformations such as multiple abstract specialization, parallelization, and resource usage control, all in a provably correct way. In the case of validation and debugging, the assertions can refer to a variety of program points such as procedure entry, procedure exit, points within procedures, or global computations. The system can reason with much richer information than, for example, traditional types. This includes data structure shape (including pointer sharing), bounds on data structure sizes, and other operational variable instantiation properties, as well as procedure-level properties such as determinacy, termination, non-failure, and bounds on resource consumption (time or space cost).

Nuevos dispositivos y criterios de análisis de señales de resonancia magnética para la caracterización humedad/porosidad de suelos y acuíferos superficiales

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Un estudio geofísico mediante resonancia se realiza mediante la excitación del agua del subsuelo a partir de la emisión de una intensidad variable a lo largo de un cable extendido sobre la superficie en forma cuadrada o circular. El volumen investigado depende del tamaño de dicho cable, lo cual, junto con la intensidad utilizada para la excitación del agua determina las diferentes profundidades del terreno de las que se va a extraer información, que se encuentran entre 10 y 100 m, habitualmente. La tesis doctoral presentada consiste en la adaptación del Método de Resonancia Magnética para su utilización en aplicaciones superficiales mediante bucles de tamaño reducido. Dicha información sobre el terreno en la escala desde decímetros a pocos metros es interesante en relación a la física de suelos y en general en relación a diferentes problemas de Ingeniería, tanto de extracción de agua como constructiva. Una vez realizada la revisión del estado de conocimiento actual del método en relación a sus aplicaciones usuales, se estudian los problemas inherentes a su adaptación a medidas superficiales. Para solventar dichos problemas se han considerado dos líneas de investigación principales: En primer lugar se realiza un estudio de la influencia de las características del pulso de excitación emitido por el equipo en la calidad de las medidas obtenidas, y las posibles estrategias para mejorar dicho pulso. El pulso de excitación es un parámetro clave en la extracción de información sobre diferentes profundidades del terreno. Por otro lado se busca la optimización del dispositivo de medida para su adaptación al estudio de los primeros metros del suelo mediante el equipo disponible, tratándose éste del equipo NumisLITE de la casa Iris Instruments. ABSTRACT Magnetic Resonance Sounding is a geophysical method performed through the excitation of the subsurface water by a variable electrical intensity delivered through a wire extended on the surface, forming a circle or a square. The investigated volume depends on the wire length and the intensity used, determining the different subsurface depths reached. In the usual application of the method, this depth ranges between 10 and 100 m. This thesis studies the adaptation of the above method to more superficial applications using smaller wire loops. Information about the subsurface in the range of decimeter to a few meters is interesting regarding physics of soils, as well as different Engineering problems, either for water extraction or for construction. After a review of the nowadays state of the art of the method regarding its usual applications, the special issues attached to its use to perform very shallow measures are studied. In order to sort out these problems two main research lines are considered: On the one hand, a study about the influence of the characteristics of the emitted pulse in the resulting measure quality is performed. Possible strategies in order to improve this pulse are investigated, as the excitation pulse is a key parameter to obtain information from different depths of the subsurface. On the other hand, the study tries to optimize the measurement device to its adaptation to the study of the first meters of the ground with the available instrumentation, the NumisLITE equipment from Iris Instruments.

Termination and cost analysis: Complexity and precision issues

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The research in this thesis is related to static cost and termination analysis. Cost analysis aims at estimating the amount of resources that a given program consumes during the execution, and termination analysis aims at proving that the execution of a given program will eventually terminate. These analyses are strongly related, indeed cost analysis techniques heavily rely on techniques developed for termination analysis. Precision, scalability, and applicability are essential in static analysis in general. Precision is related to the quality of the inferred results, scalability to the size of programs that can be analyzed, and applicability to the class of programs that can be handled by the analysis (independently from precision and scalability issues). This thesis addresses these aspects in the context of cost and termination analysis, from both practical and theoretical perspectives. For cost analysis, we concentrate on the problem of solving cost relations (a form of recurrence relations) into closed-form upper and lower bounds, which is the heart of most modern cost analyzers, and also where most of the precision and applicability limitations can be found. We develop tools, and their underlying theoretical foundations, for solving cost relations that overcome the limitations of existing approaches, and demonstrate superiority in both precision and applicability. A unique feature of our techniques is the ability to smoothly handle both lower and upper bounds, by reversing the corresponding notions in the underlying theory. For termination analysis, we study the hardness of the problem of deciding termination for a speci�c form of simple loops that arise in the context of cost analysis. This study gives a better understanding of the (theoretical) limits of scalability and applicability for both termination and cost analysis.

Effectiveness of combined sharing and freeness analysis using abstract interpretation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents improved unification algorithms, an implementation, and an analysis of the effectiveness of an abstract interpreter based on the sharing + freeness domain presented in a previous paper, which was designed to accurately and concisely represent combined freeness and sharing information for program variables. We first briefly review this domain and the unification algorithms previously proposed. We then improve these algorithms and correct them to deal with some cases which were not well analyzed previously, illustrating the improvement with an example. We then present the implementation of the improved algorithm and evaluate its performance by comparing the effectiveness of the information inferred to that of other interpreters available to us for an application (program parallelization) that is common to all these interpreters. All these systems have been embedded in a real parallelizing compiler. Effectiveness of the analysis is measured in terms of actual final performance of the system: i.e. in terms of the actual speedups obtained. The results show good performance for the combined domain in that it improves the accuracy of both types of information and also in that the analyzer using the combined domain is more effective in the application than any of the other analyzers it is compared to.

StreamCloud: An elastic and scalable data streaming system

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Many applications in several domains such as telecommunications, network security, large scale sensor networks, require online processing of continuous data lows. They produce very high loads that requires aggregating the processing capacity of many nodes. Current Stream Processing Engines do not scale with the input load due to single-node bottlenecks. Additionally, they are based on static con?gurations that lead to either under or over-provisioning. In this paper, we present StreamCloud, a scalable and elastic stream processing engine for processing large data stream volumes. StreamCloud uses a novel parallelization technique that splits queries into subqueries that are allocated to independent sets of nodes in a way that minimizes the distribution overhead. Its elastic protocols exhibit low intrusiveness, enabling effective adjustment of resources to the incoming load. Elasticity is combined with dynamic load balancing to minimize the computational resources used. The paper presents the system design, implementation and a thorough evaluation of the scalability and elasticity of the fully implemented system.

An evolutionary algorithm for the surface structure problem

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Many macroscopic properties: hardness, corrosion, catalytic activity, etc. are directly related to the surface structure, that is, to the position and chemical identity of the outermost atoms of the material. Current experimental techniques for its determination produce a “signature” from which the structure must be inferred by solving an inverse problem: a solution is proposed, its corresponding signature computed and then compared to the experiment. This is a challenging optimization problem where the search space and the number of local minima grows exponentially with the number of atoms, hence its solution cannot be achieved for arbitrarily large structures. Nowadays, it is solved by using a mixture of human knowledge and local search techniques: an expert proposes a solution that is refined using a local minimizer. If the outcome does not fit the experiment, a new solution must be proposed again. Solving a small surface can take from days to weeks of this trial and error method. Here we describe our ongoing work in its solution. We use an hybrid algorithm that mixes evolutionary techniques with trusted region methods and reuses knowledge gained during the execution to avoid repeated search of structures. Its parallelization produces good results even when not requiring the gathering of the full population, hence it can be used in loosely coupled environments such as grids. With this algorithm, the solution of test cases that previously took weeks of expert time can be automatically solved in a day or two of uniprocessor time.

Comparison of V2IC control with voltage mode and current mode controls for high frequency (MHz) and very fast response applications

Relevância:

10.00% 10.00%

Publicador:

Resumo:

High switching frequencies (several MHz) allow the integration of low power DC/DC converters. Although, in theory, a high switching frequency would make possible to implement a conventional Voltage Mode control (VMC) or Peak Current Mode control (PCMC) with very high bandwidth, in practice, parasitic effects and robustness limits the applicability of these control techniques. This paper compares VMC and CMC techniques with the V2IC control. This control is based on two loops. The fast internal loop has information of the output capacitor current and the error voltage, providing fast dynamic response under load and voltage reference steps, while the slow external voltage loop provides accurate steady state regulation. This paper shows the fast dynamic response of the V2IC control under load and output voltage reference steps and its robustness operating with additional output capacitors added by the customer.

Two-wavelength switching with a distributed-feed back semiconductor optical amplifier (DFBSOA)

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Switching of a signal beam by another control beam at different wavelength is demonstrated experimentally using the optical bistability occurring in a 1.55 mm-distributed feedback semiconductor optical amplifier (DFBSOA) working in reflection. Counterclockwise (S-shaped) and reverse (clockwise) bistability are observed in the output of the control and the signal beam respectively, as the power of the input control signal is increased. With this technique an optical signal can be set in either of the optical input wavelengths by appropriate choice of the powers of the input signals. The switching properties of the DFBSOA are studied experimentally as the applied bias current is increased from below to above threshold and for different levels of optical power in the signal beam and different wavelength detunings between both input signals. Higher on-off extinction ratios, wider bistable loops and lower input power requirements for switching are obtained when the DFBSOA is operated slightly above its threshold value.

Modeling Reflective Bistability in Vertical-Cavity Semiconductor Optical Amplifiers

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The characteristics of optical bistability in a vertical- cavity semiconductor optical amplifier (VCSOA) operated in reflection are reported. The dependences of the optical bistability in VCSOAs on the initial phase detuning and on the applied bias current are analyzed. The optical bistability is also studied for different numbers of superimposed periods in the top distributed bragg reflector (DBR) that conform the internal cavity of the device. The appearance of the X-bistable and the clockwise bistable loops is predicted theoretically in a VCSOA operated in reflection for the first time, to the best of our knowledge. Moreover, it is also predicted that the control of the VCSOA’s top reflectivity by the addition of new superimposed periods in its top DBR reduces by one order of magnitude the input power needed for the assessment of the X- and the clockwise bistable loop, compared to that required in in-plane semiconductor optical amplifiers. These results, added to the ease of fabricating two-dimensional arrays of this kind of device could be useful for the development of new optical logic or optical signal regeneration devices.

Diseño de los sistemas de comunicación en plantas de exploración y producción de hidrocarburos

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Las plantas industriales de exploración y producción de petróleo y gas disponen de numerosos sistemas de comunicación que permiten el correcto funcionamiento de los procesos que tienen lugar en ella así como la seguridad de la propia planta. Para el presente Proyecto Fin de Carrera se ha llevado a cabo el diseño del sistema de megafonía PAGA (Public Address and General Alarm) y del circuito cerrado de televisión (CCTV) en la unidad de procesos Hydrocrcaker encargada del craqueo de hidrógeno. Partiendo de los requisitos definidos por las especificaciones corporativas de los grupos petroleros para ambos sistemas, PAGA y CCTV, se han expuesto los principios teóricos sobre los que se fundamenta cada uno de ellos y las pautas a seguir para el diseño y demostración del buen funcionamiento a partir de software específico. Se ha empleado las siguientes herramientas software: EASE para la simulación acústica, PSpice para la simulación eléctrica de las etapas de amplificación en la megafonía; y JVSG para el diseño de CCTV. La sonorización tanto de las unidades como del resto de instalaciones interiores ha de garantizar la inteligibilidad de los mensajes transmitidos. La realización de una simulación acústica permite conocer cómo va a ser el comportamiento de la megafonía sin necesidad de instalar el sistema, lo cual es muy útil para este tipo de proyectos cuya ingeniería se realiza previamente a la construcción de la planta. Además se comprueba el correcto diseño de las etapas de amplificación basadas en líneas de alta impedancia o de tensión constante (100 V). El circuito cerrado de televisión (CCTV) garantiza la transmisión de señales visuales de todos los accesos a las instalaciones y unidades de la planta así como la visión en tiempo real del correcto funcionamiento de los procesos químicos llevados a cabo en la refinería. El sistema dispone de puestos de control remoto para el manejo y gestión de las cámaras desplegadas; y de un sistema de almacenamiento de las grabaciones en discos duros (RAID-5) a través de una red SAN (Storage Area Network). Se especifican las diferentes fases de un proyecto de ingeniería en el sector de E&P de hidrocarburos entre las que se destaca: propuesta y adquisición, reunión de arranque (KOM, Kick Off Meeting), estudio in situ (Site Survey), plan de proyecto, diseño y documentación, procedimientos de pruebas, instalación, puesta en marcha y aceptaciones del sistema. Se opta por utilizar terminología inglesa dado al ámbito global del sector. En la última parte del proyecto se presenta un presupuesto aproximado de los materiales empleados en el diseño de PAGA y CCTV. ABSTRACT. Integrated communications for Oil and Gas allows reducing risks, improving productivity, reducing costs, and countering threats to safety and security. Both PAGA system (Public Address and General Alarm) and Closed Circuit Television have been designed for this project in order to ensure a reliable security of an oil refinery. Based on the requirements defined by corporate specifications for both systems (PAGA and CCTV), theoretical principles have been presented as well as the guidelines for the design and demonstration of a reliable design. The following software has been used: EASE for acoustic simulation; PSpice for simulation of the megaphony amplification loops; and JVSG tool for CCTV design. Acoustic for both the units and the other indoor facilities must ensure intelligibility of the transmitted messages. An acoustic simulation allows us to know how will be the performance of the PAGA system without installing loudspeakers, which is very useful for this type of project whose engineering is performed prior to the construction of the plant. Furthermore, it has been verified the correct design of the amplifier stages based on high impedance lines or constant voltage (100 V). Closed circuit television (CCTV) ensures the transmission of visual signals of all access to facilities as well as real-time view of the proper functioning of chemical processes carried out at the refinery. The system has remote control stations for the handling and management of deployed cameras. It is also included a storage system of the recordings on hard drives (RAID - 5) through a SAN (Storage Area Network). Phases of an engineering project in Oil and Gas are defined in the current project. It includes: proposal and acquisition, kick-off meeting (KOM), Site Survey, project plan, design and documentation, testing procedures (SAT and FAT), installation, commissioning and acceptance of the systems. Finally, it has been presented an estimate budget of the materials used in the design of PAGA and CCTV.

Simulación en tiempo real de vehículos industriales con modelos multicuerpo de gran complejidad

Relevância:

10.00% 10.00%

Publicador:

Resumo:

El objetivo de esta Tesis ha sido la consecución de simulaciones en tiempo real de vehículos industriales modelizados como sistemas multicuerpo complejos formados por sólidos rígidos. Para el desarrollo de un programa de simulación deben considerarse cuatro aspectos fundamentales: la modelización del sistema multicuerpo (tipos de coordenadas, pares ideales o impuestos mediante fuerzas), la formulación a utilizar para plantear las ecuaciones diferenciales del movimiento (coordenadas dependientes o independientes, métodos globales o topológicos, forma de imponer las ecuaciones de restricción), el método de integración numérica para resolver estas ecuaciones en el tiempo (integradores explícitos o implícitos) y finalmente los detalles de la implementación realizada (lenguaje de programación, librerías matemáticas, técnicas de paralelización). Estas cuatro etapas están interrelacionadas entre sí y todas han formado parte de este trabajo. Desde la generación de modelos de una furgoneta y de camión con semirremolque, el uso de tres formulaciones dinámicas diferentes, la integración de las ecuaciones diferenciales del movimiento mediante métodos explícitos e implícitos, hasta el uso de funciones BLAS, de técnicas de matrices sparse y la introducción de paralelización para utilizar los distintos núcleos del procesador. El trabajo presentado en esta Tesis ha sido organizado en 8 capítulos, dedicándose el primero de ellos a la Introducción. En el Capítulo 2 se presentan dos formulaciones semirrecursivas diferentes, de las cuales la primera está basada en una doble transformación de velocidades, obteniéndose las ecuaciones diferenciales del movimiento en función de las aceleraciones relativas independientes. La integración numérica de estas ecuaciones se ha realizado con el método de Runge-Kutta explícito de cuarto orden. La segunda formulación está basada en coordenadas relativas dependientes, imponiendo las restricciones por medio de penalizadores en posición y corrigiendo las velocidades y aceleraciones mediante métodos de proyección. En este segundo caso la integración de las ecuaciones del movimiento se ha llevado a cabo mediante el integrador implícito HHT (Hilber, Hughes and Taylor), perteneciente a la familia de integradores estructurales de Newmark. En el Capítulo 3 se introduce la tercera formulación utilizada en esta Tesis. En este caso las uniones entre los sólidos del sistema se ha realizado mediante uniones flexibles, lo que obliga a imponer los pares por medio de fuerzas. Este tipo de uniones impide trabajar con coordenadas relativas, por lo que la posición del sistema y el planteamiento de las ecuaciones del movimiento se ha realizado utilizando coordenadas Cartesianas y parámetros de Euler. En esta formulación global se introducen las restricciones mediante fuerzas (con un planteamiento similar al de los penalizadores) y la estabilización del proceso de integración numérica se realiza también mediante proyecciones de velocidades y aceleraciones. En el Capítulo 4 se presenta una revisión de las principales herramientas y estrategias utilizadas para aumentar la eficiencia de las implementaciones de los distintos algoritmos. En primer lugar se incluye una serie de consideraciones básicas para aumentar la eficiencia numérica de las implementaciones. A continuación se mencionan las principales características de los analizadores de códigos utilizados y también las librerías matemáticas utilizadas para resolver los problemas de álgebra lineal tanto con matrices densas como sparse. Por último se desarrolla con un cierto detalle el tema de la paralelización en los actuales procesadores de varios núcleos, describiendo para ello el patrón empleado y las características más importantes de las dos herramientas propuestas, OpenMP y las TBB de Intel. Hay que señalar que las características de los sistemas multicuerpo problemas de pequeño tamaño, frecuente uso de la recursividad, y repetición intensiva en el tiempo de los cálculos con fuerte dependencia de los resultados anteriores dificultan extraordinariamente el uso de técnicas de paralelización frente a otras áreas de la mecánica computacional, tales como por ejemplo el cálculo por elementos finitos. Basándose en los conceptos mencionados en el Capítulo 4, el Capítulo 5 está dividido en tres secciones, una para cada formulación propuesta en esta Tesis. En cada una de estas secciones se describen los detalles de cómo se han realizado las distintas implementaciones propuestas para cada algoritmo y qué herramientas se han utilizado para ello. En la primera sección se muestra el uso de librerías numéricas para matrices densas y sparse en la formulación topológica semirrecursiva basada en la doble transformación de velocidades. En la segunda se describe la utilización de paralelización mediante OpenMP y TBB en la formulación semirrecursiva con penalizadores y proyecciones. Por último, se describe el uso de técnicas de matrices sparse y paralelización en la formulación global con uniones flexibles y parámetros de Euler. El Capítulo 6 describe los resultados alcanzados mediante las formulaciones e implementaciones descritas previamente. Este capítulo comienza con una descripción de la modelización y topología de los dos vehículos estudiados. El primer modelo es un vehículo de dos ejes del tipo chasis-cabina o furgoneta, perteneciente a la gama de vehículos de carga medianos. El segundo es un vehículo de cinco ejes que responde al modelo de un camión o cabina con semirremolque, perteneciente a la categoría de vehículos industriales pesados. En este capítulo además se realiza un estudio comparativo entre las simulaciones de estos vehículos con cada una de las formulaciones utilizadas y se presentan de modo cuantitativo los efectos de las mejoras alcanzadas con las distintas estrategias propuestas en esta Tesis. Con objeto de extraer conclusiones más fácilmente y para evaluar de un modo más objetivo las mejoras introducidas en la Tesis, todos los resultados de este capítulo se han obtenido con el mismo computador, que era el top de la gama Intel Xeon en 2007, pero que hoy día está ya algo obsoleto. Por último los Capítulos 7 y 8 están dedicados a las conclusiones finales y las futuras líneas de investigación que pueden derivar del trabajo realizado en esta Tesis. Los objetivos de realizar simulaciones en tiempo real de vehículos industriales de gran complejidad han sido alcanzados con varias de las formulaciones e implementaciones desarrolladas. ABSTRACT The objective of this Dissertation has been the achievement of real time simulations of industrial vehicles modeled as complex multibody systems made up by rigid bodies. For the development of a simulation program, four main aspects must be considered: the modeling of the multibody system (types of coordinates, ideal joints or imposed by means of forces), the formulation to be used to set the differential equations of motion (dependent or independent coordinates, global or topological methods, ways to impose constraints equations), the method of numerical integration to solve these equations in time (explicit or implicit integrators) and the details of the implementation carried out (programming language, mathematical libraries, parallelization techniques). These four stages are interrelated and all of them are part of this work. They involve the generation of models for a van and a semitrailer truck, the use of three different dynamic formulations, the integration of differential equations of motion through explicit and implicit methods, the use of BLAS functions and sparse matrix techniques, and the introduction of parallelization to use the different processor cores. The work presented in this Dissertation has been structured in eight chapters, the first of them being the Introduction. In Chapter 2, two different semi-recursive formulations are shown, of which the first one is based on a double velocity transformation, thus getting the differential equations of motion as a function of the independent relative accelerations. The numerical integration of these equations has been made with the Runge-Kutta explicit method of fourth order. The second formulation is based on dependent relative coordinates, imposing the constraints by means of position penalty coefficients and correcting the velocities and accelerations by projection methods. In this second case, the integration of the motion equations has been carried out by means of the HHT implicit integrator (Hilber, Hughes and Taylor), which belongs to the Newmark structural integrators family. In Chapter 3, the third formulation used in this Dissertation is presented. In this case, the joints between the bodies of the system have been considered as flexible joints, with forces used to impose the joint conditions. This kind of union hinders to work with relative coordinates, so the position of the system bodies and the setting of the equations of motion have been carried out using Cartesian coordinates and Euler parameters. In this global formulation, constraints are introduced through forces (with a similar approach to the penalty coefficients) are presented. The stabilization of the numerical integration is carried out also by velocity and accelerations projections. In Chapter 4, a revision of the main computer tools and strategies used to increase the efficiency of the implementations of the algorithms is presented. First of all, some basic considerations to increase the numerical efficiency of the implementations are included. Then the main characteristics of the code’ analyzers used and also the mathematical libraries used to solve linear algebra problems (both with dense and sparse matrices) are mentioned. Finally, the topic of parallelization in current multicore processors is developed thoroughly. For that, the pattern used and the most important characteristics of the tools proposed, OpenMP and Intel TBB, are described. It needs to be highlighted that the characteristics of multibody systems small size problems, frequent recursion use and intensive repetition along the time of the calculation with high dependencies of the previous results complicate extraordinarily the use of parallelization techniques against other computational mechanics areas, as the finite elements computation. Based on the concepts mentioned in Chapter 4, Chapter 5 is divided into three sections, one for each formulation proposed in this Dissertation. In each one of these sections, the details of how these different proposed implementations have been made for each algorithm and which tools have been used are described. In the first section, it is shown the use of numerical libraries for dense and sparse matrices in the semirecursive topological formulation based in the double velocity transformation. In the second one, the use of parallelization by means OpenMP and TBB is depicted in the semi-recursive formulation with penalization and projections. Lastly, the use of sparse matrices and parallelization techniques is described in the global formulation with flexible joints and Euler parameters. Chapter 6 depicts the achieved results through the formulations and implementations previously described. This chapter starts with a description of the modeling and topology of the two vehicles studied. The first model is a two-axle chassis-cabin or van like vehicle, which belongs to the range of medium charge vehicles. The second one is a five-axle vehicle belonging to the truck or cabin semi-trailer model, belonging to the heavy industrial vehicles category. In this chapter, a comparative study is done between the simulations of these vehicles with each one of the formulations used and the improvements achieved are presented in a quantitative way with the different strategies proposed in this Dissertation. With the aim of deducing the conclusions more easily and to evaluate in a more objective way the improvements introduced in the Dissertation, all the results of this chapter have been obtained with the same computer, which was the top one among the Intel Xeon range in 2007, but which is rather obsolete today. Finally, Chapters 7 and 8 are dedicated to the final conclusions and the future research projects that can be derived from the work presented in this Dissertation. The objectives of doing real time simulations in high complex industrial vehicles have been achieved with the formulations and implementations developed.

Run-Time Scalable Hardware for Reconfigurable Systems

Relevância:

10.00% 10.00%

Publicador:

Resumo:

La optimización de parámetros tales como el consumo de potencia, la cantidad de recursos lógicos empleados o la ocupación de memoria ha sido siempre una de las preocupaciones principales a la hora de diseñar sistemas embebidos. Esto es debido a que se trata de sistemas dotados de una cantidad de recursos limitados, y que han sido tradicionalmente empleados para un propósito específico, que permanece invariable a lo largo de toda la vida útil del sistema. Sin embargo, el uso de sistemas embebidos se ha extendido a áreas de aplicación fuera de su ámbito tradicional, caracterizadas por una mayor demanda computacional. Así, por ejemplo, algunos de estos sistemas deben llevar a cabo un intenso procesado de señales multimedia o la transmisión de datos mediante sistemas de comunicaciones de alta capacidad. Por otra parte, las condiciones de operación del sistema pueden variar en tiempo real. Esto sucede, por ejemplo, si su funcionamiento depende de datos medidos por el propio sistema o recibidos a través de la red, de las demandas del usuario en cada momento, o de condiciones internas del propio dispositivo, tales como la duración de la batería. Como consecuencia de la existencia de requisitos de operación dinámicos es necesario ir hacia una gestión dinámica de los recursos del sistema. Si bien el software es inherentemente flexible, no ofrece una potencia computacional tan alta como el hardware. Por lo tanto, el hardware reconfigurable aparece como una solución adecuada para tratar con mayor flexibilidad los requisitos variables dinámicamente en sistemas con alta demanda computacional. La flexibilidad y adaptabilidad del hardware requieren de dispositivos reconfigurables que permitan la modificación de su funcionalidad bajo demanda. En esta tesis se han seleccionado las FPGAs (Field Programmable Gate Arrays) como los dispositivos más apropiados, hoy en día, para implementar sistemas basados en hardware reconfigurable De entre todas las posibilidades existentes para explotar la capacidad de reconfiguración de las FPGAs comerciales, se ha seleccionado la reconfiguración dinámica y parcial. Esta técnica consiste en substituir una parte de la lógica del dispositivo, mientras el resto continúa en funcionamiento. La capacidad de reconfiguración dinámica y parcial de las FPGAs es empleada en esta tesis para tratar con los requisitos de flexibilidad y de capacidad computacional que demandan los dispositivos embebidos. La propuesta principal de esta tesis doctoral es el uso de arquitecturas de procesamiento escalables espacialmente, que son capaces de adaptar su funcionalidad y rendimiento en tiempo real, estableciendo un compromiso entre dichos parámetros y la cantidad de lógica que ocupan en el dispositivo. A esto nos referimos con arquitecturas con huellas escalables. En particular, se propone el uso de arquitecturas altamente paralelas, modulares, regulares y con una alta localidad en sus comunicaciones, para este propósito. El tamaño de dichas arquitecturas puede ser modificado mediante la adición o eliminación de algunos de los módulos que las componen, tanto en una dimensión como en dos. Esta estrategia permite implementar soluciones escalables, sin tener que contar con una versión de las mismas para cada uno de los tamaños posibles de la arquitectura. De esta manera se reduce significativamente el tiempo necesario para modificar su tamaño, así como la cantidad de memoria necesaria para almacenar todos los archivos de configuración. En lugar de proponer arquitecturas para aplicaciones específicas, se ha optado por patrones de procesamiento genéricos, que pueden ser ajustados para solucionar distintos problemas en el estado del arte. A este respecto, se proponen patrones basados en esquemas sistólicos, así como de tipo wavefront. Con el objeto de poder ofrecer una solución integral, se han tratado otros aspectos relacionados con el diseño y el funcionamiento de las arquitecturas, tales como el control del proceso de reconfiguración de la FPGA, la integración de las arquitecturas en el resto del sistema, así como las técnicas necesarias para su implementación. Por lo que respecta a la implementación, se han tratado distintos aspectos de bajo nivel dependientes del dispositivo. Algunas de las propuestas realizadas a este respecto en la presente tesis doctoral son un router que es capaz de garantizar el correcto rutado de los módulos reconfigurables dentro del área destinada para ellos, así como una estrategia para la comunicación entre módulos que no introduce ningún retardo ni necesita emplear recursos configurables del dispositivo. El flujo de diseño propuesto se ha automatizado mediante una herramienta denominada DREAMS. La herramienta se encarga de la modificación de las netlists correspondientes a cada uno de los módulos reconfigurables del sistema, y que han sido generadas previamente mediante herramientas comerciales. Por lo tanto, el flujo propuesto se entiende como una etapa de post-procesamiento, que adapta esas netlists a los requisitos de la reconfiguración dinámica y parcial. Dicha modificación la lleva a cabo la herramienta de una forma completamente automática, por lo que la productividad del proceso de diseño aumenta de forma evidente. Para facilitar dicho proceso, se ha dotado a la herramienta de una interfaz gráfica. El flujo de diseño propuesto, y la herramienta que lo soporta, tienen características específicas para abordar el diseño de las arquitecturas dinámicamente escalables propuestas en esta tesis. Entre ellas está el soporte para el realojamiento de módulos reconfigurables en posiciones del dispositivo distintas a donde el módulo es originalmente implementado, así como la generación de estructuras de comunicación compatibles con la simetría de la arquitectura. El router has sido empleado también en esta tesis para obtener un rutado simétrico entre nets equivalentes. Dicha posibilidad ha sido explotada para aumentar la protección de circuitos con altos requisitos de seguridad, frente a ataques de canal lateral, mediante la implantación de lógica complementaria con rutado idéntico. Para controlar el proceso de reconfiguración de la FPGA, se propone en esta tesis un motor de reconfiguración especialmente adaptado a los requisitos de las arquitecturas dinámicamente escalables. Además de controlar el puerto de reconfiguración, el motor de reconfiguración ha sido dotado de la capacidad de realojar módulos reconfigurables en posiciones arbitrarias del dispositivo, en tiempo real. De esta forma, basta con generar un único bitstream por cada módulo reconfigurable del sistema, independientemente de la posición donde va a ser finalmente reconfigurado. La estrategia seguida para implementar el proceso de realojamiento de módulos es diferente de las propuestas existentes en el estado del arte, pues consiste en la composición de los archivos de configuración en tiempo real. De esta forma se consigue aumentar la velocidad del proceso, mientras que se reduce la longitud de los archivos de configuración parciales a almacenar en el sistema. El motor de reconfiguración soporta módulos reconfigurables con una altura menor que la altura de una región de reloj del dispositivo. Internamente, el motor se encarga de la combinación de los frames que describen el nuevo módulo, con la configuración existente en el dispositivo previamente. El escalado de las arquitecturas de procesamiento propuestas en esta tesis también se puede beneficiar de este mecanismo. Se ha incorporado también un acceso directo a una memoria externa donde se pueden almacenar bitstreams parciales. Para acelerar el proceso de reconfiguración se ha hecho funcionar el ICAP por encima de la máxima frecuencia de reloj aconsejada por el fabricante. Así, en el caso de Virtex-5, aunque la máxima frecuencia del reloj deberían ser 100 MHz, se ha conseguido hacer funcionar el puerto de reconfiguración a frecuencias de operación de hasta 250 MHz, incluyendo el proceso de realojamiento en tiempo real. Se ha previsto la posibilidad de portar el motor de reconfiguración a futuras familias de FPGAs. Por otro lado, el motor de reconfiguración se puede emplear para inyectar fallos en el propio dispositivo hardware, y así ser capaces de evaluar la tolerancia ante los mismos que ofrecen las arquitecturas reconfigurables. Los fallos son emulados mediante la generación de archivos de configuración a los que intencionadamente se les ha introducido un error, de forma que se modifica su funcionalidad. Con el objetivo de comprobar la validez y los beneficios de las arquitecturas propuestas en esta tesis, se han seguido dos líneas principales de aplicación. En primer lugar, se propone su uso como parte de una plataforma adaptativa basada en hardware evolutivo, con capacidad de escalabilidad, adaptabilidad y recuperación ante fallos. En segundo lugar, se ha desarrollado un deblocking filter escalable, adaptado a la codificación de vídeo escalable, como ejemplo de aplicación de las arquitecturas de tipo wavefront propuestas. El hardware evolutivo consiste en el uso de algoritmos evolutivos para diseñar hardware de forma autónoma, explotando la flexibilidad que ofrecen los dispositivos reconfigurables. En este caso, los elementos de procesamiento que componen la arquitectura son seleccionados de una biblioteca de elementos presintetizados, de acuerdo con las decisiones tomadas por el algoritmo evolutivo, en lugar de definir la configuración de las mismas en tiempo de diseño. De esta manera, la configuración del core puede cambiar cuando lo hacen las condiciones del entorno, en tiempo real, por lo que se consigue un control autónomo del proceso de reconfiguración dinámico. Así, el sistema es capaz de optimizar, de forma autónoma, su propia configuración. El hardware evolutivo tiene una capacidad inherente de auto-reparación. Se ha probado que las arquitecturas evolutivas propuestas en esta tesis son tolerantes ante fallos, tanto transitorios, como permanentes y acumulativos. La plataforma evolutiva se ha empleado para implementar filtros de eliminación de ruido. La escalabilidad también ha sido aprovechada en esta aplicación. Las arquitecturas evolutivas escalables permiten la adaptación autónoma de los cores de procesamiento ante fluctuaciones en la cantidad de recursos disponibles en el sistema. Por lo tanto, constituyen un ejemplo de escalabilidad dinámica para conseguir un determinado nivel de calidad, que puede variar en tiempo real. Se han propuesto dos variantes de sistemas escalables evolutivos. El primero consiste en un único core de procesamiento evolutivo, mientras que el segundo está formado por un número variable de arrays de procesamiento. La codificación de vídeo escalable, a diferencia de los codecs no escalables, permite la decodificación de secuencias de vídeo con diferentes niveles de calidad, de resolución temporal o de resolución espacial, descartando la información no deseada. Existen distintos algoritmos que soportan esta característica. En particular, se va a emplear el estándar Scalable Video Coding (SVC), que ha sido propuesto como una extensión de H.264/AVC, ya que este último es ampliamente utilizado tanto en la industria, como a nivel de investigación. Para poder explotar toda la flexibilidad que ofrece el estándar, hay que permitir la adaptación de las características del decodificador en tiempo real. El uso de las arquitecturas dinámicamente escalables es propuesto en esta tesis con este objetivo. El deblocking filter es un algoritmo que tiene como objetivo la mejora de la percepción visual de la imagen reconstruida, mediante el suavizado de los "artefactos" de bloque generados en el lazo del codificador. Se trata de una de las tareas más intensivas en procesamiento de datos de H.264/AVC y de SVC, y además, su carga computacional es altamente dependiente del nivel de escalabilidad seleccionado en el decodificador. Por lo tanto, el deblocking filter ha sido seleccionado como prueba de concepto de la aplicación de las arquitecturas dinámicamente escalables para la compresión de video. La arquitectura propuesta permite añadir o eliminar unidades de computación, siguiendo un esquema de tipo wavefront. La arquitectura ha sido propuesta conjuntamente con un esquema de procesamiento en paralelo del deblocking filter a nivel de macrobloque, de tal forma que cuando se varía del tamaño de la arquitectura, el orden de filtrado de los macrobloques varia de la misma manera. El patrón propuesto se basa en la división del procesamiento de cada macrobloque en dos etapas independientes, que se corresponden con el filtrado horizontal y vertical de los bloques dentro del macrobloque. Las principales contribuciones originales de esta tesis son las siguientes: - El uso de arquitecturas altamente regulares, modulares, paralelas y con una intensa localidad en sus comunicaciones, para implementar cores de procesamiento dinámicamente reconfigurables. - El uso de arquitecturas bidimensionales, en forma de malla, para construir arquitecturas dinámicamente escalables, con una huella escalable. De esta forma, las arquitecturas permiten establecer un compromiso entre el área que ocupan en el dispositivo, y las prestaciones que ofrecen en cada momento. Se proponen plantillas de procesamiento genéricas, de tipo sistólico o wavefront, que pueden ser adaptadas a distintos problemas de procesamiento. - Un flujo de diseño y una herramienta que lo soporta, para el diseño de sistemas reconfigurables dinámicamente, centradas en el diseño de las arquitecturas altamente paralelas, modulares y regulares propuestas en esta tesis. - Un esquema de comunicaciones entre módulos reconfigurables que no introduce ningún retardo ni requiere el uso de recursos lógicos propios. - Un router flexible, capaz de resolver los conflictos de rutado asociados con el diseño de sistemas reconfigurables dinámicamente. - Un algoritmo de optimización para sistemas formados por múltiples cores escalables que optimice, mediante un algoritmo genético, los parámetros de dicho sistema. Se basa en un modelo conocido como el problema de la mochila. - Un motor de reconfiguración adaptado a los requisitos de las arquitecturas altamente regulares y modulares. Combina una alta velocidad de reconfiguración, con la capacidad de realojar módulos en tiempo real, incluyendo el soporte para la reconfiguración de regiones que ocupan menos que una región de reloj, así como la réplica de un módulo reconfigurable en múltiples posiciones del dispositivo. - Un mecanismo de inyección de fallos que, empleando el motor de reconfiguración del sistema, permite evaluar los efectos de fallos permanentes y transitorios en arquitecturas reconfigurables. - La demostración de las posibilidades de las arquitecturas propuestas en esta tesis para la implementación de sistemas de hardware evolutivos, con una alta capacidad de procesamiento de datos. - La implementación de sistemas de hardware evolutivo escalables, que son capaces de tratar con la fluctuación de la cantidad de recursos disponibles en el sistema, de una forma autónoma. - Una estrategia de procesamiento en paralelo para el deblocking filter compatible con los estándares H.264/AVC y SVC que reduce el número de ciclos de macrobloque necesarios para procesar un frame de video. - Una arquitectura dinámicamente escalable que permite la implementación de un nuevo deblocking filter, totalmente compatible con los estándares H.264/AVC y SVC, que explota el paralelismo a nivel de macrobloque. El presente documento se organiza en siete capítulos. En el primero se ofrece una introducción al marco tecnológico de esta tesis, especialmente centrado en la reconfiguración dinámica y parcial de FPGAs. También se motiva la necesidad de las arquitecturas dinámicamente escalables propuestas en esta tesis. En el capítulo 2 se describen las arquitecturas dinámicamente escalables. Dicha descripción incluye la mayor parte de las aportaciones a nivel arquitectural realizadas en esta tesis. Por su parte, el flujo de diseño adaptado a dichas arquitecturas se propone en el capítulo 3. El motor de reconfiguración se propone en el 4, mientras que el uso de dichas arquitecturas para implementar sistemas de hardware evolutivo se aborda en el 5. El deblocking filter escalable se describe en el 6, mientras que las conclusiones finales de esta tesis, así como la descripción del trabajo futuro, son abordadas en el capítulo 7. ABSTRACT The optimization of system parameters, such as power dissipation, the amount of hardware resources and the memory footprint, has been always a main concern when dealing with the design of resource-constrained embedded systems. This situation is even more demanding nowadays. Embedded systems cannot anymore be considered only as specific-purpose computers, designed for a particular functionality that remains unchanged during their lifetime. Differently, embedded systems are now required to deal with more demanding and complex functions, such as multimedia data processing and high-throughput connectivity. In addition, system operation may depend on external data, the user requirements or internal variables of the system, such as the battery life-time. All these conditions may vary at run-time, leading to adaptive scenarios. As a consequence of both the growing computational complexity and the existence of dynamic requirements, dynamic resource management techniques for embedded systems are needed. Software is inherently flexible, but it cannot meet the computing power offered by hardware solutions. Therefore, reconfigurable hardware emerges as a suitable technology to deal with the run-time variable requirements of complex embedded systems. Adaptive hardware requires the use of reconfigurable devices, where its functionality can be modified on demand. In this thesis, Field Programmable Gate Arrays (FPGAs) have been selected as the most appropriate commercial technology existing nowadays to implement adaptive hardware systems. There are different ways of exploiting reconfigurability in reconfigurable devices. Among them is dynamic and partial reconfiguration. This is a technique which consists in substituting part of the FPGA logic on demand, while the rest of the device continues working. The strategy followed in this thesis is to exploit the dynamic and partial reconfiguration of commercial FPGAs to deal with the flexibility and complexity demands of state-of-the-art embedded systems. The proposal of this thesis to deal with run-time variable system conditions is the use of spatially scalable processing hardware IP cores, which are able to adapt their functionality or performance at run-time, trading them off with the amount of logic resources they occupy in the device. This is referred to as a scalable footprint in the context of this thesis. The distinguishing characteristic of the proposed cores is that they rely on highly parallel, modular and regular architectures, arranged in one or two dimensions. These architectures can be scaled by means of the addition or removal of the composing blocks. This strategy avoids implementing a full version of the core for each possible size, with the corresponding benefits in terms of scaling and adaptation time, as well as bitstream storage memory requirements. Instead of providing specific-purpose architectures, generic architectural templates, which can be tuned to solve different problems, are proposed in this thesis. Architectures following both systolic and wavefront templates have been selected. Together with the proposed scalable architectural templates, other issues needed to ensure the proper design and operation of the scalable cores, such as the device reconfiguration control, the run-time management of the architecture and the implementation techniques have been also addressed in this thesis. With regard to the implementation of dynamically reconfigurable architectures, device dependent low-level details are addressed. Some of the aspects covered in this thesis are the area constrained routing for reconfigurable modules, or an inter-module communication strategy which does not introduce either extra delay or logic overhead. The system implementation, from the hardware description to the device configuration bitstream, has been fully automated by modifying the netlists corresponding to each of the system modules, which are previously generated using the vendor tools. This modification is therefore envisaged as a post-processing step. Based on these implementation proposals, a design tool called DREAMS (Dynamically Reconfigurable Embedded and Modular Systems) has been created, including a graphic user interface. The tool has specific features to cope with modular and regular architectures, including the support for module relocation and the inter-module communications scheme based on the symmetry of the architecture. The core of the tool is a custom router, which has been also exploited in this thesis to obtain symmetric routed nets, with the aim of enhancing the protection of critical reconfigurable circuits against side channel attacks. This is achieved by duplicating the logic with an exactly equal routing. In order to control the reconfiguration process of the FPGA, a Reconfiguration Engine suited to the specific requirements set by the proposed architectures was also proposed. Therefore, in addition to controlling the reconfiguration port, the Reconfiguration Engine has been enhanced with the online relocation ability, which allows employing a unique configuration bitstream for all the positions where the module may be placed in the device. Differently to the existing relocating solutions, which are based on bitstream parsers, the proposed approach is based on the online composition of bitstreams. This strategy allows increasing the speed of the process, while the length of partial bitstreams is also reduced. The height of the reconfigurable modules can be lower than the height of a clock region. The Reconfiguration Engine manages the merging process of the new and the existing configuration frames within each clock region. The process of scaling up and down the hardware cores also benefits from this technique. A direct link to an external memory where partial bitstreams can be stored has been also implemented. In order to accelerate the reconfiguration process, the ICAP has been overclocked over the speed reported by the manufacturer. In the case of Virtex-5, even though the maximum frequency of the ICAP is reported to be 100 MHz, valid operations at 250 MHz have been achieved, including the online relocation process. Portability of the reconfiguration solution to today's and probably, future FPGAs, has been also considered. The reconfiguration engine can be also used to inject faults in real hardware devices, and this way being able to evaluate the fault tolerance offered by the reconfigurable architectures. Faults are emulated by introducing partial bitstreams intentionally modified to provide erroneous functionality. To prove the validity and the benefits offered by the proposed architectures, two demonstration application lines have been envisaged. First, scalable architectures have been employed to develop an evolvable hardware platform with adaptability, fault tolerance and scalability properties. Second, they have been used to implement a scalable deblocking filter suited to scalable video coding. Evolvable Hardware is the use of evolutionary algorithms to design hardware in an autonomous way, exploiting the flexibility offered by reconfigurable devices. In this case, processing elements composing the architecture are selected from a presynthesized library of processing elements, according to the decisions taken by the algorithm, instead of being decided at design time. This way, the configuration of the array may change as run-time environmental conditions do, achieving autonomous control of the dynamic reconfiguration process. Thus, the self-optimization property is added to the native self-configurability of the dynamically scalable architectures. In addition, evolvable hardware adaptability inherently offers self-healing features. The proposal has proved to be self-tolerant, since it is able to self-recover from both transient and cumulative permanent faults. The proposed evolvable architecture has been used to implement noise removal image filters. Scalability has been also exploited in this application. Scalable evolvable hardware architectures allow the autonomous adaptation of the processing cores to a fluctuating amount of resources available in the system. Thus, it constitutes an example of the dynamic quality scalability tackled in this thesis. Two variants have been proposed. The first one consists in a single dynamically scalable evolvable core, and the second one contains a variable number of processing cores. Scalable video is a flexible approach for video compression, which offers scalability at different levels. Differently to non-scalable codecs, a scalable video bitstream can be decoded with different levels of quality, spatial or temporal resolutions, by discarding the undesired information. The interest in this technology has been fostered by the development of the Scalable Video Coding (SVC) standard, as an extension of H.264/AVC. In order to exploit all the flexibility offered by the standard, it is necessary to adapt the characteristics of the decoder to the requirements of each client during run-time. The use of dynamically scalable architectures is proposed in this thesis with this aim. The deblocking filter algorithm is the responsible of improving the visual perception of a reconstructed image, by smoothing blocking artifacts generated in the encoding loop. This is one of the most computationally intensive tasks of the standard, and furthermore, it is highly dependent on the selected scalability level in the decoder. Therefore, the deblocking filter has been selected as a proof of concept of the implementation of dynamically scalable architectures for video compression. The proposed architecture allows the run-time addition or removal of computational units working in parallel to change its level of parallelism, following a wavefront computational pattern. Scalable architecture is offered together with a scalable parallelization strategy at the macroblock level, such that when the size of the architecture changes, the macroblock filtering order is modified accordingly. The proposed pattern is based on the division of the macroblock processing into two independent stages, corresponding to the horizontal and vertical filtering of the blocks within the macroblock. The main contributions of this thesis are: - The use of highly parallel, modular, regular and local architectures to implement dynamically reconfigurable processing IP cores, for data intensive applications with flexibility requirements. - The use of two-dimensional mesh-type arrays as architectural templates to build dynamically reconfigurable IP cores, with a scalable footprint. The proposal consists in generic architectural templates, which can be tuned to solve different computational problems. •A design flow and a tool targeting the design of DPR systems, focused on highly parallel, modular and local architectures. - An inter-module communication strategy, which does not introduce delay or area overhead, named Virtual Borders. - A custom and flexible router to solve the routing conflicts as well as the inter-module communication problems, appearing during the design of DPR systems. - An algorithm addressing the optimization of systems composed of multiple scalable cores, which size can be decided individually, to optimize the system parameters. It is based on a model known as the multi-dimensional multi-choice Knapsack problem. - A reconfiguration engine tailored to the requirements of highly regular and modular architectures. It combines a high reconfiguration throughput with run-time module relocation capabilities, including the support for sub-clock reconfigurable regions and the replication in multiple positions. - A fault injection mechanism which takes advantage of the system reconfiguration engine, as well as the modularity of the proposed reconfigurable architectures, to evaluate the effects of transient and permanent faults in these architectures. - The demonstration of the possibilities of the architectures proposed in this thesis to implement evolvable hardware systems, while keeping a high processing throughput. - The implementation of scalable evolvable hardware systems, which are able to adapt to the fluctuation of the amount of resources available in the system, in an autonomous way. - A parallelization strategy for the H.264/AVC and SVC deblocking filter, which reduces the number of macroblock cycles needed to process the whole frame. - A dynamically scalable architecture that permits the implementation of a novel deblocking filter module, fully compliant with the H.264/AVC and SVC standards, which exploits the macroblock level parallelism of the algorithm. This document is organized in seven chapters. In the first one, an introduction to the technology framework of this thesis, specially focused on dynamic and partial reconfiguration, is provided. The need for the dynamically scalable processing architectures proposed in this work is also motivated in this chapter. In chapter 2, dynamically scalable architectures are described. Description includes most of the architectural contributions of this work. The design flow tailored to the scalable architectures, together with the DREAMs tool provided to implement them, are described in chapter 3. The reconfiguration engine is described in chapter 4. The use of the proposed scalable archtieectures to implement evolvable hardware systems is described in chapter 5, while the scalable deblocking filter is described in chapter 6. Final conclusions of this thesis, and the description of future work, are addressed in chapter 7.

Model-based Self-awareness Patterns for Autonomy

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Los sistemas técnicos son cada vez más complejos, incorporan funciones más avanzadas, están más integrados con otros sistemas y trabajan en entornos menos controlados. Todo esto supone unas condiciones más exigentes y con mayor incertidumbre para los sistemas de control, a los que además se demanda un comportamiento más autónomo y fiable. La adaptabilidad de manera autónoma es un reto para tecnologías de control actualmente. El proyecto de investigación ASys propone abordarlo trasladando la responsabilidad de la capacidad de adaptación del sistema de los ingenieros en tiempo de diseño al propio sistema en operación. Esta tesis pretende avanzar en la formulación y materialización técnica de los principios de ASys de cognición y auto-consciencia basadas en modelos y autogestión de los sistemas en tiempo de operación para una autonomía robusta. Para ello el trabajo se ha centrado en la capacidad de auto-conciencia, inspirada en los sistemas biológicos, y se ha explorado la posibilidad de integrarla en la arquitectura de los sistemas de control. Además de la auto-consciencia, se han explorado otros temas relevantes: modelado funcional, modelado de software, tecnología de los patrones, tecnología de componentes, tolerancia a fallos. Se ha analizado el estado de la técnica en los ámbitos pertinentes para las cuestiones de la auto-consciencia y la adaptabilidad en sistemas técnicos: arquitecturas cognitivas, control tolerante a fallos, y arquitecturas software dinámicas y computación autonómica. El marco teórico de ASys existente de sistemas autónomos cognitivos ha sido adaptado para servir de base para este análisis de autoconsciencia y adaptación y para dar sustento conceptual al posterior desarrollo de la solución. La tesis propone una solución general de diseño para la construcción de sistemas autónomos auto-conscientes. La idea central es la integración de un meta-controlador en la arquitectura de control del sistema autónomo, capaz de percibir la estado funcional del sistema de control y, si es necesario, reconfigurarlo en tiempo de operación. Esta solución de metacontrol se ha formalizado en cuatro patrones de diseño: i) el Patrón Metacontrol, que define la integración de un subsistema de metacontrol, responsable de controlar al propio sistema de control a través de la interfaz proporcionada por su plataforma de componentes, ii) el patrón Bucle de Control Epistémico, que define un bucle de control cognitivo basado en el modelos y que se puede aplicar al diseño del metacontrol, iii) el patrón de Reflexión basada en Modelo Profundo propone una solución para construir el modelo ejecutable utilizado por el meta-controlador mediante una transformación de modelo a modelo a partir del modelo de ingeniería del sistema, y, finalmente, iv) el Patrón Metacontrol Funcional, que estructura el meta-controlador en dos bucles, uno para el control de la configuración de los componentes del sistema de control, y otro sobre éste, controlando las funciones que realiza dicha configuración de componentes; de esta manera las consideraciones funcionales y estructurales se desacoplan. La Arquitectura OM y el metamodelo TOMASys son las piezas centrales del marco arquitectónico desarrollado para materializar la solución compuesta de los patrones anteriores. El metamodelo TOMASys ha sido desarrollado para la representación de la estructura y su relación con los requisitos funcionales de cualquier sistema autónomo. La Arquitectura OM es un patrón de referencia para la construcción de una metacontrolador integrando los patrones de diseño propuestos. Este meta-controlador se puede integrar en la arquitectura de cualquier sistema control basado en componentes. El elemento clave de su funcionamiento es un modelo TOMASys del sistema decontrol, que el meta-controlador usa para monitorizarlo y calcular las acciones de reconfiguración necesarias para adaptarlo a las circunstancias en cada momento. Un proceso de ingeniería, complementado con otros recursos, ha sido elaborado para guiar la aplicación del marco arquitectónico OM. Dicho Proceso de Ingeniería OM define la metodología a seguir para construir el subsistema de metacontrol para un sistema autónomo a partir del modelo funcional del mismo. La librería OMJava proporciona una implementación del meta-controlador OM que se puede integrar en el control de cualquier sistema autónomo, independientemente del dominio de la aplicación o de su tecnología de implementación. Para concluir, la solución completa ha sido validada con el desarrollo de un robot móvil autónomo que incorpora un meta-controlador con la Arquitectura OM. Las propiedades de auto-consciencia y adaptación proporcionadas por el meta-controlador han sido validadas en diferentes escenarios de operación del robot, en los que el sistema era capaz de sobreponerse a fallos en el sistema de control mediante reconfiguraciones orquestadas por el metacontrolador. ABSTRACT Technical systems are becoming more complex, they incorporate more advanced functionalities, they are more integrated with other systems and they are deployed in less controlled environments. All this supposes a more demanding and uncertain scenario for control systems, which are also required to be more autonomous and dependable. Autonomous adaptivity is a current challenge for extant control technologies. The ASys research project proposes to address it by moving the responsibility for adaptivity from the engineers at design time to the system at run-time. This thesis has intended to advance in the formulation and technical reification of ASys principles of model-based self-cognition and having systems self-handle at runtime for robust autonomy. For that it has focused on the biologically inspired capability of self-awareness, and explored the possibilities to embed it into the very architecture of control systems. Besides self-awareness, other themes related to the envisioned solution have been explored: functional modeling, software modeling, patterns technology, components technology, fault tolerance. The state of the art in fields relevant for the issues of self-awareness and adaptivity has been analysed: cognitive architectures, fault-tolerant control, and software architectural reflection and autonomic computing. The extant and evolving ASys Theoretical Framework for cognitive autonomous systems has been adapted to provide a basement for this selfhood-centred analysis and to conceptually support the subsequent development of our solution. The thesis proposes a general design solution for building self-aware autonomous systems. Its central idea is the integration of a metacontroller in the control architecture of the autonomous system, capable of perceiving the functional state of the control system and reconfiguring it if necessary at run-time. This metacontrol solution has been formalised into four design patterns: i) the Metacontrol Pattern, which defines the integration of a metacontrol subsystem, controlling the domain control system through an interface provided by its implementation component platform, ii) the Epistemic Control Loop pattern, which defines a modelbased cognitive control loop that can be applied to the design of such a metacontroller, iii) the Deep Model Reflection pattern proposes a solution to produce the online executable model used by the metacontroller by model-to-model transformation from the engineering model, and, finally, iv) the Functional Metacontrol pattern, which proposes to structure the metacontroller in two loops, one for controlling the configuration of components of the controller, and another one on top of the former, controlling the functions being realised by that configuration; this way the functional and structural concerns become decoupled. The OM Architecture and the TOMASys metamodel are the core pieces of the architectural framework developed to reify this patterned solution. The TOMASys metamodel has been developed for representing the structure and its relation to the functional requirements of any autonomous system. The OM architecture is a blueprint for building a metacontroller according to the patterns. This metacontroller can be integrated on top of any component-based control architecture. At the core of its operation lies a TOMASys model of the control system. An engineering process and accompanying assets have been constructed to complete and exploit the architectural framework. The OM Engineering Process defines the process to follow to develop the metacontrol subsystem from the functional model of the controller of the autonomous system. The OMJava library provides a domain and application-independent implementation of an OM Metacontroller than can be used in the implementation phase of OMEP. Finally, the complete solution has been validated in the development of an autonomous mobile robot that incorporates an OM metacontroller. The functional selfawareness and adaptivity properties achieved thanks to the metacontrol system have been validated in different scenarios. In these scenarios the robot was able to overcome failures in the control system thanks to reconfigurations performed by the metacontroller.

Simulation of impulse response for indoor visible light communications using 3D CAD models

Relevância:

10.00% 10.00%

Publicador:

Resumo:

n this article, a tool for simulating the channel impulse response for indoor visible light communications using 3D computer-aided design (CAD) models is presented. The simulation tool is based on a previous Monte Carlo ray-tracing algorithm for indoor infrared channel estimation, but including wavelength response evaluation. The 3D scene, or the simulation environment, can be defined using any CAD software in which the user specifies, in addition to the setting geometry, the reflection characteristics of the surface materials as well as the structures of the emitters and receivers involved in the simulation. Also, in an effort to improve the computational efficiency, two optimizations are proposed. The first one consists of dividing the setting into cubic regions of equal size, which offers a calculation improvement of approximately 50% compared to not dividing the 3D scene into sub-regions. The second one involves the parallelization of the simulation algorithm, which provides a computational speed-up proportional to the number of processors used.

A scalable SIEM correlation engine and its application to the Olympic Games IT infrastructure

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The security event correlation scalability has become a major concern for security analysts and IT administrators when considering complex IT infrastructures that need to handle gargantuan amounts of events or wide correlation window spans. The current correlation capabilities of Security Information and Event Management (SIEM), based on a single node in centralized servers, have proved to be insufficient to process large event streams. This paper introduces a step forward in the current state of the art to address the aforementioned problems. The proposed model takes into account the two main aspects of this ?eld: distributed correlation and query parallelization. We present a case study of a multiple-step attack on the Olympic Games IT infrastructure to illustrate the applicability of our approach.

«
1
2
3
4
5
»