17 resultados para Hernández, María Amelia

em Universidad Politécnica de Madrid


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Complexity has always been one of the most important issues in distributed computing. From the first clusters to grid and now cloud computing, dealing correctly and efficiently with system complexity is the key to taking technology a step further. In this sense, global behavior modeling is an innovative methodology aimed at understanding the grid behavior. The main objective of this methodology is to synthesize the grid's vast, heterogeneous nature into a simple but powerful behavior model, represented in the form of a single, abstract entity, with a global state. Global behavior modeling has proved to be very useful in effectively managing grid complexity but, in many cases, deeper knowledge is needed. It generates a descriptive model that could be greatly improved if extended not only to explain behavior, but also to predict it. In this paper we present a prediction methodology whose objective is to define the techniques needed to create global behavior prediction models for grid systems. This global behavior prediction can benefit grid management, specially in areas such as fault tolerance or job scheduling. The paper presents experimental results obtained in real scenarios in order to validate this approach.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The manipulation and handling of an ever increasing volume of data by current data-intensive applications require novel techniques for e?cient data management. Despite recent advances in every aspect of data management (storage, access, querying, analysis, mining), future applications are expected to scale to even higher degrees, not only in terms of volumes of data handled but also in terms of users and resources, often making use of multiple, pre-existing autonomous, distributed or heterogeneous resources.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this introductory chapter we put in context and give a brief outline of the work that we thoroughly present in the rest of the dissertation. We consider this work divided in two main parts. The first part is the Firenze Framework, a knowledge level description framework rich enough to express the semantics required for describing both semantic Web services and semantic Grid services. We start by defining what the Semantic Grid is and its relation with the Semantic Web; and the possibility of their convergence since both initiatives have become mainly service-oriented. We also introduce the main motivators of the creation of this framework, one is to provide a valid description framework that works at knowledge level; the other to provide a description framework that takes into account the characteristics of Grid services in order to be able to describe them properly. The other part of the dissertation is devoted to Vega, an event-driven architecture that, by means of proposed knowledge level description framework, is able to achieve high scale provisioning of knowledge-intensive services. In this introductory chapter we portrait the anatomy of a generic event-driven architecture, and we briefly enumerate their main characteristics, which are the reason that make them our choice.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

En este trabajo, en primer lugar, se analiza la estructura (accionariado), funcionalidad (cómo funcionan y emiten sus opiniones) y el comportamiento (cómo actúan en los mercados) de las tres agencias más importantes de calificación de activos financieros, de emisores y emisiones de activos financieros. En segundo término, se señalan sus poco, si algo, científicas evaluaciones basadas en conceptos y actuaciones tan poco adecuadas como: profecía autocumplida, carencia de fecha de caducidad, catastrofismo, que, a veces, imponen con comportamientos gansteriles. Asimismo, se destaca la importancia y trascendencia que suponen las opiniones que emiten dichas agencias, auténticos ucases, que las convierten, de facto, en auténticos señores de horca y caudillo que acondicionan la vida y hacienda de inversores y emisores. En realidad, su comportamiento es tan deplorable y funesto, que como las hijas de Elena, de ahí el título del trabajo, ninguna es buena. Finalmente, se presenta, en forma de decálogo, los hechos, irrefutables, que justifican el calificativo de malvadas de dichas agencias: S&P, Moody?s y Fitch, hasta el punto de que son el antimefistofelismo personificado. Mefistófeles se definía a sí mismo como un pobre diablo que pretendiendo hacer el mal, acababa haciendo el bien, justo al revés que las hijas de Elena.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The popularity of MapReduce programming model has increased interest in the research community for its improvement. Among the other directions, the point of fault tolerance, concretely the failure detection issue seems to be a crucial one, but that until now has not reached its satisfying level. Motivated by this, I decided to devote my main research during this period into having a prototype system architecture of MapReduce framework with a new failure detection service, containing both analytical (theoretical) and implementation part. I am confident that this work should lead the way for further contributions in detecting failures to any NoSQL App frameworks, and cloud storage systems in general.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Data grid services have been used to deal with the increasing needs of applications in terms of data volume and throughput. The large scale, heterogeneity and dynamism of grid environments often make management and tuning of these data services very complex. Furthermore, current high-performance I/O approaches are characterized by their high complexity and specific features that usually require specialized administrator skills. Autonomic computing can help manage this complexity. The present paper describes an autonomic subsystem intended to provide self-management features aimed at efficiently reducing the I/O problem in a grid environment, thereby enhancing the quality of service (QoS) of data access and storage services in the grid. Our proposal takes into account that data produced in an I/O system is not usually immediately required. Therefore, performance improvements are related not only to current but also to any future I/O access, as the actual data access usually occurs later on. Nevertheless, the exact time of the next I/O operations is unknown. Thus, our approach proposes a long-term prediction designed to forecast the future workload of grid components. This enables the autonomic subsystem to determine the optimal data placement to improve both current and future I/O operations.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Over the last decade, Grid computing paved the way for a new level of large scale distributed systems. This infrastructure made it possible to securely and reliably take advantage of widely separated computational resources that are part of several different organizations. Resources can be incorporated to the Grid, building a theoretical virtual supercomputer. In time, cloud computing emerged as a new type of large scale distributed system, inheriting and expanding the expertise and knowledge that have been obtained so far. Some of the main characteristics of Grids naturally evolved into clouds, others were modified and adapted and others were simply discarded or postponed. Regardless of these technical specifics, both Grids and clouds together can be considered as one of the most important advances in large scale distributed computing of the past ten years; however, this step in distributed computing has came along with a completely new level of complexity. Grid and cloud management mechanisms play a key role, and correct analysis and understanding of the system behavior are needed. Large scale distributed systems must be able to self-manage, incorporating autonomic features capable of controlling and optimizing all resources and services. Traditional distributed computing management mechanisms analyze each resource separately and adjust specific parameters of each one of them. When trying to adapt the same procedures to Grid and cloud computing, the vast complexity of these systems can make this task extremely complicated. But large scale distributed systems complexity could only be a matter of perspective. It could be possible to understand the Grid or cloud behavior as a single entity, instead of a set of resources. This abstraction could provide a different understanding of the system, describing large scale behavior and global events that probably would not be detected analyzing each resource separately. In this work we define a theoretical framework that combines both ideas, multiple resources and single entity, to develop large scale distributed systems management techniques aimed at system performance optimization, increased dependability and Quality of Service (QoS). The resulting synergy could be the key 350 J. Montes et al. to address the most important difficulties of Grid and cloud management.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In just a few years cloud computing has become a very popular paradigm and a business success story, with storage being one of the key features. To achieve high data availability, cloud storage services rely on replication. In this context, one major challenge is data consistency. In contrast to traditional approaches that are mostly based on strong consistency, many cloud storage services opt for weaker consistency models in order to achieve better availability and performance. This comes at the cost of a high probability of stale data being read, as the replicas involved in the reads may not always have the most recent write. In this paper, we propose a novel approach, named Harmony, which adaptively tunes the consistency level at run-time according to the application requirements. The key idea behind Harmony is an intelligent estimation model of stale reads, allowing to elastically scale up or down the number of replicas involved in read operations to maintain a low (possibly zero) tolerable fraction of stale reads. As a result, Harmony can meet the desired consistency of the applications while achieving good performance. We have implemented Harmony and performed extensive evaluations with the Cassandra cloud storage on Grid?5000 testbed and on Amazon EC2. The results show that Harmony can achieve good performance without exceeding the tolerated number of stale reads. For instance, in contrast to the static eventual consistency used in Cassandra, Harmony reduces the stale data being read by almost 80% while adding only minimal latency. Meanwhile, it improves the throughput of the system by 45% while maintaining the desired consistency requirements of the applications when compared to the strong consistency model in Cassandra.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The inherent complexity of modern cloud infrastructures has created the need for innovative monitoring approaches, as state-of-the-art solutions used for other large-scale environments do not address specific cloud features. Although cloud monitoring is nowadays an active research field, a comprehensive study covering all its aspects has not been presented yet. This paper provides a deep insight into cloud monitoring. It proposes a unified cloud monitoring taxonomy, based on which it defines a layered cloud monitoring architecture. To illustrate it, we have implemented GMonE, a general-purpose cloud monitoring tool which covers all aspects of cloud monitoring by specifically addressing the needs of modern cloud infrastructures. Furthermore, we have evaluated the performance, scalability and overhead of GMonE with Yahoo Cloud Serving Benchmark (YCSB), by using the OpenNebula cloud middleware on the Grid’5000 experimental testbed. The results of this evaluation demonstrate the benefits of our approach, surpassing the monitoring performance and capabilities of cloud monitoring alternatives such as those present in state-of-the-art systems such as Amazon EC2 and OpenNebula.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

La resistencia al corte de las discontinuidades de un macizo rocoso es un tema que se ha venido estudiando desde años atrás por diversos autores. Adquiere una gran importancia, debido a que constituye la base de diseños de taludes, de obras subterráneas y de otras obras de ingeniería. Es por esto que se presenta esta tesis como un aporte a las investigaciones realizadas en este ámbito. Se basa en la comparación de cuatro (4) criterios de resistencia al corte conocidos: Patton (1966), Ladanyi y Archambault (1970), Barton & Choubey (1976) y Maksimovic (1996). Con los cuatro criterios mencionados, se analiza la influencia de cada parámetro dentro de cada uno. Además, de observar las diferencias que presentan al utilizar los mismos valores de los parámetros independientes correspondientes. Procediendo de esta manera, se pueda tener una base para saber cuáles son los parámetros que resultan determinantes dentro de cada criterio. Se aplican dichos criterios a un caso práctico. Posteriormente, se proporcionan las conclusiones para el momento de necesitar utilizar cada uno de los cuatro criterios que se estudian.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

El presente Proyecto de Fin de Máster consiste en crear una herramienta software capaz de monitorizar y gestionar la actividad de Hydra, una herramienta de gestión de entornos distribuidos, para que su estrategia de balanceo de carga se adecúe al modelo creado por GloBeM, una metodología de análisis de entornos distribuidos. GloBeM, que es una metodología externa, puede analizar y crear un modelo de máquina de estados finitos a partir de un sistema distribuido concreto. Hydra, una herramienta también externa, es un sistema de gestión de entornos cloud recientemente desarrollado y de código abierto, con un sistema de balanceo de carga efectivo pero algo limitado. El software construido recoge el modelo creado por GloBeM y lo analiza. A partir de ahí, monitoriza en tiempo real y a una frecuencia determinada la actividad de Hydra y el sistema cloud que ésta gestiona, y reconfigura sus parámetros para que su desempeño se ciña a lo estipulado por el modelo de GloBeM, extendiendo así el sistema de balanceo de carga original de Hydra.---ABSTRACT---This Master's Thesis Project involves creating a software able to monitor and manage the activity of Hydra, a tool for managing distributed environments, in order to adjust its load balancing strategy to the model created by GloBeM, an analysis methodology for distributed environments. GloBeM, which is an external methodology, can analyse and create a finite-state machine model from a particular cloud system. Hydra, also an external tool, is an open source management system for cloud environments recently developed, with a relatively limited system of load balancing. The created software gets the model created by GloBeM as an input and analyses it. From there, it monitors in real time and at a certain frequency Hydra’s activity and the cloud system that it manages, and reconfigures its parameters to adjust its performance to the stipulations by the GloBeM’s model, extending Hydra's original load balancing system.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Reproducible research in scientific workflows is often addressed by tracking the provenance of the produced results. While this approach allows inspecting intermediate and final results, improves understanding, and permits replaying a workflow execution, it does not ensure that the computational environment is available for subsequent executions to reproduce the experiment. In this work, we propose describing the resources involved in the execution of an experiment using a set of semantic vocabularies, so as to conserve the computational environment. We define a process for documenting the workflow application, management system, and their dependencies based on 4 domain ontologies. We then conduct an experimental evaluation using a real workflow application on an academic and a public Cloud platform. Results show that our approach can reproduce an equivalent execution environment of a predefined virtual machine image on both computing platforms.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Estudio comparativo del proceso de envejecimiento acelerado (promovido por la incorporación de un aditivo prodegradante) exhibido por un polietileno de alta densidad con el manifestado por un copolímero de etileno‐ácido undecenoico, evaluándose el efecto de la presencia de grupos polares en la arquitectura macromolecular.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Debido al gran incremento de datos digitales que ha tenido lugar en los últimos años, ha surgido un nuevo paradigma de computación paralela para el procesamiento eficiente de grandes volúmenes de datos. Muchos de los sistemas basados en este paradigma, también llamados sistemas de computación intensiva de datos, siguen el modelo de programación de Google MapReduce. La principal ventaja de los sistemas MapReduce es que se basan en la idea de enviar la computación donde residen los datos, tratando de proporcionar escalabilidad y eficiencia. En escenarios libres de fallo, estos sistemas generalmente logran buenos resultados. Sin embargo, la mayoría de escenarios donde se utilizan, se caracterizan por la existencia de fallos. Por tanto, estas plataformas suelen incorporar características de tolerancia a fallos y fiabilidad. Por otro lado, es reconocido que las mejoras en confiabilidad vienen asociadas a costes adicionales en recursos. Esto es razonable y los proveedores que ofrecen este tipo de infraestructuras son conscientes de ello. No obstante, no todos los enfoques proporcionan la misma solución de compromiso entre las capacidades de tolerancia a fallo (o de manera general, las capacidades de fiabilidad) y su coste. Esta tesis ha tratado la problemática de la coexistencia entre fiabilidad y eficiencia de los recursos en los sistemas basados en el paradigma MapReduce, a través de metodologías que introducen el mínimo coste, garantizando un nivel adecuado de fiabilidad. Para lograr esto, se ha propuesto: (i) la formalización de una abstracción de detección de fallos; (ii) una solución alternativa a los puntos únicos de fallo de estas plataformas, y, finalmente, (iii) un nuevo sistema de asignación de recursos basado en retroalimentación a nivel de contenedores. Estas contribuciones genéricas han sido evaluadas tomando como referencia la arquitectura Hadoop YARN, que, hoy en día, es la plataforma de referencia en la comunidad de los sistemas de computación intensiva de datos. En la tesis se demuestra cómo todas las contribuciones de la misma superan a Hadoop YARN tanto en fiabilidad como en eficiencia de los recursos utilizados. ABSTRACT Due to the increase of huge data volumes, a new parallel computing paradigm to process big data in an efficient way has arisen. Many of these systems, called dataintensive computing systems, follow the Google MapReduce programming model. The main advantage of these systems is based on the idea of sending the computation where the data resides, trying to provide scalability and efficiency. In failure-free scenarios, these frameworks usually achieve good results. However, these ones are not realistic scenarios. Consequently, these frameworks exhibit some fault tolerance and dependability techniques as built-in features. On the other hand, dependability improvements are known to imply additional resource costs. This is reasonable and providers offering these infrastructures are aware of this. Nevertheless, not all the approaches provide the same tradeoff between fault tolerant capabilities (or more generally, reliability capabilities) and cost. In this thesis, we have addressed the coexistence between reliability and resource efficiency in MapReduce-based systems, looking for methodologies that introduce the minimal cost and guarantee an appropriate level of reliability. In order to achieve this, we have proposed: (i) a formalization of a failure detector abstraction; (ii) an alternative solution to single points of failure of these frameworks, and finally (iii) a novel feedback-based resource allocation system at the container level. Finally, our generic contributions have been instantiated for the Hadoop YARN architecture, which is the state-of-the-art framework in the data-intensive computing systems community nowadays. The thesis demonstrates how all our approaches outperform Hadoop YARN in terms of reliability and resource efficiency.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

La reproducibilidad de estudios y resultados científicos es una meta a tener en cuenta por cualquier científico a la hora de publicar el producto de una investigación. El auge de la ciencia computacional, como una forma de llevar a cabo estudios empíricos haciendo uso de modelos matemáticos y simulaciones, ha derivado en una serie de nuevos retos con respecto a la reproducibilidad de dichos experimentos. La adopción de los flujos de trabajo como método para especificar el procedimiento científico de estos experimentos, así como las iniciativas orientadas a la conservación de los datos experimentales desarrolladas en las últimas décadas, han solucionado parcialmente este problema. Sin embargo, para afrontarlo de forma completa, la conservación y reproducibilidad del equipamiento computacional asociado a los flujos de trabajo científicos deben ser tenidas en cuenta. La amplia gama de recursos hardware y software necesarios para ejecutar un flujo de trabajo científico hace que sea necesario aportar una descripción completa detallando que recursos son necesarios y como estos deben de ser configurados. En esta tesis abordamos la reproducibilidad de los entornos de ejecución para flujos de trabajo científicos, mediante su documentación usando un modelo formal que puede ser usado para obtener un entorno equivalente. Para ello, se ha propuesto un conjunto de modelos para representar y relacionar los conceptos relevantes de dichos entornos, así como un conjunto de herramientas que hacen uso de dichos módulos para generar una descripción de la infraestructura, y un algoritmo capaz de generar una nueva especificación de entorno de ejecución a partir de dicha descripción, la cual puede ser usada para recrearlo usando técnicas de virtualización. Estas contribuciones han sido aplicadas a un conjunto representativo de experimentos científicos pertenecientes a diferentes dominios de la ciencia, exponiendo cada uno de ellos diferentes requisitos hardware y software. Los resultados obtenidos muestran la viabilidad de propuesta desarrollada, reproduciendo de forma satisfactoria los experimentos estudiados en diferentes entornos de virtualización. ABSTRACT Reproducibility of scientific studies and results is a goal that every scientist must pursuit when announcing research outcomes. The rise of computational science, as a way of conducting empirical studies by using mathematical models and simulations, have opened a new range of challenges in this context. The adoption of workflows as a way of detailing the scientific procedure of these experiments, along with the experimental data conservation initiatives that have been undertaken during last decades, have partially eased this problem. However, in order to fully address it, the conservation and reproducibility of the computational equipment related to them must be also considered. The wide range of software and hardware resources required to execute a scientific workflow implies that a comprehensive description detailing what those resources are and how they are arranged is necessary. In this thesis we address the issue of reproducibility of execution environments for scientific workflows, by documenting them in a formalized way, which can be later used to obtain and equivalent one. In order to do so, we propose a set of semantic models for representing and relating the relevant information of those environments, as well as a set of tools that uses these models for generating a description of the infrastructure, and an algorithmic process that consumes these descriptions for deriving a new execution environment specification, which can be enacted into a new equivalent one using virtualization solutions. We apply these three contributions to a set of representative scientific experiments, belonging to different scientific domains, and exposing different software and hardware requirements. The obtained results prove the feasibility of the proposed approach, by successfully reproducing the target experiments under different virtualization environments.