821 resultados para cloud computing, hypervisor, virtualizzazione, live migration, infrastructure as a service
Resumo:
Postprint
Resumo:
Postprint
Resumo:
Winner of best paper award.
Resumo:
Cumulon is a system aimed at simplifying the development and deployment of statistical analysis of big data in public clouds. Cumulon allows users to program in their familiar language of matrices and linear algebra, without worrying about how to map data and computation to specific hardware and cloud software platforms. Given user-specified requirements in terms of time, monetary cost, and risk tolerance, Cumulon automatically makes intelligent decisions on implementation alternatives, execution parameters, as well as hardware provisioning and configuration settings -- such as what type of machines and how many of them to acquire. Cumulon also supports clouds with auction-based markets: it effectively utilizes computing resources whose availability varies according to market conditions, and suggests best bidding strategies for them. Cumulon explores two alternative approaches toward supporting such markets, with different trade-offs between system and optimization complexity. Experimental study is conducted to show the efficiency of Cumulon's execution engine, as well as the optimizer's effectiveness in finding the optimal plan in the vast plan space.
Resumo:
PURPOSE: Radiation therapy is used to treat cancer using carefully designed plans that maximize the radiation dose delivered to the target and minimize damage to healthy tissue, with the dose administered over multiple occasions. Creating treatment plans is a laborious process and presents an obstacle to more frequent replanning, which remains an unsolved problem. However, in between new plans being created, the patient's anatomy can change due to multiple factors including reduction in tumor size and loss of weight, which results in poorer patient outcomes. Cloud computing is a newer technology that is slowly being used for medical applications with promising results. The objective of this work was to design and build a system that could analyze a database of previously created treatment plans, which are stored with their associated anatomical information in studies, to find the one with the most similar anatomy to a new patient. The analyses would be performed in parallel on the cloud to decrease the computation time of finding this plan. METHODS: The system used SlicerRT, a radiation therapy toolkit for the open-source platform 3D Slicer, for its tools to perform the similarity analysis algorithm. Amazon Web Services was used for the cloud instances on which the analyses were performed, as well as for storage of the radiation therapy studies and messaging between the instances and a master local computer. A module was built in SlicerRT to provide the user with an interface to direct the system on the cloud, as well as to perform other related tasks. RESULTS: The cloud-based system out-performed previous methods of conducting the similarity analyses in terms of time, as it analyzed 100 studies in approximately 13 minutes, and produced the same similarity values as those methods. It also scaled up to larger numbers of studies to analyze in the database with a small increase in computation time of just over 2 minutes. CONCLUSION: This system successfully analyzes a large database of radiation therapy studies and finds the one that is most similar to a new patient, which represents a potential step forward in achieving feasible adaptive radiation therapy replanning.
Resumo:
Physical location of data in cloud storage is a problem that gains a lot of attention not only from the actual cloud providers but also from the end users' who lately raise many concerns regarding the privacy of their data. It is a common practice that cloud service providers create replicate users' data across multiple physical locations. However, moving data in different countries means that basically the access rights are transferred based on the local laws of the corresponding country. In other words, when a cloud service provider stores users' data in a different country then the transferred data is subject to the data protection laws of the country where the servers are located. In this paper, we propose LocLess, a protocol which is based on a symmetric searchable encryption scheme for protecting users' data from unauthorized access even if the data is transferred to different locations. The idea behind LocLess is that "Once data is placed on the cloud in an unencrypted form or encrypted with a key that is known to the cloud service provider, data privacy becomes an illusion". Hence, the proposed solution is solely based on encrypting data with a key that is only known to the data owner.
Resumo:
Simulating the efficiency of business processes could reveal crucial bottlenecks for manufacturing companies and could lead to significant optimizations resulting in decreased time to market, more efficient resource utilization, and larger profit. While such business optimization software is widely utilized by larger companies, SMEs typically do not have the required expertise and resources to efficiently exploit these advantages. The aim of this work is to explore how simulation software vendors and consultancies can extend their portfolio to SMEs by providing business process optimization based on a cloud computing platform. By executing simulation runs on the cloud, software vendors and associated business consultancies can get access to large computing power and data storage capacity on demand, run large simulation scenarios on behalf of their clients, analyze simulation results, and advise their clients regarding process optimization. The solution is mutually beneficial for both vendor/consultant and the end-user SME. End-user companies will only pay for the service without requiring large upfront costs for software licenses and expensive hardware. Software vendors can extend their business towards the SME market with potentially huge benefits.
Resumo:
How can applications be deployed on the cloud to achieve maximum performance? This question is challenging to address with the availability of a wide variety of cloud Virtual Machines (VMs) with different performance capabilities. The research reported in this paper addresses the above question by proposing a six step benchmarking methodology in which a user provides a set of weights that indicate how important memory, local communication, computation and storage related operations are to an application. The user can either provide a set of four abstract weights or eight fine grain weights based on the knowledge of the application. The weights along with benchmarking data collected from the cloud are used to generate a set of two rankings - one based only on the performance of the VMs and the other takes both performance and costs into account. The rankings are validated on three case study applications using two validation techniques. The case studies on a set of experimental VMs highlight that maximum performance can be achieved by the three top ranked VMs and maximum performance in a cost-effective manner is achieved by at least one of the top three ranked VMs produced by the methodology.
Resumo:
Durch den großen Erfolg des Cloud Computing und der hohen Geschwindigkeit, mit der Cloud-Innovationen seither Einzug in die Praxis finden, eröffnen sich für die Industrie neue Chancen im Wettbewerb. Von besonderer Bedeutung sind die Möglichkeiten, Cloud-gestützte Geschäftsprozesse dynamisch, als direkte Reaktion auf einen Kundenauftrag, anzupassen und auszuführen. Dies gilt insbesondere auch für kooperative und unternehmensübergreifende Anwendungen, welche aus mehreren IT-Diensten verschiedener Partner bestehen. Gegenstand dieses Artikels ist die Vorstellung eines Konzeptes und einer Architektur für eine zentrale Cloud-Plattform zur Konfiguration, Ausführung und Überwachung von kollaborativen Logistik-Prozessen. Auf dieser Plattform können Geschäftsprozesse modelliert und in ihren Privacy-Eigenschaften parametrisiert werden. Die einzelnen Prozesselemente werden dabei mit IT-Diensten verknüpft, die beispielsweise auf externen Cloud-Plattformen ausgeführt werden. Ein Schwerpunkt der Veröffentlichung liegt in der Betrachtung der Erstellung, Umsetzung und Überwachung von Privacy-Anforderungen.
Resumo:
Il lavoro sviluppato deriva dalla creazione, in sede di tirocinio, di un piccolo database, creato a partire dalla ricerca dei dati fino alla scelta di informazioni di rilievo e alla loro conseguente archiviazione. L’obiettivo dell’elaborato è rappresentato dalla volontà di ampliare quella conoscenza basilare posseduta sul mondo dell’informazione dal punto di vista gestionale. Infatti, considerando lo scenario odierno, si può affermare che lo studio del cliente attraverso delle informazioni rilevanti, di vario tipo, è una delle conoscenze fondamentali nel mondo dell’ingegneria gestionale. Il metodo di studio utilizzato è basato sulla comprensione delle diverse tipologie di dati presenti nel mondo aziendale e, di conseguenza, al loro legame con il mondo del web e soprattutto con i metodi di archiviazione più moderni e più utilizzati oggi sia dalle aziende, che non dai privati stessi; le piattaforme cloud. L’elaborato si suddivide in tre argomenti differenti ma strettamente collegati tra loro; la prima parte tratta di come l’informazione più basilare vada raccolta ed analizzata, la sezione centrale è legata al tema chiave dell’internet come mezzo di archiviazione e non più solo come piattaforma di ricerca del dato, mentre nel capitolo finale viene chiarito il concetto di cloud computing, comodo veloce ed efficiente, considerato da qualche anno il punto d’incontro fra i primi due argomenti. Nello specifico si andranno a presentare alcuni di applicazione reale del cloud da parte di aziende come Amazon, Google e Facebook, multinazionali che ad oggi sono riuscite a fare dell’archiviazione e della manipolazione dei dati, a scopi industriali, una delle loro fonti di guadagno. Il risultato è rappresentato da una panoramica sul funzionamento e sulle tecniche di utilizzo dell’informazione, partendo dal dato più irrilevante fino ad arrivare ai database condivisi utilizzati, se non addirittura controllati, dalle più rinomate aziende nazionali ed internazionali.
Resumo:
This paper deals with the combination of OSGi and cloud computing. Both technologies are mainly placed in the field of distributed computing. Therefore, it is discussed how different approaches from different institutions work. In addition, the approaches are compared to each other.
Resumo:
In today’s big data world, data is being produced in massive volumes, at great velocity and from a variety of different sources such as mobile devices, sensors, a plethora of small devices hooked to the internet (Internet of Things), social networks, communication networks and many others. Interactive querying and large-scale analytics are being increasingly used to derive value out of this big data. A large portion of this data is being stored and processed in the Cloud due the several advantages provided by the Cloud such as scalability, elasticity, availability, low cost of ownership and the overall economies of scale. There is thus, a growing need for large-scale cloud-based data management systems that can support real-time ingest, storage and processing of large volumes of heterogeneous data. However, in the pay-as-you-go Cloud environment, the cost of analytics can grow linearly with the time and resources required. Reducing the cost of data analytics in the Cloud thus remains a primary challenge. In my dissertation research, I have focused on building efficient and cost-effective cloud-based data management systems for different application domains that are predominant in cloud computing environments. In the first part of my dissertation, I address the problem of reducing the cost of transactional workloads on relational databases to support database-as-a-service in the Cloud. The primary challenges in supporting such workloads include choosing how to partition the data across a large number of machines, minimizing the number of distributed transactions, providing high data availability, and tolerating failures gracefully. I have designed, built and evaluated SWORD, an end-to-end scalable online transaction processing system, that utilizes workload-aware data placement and replication to minimize the number of distributed transactions that incorporates a suite of novel techniques to significantly reduce the overheads incurred both during the initial placement of data, and during query execution at runtime. In the second part of my dissertation, I focus on sampling-based progressive analytics as a means to reduce the cost of data analytics in the relational domain. Sampling has been traditionally used by data scientists to get progressive answers to complex analytical tasks over large volumes of data. Typically, this involves manually extracting samples of increasing data size (progressive samples) for exploratory querying. This provides the data scientists with user control, repeatable semantics, and result provenance. However, such solutions result in tedious workflows that preclude the reuse of work across samples. On the other hand, existing approximate query processing systems report early results, but do not offer the above benefits for complex ad-hoc queries. I propose a new progressive data-parallel computation framework, NOW!, that provides support for progressive analytics over big data. In particular, NOW! enables progressive relational (SQL) query support in the Cloud using unique progress semantics that allow efficient and deterministic query processing over samples providing meaningful early results and provenance to data scientists. NOW! enables the provision of early results using significantly fewer resources thereby enabling a substantial reduction in the cost incurred during such analytics. Finally, I propose NSCALE, a system for efficient and cost-effective complex analytics on large-scale graph-structured data in the Cloud. The system is based on the key observation that a wide range of complex analysis tasks over graph data require processing and reasoning about a large number of multi-hop neighborhoods or subgraphs in the graph; examples include ego network analysis, motif counting in biological networks, finding social circles in social networks, personalized recommendations, link prediction, etc. These tasks are not well served by existing vertex-centric graph processing frameworks whose computation and execution models limit the user program to directly access the state of a single vertex, resulting in high execution overheads. Further, the lack of support for extracting the relevant portions of the graph that are of interest to an analysis task and loading it onto distributed memory leads to poor scalability. NSCALE allows users to write programs at the level of neighborhoods or subgraphs rather than at the level of vertices, and to declaratively specify the subgraphs of interest. It enables the efficient distributed execution of these neighborhood-centric complex analysis tasks over largescale graphs, while minimizing resource consumption and communication cost, thereby substantially reducing the overall cost of graph data analytics in the Cloud. The results of our extensive experimental evaluation of these prototypes with several real-world data sets and applications validate the effectiveness of our techniques which provide orders-of-magnitude reductions in the overheads of distributed data querying and analysis in the Cloud.
Resumo:
Elasticity is one of the most known capabilities related to cloud computing, being largely deployed reactively using thresholds. In this way, maximum and minimum limits are used to drive resource allocation and deallocation actions, leading to the following problem statements: How can cloud users set the threshold values to enable elasticity in their cloud applications? And what is the impact of the applications load pattern in the elasticity? This article tries to answer these questions for iterative high performance computing applications, showing the impact of both thresholds and load patterns on application performance and resource consumption. To accomplish this, we developed a reactive and PaaS-based elasticity model called AutoElastic and employed it over a private cloud to execute a numerical integration application. Here, we are presenting an analysis of best practices and possible optimizations regarding the elasticity and HPC pair. Considering the results, we observed that the maximum threshold influences the application time more than the minimum one. We concluded that threshold values close to 100% of CPU load are directly related to a weaker reactivity, postponing resource reconfiguration when its activation in advance could be pertinent for reducing the application runtime.
Resumo:
The air-sea flux of greenhouse gases (e.g. carbon dioxide, CO2) is a critical part of the climate system and a major factor in the biogeochemical development of the oceans. More accurate and higher resolution calculations of these gas fluxes are required if we are to fully understand and predict our future climate. Satellite Earth observation is able to provide large spatial scale datasets that can be used to study gas fluxes. However, the large storage requirements needed to host such data can restrict its use by the scientific community. Fortunately, the development of cloud-computing can provide a solution. Here we describe an open source air-sea CO2 flux processing toolbox called the ‘FluxEngine’, designed for use on a cloud-computing infrastructure. The toolbox allows users to easily generate global and regional air-sea CO2 flux data from model, in situ and Earth observation data, and its air-sea gas flux calculation is user configurable. Its current installation on the Nephalae cloud allows users to easily exploit more than 8 terabytes of climate-quality Earth observation data for the derivation of gas fluxes. The resultant NetCDF data output files contain >20 data layers containing the various stages of the flux calculation along with process indicator layers to aid interpretation of the data. This paper describes the toolbox design, the verification of the air-sea CO2 flux calculations, demonstrates the use of the tools for studying global and shelf-sea air-sea fluxes and describes future developments.
Resumo:
En la actualidad, el uso del Cloud Computing se está incrementando y existen muchos proveedores que ofrecen servicios que hacen uso de esta tecnología. Uno de ellos es Amazon Web Services, que a través de su servicio Amazon EC2, nos ofrece diferentes tipos de instancias que podemos utilizar según nuestras necesidades. El modelo de negocio de AWS se basa en el pago por uso, es decir, solo realizamos el pago por el tiempo que se utilicen las instancias. En este trabajo se implementa en Amazon EC2, una aplicación cuyo objetivo es extraer de diferentes fuentes de información, los datos de las ventas realizadas por las editoriales y librerías de España. Estos datos son procesados, cargados en una base de datos y con ellos se generan reportes estadísticos, que ayudarán a los clientes a tomar mejores decisiones. Debido a que la aplicación procesa una gran cantidad de datos, se propone el desarrollo y validación de un modelo, que nos permita obtener una ejecución óptima en Amazon EC2. En este modelo se tienen en cuenta el tiempo de ejecución, el coste por uso y una métrica de coste/rendimiento. Adicionalmente, se utilizará la tecnología de contenedores Docker para llevar a cabo un caso específico del despliegue de la aplicación.