739 resultados para hybrid cloud computing
Resumo:
Postprint
Resumo:
Winner of best paper award.
Resumo:
Cumulon is a system aimed at simplifying the development and deployment of statistical analysis of big data in public clouds. Cumulon allows users to program in their familiar language of matrices and linear algebra, without worrying about how to map data and computation to specific hardware and cloud software platforms. Given user-specified requirements in terms of time, monetary cost, and risk tolerance, Cumulon automatically makes intelligent decisions on implementation alternatives, execution parameters, as well as hardware provisioning and configuration settings -- such as what type of machines and how many of them to acquire. Cumulon also supports clouds with auction-based markets: it effectively utilizes computing resources whose availability varies according to market conditions, and suggests best bidding strategies for them. Cumulon explores two alternative approaches toward supporting such markets, with different trade-offs between system and optimization complexity. Experimental study is conducted to show the efficiency of Cumulon's execution engine, as well as the optimizer's effectiveness in finding the optimal plan in the vast plan space.
Resumo:
Mobile Cloud Computing promises to overcome the physical limitations of mobile devices by executing demanding mobile applications on cloud infrastructure. In practice, implementing this paradigm is difficult; network disconnection often occurs, bandwidth may be limited, and a large power draw is required from the battery, resulting in a poor user experience. This thesis presents a mobile cloud middleware solution, Context Aware Mobile Cloud Services (CAMCS), which provides cloudbased services to mobile devices, in a disconnected fashion. An integrated user experience is delivered by designing for anticipated network disconnection, and low data transfer requirements. CAMCS achieves this by means of the Cloud Personal Assistant (CPA); each user of CAMCS is assigned their own CPA, which can complete user-assigned tasks, received as descriptions from the mobile device, by using existing cloud services. Service execution is personalised to the user's situation with contextual data, and task execution results are stored with the CPA until the user can connect with his/her mobile device to obtain the results. Requirements for an integrated user experience are outlined, along with the design and implementation of CAMCS. The operation of CAMCS and CPAs with cloud-based services is presented, specifically in terms of service description, discovery, and task execution. The use of contextual awareness to personalise service discovery and service consumption to the user's situation is also presented. Resource management by CAMCS is also studied, and compared with existing solutions. Additional application models that can be provided by CAMCS are also presented. Evaluation is performed with CAMCS deployed on the Amazon EC2 cloud. The resource usage of the CAMCS Client, running on Android-based mobile devices, is also evaluated. A user study with volunteers using CAMCS on their own mobile devices is also presented. Results show that CAMCS meets the requirements outlined for an integrated user experience.
Resumo:
Cloud computing offers massive scalability and elasticity required by many scien-tific and commercial applications. Combining the computational and data handling capabilities of clouds with parallel processing also has the potential to tackle Big Data problems efficiently. Science gateway frameworks and workflow systems enable application developers to implement complex applications and make these available for end-users via simple graphical user interfaces. The integration of such frameworks with Big Data processing tools on the cloud opens new oppor-tunities for application developers. This paper investigates how workflow sys-tems and science gateways can be extended with Big Data processing capabilities. A generic approach based on infrastructure aware workflows is suggested and a proof of concept is implemented based on the WS-PGRADE/gUSE science gateway framework and its integration with the Hadoop parallel data processing solution based on the MapReduce paradigm in the cloud. The provided analysis demonstrates that the methods described to integrate Big Data processing with workflows and science gateways work well in different cloud infrastructures and application scenarios, and can be used to create massively parallel applications for scientific analysis of Big Data.
Resumo:
PURPOSE: Radiation therapy is used to treat cancer using carefully designed plans that maximize the radiation dose delivered to the target and minimize damage to healthy tissue, with the dose administered over multiple occasions. Creating treatment plans is a laborious process and presents an obstacle to more frequent replanning, which remains an unsolved problem. However, in between new plans being created, the patient's anatomy can change due to multiple factors including reduction in tumor size and loss of weight, which results in poorer patient outcomes. Cloud computing is a newer technology that is slowly being used for medical applications with promising results. The objective of this work was to design and build a system that could analyze a database of previously created treatment plans, which are stored with their associated anatomical information in studies, to find the one with the most similar anatomy to a new patient. The analyses would be performed in parallel on the cloud to decrease the computation time of finding this plan. METHODS: The system used SlicerRT, a radiation therapy toolkit for the open-source platform 3D Slicer, for its tools to perform the similarity analysis algorithm. Amazon Web Services was used for the cloud instances on which the analyses were performed, as well as for storage of the radiation therapy studies and messaging between the instances and a master local computer. A module was built in SlicerRT to provide the user with an interface to direct the system on the cloud, as well as to perform other related tasks. RESULTS: The cloud-based system out-performed previous methods of conducting the similarity analyses in terms of time, as it analyzed 100 studies in approximately 13 minutes, and produced the same similarity values as those methods. It also scaled up to larger numbers of studies to analyze in the database with a small increase in computation time of just over 2 minutes. CONCLUSION: This system successfully analyzes a large database of radiation therapy studies and finds the one that is most similar to a new patient, which represents a potential step forward in achieving feasible adaptive radiation therapy replanning.
Resumo:
Physical location of data in cloud storage is a problem that gains a lot of attention not only from the actual cloud providers but also from the end users' who lately raise many concerns regarding the privacy of their data. It is a common practice that cloud service providers create replicate users' data across multiple physical locations. However, moving data in different countries means that basically the access rights are transferred based on the local laws of the corresponding country. In other words, when a cloud service provider stores users' data in a different country then the transferred data is subject to the data protection laws of the country where the servers are located. In this paper, we propose LocLess, a protocol which is based on a symmetric searchable encryption scheme for protecting users' data from unauthorized access even if the data is transferred to different locations. The idea behind LocLess is that "Once data is placed on the cloud in an unencrypted form or encrypted with a key that is known to the cloud service provider, data privacy becomes an illusion". Hence, the proposed solution is solely based on encrypting data with a key that is only known to the data owner.
Resumo:
Simulating the efficiency of business processes could reveal crucial bottlenecks for manufacturing companies and could lead to significant optimizations resulting in decreased time to market, more efficient resource utilization, and larger profit. While such business optimization software is widely utilized by larger companies, SMEs typically do not have the required expertise and resources to efficiently exploit these advantages. The aim of this work is to explore how simulation software vendors and consultancies can extend their portfolio to SMEs by providing business process optimization based on a cloud computing platform. By executing simulation runs on the cloud, software vendors and associated business consultancies can get access to large computing power and data storage capacity on demand, run large simulation scenarios on behalf of their clients, analyze simulation results, and advise their clients regarding process optimization. The solution is mutually beneficial for both vendor/consultant and the end-user SME. End-user companies will only pay for the service without requiring large upfront costs for software licenses and expensive hardware. Software vendors can extend their business towards the SME market with potentially huge benefits.
Resumo:
How can applications be deployed on the cloud to achieve maximum performance? This question is challenging to address with the availability of a wide variety of cloud Virtual Machines (VMs) with different performance capabilities. The research reported in this paper addresses the above question by proposing a six step benchmarking methodology in which a user provides a set of weights that indicate how important memory, local communication, computation and storage related operations are to an application. The user can either provide a set of four abstract weights or eight fine grain weights based on the knowledge of the application. The weights along with benchmarking data collected from the cloud are used to generate a set of two rankings - one based only on the performance of the VMs and the other takes both performance and costs into account. The rankings are validated on three case study applications using two validation techniques. The case studies on a set of experimental VMs highlight that maximum performance can be achieved by the three top ranked VMs and maximum performance in a cost-effective manner is achieved by at least one of the top three ranked VMs produced by the methodology.
Resumo:
Durch den großen Erfolg des Cloud Computing und der hohen Geschwindigkeit, mit der Cloud-Innovationen seither Einzug in die Praxis finden, eröffnen sich für die Industrie neue Chancen im Wettbewerb. Von besonderer Bedeutung sind die Möglichkeiten, Cloud-gestützte Geschäftsprozesse dynamisch, als direkte Reaktion auf einen Kundenauftrag, anzupassen und auszuführen. Dies gilt insbesondere auch für kooperative und unternehmensübergreifende Anwendungen, welche aus mehreren IT-Diensten verschiedener Partner bestehen. Gegenstand dieses Artikels ist die Vorstellung eines Konzeptes und einer Architektur für eine zentrale Cloud-Plattform zur Konfiguration, Ausführung und Überwachung von kollaborativen Logistik-Prozessen. Auf dieser Plattform können Geschäftsprozesse modelliert und in ihren Privacy-Eigenschaften parametrisiert werden. Die einzelnen Prozesselemente werden dabei mit IT-Diensten verknüpft, die beispielsweise auf externen Cloud-Plattformen ausgeführt werden. Ein Schwerpunkt der Veröffentlichung liegt in der Betrachtung der Erstellung, Umsetzung und Überwachung von Privacy-Anforderungen.
Resumo:
Il lavoro sviluppato deriva dalla creazione, in sede di tirocinio, di un piccolo database, creato a partire dalla ricerca dei dati fino alla scelta di informazioni di rilievo e alla loro conseguente archiviazione. L’obiettivo dell’elaborato è rappresentato dalla volontà di ampliare quella conoscenza basilare posseduta sul mondo dell’informazione dal punto di vista gestionale. Infatti, considerando lo scenario odierno, si può affermare che lo studio del cliente attraverso delle informazioni rilevanti, di vario tipo, è una delle conoscenze fondamentali nel mondo dell’ingegneria gestionale. Il metodo di studio utilizzato è basato sulla comprensione delle diverse tipologie di dati presenti nel mondo aziendale e, di conseguenza, al loro legame con il mondo del web e soprattutto con i metodi di archiviazione più moderni e più utilizzati oggi sia dalle aziende, che non dai privati stessi; le piattaforme cloud. L’elaborato si suddivide in tre argomenti differenti ma strettamente collegati tra loro; la prima parte tratta di come l’informazione più basilare vada raccolta ed analizzata, la sezione centrale è legata al tema chiave dell’internet come mezzo di archiviazione e non più solo come piattaforma di ricerca del dato, mentre nel capitolo finale viene chiarito il concetto di cloud computing, comodo veloce ed efficiente, considerato da qualche anno il punto d’incontro fra i primi due argomenti. Nello specifico si andranno a presentare alcuni di applicazione reale del cloud da parte di aziende come Amazon, Google e Facebook, multinazionali che ad oggi sono riuscite a fare dell’archiviazione e della manipolazione dei dati, a scopi industriali, una delle loro fonti di guadagno. Il risultato è rappresentato da una panoramica sul funzionamento e sulle tecniche di utilizzo dell’informazione, partendo dal dato più irrilevante fino ad arrivare ai database condivisi utilizzati, se non addirittura controllati, dalle più rinomate aziende nazionali ed internazionali.
Resumo:
This paper deals with the combination of OSGi and cloud computing. Both technologies are mainly placed in the field of distributed computing. Therefore, it is discussed how different approaches from different institutions work. In addition, the approaches are compared to each other.
Resumo:
We present Dithen, a novel computation-as-a-service (CaaS) cloud platform specifically tailored to the parallel ex-ecution of large-scale multimedia tasks. Dithen handles the upload/download of both multimedia data and executable items, the assignment of compute units to multimedia workloads, and the reactive control of the available compute units to minimize the cloud infrastructure cost under deadline-abiding execution. Dithen combines three key properties: (i) the reactive assignment of individual multimedia tasks to available computing units according to availability and predetermined time-to-completion constraints; (ii) optimal resource estimation based on Kalman-filter estimates; (iii) the use of additive increase multiplicative decrease (AIMD) algorithms (famous for being the resource management in the transport control protocol) for the control of the number of units servicing workloads. The deployment of Dithen over Amazon EC2 spot instances is shown to be capable of processing more than 80,000 video transcoding, face detection and image processing tasks (equivalent to the processing of more than 116 GB of compressed data) for less than $1 in billing cost from EC2. Moreover, the proposed AIMD-based control mechanism, in conjunction with the Kalman estimates, is shown to provide for more than 27% reduction in EC2 spot instance cost against methods based on reactive resource estimation. Finally, Dithen is shown to offer a 38% to 500% reduction of the billing cost against the current state-of-the-art in CaaS platforms on Amazon EC2 (Amazon Lambda and Amazon Autoscale). A baseline version of Dithen is currently available at dithen.com.
Resumo:
Avec l’avènement des objets connectés, la bande passante nécessaire dépasse la capacité des interconnections électriques et interface sans fils dans les réseaux d’accès mais aussi dans les réseaux coeurs. Des systèmes photoniques haute capacité situés dans les réseaux d’accès utilisant la technologie radio sur fibre systèmes ont été proposés comme solution dans les réseaux sans fil de 5e générations. Afin de maximiser l’utilisation des ressources des serveurs et des ressources réseau, le cloud computing et des services de stockage sont en cours de déploiement. De cette manière, les ressources centralisées pourraient être diffusées de façon dynamique comme l’utilisateur final le souhaite. Chaque échange nécessitant une synchronisation entre le serveur et son infrastructure, une couche physique optique permet au cloud de supporter la virtualisation des réseaux et de les définir de façon logicielle. Les amplificateurs à semi-conducteurs réflectifs (RSOA) sont une technologie clé au niveau des ONU(unité de communications optiques) dans les réseaux d’accès passif (PON) à fibres. Nous examinons ici la possibilité d’utiliser un RSOA et la technologie radio sur fibre pour transporter des signaux sans fil ainsi qu’un signal numérique sur un PON. La radio sur fibres peut être facilement réalisée grâce à l’insensibilité a la longueur d’onde du RSOA. Le choix de la longueur d’onde pour la couche physique est cependant choisi dans les couches 2/3 du modèle OSI. Les interactions entre la couche physique et la commutation de réseaux peuvent être faites par l’ajout d’un contrôleur SDN pour inclure des gestionnaires de couches optiques. La virtualisation réseau pourrait ainsi bénéficier d’une couche optique flexible grâce des ressources réseau dynamique et adaptée. Dans ce mémoire, nous étudions un système disposant d’une couche physique optique basé sur un RSOA. Celle-ci nous permet de façon simultanée un envoi de signaux sans fil et le transport de signaux numérique au format modulation tout ou rien (OOK) dans un système WDM(multiplexage en longueur d’onde)-PON. Le RSOA a été caractérisé pour montrer sa capacité à gérer une plage dynamique élevée du signal sans fil analogique. Ensuite, les signaux RF et IF du système de fibres sont comparés avec ses avantages et ses inconvénients. Finalement, nous réalisons de façon expérimentale une liaison point à point WDM utilisant la transmission en duplex intégral d’un signal wifi analogique ainsi qu’un signal descendant au format OOK. En introduisant deux mélangeurs RF dans la liaison montante, nous avons résolu le problème d’incompatibilité avec le système sans fil basé sur le TDD (multiplexage en temps duplexé).
Resumo:
In today’s big data world, data is being produced in massive volumes, at great velocity and from a variety of different sources such as mobile devices, sensors, a plethora of small devices hooked to the internet (Internet of Things), social networks, communication networks and many others. Interactive querying and large-scale analytics are being increasingly used to derive value out of this big data. A large portion of this data is being stored and processed in the Cloud due the several advantages provided by the Cloud such as scalability, elasticity, availability, low cost of ownership and the overall economies of scale. There is thus, a growing need for large-scale cloud-based data management systems that can support real-time ingest, storage and processing of large volumes of heterogeneous data. However, in the pay-as-you-go Cloud environment, the cost of analytics can grow linearly with the time and resources required. Reducing the cost of data analytics in the Cloud thus remains a primary challenge. In my dissertation research, I have focused on building efficient and cost-effective cloud-based data management systems for different application domains that are predominant in cloud computing environments. In the first part of my dissertation, I address the problem of reducing the cost of transactional workloads on relational databases to support database-as-a-service in the Cloud. The primary challenges in supporting such workloads include choosing how to partition the data across a large number of machines, minimizing the number of distributed transactions, providing high data availability, and tolerating failures gracefully. I have designed, built and evaluated SWORD, an end-to-end scalable online transaction processing system, that utilizes workload-aware data placement and replication to minimize the number of distributed transactions that incorporates a suite of novel techniques to significantly reduce the overheads incurred both during the initial placement of data, and during query execution at runtime. In the second part of my dissertation, I focus on sampling-based progressive analytics as a means to reduce the cost of data analytics in the relational domain. Sampling has been traditionally used by data scientists to get progressive answers to complex analytical tasks over large volumes of data. Typically, this involves manually extracting samples of increasing data size (progressive samples) for exploratory querying. This provides the data scientists with user control, repeatable semantics, and result provenance. However, such solutions result in tedious workflows that preclude the reuse of work across samples. On the other hand, existing approximate query processing systems report early results, but do not offer the above benefits for complex ad-hoc queries. I propose a new progressive data-parallel computation framework, NOW!, that provides support for progressive analytics over big data. In particular, NOW! enables progressive relational (SQL) query support in the Cloud using unique progress semantics that allow efficient and deterministic query processing over samples providing meaningful early results and provenance to data scientists. NOW! enables the provision of early results using significantly fewer resources thereby enabling a substantial reduction in the cost incurred during such analytics. Finally, I propose NSCALE, a system for efficient and cost-effective complex analytics on large-scale graph-structured data in the Cloud. The system is based on the key observation that a wide range of complex analysis tasks over graph data require processing and reasoning about a large number of multi-hop neighborhoods or subgraphs in the graph; examples include ego network analysis, motif counting in biological networks, finding social circles in social networks, personalized recommendations, link prediction, etc. These tasks are not well served by existing vertex-centric graph processing frameworks whose computation and execution models limit the user program to directly access the state of a single vertex, resulting in high execution overheads. Further, the lack of support for extracting the relevant portions of the graph that are of interest to an analysis task and loading it onto distributed memory leads to poor scalability. NSCALE allows users to write programs at the level of neighborhoods or subgraphs rather than at the level of vertices, and to declaratively specify the subgraphs of interest. It enables the efficient distributed execution of these neighborhood-centric complex analysis tasks over largescale graphs, while minimizing resource consumption and communication cost, thereby substantially reducing the overall cost of graph data analytics in the Cloud. The results of our extensive experimental evaluation of these prototypes with several real-world data sets and applications validate the effectiveness of our techniques which provide orders-of-magnitude reductions in the overheads of distributed data querying and analysis in the Cloud.