5 resultados para cloud-based applications
em DRUM (Digital Repository at the University of Maryland)
Resumo:
In today’s big data world, data is being produced in massive volumes, at great velocity and from a variety of different sources such as mobile devices, sensors, a plethora of small devices hooked to the internet (Internet of Things), social networks, communication networks and many others. Interactive querying and large-scale analytics are being increasingly used to derive value out of this big data. A large portion of this data is being stored and processed in the Cloud due the several advantages provided by the Cloud such as scalability, elasticity, availability, low cost of ownership and the overall economies of scale. There is thus, a growing need for large-scale cloud-based data management systems that can support real-time ingest, storage and processing of large volumes of heterogeneous data. However, in the pay-as-you-go Cloud environment, the cost of analytics can grow linearly with the time and resources required. Reducing the cost of data analytics in the Cloud thus remains a primary challenge. In my dissertation research, I have focused on building efficient and cost-effective cloud-based data management systems for different application domains that are predominant in cloud computing environments. In the first part of my dissertation, I address the problem of reducing the cost of transactional workloads on relational databases to support database-as-a-service in the Cloud. The primary challenges in supporting such workloads include choosing how to partition the data across a large number of machines, minimizing the number of distributed transactions, providing high data availability, and tolerating failures gracefully. I have designed, built and evaluated SWORD, an end-to-end scalable online transaction processing system, that utilizes workload-aware data placement and replication to minimize the number of distributed transactions that incorporates a suite of novel techniques to significantly reduce the overheads incurred both during the initial placement of data, and during query execution at runtime. In the second part of my dissertation, I focus on sampling-based progressive analytics as a means to reduce the cost of data analytics in the relational domain. Sampling has been traditionally used by data scientists to get progressive answers to complex analytical tasks over large volumes of data. Typically, this involves manually extracting samples of increasing data size (progressive samples) for exploratory querying. This provides the data scientists with user control, repeatable semantics, and result provenance. However, such solutions result in tedious workflows that preclude the reuse of work across samples. On the other hand, existing approximate query processing systems report early results, but do not offer the above benefits for complex ad-hoc queries. I propose a new progressive data-parallel computation framework, NOW!, that provides support for progressive analytics over big data. In particular, NOW! enables progressive relational (SQL) query support in the Cloud using unique progress semantics that allow efficient and deterministic query processing over samples providing meaningful early results and provenance to data scientists. NOW! enables the provision of early results using significantly fewer resources thereby enabling a substantial reduction in the cost incurred during such analytics. Finally, I propose NSCALE, a system for efficient and cost-effective complex analytics on large-scale graph-structured data in the Cloud. The system is based on the key observation that a wide range of complex analysis tasks over graph data require processing and reasoning about a large number of multi-hop neighborhoods or subgraphs in the graph; examples include ego network analysis, motif counting in biological networks, finding social circles in social networks, personalized recommendations, link prediction, etc. These tasks are not well served by existing vertex-centric graph processing frameworks whose computation and execution models limit the user program to directly access the state of a single vertex, resulting in high execution overheads. Further, the lack of support for extracting the relevant portions of the graph that are of interest to an analysis task and loading it onto distributed memory leads to poor scalability. NSCALE allows users to write programs at the level of neighborhoods or subgraphs rather than at the level of vertices, and to declaratively specify the subgraphs of interest. It enables the efficient distributed execution of these neighborhood-centric complex analysis tasks over largescale graphs, while minimizing resource consumption and communication cost, thereby substantially reducing the overall cost of graph data analytics in the Cloud. The results of our extensive experimental evaluation of these prototypes with several real-world data sets and applications validate the effectiveness of our techniques which provide orders-of-magnitude reductions in the overheads of distributed data querying and analysis in the Cloud.
Resumo:
Graphene has emerged as an extraordinary material with its capability to accommodate an array of remarkable electronic, mechanical and chemical properties. Extra-large surface-to-volume ratio renders graphene a highly flexible morphology, giving rise to intriguing observations such as ripples, wrinkles and folds as well as the potential to transform into other novel carbon nanostructures. Ultra-thin, mechanically tough, electrically conductive graphene films promise to enable a wealth of possible applications ranging from hydrogen storage scaffolds, electronic transistors, to bottom-up material designs. Enthusiasm for graphene-based applications aside, there are still significant challenges to their realization, largely due to the difficulty of precisely controlling the graphene properties. Controlling the graphene morphology over large areas is crucial in enabling future graphene-based applications and material design. This dissertation aims to shed lights on potential mechanisms to actively manipulate the graphene morphology and properties and therefore enable the material design principle that delivers desirable mechanical and electronic functionalities of graphene and its derivatives.
Resumo:
The surge of interest in graphene, as epitomized by the Nobel Prize in Physics in 2010, is attributed to its extraordinary properties. Graphene is ultrathin, mechanically tough, and has amendable surface chemistry. These features make graphene and graphene based nanostructure an ideal candidate for the use of molecular mass manipulation. The controllable and programmable molecular mass manipulation is crucial in enabling future graphene based applications, however is challenging to achieve. This dissertation studies several aspects in molecular mass manipulation including mass transportation, patterning and storage. For molecular mass transportation, two methods based on carbon nanoscroll are demonstrated to be effective. They are torsional buckling instability assisted transportation and surface energy induced radial shrinkage. To achieve a more controllable transportation, a fundamental law of direction transport of molecular mass by straining basal graphene is studied. For molecular mass patterning, we reveal a barrier effect of line defects in graphene, which can enable molecular confining and patterning in a domain of desirable geometry. Such a strategy makes controllable patterning feasible for various types of molecules. For molecular mass storage, we propose a novel partially hydrogenated bilayer graphene structure which has large capacity for mass uptake. Also the mass release can be achieved by simply stretching the structure. Therefore the mass uptake and release is reversible. This kind of structure is crucial in enabling hydrogen fuel based technology. Lastly, spontaneous nanofluidic channel formation enabled by patterned hydrogenation is studied. This novel strategy enables programmable channel formation with pre-defined complex geometry.
Resumo:
With the proliferation of new mobile devices and applications, the demand for ubiquitous wireless services has increased dramatically in recent years. The explosive growth in the wireless traffic requires the wireless networks to be scalable so that they can be efficiently extended to meet the wireless communication demands. In a wireless network, the interference power typically grows with the number of devices without necessary coordination among them. On the other hand, large scale coordination is always difficult due to the low-bandwidth and high-latency interfaces between access points (APs) in traditional wireless networks. To address this challenge, cloud radio access network (C-RAN) has been proposed, where a pool of base band units (BBUs) are connected to the distributed remote radio heads (RRHs) via high bandwidth and low latency links (i.e., the front-haul) and are responsible for all the baseband processing. But the insufficient front-haul link capacity may limit the scale of C-RAN and prevent it from fully utilizing the benefits made possible by the centralized baseband processing. As a result, the front-haul link capacity becomes a bottleneck in the scalability of C-RAN. In this dissertation, we explore the scalable C-RAN in the effort of tackling this challenge. In the first aspect of this dissertation, we investigate the scalability issues in the existing wireless networks and propose a novel time-reversal (TR) based scalable wireless network in which the interference power is naturally mitigated by the focusing effects of TR communications without coordination among APs or terminal devices (TDs). Due to this nice feature, it is shown that the system can be easily extended to serve more TDs. Motivated by the nice properties of TR communications in providing scalable wireless networking solutions, in the second aspect of this dissertation, we apply the TR based communications to the C-RAN and discover the TR tunneling effects which alleviate the traffic load in the front-haul links caused by the increment of TDs. We further design waveforming schemes to optimize the downlink and uplink transmissions in the TR based C-RAN, which are shown to improve the downlink and uplink transmission accuracies. Consequently, the traffic load in the front-haul links is further alleviated by the reducing re-transmissions caused by transmission errors. Moreover, inspired by the TR-based C-RAN, we propose the compressive quantization scheme which applies to the uplink of multi-antenna C-RAN so that more antennas can be utilized with the limited front-haul capacity, which provide rich spatial diversity such that the massive TDs can be served more efficiently.
Resumo:
Executing a cloud or aerosol physical properties retrieval algorithm from controlled synthetic data is an important step in retrieval algorithm development. Synthetic data can help answer questions about the sensitivity and performance of the algorithm or aid in determining how an existing retrieval algorithm may perform with a planned sensor. Synthetic data can also help in solving issues that may have surfaced in the retrieval results. Synthetic data become very important when other validation methods, such as field campaigns,are of limited scope. These tend to be of relatively short duration and often are costly. Ground stations have limited spatial coverage whilesynthetic data can cover large spatial and temporal scales and a wide variety of conditions at a low cost. In this work I develop an advanced cloud and aerosol retrieval simulator for the MODIS instrument, also known as Multi-sensor Cloud and Aerosol Retrieval Simulator (MCARS). In a close collaboration with the modeling community I have seamlessly combined the GEOS-5 global climate model with the DISORT radiative transfer code, widely used by the remote sensing community, with the observations from the MODIS instrument to create the simulator. With the MCARS simulator it was then possible to solve the long standing issue with the MODIS aerosol optical depth retrievals that had a low bias for smoke aerosols. MODIS aerosol retrieval did not account for effects of humidity on smoke aerosols. The MCARS simulator also revealed an issue that has not been recognized previously, namely,the value of fine mode fraction could create a linear dependence between retrieved aerosol optical depth and land surface reflectance. MCARS provided the ability to examine aerosol retrievals against “ground truth” for hundreds of thousands of simultaneous samples for an area covered by only three AERONET ground stations. Findings from MCARS are already being used to improve the performance of operational MODIS aerosol properties retrieval algorithms. The modeling community will use the MCARS data to create new parameterizations for aerosol properties as a function of properties of the atmospheric column and gain the ability to correct any assimilated retrieval data that may display similar dependencies in comparisons with ground measurements.