860 resultados para Distributed database systems


Relevância:

40.00% 40.00%

Publicador:

Resumo:

An approach of building distributed decision support systems is proposed. There is defined a framework of a distributed DSS and examined questions of problem formulation and solving using artificial intellectual agents in system core.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present a complex neural network model of user behavior in distributed systems. The model reflects both dynamical and statistical features of user behavior and consists of three components: on-line and off-line models and change detection module. On-line model reflects dynamical features by predicting user actions on the basis of previous ones. Off-line model is based on the analysis of statistical parameters of user behavior. In both cases neural networks are used to reveal uncharacteristic activity of users. Change detection module is intended for trends analysis in user behavior. The efficiency of complex model is verified on real data of users of Space Research Institute of NASU-NSAU.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Advances in the area of industrial metrology have generated new technologies that are capable of measuring components with complex geometry and large dimensions. However, no standard or best-practice guides are available for the majority of such systems. Therefore, these new systems require appropriate testing and verification in order for the users to understand their full potential prior to their deployment in a real manufacturing environment. This is a crucial stage, especially when more than one system can be used for a specific measurement task. In this paper, two relatively new large-volume measurement systems, the mobile spatial co-ordinate measuring system (MScMS) and the indoor global positioning system (iGPS), are reviewed. These two systems utilize different technologies: the MScMS is based on ultrasound and radiofrequency signal transmission and the iGPS uses laser technology. Both systems have components with small dimensions that are distributed around the measuring area to form a network of sensors allowing rapid dimensional measurements to be performed in relation to large-size objects, with typical dimensions of several decametres. The portability, reconfigurability, and ease of installation make these systems attractive for many industries that manufacture large-scale products. In this paper, the major technical aspects of the two systems are briefly described and compared. Initial results of the tests performed to establish the repeatability and reproducibility of these systems are also presented. © IMechE 2009.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The Semantic Binary Data Model (SBM) is a viable alternative to the now-dominant relational data model. SBM would be especially advantageous for applications dealing with complex interrelated networks of objects provided that a robust efficient implementation can be achieved. This dissertation presents an implementation design method for SBM, algorithms, and their analytical and empirical evaluation. Our method allows building a robust and flexible database engine with a wider applicability range and improved performance. ^ Extensions to SBM are introduced and an implementation of these extensions is proposed that allows the database engine to efficiently support applications with a predefined set of queries. A New Record data structure is proposed. Trade-offs of employing Fact, Record and Bitmap Data structures for storing information in a semantic database are analyzed. ^ A clustering ID distribution algorithm and an efficient algorithm for object ID encoding are proposed. Mapping to an XML data model is analyzed and a new XML-based XSDL language facilitating interoperability of the system is defined. Solutions to issues associated with making the database engine multi-platform are presented. An improvement to the atomic update algorithm suitable for certain scenarios of database recovery is proposed. ^ Specific guidelines are devised for implementing a robust and well-performing database engine based on the extended Semantic Data Model. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

An electronic database support system for strategic planning activities can be built by providing conceptual and system specific information. The design and development of this type of system center around the information needs of strategy planners. Data that supply information on the organization's internal and external environments must be originated, evaluated, collected, organized, managed, and analyzed. Strategy planners may use the resulting information to improve their decision making.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Many systems and applications are continuously producing events. These events are used to record the status of the system and trace the behaviors of the systems. By examining these events, system administrators can check the potential problems of these systems. If the temporal dynamics of the systems are further investigated, the underlying patterns can be discovered. The uncovered knowledge can be leveraged to predict the future system behaviors or to mitigate the potential risks of the systems. Moreover, the system administrators can utilize the temporal patterns to set up event management rules to make the system more intelligent. With the popularity of data mining techniques in recent years, these events grad- ually become more and more useful. Despite the recent advances of the data mining techniques, the application to system event mining is still in a rudimentary stage. Most of works are still focusing on episodes mining or frequent pattern discovering. These methods are unable to provide a brief yet comprehensible summary to reveal the valuable information from the high level perspective. Moreover, these methods provide little actionable knowledge to help the system administrators to better man- age the systems. To better make use of the recorded events, more practical techniques are required. From the perspective of data mining, three correlated directions are considered to be helpful for system management: (1) Provide concise yet comprehensive summaries about the running status of the systems; (2) Make the systems more intelligence and autonomous; (3) Effectively detect the abnormal behaviors of the systems. Due to the richness of the event logs, all these directions can be solved in the data-driven manner. And in this way, the robustness of the systems can be enhanced and the goal of autonomous management can be approached. This dissertation mainly focuses on the foregoing directions that leverage tem- poral mining techniques to facilitate system management. More specifically, three concrete topics will be discussed, including event, resource demand prediction, and streaming anomaly detection. Besides the theoretic contributions, the experimental evaluation will also be presented to demonstrate the effectiveness and efficacy of the corresponding solutions.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The future power grid will effectively utilize renewable energy resources and distributed generation to respond to energy demand while incorporating information technology and communication infrastructure for their optimum operation. This dissertation contributes to the development of real-time techniques, for wide-area monitoring and secure real-time control and operation of hybrid power systems. ^ To handle the increased level of real-time data exchange, this dissertation develops a supervisory control and data acquisition (SCADA) system that is equipped with a state estimation scheme from the real-time data. This system is verified on a specially developed laboratory-based test bed facility, as a hardware and software platform, to emulate the actual scenarios of a real hybrid power system with the highest level of similarities and capabilities to practical utility systems. It includes phasor measurements at hundreds of measurement points on the system. These measurements were obtained from especially developed laboratory based Phasor Measurement Unit (PMU) that is utilized in addition to existing commercially based PMU’s. The developed PMU was used in conjunction with the interconnected system along with the commercial PMU’s. The tested studies included a new technique for detecting the partially islanded micro grids in addition to several real-time techniques for synchronization and parameter identifications of hybrid systems. ^ Moreover, due to numerous integration of renewable energy resources through DC microgrids, this dissertation performs several practical cases for improvement of interoperability of such systems. Moreover, increased number of small and dispersed generating stations and their need to connect fast and properly into the AC grids, urged this work to explore the challenges that arise in synchronization of generators to the grid and through introduction of a Dynamic Brake system to improve the process of connecting distributed generators to the power grid.^ Real time operation and control requires data communication security. A research effort in this dissertation was developed based on Trusted Sensing Base (TSB) process for data communication security. The innovative TSB approach improves the security aspect of the power grid as a cyber-physical system. It is based on available GPS synchronization technology and provides protection against confidentiality attacks in critical power system infrastructures. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Distributed Generation (DG) from alternate sources and smart grid technologies represent good solutions for the increase in energy demands. Employment of these DG assets requires solutions for the new technical challenges that are accompanied by the integration and interconnection into operational power systems. A DG infrastructure comprised of alternate energy sources in addition to conventional sources, is developed as a test bed. The test bed is operated by synchronizing, wind, photovoltaic, fuel cell, micro generator and energy storage assets, in addition to standard AC generators. Connectivity of these DG assets is tested for viability and for their operational characteristics. The control and communication layers for dynamic operations are developed to improve the connectivity of alternates to the power system. A real time application for the operation of alternate sources in microgrids is developed. Multi agent approach is utilized to improve stability and sequences of actions for black start are implemented. Experiments for control and stability issues related to dynamic operation under load conditions have been conducted and verified.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Distributed Computing frameworks belong to a class of programming models that allow developers to

launch workloads on large clusters of machines. Due to the dramatic increase in the volume of

data gathered by ubiquitous computing devices, data analytic workloads have become a common

case among distributed computing applications, making Data Science an entire field of

Computer Science. We argue that Data Scientist's concern lays in three main components: a dataset,

a sequence of operations they wish to apply on this dataset, and some constraint they may have

related to their work (performances, QoS, budget, etc). However, it is actually extremely

difficult, without domain expertise, to perform data science. One need to select the right amount

and type of resources, pick up a framework, and configure it. Also, users are often running their

application in shared environments, ruled by schedulers expecting them to specify precisely their resource

needs. Inherent to the distributed and concurrent nature of the cited frameworks, monitoring and

profiling are hard, high dimensional problems that block users from making the right

configuration choices and determining the right amount of resources they need. Paradoxically, the

system is gathering a large amount of monitoring data at runtime, which remains unused.

In the ideal abstraction we envision for data scientists, the system is adaptive, able to exploit

monitoring data to learn about workloads, and process user requests into a tailored execution

context. In this work, we study different techniques that have been used to make steps toward

such system awareness, and explore a new way to do so by implementing machine learning

techniques to recommend a specific subset of system configurations for Apache Spark applications.

Furthermore, we present an in depth study of Apache Spark executors configuration, which highlight

the complexity in choosing the best one for a given workload.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

As the world population continues to grow past seven billion people and global challenges continue to persist including resource availability, biodiversity loss, climate change and human well-being, a new science is required that can address the integrated nature of these challenges and the multiple scales on which they are manifest. Sustainability science has emerged to fill this role. In the fifteen years since it was first called for in the pages of Science, it has rapidly matured, however its place in the history of science and the way it is practiced today must be continually evaluated. In Part I, two chapters address this theoretical and practical grounding. Part II transitions to the applied practice of sustainability science in addressing the urban heat island (UHI) challenge wherein the climate of urban areas are warmer than their surrounding rural environs. The UHI has become increasingly important within the study of earth sciences given the increased focus on climate change and as the balance of humans now live in urban areas.

In Chapter 2 a novel contribution to the historical context of sustainability is argued. Sustainability as a concept characterizing the relationship between humans and nature emerged in the mid to late 20th century as a response to findings used to also characterize the Anthropocene. Emerging from the human-nature relationships that came before it, evidence is provided that suggests Sustainability was enabled by technology and a reorientation of world-view and is unique in its global boundary, systematic approach and ambition for both well being and the continued availability of resources and Earth system function. Sustainability is further an ambition that has wide appeal, making it one of the first normative concepts of the Anthropocene.

Despite its widespread emergence and adoption, sustainability science continues to suffer from definitional ambiguity within the academe. In Chapter 3, a review of efforts to provide direction and structure to the science reveals a continuum of approaches anchored at either end by differing visions of how the science interfaces with practice (solutions). At one end, basic science of societally defined problems informs decisions about possible solutions and their application. At the other end, applied research directly affects the options available to decision makers. While clear from the literature, survey data further suggests that the dichotomy does not appear to be as apparent in the minds of practitioners.

In Chapter 4, the UHI is first addressed at the synoptic, mesoscale. Urban climate is the most immediate manifestation of the warming global climate for the majority of people on earth. Nearly half of those people live in small to medium sized cities, an understudied scale in urban climate research. Widespread characterization would be useful to decision makers in planning and design. Using a multi-method approach, the mesoscale UHI in the study region is characterized and the secular trend over the last sixty years evaluated. Under isolated ideal conditions the findings indicate a UHI of 5.3 ± 0.97 °C to be present in the study area, the magnitude of which is growing over time.

Although urban heat islands (UHI) are well studied, there remain no panaceas for local scale mitigation and adaptation methods, therefore continued attention to characterization of the phenomenon in urban centers of different scales around the globe is required. In Chapter 5, a local scale analysis of the canopy layer and surface UHI in a medium sized city in North Carolina, USA is conducted using multiple methods including stationary urban sensors, mobile transects and remote sensing. Focusing on the ideal conditions for UHI development during an anticyclonic summer heat event, the study observes a range of UHI intensity depending on the method of observation: 8.7 °C from the stationary urban sensors; 6.9 °C from mobile transects; and, 2.2 °C from remote sensing. Additional attention is paid to the diurnal dynamics of the UHI and its correlation with vegetation indices, dewpoint and albedo. Evapotranspiration is shown to drive dynamics in the study region.

Finally, recognizing that a bridge must be established between the physical science community studying the Urban Heat Island (UHI) effect, and the planning community and decision makers implementing urban form and development policies, Chapter 6 evaluates multiple urban form characterization methods. Methods evaluated include local climate zones (LCZ), national land cover database (NCLD) classes and urban cluster analysis (UCA) to determine their utility in describing the distribution of the UHI based on three standard observation types 1) fixed urban temperature sensors, 2) mobile transects and, 3) remote sensing. Bivariate, regression and ANOVA tests are used to conduct the analyses. Findings indicate that the NLCD classes are best correlated to the UHI intensity and distribution in the study area. Further, while the UCA method is not useful directly, the variables included in the method are predictive based on regression analysis so the potential for better model design exists. Land cover variables including albedo, impervious surface fraction and pervious surface fraction are found to dominate the distribution of the UHI in the study area regardless of observation method.

Chapter 7 provides a summary of findings, and offers a brief analysis of their implications for both the scientific discourse generally, and the study area specifically. In general, the work undertaken does not achieve the full ambition of sustainability science, additional work is required to translate findings to practice and more fully evaluate adoption. The implications for planning and development in the local region are addressed in the context of a major light-rail infrastructure project including several systems level considerations like human health and development. Finally, several avenues for future work are outlined. Within the theoretical development of sustainability science, these pathways include more robust evaluations of the theoretical and actual practice. Within the UHI context, these include development of an integrated urban form characterization model, application of study methodology in other geographic areas and at different scales, and use of novel experimental methods including distributed sensor networks and citizen science.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Intelligent Tutoring Systems (ITSs) are computerized systems for learning-by-doing. These systems provide students with immediate and customized feedback on learning tasks. An ITS typically consists of several modules that are connected to each other. This research focuses on the distribution of the ITS module that provides expert knowledge services. For the distribution of such an expert knowledge module we need to use an architectural style because this gives a standard interface, which increases the reusability and operability of the expert knowledge module. To provide expert knowledge modules in a distributed way we need to answer the research question: ‘How can we compare and evaluate REST, Web services and Plug-in architectural styles for the distribution of the expert knowledge module in an intelligent tutoring system?’. We present an assessment method for selecting an architectural style. Using the assessment method on three architectural styles, we selected the REST architectural style as the style that best supports the distribution of expert knowledge modules. With this assessment method we also analyzed the trade-offs that come with selecting REST. We present a prototype and architectural views based on REST to demonstrate that the assessment method correctly scores REST as an appropriate architectural style for the distribution of expert knowledge modules.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A method is outlined for optimising graph partitions which arise in mapping unstructured mesh calculations to parallel computers. The method employs a relative gain iterative technique to both evenly balance the workload and minimise the number and volume of interprocessor communications. A parallel graph reduction technique is also briefly described and can be used to give a global perspective to the optimisation. The algorithms work efficiently in parallel as well as sequentially and when combined with a fast direct partitioning technique (such as the Greedy algorithm) to give an initial partition, the resulting two-stage process proves itself to be both a powerful and flexible solution to the static graph-partitioning problem. Experiments indicate that the resulting parallel code can provide high quality partitions, independent of the initial partition, within a few seconds. The algorithms can also be used for dynamic load-balancing, reusing existing partitions and in this case the procedures are much faster than static techniques, provide partitions of similar or higher quality and, in comparison, involve the migration of a fraction of the data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In today’s big data world, data is being produced in massive volumes, at great velocity and from a variety of different sources such as mobile devices, sensors, a plethora of small devices hooked to the internet (Internet of Things), social networks, communication networks and many others. Interactive querying and large-scale analytics are being increasingly used to derive value out of this big data. A large portion of this data is being stored and processed in the Cloud due the several advantages provided by the Cloud such as scalability, elasticity, availability, low cost of ownership and the overall economies of scale. There is thus, a growing need for large-scale cloud-based data management systems that can support real-time ingest, storage and processing of large volumes of heterogeneous data. However, in the pay-as-you-go Cloud environment, the cost of analytics can grow linearly with the time and resources required. Reducing the cost of data analytics in the Cloud thus remains a primary challenge. In my dissertation research, I have focused on building efficient and cost-effective cloud-based data management systems for different application domains that are predominant in cloud computing environments. In the first part of my dissertation, I address the problem of reducing the cost of transactional workloads on relational databases to support database-as-a-service in the Cloud. The primary challenges in supporting such workloads include choosing how to partition the data across a large number of machines, minimizing the number of distributed transactions, providing high data availability, and tolerating failures gracefully. I have designed, built and evaluated SWORD, an end-to-end scalable online transaction processing system, that utilizes workload-aware data placement and replication to minimize the number of distributed transactions that incorporates a suite of novel techniques to significantly reduce the overheads incurred both during the initial placement of data, and during query execution at runtime. In the second part of my dissertation, I focus on sampling-based progressive analytics as a means to reduce the cost of data analytics in the relational domain. Sampling has been traditionally used by data scientists to get progressive answers to complex analytical tasks over large volumes of data. Typically, this involves manually extracting samples of increasing data size (progressive samples) for exploratory querying. This provides the data scientists with user control, repeatable semantics, and result provenance. However, such solutions result in tedious workflows that preclude the reuse of work across samples. On the other hand, existing approximate query processing systems report early results, but do not offer the above benefits for complex ad-hoc queries. I propose a new progressive data-parallel computation framework, NOW!, that provides support for progressive analytics over big data. In particular, NOW! enables progressive relational (SQL) query support in the Cloud using unique progress semantics that allow efficient and deterministic query processing over samples providing meaningful early results and provenance to data scientists. NOW! enables the provision of early results using significantly fewer resources thereby enabling a substantial reduction in the cost incurred during such analytics. Finally, I propose NSCALE, a system for efficient and cost-effective complex analytics on large-scale graph-structured data in the Cloud. The system is based on the key observation that a wide range of complex analysis tasks over graph data require processing and reasoning about a large number of multi-hop neighborhoods or subgraphs in the graph; examples include ego network analysis, motif counting in biological networks, finding social circles in social networks, personalized recommendations, link prediction, etc. These tasks are not well served by existing vertex-centric graph processing frameworks whose computation and execution models limit the user program to directly access the state of a single vertex, resulting in high execution overheads. Further, the lack of support for extracting the relevant portions of the graph that are of interest to an analysis task and loading it onto distributed memory leads to poor scalability. NSCALE allows users to write programs at the level of neighborhoods or subgraphs rather than at the level of vertices, and to declaratively specify the subgraphs of interest. It enables the efficient distributed execution of these neighborhood-centric complex analysis tasks over largescale graphs, while minimizing resource consumption and communication cost, thereby substantially reducing the overall cost of graph data analytics in the Cloud. The results of our extensive experimental evaluation of these prototypes with several real-world data sets and applications validate the effectiveness of our techniques which provide orders-of-magnitude reductions in the overheads of distributed data querying and analysis in the Cloud.