904 resultados para Parallel and Distributed Processing
Resumo:
In this paper we evaluate and compare two representativeand popular distributed processing engines for large scalebig data analytics, Spark and graph based engine GraphLab. Wedesign a benchmark suite including representative algorithmsand datasets to compare the performances of the computingengines, from performance aspects of running time, memory andCPU usage, network and I/O overhead. The benchmark suite istested on both local computer cluster and virtual machines oncloud. By varying the number of computers and memory weexamine the scalability of the computing engines with increasingcomputing resources (such as CPU and memory). We also runcross-evaluation of generic and graph based analytic algorithmsover graph processing and generic platforms to identify thepotential performance degradation if only one processing engineis available. It is observed that both computing engines showgood scalability with increase of computing resources. WhileGraphLab largely outperforms Spark for graph algorithms, ithas close running time performance as Spark for non-graphalgorithms. Additionally the running time with Spark for graphalgorithms over cloud virtual machines is observed to increaseby almost 100% compared to over local computer clusters.
Resumo:
Today, databases have become an integral part of information systems. In the past two decades, we have seen different database systems being developed independently and used in different applications domains. Today's interconnected networks and advanced applications, such as data warehousing, data mining & knowledge discovery and intelligent data access to information on the Web, have created a need for integrated access to such heterogeneous, autonomous, distributed database systems. Heterogeneous/multidatabase research has focused on this issue resulting in many different approaches. However, a single, generally accepted methodology in academia or industry has not emerged providing ubiquitous intelligent data access from heterogeneous, autonomous, distributed information sources. ^ This thesis describes a heterogeneous database system being developed at High-performance Database Research Center (HPDRC). A major impediment to ubiquitous deployment of multidatabase technology is the difficulty in resolving semantic heterogeneity. That is, identifying related information sources for integration and querying purposes. Our approach considers the semantics of the meta-data constructs in resolving this issue. The major contributions of the thesis work include: (i) providing a scalable, easy-to-implement architecture for developing a heterogeneous multidatabase system, utilizing Semantic Binary Object-oriented Data Model (Sem-ODM) and Semantic SQL query language to capture the semantics of the data sources being integrated and to provide an easy-to-use query facility; (ii) a methodology for semantic heterogeneity resolution by investigating into the extents of the meta-data constructs of component schemas. This methodology is shown to be correct, complete and unambiguous; (iii) a semi-automated technique for identifying semantic relations, which is the basis of semantic knowledge for integration and querying, using shared ontologies for context-mediation; (iv) resolutions for schematic conflicts and a language for defining global views from a set of component Sem-ODM schemas; (v) design of a knowledge base for storing and manipulating meta-data and knowledge acquired during the integration process. This knowledge base acts as the interface between integration and query processing modules; (vi) techniques for Semantic SQL query processing and optimization based on semantic knowledge in a heterogeneous database environment; and (vii) a framework for intelligent computing and communication on the Internet applying the concepts of our work. ^
Resumo:
This research aims at a study of the hybrid flow shop problem which has parallel batch-processing machines in one stage and discrete-processing machines in other stages to process jobs of arbitrary sizes. The objective is to minimize the makespan for a set of jobs. The problem is denoted as: FF: batch1,sj:Cmax. The problem is formulated as a mixed-integer linear program. The commercial solver, AMPL/CPLEX, is used to solve problem instances to their optimality. Experimental results show that AMPL/CPLEX requires considerable time to find the optimal solution for even a small size problem, i.e., a 6-job instance requires 2 hours in average. A bottleneck-first-decomposition heuristic (BFD) is proposed in this study to overcome the computational (time) problem encountered while using the commercial solver. The proposed BFD heuristic is inspired by the shifting bottleneck heuristic. It decomposes the entire problem into three sub-problems, and schedules the sub-problems one by one. The proposed BFD heuristic consists of four major steps: formulating sub-problems, prioritizing sub-problems, solving sub-problems and re-scheduling. For solving the sub-problems, two heuristic algorithms are proposed; one for scheduling a hybrid flow shop with discrete processing machines, and the other for scheduling parallel batching machines (single stage). Both consider job arrival and delivery times. An experiment design is conducted to evaluate the effectiveness of the proposed BFD, which is further evaluated against a set of common heuristics including a randomized greedy heuristic and five dispatching rules. The results show that the proposed BFD heuristic outperforms all these algorithms. To evaluate the quality of the heuristic solution, a procedure is developed to calculate a lower bound of makespan for the problem under study. The lower bound obtained is tighter than other bounds developed for related problems in literature. A meta-search approach based on the Genetic Algorithm concept is developed to evaluate the significance of further improving the solution obtained from the proposed BFD heuristic. The experiment indicates that it reduces the makespan by 1.93 % in average within a negligible time when problem size is less than 50 jobs.
Resumo:
Today, databases have become an integral part of information systems. In the past two decades, we have seen different database systems being developed independently and used in different applications domains. Today's interconnected networks and advanced applications, such as data warehousing, data mining & knowledge discovery and intelligent data access to information on the Web, have created a need for integrated access to such heterogeneous, autonomous, distributed database systems. Heterogeneous/multidatabase research has focused on this issue resulting in many different approaches. However, a single, generally accepted methodology in academia or industry has not emerged providing ubiquitous intelligent data access from heterogeneous, autonomous, distributed information sources. This thesis describes a heterogeneous database system being developed at Highperformance Database Research Center (HPDRC). A major impediment to ubiquitous deployment of multidatabase technology is the difficulty in resolving semantic heterogeneity. That is, identifying related information sources for integration and querying purposes. Our approach considers the semantics of the meta-data constructs in resolving this issue. The major contributions of the thesis work include: (i.) providing a scalable, easy-to-implement architecture for developing a heterogeneous multidatabase system, utilizing Semantic Binary Object-oriented Data Model (Sem-ODM) and Semantic SQL query language to capture the semantics of the data sources being integrated and to provide an easy-to-use query facility; (ii.) a methodology for semantic heterogeneity resolution by investigating into the extents of the meta-data constructs of component schemas. This methodology is shown to be correct, complete and unambiguous; (iii.) a semi-automated technique for identifying semantic relations, which is the basis of semantic knowledge for integration and querying, using shared ontologies for context-mediation; (iv.) resolutions for schematic conflicts and a language for defining global views from a set of component Sem-ODM schemas; (v.) design of a knowledge base for storing and manipulating meta-data and knowledge acquired during the integration process. This knowledge base acts as the interface between integration and query processing modules; (vi.) techniques for Semantic SQL query processing and optimization based on semantic knowledge in a heterogeneous database environment; and (vii.) a framework for intelligent computing and communication on the Internet applying the concepts of our work.
Resumo:
The astonishing development of diverse and different hardware platforms is twofold: on one side, the challenge for the exascale performance for big data processing and management; on the other side, the mobile and embedded devices for data collection and human machine interaction. This drove to a highly hierarchical evolution of programming models. GVirtuS is the general virtualization system developed in 2009 and firstly introduced in 2010 enabling a completely transparent layer among GPUs and VMs. This paper shows the latest achievements and developments of GVirtuS, now supporting CUDA 6.5, memory management and scheduling. Thanks to the new and improved remoting capabilities, GVirtus now enables GPU sharing among physical and virtual machines based on x86 and ARM CPUs on local workstations,computing clusters and distributed cloud appliances.
Distributed and compressed MIKEY mode to secure end-to-end communications in the Internet of things.
Resumo:
Multimedia Internet KEYing protocol (MIKEY) aims at establishing secure credentials between two communicating entities. However, existing MIKEY modes fail to meet the requirements of low-power and low-processing devices. To address this issue, we combine two previously proposed approaches to introduce a new distributed and compressed MIKEY mode for the Internet of Things. Indeed, relying on a cooperative approach, a set of third parties is used to discharge the constrained nodes from heavy computational operations. Doing so, the preshared mode is used in the constrained part of network, while the public key mode is used in the unconstrained part of the network. Furthermore, to mitigate the communication cost we introduce a new header compression scheme that reduces the size of MIKEY’s header from 12 Bytes to 3 Bytes in the best compression case. Preliminary results show that our proposed mode is energy preserving whereas its security properties are preserved untouched.
Resumo:
Performance and scalability of model transformations are becoming prominent topics in Model-Driven Engineering. In previous works we introduced LinTra, a platform for executing model transformations in parallel. LinTra is based on the Linda model of a coordination language and is intended to be used as a middleware where high-level model transformation languages are compiled. In this paper we present the initial results of our analyses on the scalability of out-place model-to-model transformation executions in LinTra when the models and the processing elements are distributed over a set of machines.
Resumo:
This paper presents the recent finding by Muhlhaus et al [1] that bifurcation of crack growth patterns exists for arrays of two-dimensional cracks. This bifurcation is a result of the nonlinear effect due to crack interaction, which is, in the present analysis, approximated by the dipole asymptotic or pseudo-traction method. The nonlinear parameter for the problem is the crack length/ spacing ratio lambda = a/h. For parallel and edge crack arrays under far field tension, uniform crack growth patterns (all cracks having same size) yield to nonuniform crack growth patterns (i.e. bifurcation) if lambda is larger than a critical value lambda(cr) (note that such bifurcation is not found for collinear crack arrays). For parallel and edge crack arrays respectively, the value of lambda(cr) decreases monotonically from (2/9)(1/2) and (2/15.096)(1/2) for arrays of 2 cracks, to (2/3)(1/2)/pi and (2/5.032)(1/2)/pi for infinite arrays of cracks. The critical parameter lambda(cr) is calculated numerically for arrays of up to 100 cracks, whilst discrete Fourier transform is used to obtain the exact solution of lambda(cr) for infinite crack arrays. For geomaterials, bifurcation can also occurs when array of sliding cracks are under compression.
Resumo:
The present study investigates human visual processing of simple two-colour patterns using a delayed match to sample paradigm with positron emission tomography (PET). This study is unique in that we specifically designed the visual stimuli to be the same for both pattern and colour recognition with all patterns being abstract shapes not easily verbally coded composed of two-colour combinations. We did this to explore those brain regions required for both colour and pattern processing and to separate those areas of activation required for one or the other. We found that both tasks activated similar occipital regions, the major difference being more extensive activation in pattern recognition. A right-sided network that involved the inferior parietal lobule, the head of the caudate nucleus, and the pulvinar nucleus of the thalamus was common to both paradigms. Pattern recognition also activated the left temporal pole and right lateral orbital gyrus, whereas colour recognition activated the left fusiform gyrus and several right frontal regions. (C) 2001 Wiley-Liss, Inc.
Resumo:
Thc oen itninteureasc ttioo nb eo fa pcreangtmraal tiiscs uaen di ns ymnotadcetlisc ocfo nsestnrtaeinnctse processing. It is well established that object relatives (1) are harder to process than subject relatives (2). Passivization, other things being equal, increases sentence complexity. However, one of the functions of the passive construction is to promote an NP into the role of subject so that it can be more easily bound to the head NP in a higher clause. Thus, (3) is predicted to be marginally preferred over (1). Passiviazation in this instance may be seen as a way of avoiding the object relative construction. 1. The pipe that the traveller smoked annoyed the passengers. 2. The traveller that smoked the pipe annoyed the passengers. 3.The pipe that was smoked by the traveller annoyed the 4.The traveller that the pipe was smoked by annoyed the 5.The traveller that the lady was assaulted by annoyed the In (4) we have relativization of an NP which has been demoted by passivization to the status of a by-phrase. Such relative clauses may only be obtained under quite restrictive pragmatic conditions. Many languages do not permit relativization of a constituent as low as a by-phrase on the NP accessibility hierarchy (Comrie, 1984). The factors which determine the acceptability of demoted NP relatives like (4-5) reflect the ease with which the NP promoted to subject position can be taken as a discourse topic. We explored the acceptability of sentences such as (1-5) using pair-wise judgements of samddifferent meaning, accompanied by ratings of easeof understanding. Results are discussed with reference to Gibsons DLT model of linguistic complexity and sentence processing (Gibson, 2000)
Resumo:
In competitive electricity markets with deep concerns at the efficiency level, demand response programs gain considerable significance. In the same way, distributed generation has gained increasing importance in the operation and planning of power systems. Grid operators and utilities are taking new initiatives, recognizing the value of demand response and of distributed generation for grid reliability and for the enhancement of organized spot market´s efficiency. Grid operators and utilities become able to act in both energy and reserve components of electricity markets. This paper proposes a methodology for a joint dispatch of demand response and distributed generation to provide energy and reserve by a virtual power player that operates a distribution network. The proposed method has been computationally implemented and its application is illustrated in this paper using a 32 bus distribution network with 32 medium voltage consumers.
Resumo:
Environmental management is a complex task. The amount and heterogeneity of the data needed for an environmental decision making tool is overwhelming without adequate database systems and innovative methodologies. As far as data management, data interaction and data processing is concerned we here propose the use of a Geographical Information System (GIS) whilst for the decision making we suggest a Multi-Agent System (MAS) architecture. With the adoption of a GIS we hope to provide a complementary coexistence between heterogeneous data sets, a correct data structure, a good storage capacity and a friendly user’s interface. By choosing a distributed architecture such as a Multi-Agent System, where each agent is a semi-autonomous Expert System with the necessary skills to cooperate with the others in order to solve a given task, we hope to ensure a dynamic problem decomposition and to achieve a better performance compared with standard monolithical architectures. Finally, and in view of the partial, imprecise, and ever changing character of information available for decision making, Belief Revision capabilities are added to the system. Our aim is to present and discuss an intelligent environmental management system capable of suggesting the more appropriate land-use actions based on the existing spatial and non-spatial constraints.
Resumo:
A distributed, agent-based intelligent system models and simulates a smart grid using physical players and computationally simulated agents. The proposed system can assess the impact of demand response programs.
Resumo:
Recent changes in the operation and planning of power systems have been motivated by the introduction of Distributed Generation (DG) and Demand Response (DR) in the competitive electricity markets' environment, with deep concerns at the efficiency level. In this context, grid operators, market operators, utilities and consumers must adopt strategies and methods to take full advantage of demand response and distributed generation. This requires that all the involved players consider all the market opportunities, as the case of energy and reserve components of electricity markets. The present paper proposes a methodology which considers the joint dispatch of demand response and distributed generation in the context of a distribution network operated by a virtual power player. The resources' participation can be performed in both energy and reserve contexts. This methodology contemplates the probability of actually using the reserve and the distribution network constraints. Its application is illustrated in this paper using a 32-bus distribution network with 66 DG units and 218 consumers classified into 6 types of consumers.
Resumo:
Hyperspectral imaging has become one of the main topics in remote sensing applications, which comprise hundreds of spectral bands at different (almost contiguous) wavelength channels over the same area generating large data volumes comprising several GBs per flight. This high spectral resolution can be used for object detection and for discriminate between different objects based on their spectral characteristics. One of the main problems involved in hyperspectral analysis is the presence of mixed pixels, which arise when the spacial resolution of the sensor is not able to separate spectrally distinct materials. Spectral unmixing is one of the most important task for hyperspectral data exploitation. However, the unmixing algorithms can be computationally very expensive, and even high power consuming, which compromises the use in applications under on-board constraints. In recent years, graphics processing units (GPUs) have evolved into highly parallel and programmable systems. Specifically, several hyperspectral imaging algorithms have shown to be able to benefit from this hardware taking advantage of the extremely high floating-point processing performance, compact size, huge memory bandwidth, and relatively low cost of these units, which make them appealing for onboard data processing. In this paper, we propose a parallel implementation of an augmented Lagragian based method for unsupervised hyperspectral linear unmixing on GPUs using CUDA. The method called simplex identification via split augmented Lagrangian (SISAL) aims to identify the endmembers of a scene, i.e., is able to unmix hyperspectral data sets in which the pure pixel assumption is violated. The efficient implementation of SISAL method presented in this work exploits the GPU architecture at low level, using shared memory and coalesced accesses to memory.