988 resultados para parallel implementation
Resumo:
In the field of Transition P systems implementation, it has been determined that it is very important to determine in advance how long takes evolution rules application in membranes. Moreover, to have time estimations of rules application in membranes makes possible to take important decisions related to hardware / software architectures design. The work presented here introduces an algorithm for applying active evolution rules in Transition P systems, which is based on active rules elimination. The algorithm complies the requisites of being nondeterministic, massively parallel, and what is more important, it is time delimited because it is only dependant on the number of membrane evolution rules.
Resumo:
This paper is focused on a parallel JAVA implementation of a processor defined in a Network of Evolutionary Processors. Processor description is based on JDom, which provides a complete, Java-based solution for accessing, manipulating, and outputting XML data from Java code. Communication among different processor to obtain a fully functional simulation of a Network of Evolutionary Processors will be treated in future. A safe-thread model of processors performs all parallel operations such as rules and filters. A non-deterministic behavior of processors is achieved with a thread for each rule and for each filter (input and output). Different results of a processor evolution are shown.
Resumo:
P systems or Membrane Computing are a type of a distributed, massively parallel and non deterministic system based on biological membranes. They are inspired in the way cells process chemical compounds, energy and information. These systems perform a computation through transition between two consecutive configurations. As it is well known in membrane computing, a configuration consists in a m-tuple of multisets present at any moment in the existing m regions of the system at that moment time. Transitions between two configurations are performed by using evolution rules which are in each region of the system in a non-deterministic maximally parallel manner. This work is part of an exhaustive investigation line. The final objective is to implement a HW system that evolves as it makes a transition P-system. To achieve this objective, it has been carried out a division of this generic system in several stages, each of them with concrete matters. In this paper the stage is developed by obtaining the part of the system that is in charge of the application of the active rules. To count the number of times that the active rules is applied exist different algorithms. Here, it is presents an algorithm with improved aspects: the number of necessary iterations to reach the final values is smaller than the case of applying step to step each rule. Hence, the whole process requires a minor number of steps and, therefore, the end of the process will be reached in a shorter length of time.
Resumo:
ACM Computing Classification System (1998): D.2.11, D.1.3, D.3.1, J.3, C.2.4.
Resumo:
We describe a parallel multi-threaded approach for high performance modelling of wide class of phenomena in ultrafast nonlinear optics. Specific implementation has been performed using the highly parallel capabilities of a programmable graphics processor. © 2011 SPIE.
Resumo:
Large read-only or read-write transactions with a large read set and a small write set constitute an important class of transactions used in such applications as data mining, data warehousing, statistical applications, and report generators. Such transactions are best supported with optimistic concurrency, because locking of large amounts of data for extended periods of time is not an acceptable solution. The abort rate in regular optimistic concurrency algorithms increases exponentially with the size of the transaction. The algorithm proposed in this dissertation solves this problem by using a new transaction scheduling technique that allows a large transaction to commit safely with significantly greater probability that can exceed several orders of magnitude versus regular optimistic concurrency algorithms. A performance simulation study and a formal proof of serializability and external consistency of the proposed algorithm are also presented.^ This dissertation also proposes a new query optimization technique (lazy queries). Lazy Queries is an adaptive query execution scheme which optimizes itself as the query runs. Lazy queries can be used to find an intersection of sub-queries in a very efficient way, which does not require full execution of large sub-queries nor does it require any statistical knowledge about the data.^ An efficient optimistic concurrency control algorithm used in a massively parallel B-tree with variable-length keys is introduced. B-trees with variable-length keys can be effectively used in a variety of database types. In particular, we show how such a B-tree was used in our implementation of a semantic object-oriented DBMS. The concurrency control algorithm uses semantically safe optimistic virtual "locks" that achieve very fine granularity in conflict detection. This algorithm ensures serializability and external consistency by using logical clocks and backward validation of transactional queries. A formal proof of correctness of the proposed algorithm is also presented. ^
Resumo:
Orthogonal Frequency-Division Multiplexing (OFDM) has been proved to be a promising technology that enables the transmission of higher data rate. Multicarrier Code-Division Multiple Access (MC-CDMA) is a transmission technique which combines the advantages of both OFDM and Code-Division Multiplexing Access (CDMA), so as to allow high transmission rates over severe time-dispersive multi-path channels without the need of a complex receiver implementation. Also MC-CDMA exploits frequency diversity via the different subcarriers, and therefore allows the high code rates systems to achieve good Bit Error Rate (BER) performances. Furthermore, the spreading in the frequency domain makes the time synchronization requirement much lower than traditional direct sequence CDMA schemes. There are still some problems when we use MC-CDMA. One is the high Peak-to-Average Power Ratio (PAPR) of the transmit signal. High PAPR leads to nonlinear distortion of the amplifier and results in inter-carrier self-interference plus out-of-band radiation. On the other hand, suppressing the Multiple Access Interference (MAI) is another crucial problem in the MC-CDMA system. Imperfect cross-correlation characteristics of the spreading codes and the multipath fading destroy the orthogonality among the users, and then cause MAI, which produces serious BER degradation in the system. Moreover, in uplink system the received signals at a base station are always asynchronous. This also destroys the orthogonality among the users, and hence, generates MAI which degrades the system performance. Besides those two problems, the interference should always be considered seriously for any communication system. In this dissertation, we design a novel MC-CDMA system, which has low PAPR and mitigated MAI. The new Semi-blind channel estimation and multi-user data detection based on Parallel Interference Cancellation (PIC) have been applied in the system. The Low Density Parity Codes (LDPC) has also been introduced into the system to improve the performance. Different interference models are analyzed in multi-carrier communication systems and then the effective interference suppression for MC-CDMA systems is employed in this dissertation. The experimental results indicate that our system not only significantly reduces the PAPR and MAI but also effectively suppresses the outside interference with low complexity. Finally, we present a practical cognitive application of the proposed system over the software defined radio platform.
Resumo:
Orthogonal Frequency-Division Multiplexing (OFDM) has been proved to be a promising technology that enables the transmission of higher data rate. Multicarrier Code-Division Multiple Access (MC-CDMA) is a transmission technique which combines the advantages of both OFDM and Code-Division Multiplexing Access (CDMA), so as to allow high transmission rates over severe time-dispersive multi-path channels without the need of a complex receiver implementation. Also MC-CDMA exploits frequency diversity via the different subcarriers, and therefore allows the high code rates systems to achieve good Bit Error Rate (BER) performances. Furthermore, the spreading in the frequency domain makes the time synchronization requirement much lower than traditional direct sequence CDMA schemes. There are still some problems when we use MC-CDMA. One is the high Peak-to-Average Power Ratio (PAPR) of the transmit signal. High PAPR leads to nonlinear distortion of the amplifier and results in inter-carrier self-interference plus out-of-band radiation. On the other hand, suppressing the Multiple Access Interference (MAI) is another crucial problem in the MC-CDMA system. Imperfect cross-correlation characteristics of the spreading codes and the multipath fading destroy the orthogonality among the users, and then cause MAI, which produces serious BER degradation in the system. Moreover, in uplink system the received signals at a base station are always asynchronous. This also destroys the orthogonality among the users, and hence, generates MAI which degrades the system performance. Besides those two problems, the interference should always be considered seriously for any communication system. In this dissertation, we design a novel MC-CDMA system, which has low PAPR and mitigated MAI. The new Semi-blind channel estimation and multi-user data detection based on Parallel Interference Cancellation (PIC) have been applied in the system. The Low Density Parity Codes (LDPC) has also been introduced into the system to improve the performance. Different interference models are analyzed in multi-carrier communication systems and then the effective interference suppression for MC-CDMA systems is employed in this dissertation. The experimental results indicate that our system not only significantly reduces the PAPR and MAI but also effectively suppresses the outside interference with low complexity. Finally, we present a practical cognitive application of the proposed system over the software defined radio platform.
Resumo:
The main focus of this research is to design and develop a high performance linear actuator based on a four bar mechanism. The present work includes the detailed analysis (kinematics and dynamics), design, implementation and experimental validation of the newly designed actuator. High performance is characterized by the acceleration of the actuator end effector. The principle of the newly designed actuator is to network the four bar rhombus configuration (where some bars are extended to form an X shape) to attain high acceleration. Firstly, a detailed kinematic analysis of the actuator is presented and kinematic performance is evaluated through MATLAB simulations. A dynamic equation of the actuator is achieved by using the Lagrangian dynamic formulation. A SIMULINK control model of the actuator is developed using the dynamic equation. In addition, Bond Graph methodology is presented for the dynamic simulation. The Bond Graph model comprises individual component modeling of the actuator along with control. Required torque was simulated using the Bond Graph model. Results indicate that, high acceleration (around 20g) can be achieved with modest (3 N-m or less) torque input. A practical prototype of the actuator is designed using SOLIDWORKS and then produced to verify the proof of concept. The design goal was to achieve the peak acceleration of more than 10g at the middle point of the travel length, when the end effector travels the stroke length (around 1 m). The actuator is primarily designed to operate in standalone condition and later to use it in the 3RPR parallel robot. A DC motor is used to operate the actuator. A quadrature encoder is attached with the DC motor to control the end effector. The associated control scheme of the actuator is analyzed and integrated with the physical prototype. From standalone experimentation of the actuator, around 17g acceleration was achieved by the end effector (stroke length was 0.2m to 0.78m). Results indicate that the developed dynamic model results are in good agreement. Finally, a Design of Experiment (DOE) based statistical approach is also introduced to identify the parametric combination that yields the greatest performance. Data are collected by using the Bond Graph model. This approach is helpful in designing the actuator without much complexity.
Resumo:
Background: I conducted my research in the context of The National Literacy Strategy (DES, 2011), which maintains that every young person should be literate and it outlines targets for improving literacy in schools from 2011 to 2020. There has been much debate on the teaching of literacy and in particular the teaching of reading. Clark (2014) outlines how learning to read should be a developmental language process and that the approaches in the early years of schooling will colour the children’s motivation and their perception of reading as a purposeful activity. The acquisition of literacy begins in the home but this study focuses on the implementation of a literacy intervention Station Teaching in the infant classes in primary school. Station Teaching occurs when a class is divided into four or five small groups of pupils and they receive intensive tuition at four or five different Stations with the help of Support teachers: New Reading, Familiar Reading, Phonics, Writing and Oral Language. Research Questions: These research questions frame my study: How is Station Teaching implemented? What is the experience of the intervention Station Teaching from the participants’ point of view: teachers, pupils, parents? What notion of literacy is Station Teaching facilitating? Methods: I chose a pragmatic parallel mixed methods design as suggested by Mertens (2010). I collected and analysed both the quantitative and qualitative data to answer the study’s research questions. In the study the quantitative data were collected from a questionnaire issued to 21 schools in Ireland. I used Excel as a data management package and thematic analysis to analyse and present the data in themes. I collected qualitative data from a case study in a school. This data included observations of two classes over a period of a year; interviews with teachers, pupils and parents; children’s drawings, photographs, teachers’ diaries and video evidence. I analysed and presented the evidence from the qualitative data in themes. Main Findings: There are many skills and strategies that are essential to effective literacy teaching in the early years including phonological awareness, phonics, vocabulary, fluency, comprehension and writing. These skills can be taught during Station Teaching. Early intervention in the early years is essential to pupils’ acquisition of literacy. The expertise of the teacher is key to improving the literacy achievement of pupils Teachers and pupils enjoy participating in ST. Pupils are motivated to read and engage in meaningful activities during ST. Staff collaboration is vital for ST to succeed ST facilitates small group work and teachers can differentiate accordingly while including all pupils in the groups. Pupils’ learning is extended in ST but extension activities need to be addressed in the Writing Station. More training should be provided for teachers on the implementation of ST and more funding for resources should be available to schools Significant contribution of the work: The main significance of the study includes: insights into the classroom implementation of Station Teaching in infant classes and extensive research into characteristics of an effective teacher of literacy.
Resumo:
We advocate the Loop-of-stencil-reduce pattern as a means of simplifying the implementation of data-parallel programs on heterogeneous multi-core platforms. Loop-of-stencil-reduce is general enough to subsume map, reduce, map-reduce, stencil, stencil-reduce, and, crucially, their usage in a loop in both data-parallel and streaming applications, or a combination of both. The pattern makes it possible to deploy a single stencil computation kernel on different GPUs. We discuss the implementation of Loop-of-stencil-reduce in FastFlow, a framework for the implementation of applications based on the parallel patterns. Experiments are presented to illustrate the use of Loop-of-stencil-reduce in developing data-parallel kernels running on heterogeneous systems.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
In this paper, we develop a fast implementation of an hyperspectral coded aperture (HYCA) algorithm on different platforms using OpenCL, an open standard for parallel programing on heterogeneous systems, which includes a wide variety of devices, from dense multicore systems from major manufactures such as Intel or ARM to new accelerators such as graphics processing units (GPUs), field programmable gate arrays (FPGAs), the Intel Xeon Phi and other custom devices. Our proposed implementation of HYCA significantly reduces its computational cost. Our experiments have been conducted using simulated data and reveal considerable acceleration factors. This kind of implementations with the same descriptive language on different architectures are very important in order to really calibrate the possibility of using heterogeneous platforms for efficient hyperspectral imaging processing in real remote sensing missions.
Resumo:
A large class of computational problems are characterised by frequent synchronisation, and computational requirements which change as a function of time. When such a problem is solved on a message passing multiprocessor machine [5], the combination of these characteristics leads to system performance which deteriorate in time. As the communication performance of parallel hardware steadily improves so load balance becomes a dominant factor in obtaining high parallel efficiency. Performance can be improved with periodic redistribution of computational load; however, redistribution can sometimes be very costly. We study the issue of deciding when to invoke a global load re-balancing mechanism. Such a decision policy must actively weigh the costs of remapping against the performance benefits, and should be general enough to apply automatically to a wide range of computations. This paper discusses a generic strategy for Dynamic Load Balancing (DLB) in unstructured mesh computational mechanics applications. The strategy is intended to handle varying levels of load changes throughout the run. The major issues involved in a generic dynamic load balancing scheme will be investigated together with techniques to automate the implementation of a dynamic load balancing mechanism within the Computer Aided Parallelisation Tools (CAPTools) environment, which is a semi-automatic tool for parallelisation of mesh based FORTRAN codes.