985 resultados para Parallel computation
Quick, Decentralized, Energy-Efficient One-Shot Max Function Computation Using Timer-Based Selection
Resumo:
In several wireless sensor networks, it is of interest to determine the maximum of the sensor readings and identify the sensor responsible for it. We propose a novel, decentralized, scalable, energy-efficient, timer-based, one-shot max function computation (TMC) algorithm. In it, the sensor nodes do not transmit their readings in a centrally pre-defined sequence. Instead, the nodes are grouped into clusters, and computation occurs over two contention stages. First, the nodes in each cluster contend with each other using the timer scheme to transmit their reading to their cluster-heads. Thereafter, the cluster-heads use the timer scheme to transmit the highest sensor reading in their cluster to the fusion node. One new challenge is that the use of the timer scheme leads to collisions, which can make the algorithm fail. We optimize the algorithm to minimize the average time required to determine the maximum subject to a constraint on the probability that it fails to find the maximum. TMC significantly lowers average function computation time, average number of transmissions, and average energy consumption compared to approaches proposed in the literature.
Resumo:
An area-efficient, wideband RF frequency synthesizer, which simultaneously generates multiple local oscillator (LO) signals, is designed. It is suitable for parallel wideband RF spectrum sensing in cognitive radios. The frequency synthesizer consists of an injection locked oscillator cascade (ILOC) where all the LO signals are derived from a single reference oscillator. The ILOC is implemented in a 130-nm technology with an active area of . It generates 4 uniformly spaced LO carrier frequencies from 500 MHz to 2 GHz. This design is the first known implementation of a CMOS based ILOC for wide-band RF spectrum sensing applications.
Resumo:
Prediction of queue waiting times of jobs submitted to production parallel batch systems is important to provide overall estimates to users and can also help meta-schedulers make scheduling decisions. In this work, we have developed a framework for predicting ranges of queue waiting times for jobs by employing multi-class classification of similar jobs in history. Our hierarchical prediction strategy first predicts the point wait time of a job using dynamic k-Nearest Neighbor (kNN) method. It then performs a multi-class classification using Support Vector Machines (SVMs) among all the classes of the jobs. The probabilities given by the SVM for the class predicted using k-NN and its neighboring classes are used to provide a set of ranges of predicted wait times with probabilities. We have used these predictions and probabilities in a meta-scheduling strategy that distributes jobs to different queues/sites in a multi-queue/grid environment for minimizing wait times of the jobs. Experiments with different production supercomputer job traces show that our prediction strategies can give correct predictions for about 77-87% of the jobs, and also result in about 12% improved accuracy when compared to the next best existing method. Experiments with our meta-scheduling strategy using different production and synthetic job traces for various system sizes, partitioning schemes and different workloads, show that the meta-scheduling strategy gives much improved performance when compared to existing scheduling policies by reducing the overall average queue waiting times of the jobs by about 47%.
Resumo:
In concentrated solar power(CSP) generating stations, incident solar energy is reflected from a large number of mirrors or heliostats to a faraway receiver. In typical CSP installations, the mirror needs to be moved about two axes independently using two actuators in series with the mirror effectively mounted at a single point. A three degree-of-freedom parallel manipulator, namely the 3-RPS parallel manipulator, is proposed to track the sun. The proposed 3-RPS parallel manipulator supports the load of the mirror, structure and wind loading at three points resulting in less deflection, and thus a much larger mirror can be moved with the required tracking accuracy and without increasing the weight of the support structure. The kinematics equations to determine motion of the actuated prismatic joints in the 3-RPS parallel manipulator such that the sun's rays are reflected on to a stationary receiver are developed. Using finite element analysis, it is shown that for same sized mirror, wind loading and maximum deflection requirement, the weight of the support structure is between 15% and 60% less with the 3-RPS parallel manipulator when compared to azimuth-elevation or the target-aligned configurations.
Resumo:
With increasing energy demand, it necessitates to generate and transmit the electrical power with minimal losses. High voltage power transmission is the most economical way of transmitting bulk power over long distances. Transmission insulator is one of the main components used as a mechanical support and to electrically isolate the conductor from the tower. Corona from the hardware and conductors can significantly affect the performance of the polymeric insulators. In the present investigation a methodology is presented to evaluate the corona performance of the polymeric shed material under different environment conditions for both ac and dc excitation. The results of the comprehensive analysis on various polymeric samples and the power released from the corona electrode for both the ac and dc excitation are presented. Some interesting results obtained from the chemical analysis confirmed the presence of nitric acid species on the treated sample which in long term will affect the strength of the insulator, also the morphological changes were found to be varying for different experimental conditions. (C) 2015 The Authors. Published by Elsevier Ltd.
Resumo:
Support vector machines (SVM) are a popular class of supervised models in machine learning. The associated compute intensive learning algorithm limits their use in real-time applications. This paper presents a fully scalable architecture of a coprocessor, which can compute multiple rows of the kernel matrix in parallel. Further, we propose an extended variant of the popular decomposition technique, sequential minimal optimization, which we call hybrid working set (HWS) algorithm, to effectively utilize the benefits of cached kernel columns and the parallel computational power of the coprocessor. The coprocessor is implemented on Xilinx Virtex 7 field-programmable gate array-based VC707 board and achieves a speedup of upto 25x for kernel computation over single threaded computation on Intel Core i5. An application speedup of upto 15x over software implementation of LIBSVM and speedup of upto 23x over SVMLight is achieved using the HWS algorithm in unison with the coprocessor. The reduction in the number of iterations and sensitivity of the optimization time to variation in cache size using the HWS algorithm are also shown.
Resumo:
Graph algorithms have been shown to possess enough parallelism to keep several computing resources busy-even hundreds of cores on a GPU. Unfortunately, tuning their implementation for efficient execution on a particular hardware configuration of heterogeneous systems consisting of multicore CPUs and GPUs is challenging, time consuming, and error prone. To address these issues, we propose a domain-specific language (DSL), Falcon, for implementing graph algorithms that (i) abstracts the hardware, (ii) provides constructs to write explicitly parallel programs at a higher level, and (iii) can work with general algorithms that may change the graph structure (morph algorithms). We illustrate the usage of our DSL to implement local computation algorithms (that do not change the graph structure) and morph algorithms such as Delaunay mesh refinement, survey propagation, and dynamic SSSP on GPU and multicore CPUs. Using a set of benchmark graphs, we illustrate that the generated code performs close to the state-of-the-art hand-tuned implementations.
Resumo:
Quantum cellular automata (QCA) is a new technology in the nanometer scale and has been considered as one of the alternative to CMOS technology. In this paper, we describe the design and layout of a serial memory and parallel memory, showing the layout of individual memory cells. Assuming that we can fabricate cells which are separated by 10nm, memory capacities of over 1.6 Gbit/cm2 can be achieved. Simulations on the proposed memories were carried out using QCADesigner, a layout and simulation tool for QCA. During the design, we have tried to reduce the number of cells as well as to reduce the area which is found to be 86.16sq mm and 0.12 nm2 area with the QCA based memory cell. We have also achieved an increase in efficiency by 40%.These circuits are the building block of nano processors and provide us to understand the nano devices of the future.
Resumo:
The crystal structure of a tripeptide Boc-Leu-Val-Ac(12)c-OMe (1) is determined, which incorporates a bulky 1-aminocyclododecane-1-carboxylic acid (Ac(12)c) side chain. The peptide adopts a semi-extended backbone conformation for Leu and Val residues, while the backbone torsion angles of the C-,C--dialkylated residue Ac(12)c are in the helical region of the Ramachandran map. The molecular packing of 1 revealed a unique supramolecular twisted parallel -sheet coiling into a helical architecture in crystals, with the bulky hydrophobic Ac(12)c side chains projecting outward the helical column. This arrangement resembles the packing of peptide helices in crystal structures. Although short oligopeptides often assemble as parallel or anti-parallel -sheet in crystals, twisted or helical -sheet formation has been observed in a few examples of dipeptide crystal structures. Peptide 1 presents the first example of a tripeptide showing twisted -sheet assembly in crystals. Copyright (c) 2016 European Peptide Society and John Wiley & Sons, Ltd.
Resumo:
The crystal structure of a tripeptide Boc-Leu-Val-Ac(12)c-OMe (1) is determined, which incorporates a bulky 1-aminocyclododecane-1-carboxylic acid (Ac(12)c) side chain. The peptide adopts a semi-extended backbone conformation for Leu and Val residues, while the backbone torsion angles of the C-,C--dialkylated residue Ac(12)c are in the helical region of the Ramachandran map. The molecular packing of 1 revealed a unique supramolecular twisted parallel -sheet coiling into a helical architecture in crystals, with the bulky hydrophobic Ac(12)c side chains projecting outward the helical column. This arrangement resembles the packing of peptide helices in crystal structures. Although short oligopeptides often assemble as parallel or anti-parallel -sheet in crystals, twisted or helical -sheet formation has been observed in a few examples of dipeptide crystal structures. Peptide 1 presents the first example of a tripeptide showing twisted -sheet assembly in crystals. Copyright (c) 2016 European Peptide Society and John Wiley & Sons, Ltd.
Resumo:
Signals recorded from the brain often show rhythmic patterns at different frequencies, which are tightly coupled to the external stimuli as well as the internal state of the subject. In addition, these signals have very transient structures related to spiking or sudden onset of a stimulus, which have durations not exceeding tens of milliseconds. Further, brain signals are highly nonstationary because both behavioral state and external stimuli can change on a short time scale. It is therefore essential to study brain signals using techniques that can represent both rhythmic and transient components of the signal, something not always possible using standard signal processing techniques such as short time fourier transform, multitaper method, wavelet transform, or Hilbert transform. In this review, we describe a multiscale decomposition technique based on an over-complete dictionary called matching pursuit (MP), and show that it is able to capture both a sharp stimulus-onset transient and a sustained gamma rhythm in local field potential recorded from the primary visual cortex. We compare the performance of MP with other techniques and discuss its advantages and limitations. Data and codes for generating all time-frequency power spectra are provided.
Resumo:
In this paper, a pressure correction algorithm for computing incompressible flows is modified and implemented on unstructured Chimera grid. Schwarz method is used to couple the solutions of different sub-domains. A new interpolation to ensure consistency between primary variables and auxiliary variables is proposed. Other important issues such as global mass conservation and order of accuracy in the interpolations are also discussed. Two numerical simulations are successfully performed. They include one steady case, the lid-driven cavity and one unsteady case, the flow around a circular cylinder. The results demonstrate a very good performance of the proposed scheme on unstructured Chimera grids. It prevents the decoupling of pressure field in the overlapping region and requires only little modification to the existing unstructured Navier–Stokes (NS) solver. The numerical experiments show the reliability and potential of this method in applying to practical problems.
Resumo:
A material model, whose framework is parallel spring-bundles oriented in 3-D space, is proposed. Based on a discussion of the discrete schemes and optimum discretization of the solid angles, a 3-D network cell consisted of one-dimensional components is developed with its geometrical and physical parameters calibrated. It is proved that the 3-D network model is able to exactly simulate materials with arbitrary Poisson ratio from 0 to 1/2, breaking through the limit that the previous models in the literature are only suitable for materials with Poisson ratio from 0 to 1/3. A simplified model is also proposed to realize high computation accuracy within low computation cost. Examples demonstrate that the 3-D network model has particular superiority in the simulation of short-fiber reinforced composites.