174 resultados para GPGPU Parallel Computing


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Analogue computers provide actual rather than virtual representations of model systems. They are powerful and engaging computing machines that are cheap and simple to build. This two-part Retronics article helps you build (and understand!) your own analogue computer to simulate the Lorenz butterfly that's become iconic for Chaos theory.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Proposed is a unique cell histogram architecture which will process k data items in parallel to compute 2q histogram bins per time step. An array of m/2q cells computes an m-bin histogram with a speed-up factor of k; k ⩾ 2 makes it faster than current dual-ported memory implementations. Furthermore, simple mechanisms for conflict-free storing of the histogram bins into an external memory array are discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The adsorption of gases on microporous carbons is still poorly understood, partly because the structure of these carbons is not well known. Here, a model of microporous carbons based on fullerene- like fragments is used as the basis for a theoretical study of Ar adsorption on carbon. First, a simulation box was constructed, containing a plausible arrangement of carbon fragments. Next, using a new Monte Carlo simulation algorithm, two types of carbon fragments were gradually placed into the initial structure to increase its microporosity. Thirty six different microporous carbon structures were generated in this way. Using the method proposed recently by Bhattacharya and Gubbins ( BG), the micropore size distributions of the obtained carbon models and the average micropore diameters were calculated. For ten chosen structures, Ar adsorption isotherms ( 87 K) were simulated via the hyper- parallel tempering Monte Carlo simulation method. The isotherms obtained in this way were described by widely applied methods of microporous carbon characterisation, i. e. Nguyen and Do, Horvath - Kawazoe, high- resolution alpha(a)s plots, adsorption potential distributions and the Dubinin - Astakhov ( DA) equation. From simulated isotherms described by the DA equation, the average micropore diameters were calculated using empirical relationships proposed by different authors and they were compared with those from the BG method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The real-time parallel computation of histograms using an array of pipelined cells is proposed and prototyped in this paper with application to consumer imaging products. The array operates in two modes: histogram computation and histogram reading. The proposed parallel computation method does not use any memory blocks. The resulting histogram bins can be stored into an external memory block in a pipelined fashion for subsequent reading or streaming of the results. The array of cells can be tuned to accommodate the required data path width in a VLSI image processing engine as present in many imaging consumer devices. Synthesis of the architectures presented in this paper in FPGA are shown to compute the real-time histogram of images streamed at over 36 megapixels at 30 frames/s by processing in parallel 1, 2 or 4 pixels per clock cycle.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The DNA G-qadruplexes are one of the targets being actively explored for anti-cancer therapy by inhibiting them through small molecules. This computational study was conducted to predict the binding strengths and orientations of a set of novel dimethyl-amino-ethyl-acridine (DACA) analogues that are designed and synthesized in our laboratory, but did not diffract in Synchrotron light.Thecrystal structure of DNA G-Quadruplex(TGGGGT)4(PDB: 1O0K) was used as target for their binding properties in our studies.We used both the force field (FF) and QM/MM derived atomic charge schemes simultaneously for comparing the predictions of drug binding modes and their energetics. This study evaluates the comparative performance of fixed point charge based Glide XP docking and the quantum polarized ligand docking schemes. These results will provide insights on the effects of including or ignoring the drug-receptor interfacial polarization events in molecular docking simulations, which in turn, will aid the rational selection of computational methods at different levels of theory in future drug design programs. Plenty of molecular modelling tools and methods currently exist for modelling drug-receptor or protein-protein, or DNA-protein interactionssat different levels of complexities.Yet, the capasity of such tools to describevarious physico-chemical propertiesmore accuratelyis the next step ahead in currentresearch.Especially, the usage of most accurate methods in quantum mechanics(QM) is severely restricted by theirtedious nature. Though the usage of massively parallel super computing environments resulted in a tremendous improvement in molecular mechanics (MM) calculations like molecular dynamics,they are still capable of dealing with only a couple of tens to hundreds of atoms for QM methods. One such efficient strategy that utilizes thepowers of both MM and QM are the QM/MM hybrid methods. Lately, attempts have been directed towards the goal of deploying several different QM methods for betterment of force field based simulations, but with practical restrictions in place. One of such methods utilizes the inclusion of charge polarization events at the drug-receptor interface, that is not explicitly present in the MM FF.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recently major processor manufacturers have announced a dramatic shift in their paradigm to increase computing power over the coming years. Instead of focusing on faster clock speeds and more powerful single core CPUs, the trend clearly goes towards multi core systems. This will also result in a paradigm shift for the development of algorithms for computationally expensive tasks, such as data mining applications. Obviously, work on parallel algorithms is not new per se but concentrated efforts in the many application domains are still missing. Multi-core systems, but also clusters of workstations and even large-scale distributed computing infrastructures provide new opportunities and pose new challenges for the design of parallel and distributed algorithms. Since data mining and machine learning systems rely on high performance computing systems, research on the corresponding algorithms must be on the forefront of parallel algorithm research in order to keep pushing data mining and machine learning applications to be more powerful and, especially for the former, interactive. To bring together researchers and practitioners working in this exciting field, a workshop on parallel data mining was organized as part of PKDD/ECML 2006 (Berlin, Germany). The six contributions selected for the program describe various aspects of data mining and machine learning approaches featuring low to high degrees of parallelism: The first contribution focuses the classic problem of distributed association rule mining and focuses on communication efficiency to improve the state of the art. After this a parallelization technique for speeding up decision tree construction by means of thread-level parallelism for shared memory systems is presented. The next paper discusses the design of a parallel approach for dis- tributed memory systems of the frequent subgraphs mining problem. This approach is based on a hierarchical communication topology to solve issues related to multi-domain computational envi- ronments. The forth paper describes the combined use and the customization of software packages to facilitate a top down parallelism in the tuning of Support Vector Machines (SVM) and the next contribution presents an interesting idea concerning parallel training of Conditional Random Fields (CRFs) and motivates their use in labeling sequential data. The last contribution finally focuses on very efficient feature selection. It describes a parallel algorithm for feature selection from random subsets. Selecting the papers included in this volume would not have been possible without the help of an international Program Committee that has provided detailed reviews for each paper. We would like to also thank Matthew Otey who helped with publicity for the workshop.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Markowitz showed that assets can be combined to produce an 'Efficient' portfolio that will give the highest level of portfolio return for any level of portfolio risk, as measured by the variance or standard deviation. These portfolios can then be connected to generate what is termed an 'Efficient Frontier' (EF). In this paper we discuss the calculation of the Efficient Frontier for combinations of assets, again using the spreadsheet Optimiser. To illustrate the derivation of the Efficient Frontier, we use the data from the Investment Property Databank Long Term Index of Investment Returns for the period 1971 to 1993. Many investors might require a certain specific level of holding or a restriction on holdings in at least some of the assets. Such additional constraints may be readily incorporated into the model to generate a constrained EF with upper and/or lower bounds. This can then be compared with the unconstrained EF to see whether the reduction in return is acceptable. To see the effect that these additional constraints may have, we adopt a fairly typical pension fund profile, with no more than 20% of the total held in Property. The paper shows that it is now relatively easy to use the Optimiser available in at least one spreadsheet (EXCEL) to calculate efficient portfolios for various levels of risk and return, both constrained and unconstrained, so as to be able to generate any number of Efficient Frontiers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Both the (5,3) counter and (2,2,3) counter multiplication techniques are investigated for the efficiency of their operation speed and the viability of the architectures when implemented in a fast bipolar ECL technology. The implementation of the counters in series-gated ECL and threshold logic are contrasted for speed, noise immunity and complexity, and are critically compared with the fastest practical design of a full-adder. A novel circuit technique to overcome the problems of needing high fan-in input weights in threshold circuits through the use of negative weighted inputs is presented. The authors conclude that a (2,2,3) counter based array multiplier implemented in series-gated ECL should enable a significant increase in speed over conventional full adder based array multipliers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The authors compare various array multiplier architectures based on (p,q) counter circuits. The tradeoff in multiplier design is always between adding complexity and increasing speed. It is shown that by using a (2,2,3) counter cell it is possible to gain a significant increase in speed over a conventional full-adder, carry-save array based approach. The increase in complexity should be easily accommodated using modern emitter-coupled-logic processes.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The K-Means algorithm for cluster analysis is one of the most influential and popular data mining methods. Its straightforward parallel formulation is well suited for distributed memory systems with reliable interconnection networks, such as massively parallel processors and clusters of workstations. However, in large-scale geographically distributed systems the straightforward parallel algorithm can be rendered useless by a single communication failure or high latency in communication paths. The lack of scalable and fault tolerant global communication and synchronisation methods in large-scale systems has hindered the adoption of the K-Means algorithm for applications in large networked systems such as wireless sensor networks, peer-to-peer systems and mobile ad hoc networks. This work proposes a fully distributed K-Means algorithm (EpidemicK-Means) which does not require global communication and is intrinsically fault tolerant. The proposed distributed K-Means algorithm provides a clustering solution which can approximate the solution of an ideal centralised algorithm over the aggregated data as closely as desired. A comparative performance analysis is carried out against the state of the art sampling methods and shows that the proposed method overcomes the limitations of the sampling-based approaches for skewed clusters distributions. The experimental analysis confirms that the proposed algorithm is very accurate and fault tolerant under unreliable network conditions (message loss and node failures) and is suitable for asynchronous networks of very large and extreme scale.