922 resultados para Parallel numerical algorithms


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Case-Based Reasoning (CBR) uses past experiences to solve new problems. The quality of the past experiences, which are stored as cases in a case base, is a big factor in the performance of a CBR system. The system's competence may be improved by adding problems to the case base after they have been solved and their solutions verified to be correct. However, from time to time, the case base may have to be refined to reduce redundancy and to get rid of any noisy cases that may have been introduced. Many case base maintenance algorithms have been developed to delete noisy and redundant cases. However, different algorithms work well in different situations and it may be difficult for a knowledge engineer to know which one is the best to use for a particular case base. In this thesis, we investigate ways to combine algorithms to produce better deletion decisions than the decisions made by individual algorithms, and ways to choose which algorithm is best for a given case base at a given time. We analyse five of the most commonly-used maintenance algorithms in detail and show how the different algorithms perform better on different datasets. This motivates us to develop a new approach: maintenance by a committee of experts (MACE). MACE allows us to combine maintenance algorithms to produce a composite algorithm which exploits the merits of each of the algorithms that it contains. By combining different algorithms in different ways we can also define algorithms that have different trade-offs between accuracy and deletion. While MACE allows us to define an infinite number of new composite algorithms, we still face the problem of choosing which algorithm to use. To make this choice, we need to be able to identify properties of a case base that are predictive of which maintenance algorithm is best. We examine a number of measures of dataset complexity for this purpose. These provide a numerical way to describe a case base at a given time. We use the numerical description to develop a meta-case-based classification system. This system uses previous experience about which maintenance algorithm was best to use for other case bases to predict which algorithm to use for a new case base. Finally, we give the knowledge engineer more control over the deletion process by creating incremental versions of the maintenance algorithms. These incremental algorithms suggest one case at a time for deletion rather than a group of cases, which allows the knowledge engineer to decide whether or not each case in turn should be deleted or kept. We also develop incremental versions of the complexity measures, allowing us to create an incremental version of our meta-case-based classification system. Since the case base changes after each deletion, the best algorithm to use may also change. The incremental system allows us to choose which algorithm is the best to use at each point in the deletion process.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article describes advances in statistical computation for large-scale data analysis in structured Bayesian mixture models via graphics processing unit (GPU) programming. The developments are partly motivated by computational challenges arising in fitting models of increasing heterogeneity to increasingly large datasets. An example context concerns common biological studies using high-throughput technologies generating many, very large datasets and requiring increasingly high-dimensional mixture models with large numbers of mixture components.We outline important strategies and processes for GPU computation in Bayesian simulation and optimization approaches, give examples of the benefits of GPU implementations in terms of processing speed and scale-up in ability to analyze large datasets, and provide a detailed, tutorial-style exposition that will benefit readers interested in developing GPU-based approaches in other statistical models. Novel, GPU-oriented approaches to modifying existing algorithms software design can lead to vast speed-up and, critically, enable statistical analyses that presently will not be performed due to compute time limitations in traditional computational environments. Supplementalmaterials are provided with all source code, example data, and details that will enable readers to implement and explore the GPU approach in this mixture modeling context. © 2010 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Adult humans, infants, pre-school children, and non-human animals appear to share a system of approximate numerical processing for non-symbolic stimuli such as arrays of dots or sequences of tones. Behavioral studies of adult humans implicate a link between these non-symbolic numerical abilities and symbolic numerical processing (e.g., similar distance effects in accuracy and reaction-time for arrays of dots and Arabic numerals). However, neuroimaging studies have remained inconclusive on the neural basis of this link. The intraparietal sulcus (IPS) is known to respond selectively to symbolic numerical stimuli such as Arabic numerals. Recent studies, however, have arrived at conflicting conclusions regarding the role of the IPS in processing non-symbolic, numerosity arrays in adulthood, and very little is known about the brain basis of numerical processing early in development. Addressing the question of whether there is an early-developing neural basis for abstract numerical processing is essential for understanding the cognitive origins of our uniquely human capacity for math and science. Using functional magnetic resonance imaging (fMRI) at 4-Tesla and an event-related fMRI adaptation paradigm, we found that adults showed a greater IPS response to visual arrays that deviated from standard stimuli in their number of elements, than to stimuli that deviated in local element shape. These results support previous claims that there is a neurophysiological link between non-symbolic and symbolic numerical processing in adulthood. In parallel, we tested 4-y-old children with the same fMRI adaptation paradigm as adults to determine whether the neural locus of non-symbolic numerical activity in adults shows continuity in function over development. We found that the IPS responded to numerical deviants similarly in 4-y-old children and adults. To our knowledge, this is the first evidence that the neural locus of adult numerical cognition takes form early in development, prior to sophisticated symbolic numerical experience. More broadly, this is also, to our knowledge, the first cognitive fMRI study to test healthy children as young as 4 y, providing new insights into the neurophysiology of human cognitive development.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An abstract of this work will be presented at the Compiler, Architecture and Tools Conference (CATC), Intel Development Center, Haifa, Israel November 23, 2015.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

User supplied knowledge and interaction is a vital component of a toolkit for producing high quality parallel implementations of scalar FORTRAN numerical code. In this paper we consider the necessary components that such a parallelisation toolkit should possess to provide an effective environment to identify, extract and embed user relevant user knowledge. We also examine to what extent these facilities are available in leading parallelisation tools; in particular we discuss how these issues have been addressed in the development of the user interface of the Computer Aided Parallelisation Tools (CAPTools). The CAPTools environment has been designed to enable user exploration, interaction and insertion of user knowledge to facilitate the automatic generation of very efficient parallel code. A key issue in the user's interaction is control of the volume of information so that the user is focused on only that which is needed. User control over the level and extent of information revealed at any phase is supplied using a wide variety of filters. Another issue is the way in which information is communicated. Dependence analysis and its resulting graphs involve a lot of sophisticated rather abstract concepts unlikely to be familiar to most users of parallelising tools. As such, considerable effort has been made to communicate with the user in terms that they will understand. These features, amongst others, and their use in the parallelisation process are described and their effectiveness discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Parallel computing is now widely used in numerical simulation, particularly for application codes based on finite difference and finite element methods. A popular and successful technique employed to parallelize such codes onto large distributed memory systems is to partition the mesh into sub-domains that are then allocated to processors. The code then executes in parallel, using the SPMD methodology, with message passing for inter-processor interactions. In order to improve the parallel efficiency of an imbalanced structured mesh CFD code, a new dynamic load balancing (DLB) strategy has been developed in which the processor partition range limits of just one of the partitioned dimensions uses non-coincidental limits, as opposed to coincidental limits. The ‘local’ partition limit change allows greater flexibility in obtaining a balanced load distribution, as the workload increase, or decrease, on a processor is no longer restricted by the ‘global’ (coincidental) limit change. The automatic implementation of this generic DLB strategy within an existing parallel code is presented in this chapter, along with some preliminary results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The parallelization of an industrially important in-house computational fluid dynamics (CFD) code for calculating the airflow over complex aircraft configurations using the Euler or Navier–Stokes equations is presented. The code discussed is the flow solver module of the SAUNA CFD suite. This suite uses a novel grid system that may include block-structured hexahedral or pyramidal grids, unstructured tetrahedral grids or a hybrid combination of both. To assist in the rapid convergence to a solution, a number of convergence acceleration techniques are employed including implicit residual smoothing and a multigrid full approximation storage scheme (FAS). Key features of the parallelization approach are the use of domain decomposition and encapsulated message passing to enable the execution in parallel using a single programme multiple data (SPMD) paradigm. In the case where a hybrid grid is used, a unified grid partitioning scheme is employed to define the decomposition of the mesh. The parallel code has been tested using both structured and hybrid grids on a number of different distributed memory parallel systems and is now routinely used to perform industrial scale aeronautical simulations. Copyright © 2000 John Wiley & Sons, Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper discusses preconditioned Krylov subspace methods for solving large scale linear systems that originate from oil reservoir numerical simulations. Two types of preconditioners, one being based on an incomplete LU decomposition and the other being based on iterative algorithms, are used together in a combination strategy in order to achieve an adaptive and efficient preconditioner. Numerical tests show that different Krylov subspace methods combining with appropriate preconditioners are able to achieve optimal performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The problem of deriving parallel mesh partitioning algorithms for mapping unstructured meshes to parallel computers is discussed in this chapter. In itself this raises a paradox - we seek to find a high quality partition of the mesh, but to compute it in parallel we require a partition of the mesh. In fact, we overcome this difficulty by deriving an optimisation strategy which can find a high quality partition even if the quality of the initial partition is very poor and then use a crude distribution scheme for the initial partition. The basis of this strategy is to use a multilevel approach combined with local refinement algorithms. Three such refinement algorithms are outlined and some example results presented which show that they can produce very high global quality partitions, very rapidly. The results are also compared with a similar multilevel serial partitioner and shown to be almost identical in quality. Finally we consider the impact of the initial partition on the results and demonstrate that the final partition quality is, modulo a certain amount of noise, independent of the initial partition.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The scheduling problem of minimizing the makespan for m parallel dedicated machines under single resource constraints is considered. For different variants of the problem the complexity status is established. Heuristic algorithms employing the so-called group technology approach are presented and their worst-case behavior is examined. Finally, a polynomial time approximation scheme is presented for the problem with fixed number of machines.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper considers scheduling problems for parallel dedicated machines subject to resource constraints. A fairly complete computational complexity classification is obtained, a number of polynomial-time algorithms are designed. For the problem with a fixed number of machines in which a job uses at most one resource of unit size a polynomial-time approximation scheme is offered.

Relevância:

30.00% 30.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A comprehensive solution of solidification/melting processes requires the simultaneous representation of free surface fluid flow, heat transfer, phase change, nonlinear solid mechanics and, possibly, electromagnetics together with their interactions, in what is now known as multiphysics simulation. Such simulations are computationally intensive and the implementation of solution strategies for multiphysics calculations must embed their effective parallelization. For some years, together with our collaborators, we have been involved in the development of numerical software tools for multiphysics modeling on parallel cluster systems. This research has involved a combination of algorithmic procedures, parallel strategies and tools, plus the design of a computational modeling software environment and its deployment in a range of real world applications. One output from this research is the three-dimensional parallel multiphysics code, PHYSICA. In this paper we report on an assessment of its parallel scalability on a range of increasingly complex models drawn from actual industrial problems, on three contemporary parallel cluster systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider a knapsack problem to minimize a symmetric quadratic function. We demonstrate that this symmetric quadratic knapsack problem is relevant to two problems of single machine scheduling: the problem of minimizing the weighted sum of the completion times with a single machine non-availability interval under the non-resumable scenario; and the problem of minimizing the total weighted earliness and tardiness with respect to a common small due date. We develop a polynomial-time approximation algorithm that delivers a constant worst-case performance ratio for a special form of the symmetric quadratic knapsack problem. We adapt that algorithm to our scheduling problems and achieve a better performance. For the problems under consideration no fixed-ratio approximation algorithms have been previously known.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fractal video compression is a relatively new video compression method. Its attraction is due to the high compression ratio and the simple decompression algorithm. But its computational complexity is high and as a result parallel algorithms on high performance machines become one way out. In this study we partition the matching search, which occupies the majority of the work in a fractal video compression process, into small tasks and implement them in two distributed computing environments, one using DCOM and the other using .NET Remoting technology, based on a local area network consists of loosely coupled PCs. Experimental results show that the parallel algorithm is able to achieve a high speedup in these distributed environments.