996 resultados para parallel search


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The primary approaches for people to understand the inner properties of the earth and the distribution of the mineral resources are mainly coming from surface geology survey and geophysical/geochemical data inversion and interpretation. The purpose of seismic inversion is to extract information of the subsurface stratum geometrical structures and the distribution of material properties from seismic wave which is used for resource prospecting, exploitation and the study for inner structure of the earth and its dynamic process. Although the study of seismic parameter inversion has achieved a lot since 1950s, some problems are still persisting when applying in real data due to their nonlinearity and ill-posedness. Most inversion methods we use to invert geophysical parameters are based on iterative inversion which depends largely on the initial model and constraint conditions. It would be difficult to obtain a believable result when taking into consideration different factors such as environmental and equipment noise that exist in seismic wave excitation, propagation and acquisition. The seismic inversion based on real data is a typical nonlinear problem, which means most of their objective functions are multi-minimum. It makes them formidable to be solved using commonly used methods such as general-linearization and quasi-linearization inversion because of local convergence. Global nonlinear search methods which do not rely heavily on the initial model seem more promising, but the amount of computation required for real data process is unacceptable. In order to solve those problems mentioned above, this paper addresses a kind of global nonlinear inversion method which brings Quantum Monte Carlo (QMC) method into geophysical inverse problems. QMC has been used as an effective numerical method to study quantum many-body system which is often governed by Schrödinger equation. This method can be categorized into zero temperature method and finite temperature method. This paper is subdivided into four parts. In the first one, we briefly review the theory of QMC method and find out the connections with geophysical nonlinear inversion, and then give the flow chart of the algorithm. In the second part, we apply four QMC inverse methods in 1D wave equation impedance inversion and generally compare their results with convergence rate and accuracy. The feasibility, stability, and anti-noise capacity of the algorithms are also discussed within this chapter. Numerical results demonstrate that it is possible to solve geophysical nonlinear inversion and other nonlinear optimization problems by means of QMC method. They are also showing that Green’s function Monte Carlo (GFMC) and diffusion Monte Carlo (DMC) are more applicable than Path Integral Monte Carlo (PIMC) and Variational Monte Carlo (VMC) in real data. The third part provides the parallel version of serial QMC algorithms which are applied in a 2D acoustic velocity inversion and real seismic data processing and further discusses these algorithms’ globality and anti-noise capacity. The inverted results show the robustness of these algorithms which make them feasible to be used in 2D inversion and real data processing. The parallel inversion algorithms in this chapter are also applicable in other optimization. Finally, some useful conclusions are obtained in the last section. The analysis and comparison of the results indicate that it is successful to bring QMC into geophysical inversion. QMC is a kind of nonlinear inversion method which guarantees stability, efficiency and anti-noise. The most appealing property is that it does not rely heavily on the initial model and can be suited to nonlinear and multi-minimum geophysical inverse problems. This method can also be used in other filed regarding nonlinear optimization.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The amount of computation required to solve many early vision problems is prodigious, and so it has long been thought that systems that operate in a reasonable amount of time will only become feasible when parallel systems become available. Such systems now exist in digital form, but most are large and expensive. These machines constitute an invaluable test-bed for the development of new algorithms, but they can probably not be scaled down rapidly in both physical size and cost, despite continued advances in semiconductor technology and machine architecture. Simple analog networks can perform interesting computations, as has been known for a long time. We have reached the point where it is feasible to experiment with implementation of these ideas in VLSI form, particularly if we focus on networks composed of locally interconnected passive elements, linear amplifiers, and simple nonlinear components. While there have been excursions into the development of ideas in this area since the very beginnings of work on machine vision, much work remains to be done. Progress will depend on careful attention to matching of the capabilities of simple networks to the needs of early vision. Note that this is not at all intended to be anything like a review of the field, but merely a collection of some ideas that seem to be interesting.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A vernier offset is detected at once among straight lines, and reaction times are almost independent of the number of simultaneously presented stimuli (distractors), indicating parallel processing of vernier offsets. Reaction times for identifying a vernier offset to one side among verniers offset to the opposite side increase with the number of distractors, indicating serial processing. Even deviations below a photoreceptor diameter can be detected at once. The visual system thus attains positional accuracy below the photoreceptor diameter simultaneously at different positions. I conclude that deviation from straightness, or change of orientation, is detected in parallel over the visual field. Discontinuities or gradients in orientation may represent an elementary feature of vision.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a model for the general flow in the neocortex. The basic process, called "sequence-seeking," is a search for a sequence of mappings or transformations, linking source and target representations. The search is bi-directional, "bottom-up" as well as "top-down," and it explores in parallel a large numbe rof alternative sequences. This operation is implemented in a structure termed "counter streams," in which multiple sequences are explored along two separate, complementary pathways which seeking to meet. The first part of the paper discusses the general sequence-seeking scheme and a number of related processes, such as the learning of successful sequences, context effects, and the use of "express lines" and partial matches. The second part discusses biological implications of the model in terms of connections within and between cortical areas. The model is compared with existing data, and a number of new predictions are proposed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An effective approach of simulating fluid dynamics on a cluster of non- dedicated workstations is presented. The approach uses local interaction algorithms, small communication capacity, and automatic migration of parallel processes from busy hosts to free hosts. The approach is well- suited for simulating subsonic flow problems which involve both hydrodynamics and acoustic waves; for example, the flow of air inside wind musical instruments. Typical simulations achieve $80\\%$ parallel efficiency (speedup/processors) using 20 HP-Apollo workstations. Detailed measurements of the parallel efficiency of 2D and 3D simulations are presented, and a theoretical model of efficiency is developed which fits closely the measurements. Two numerical methods of fluid dynamics are tested: explicit finite differences, and the lattice Boltzmann method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This report describes Processor Coupling, a mechanism for controlling multiple ALUs on a single integrated circuit to exploit both instruction-level and inter-thread parallelism. A compiler statically schedules individual threads to discover available intra-thread instruction-level parallelism. The runtime scheduling mechanism interleaves threads, exploiting inter-thread parallelism to maintain high ALU utilization. ALUs are assigned to threads on a cycle byscycle basis, and several threads can be active concurrently. Simulation results show that Processor Coupling performs well both on single threaded and multi-threaded applications. The experiments address the effects of memory latencies, function unit latencies, and communication bandwidth between function units.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This technical report describes a new protocol, the Unique Token Protocol, for reliable message communication. This protocol eliminates the need for end-to-end acknowledgments and minimizes the communication effort when no dynamic errors occur. Various properties of end-to-end protocols are presented. The unique token protocol solves the associated problems. It eliminates source buffering by maintaining in the network at least two copies of a message. A token is used to decide if a message was delivered to the destination exactly once. This technical report also presents a possible implementation of the protocol in a worm-hole routed, 3-D mesh network.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis describes the design and implementation of an integrated circuit and associated packaging to be used as the building block for the data routing network of a large scale shared memory multiprocessor system. A general purpose multiprocessor depends on high-bandwidth, low-latency communications between computing elements. This thesis describes the design and construction of RN1, a novel self-routing, enhanced crossbar switch as a CMOS VLSI chip. This chip provides the basic building block for a scalable pipelined routing network with byte-wide data channels. A series of RN1 chips can be cascaded with no additional internal network components to form a multistage fault-tolerant routing switch. The chip is designed to operate at clock frequencies up to 100Mhz using Hewlett-Packard's HP34 $1.2\\mu$ process. This aggressive performance goal demands that special attention be paid to optimization of the logic architecture and circuit design.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Parallel shared-memory machines with hundreds or thousands of processor-memory nodes have been built; in the future we will see machines with millions or even billions of nodes. Associated with such large systems is a new set of design challenges. Many problems must be addressed by an architecture in order for it to be successful; of these, we focus on three in particular. First, a scalable memory system is required. Second, the network messaging protocol must be fault-tolerant. Third, the overheads of thread creation, thread management and synchronization must be extremely low. This thesis presents the complete system design for Hamal, a shared-memory architecture which addresses these concerns and is directly scalable to one million nodes. Virtual memory and distributed objects are implemented in a manner that requires neither inter-node synchronization nor the storage of globally coherent translations at each node. We develop a lightweight fault-tolerant messaging protocol that guarantees message delivery and idempotence across a discarding network. A number of hardware mechanisms provide efficient support for massive multithreading and fine-grained synchronization. Experiments are conducted in simulation, using a trace-driven network simulator to investigate the messaging protocol and a cycle-accurate simulator to evaluate the Hamal architecture. We determine implementation parameters for the messaging protocol which optimize performance. A discarding network is easier to design and can be clocked at a higher rate, and we find that with this protocol its performance can approach that of a non-discarding network. Our simulations of Hamal demonstrate the effectiveness of its thread management and synchronization primitives. In particular, we find register-based synchronization to be an extremely efficient mechanism which can be used to implement a software barrier with a latency of only 523 cycles on a 512 node machine.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Conventional parallel computer architectures do not provide support for non-uniformly distributed objects. In this thesis, I introduce sparsely faceted arrays (SFAs), a new low-level mechanism for naming regions of memory, or facets, on different processors in a distributed, shared memory parallel processing system. Sparsely faceted arrays address the disconnect between the global distributed arrays provided by conventional architectures (e.g. the Cray T3 series), and the requirements of high-level parallel programming methods that wish to use objects that are distributed over only a subset of processing elements. A sparsely faceted array names a virtual globally-distributed array, but actual facets are lazily allocated. By providing simple semantics and making efficient use of memory, SFAs enable efficient implementation of a variety of non-uniformly distributed data structures and related algorithms. I present example applications which use SFAs, and describe and evaluate simple hardware mechanisms for implementing SFAs. Keeping track of which nodes have allocated facets for a particular SFA is an important task that suggests the need for automatic memory management, including garbage collection. To address this need, I first argue that conventional tracing techniques such as mark/sweep and copying GC are inherently unscalable in parallel systems. I then present a parallel memory-management strategy, based on reference-counting, that is capable of garbage collecting sparsely faceted arrays. I also discuss opportunities for hardware support of this garbage collection strategy. I have implemented a high-level hardware/OS simulator featuring hardware support for sparsely faceted arrays and automatic garbage collection. I describe the simulator and outline a few of the numerous details associated with a "real" implementation of SFAs and SFA-aware garbage collection. Simulation results are used throughout this thesis in the evaluation of hardware support mechanisms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Euterpe is a real-time computer system for the modeling of musical structures. It provides a formalism wherein familiar concepts of musical analysis may be readily expressed. This is verified by its application to the analysis of a wide variety of conventional forms of music: Gregorian chant, Mediaeval polyphony, Back counterpoint, and sonata form. It may be of further assistance in the real-time experiments in various techniques of thematic development. Finally, the system is endowed with sound-synthesis apparatus with which the user may prepare tapes for musical performances.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

O software Eutils-search tem por objetivo trazer do banco de dados PubMed informações sobre artigos relacionados a genes de um organismo específico, de acordo com as regras referentes à taxa de acesso impostas pelo site. As informações trazidas são, então, armazenadas localmente em um banco de dados para acesso rápido. Além disso, o software também gera documentos XML correspondentes às informações do organismo requisitado. O eutils-search é uma ferramenta de apoio ao desenvolvimento de aplicações de mineração de textos voltadas para os domínios de biotecnologia e biologia molecular, baseada em informações textuais obtidas do banco de dados PubMed. Este documento apresenta os pré-requisitos e a descrição dos parâmetros necessários para utilização do software, bem como uma descrição de alguns aspectos internos do software, para melhor entendimento do processo que ele automatiza, além de alguns exemplos e uso.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ferr?, S. and King, R. D. (2004) A dichotomic search algorithm for mining and learning in domain-specific logics. Fundamenta Informaticae. IOS Press. To appear

Relevância:

20.00% 20.00%

Publicador:

Resumo:

R. Daly and Q. Shen. A Framework for the Scoring of Operators on the Search Space of Equivalence Classes of Bayesian Network Structures. Proceedings of the 2005 UK Workshop on Computational Intelligence, pages 67-74.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Huelse, M, Barr, D R W, Dudek, P: Cellular Automata and non-static image processing for embodied robot systems on a massively parallel processor array. In: Adamatzky, A et al. (eds) AUTOMATA 2008, Theory and Applications of Cellular Automata. Luniver Press, 2008, pp. 504-510. Sponsorship: EPSRC