Biblioteca Digital

828 resultados para compression parallel

FTCS finite difference scheme GPGPU parallel computing for the heat conduction equation = Programación en paralelo GPGPU del método en diferencias finitas FTCS para la ecuación del calor

Relevância:

20.00% 20.00%

Publicador:

Resumo:

En el presente artículo se muestran las ventajas de la programación en paralelo resolviendo numéricamente la ecuación del calor en dos dimensiones a través del método de diferencias finitas explícito centrado en el espacio FTCS. De las conclusiones de este trabajo se pone de manifiesto la importancia de la programación en paralelo para tratar problemas grandes, en los que se requiere un elevado número de cálculos, para los cuales la programación secuencial resulta impracticable por el elevado tiempo de ejecución. En la primera sección se describe brevemente los conceptos básicos de programación en paralelo. Seguidamente se resume el método de diferencias finitas explícito centrado en el espacio FTCS aplicado a la ecuación parabólica del calor. Seguidamente se describe el problema de condiciones de contorno y valores iniciales específico al que se va a aplicar el método de diferencias finitas FTCS, proporcionando pseudocódigos de una implementación secuencial y dos implementaciones en paralelo. Finalmente tras la discusión de los resultados se presentan algunas conclusiones. In this paper the advantages of parallel computing are shown by solving the heat conduction equation in two dimensions with the forward in time central in space (FTCS) finite difference method. Two different levels of parallelization are consider and compared with traditional serial procedures. We show in this work the importance of parallel computing when dealing with large problems that are impractical or impossible to solve them with a serial computing procedure. In the first section a summary of parallel computing approach is presented. Subsequently, the forward in time central in space (FTCS) finite difference method for the heat conduction equation is outline, describing how the heat flow equation is derived in two dimensions and the particularities of the finite difference numerical technique considered. Then, a specific initial boundary value problem is solved by the FTCS finite difference method and serial and parallel pseudo codes are provided. Finally after results are discussed some conclusions are presented.

Parallel sets and morphological measurements of CT images of soil pore structure in a vineyard

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Important physical and biological processes in soil-plant-microbial systems are dominated by the geometry of soil pore space, and a correct model of this geometry is critical for understanding them. We analyze the geometry of soil pore space with the X-ray computed tomography (CT) of intact soil columns. We present here some preliminary results of our investigation on Minkowski functionals of parallel sets to characterize soil structure. We also show how the evolution of Minkowski morphological measurements of parallel sets may help to characterize the influence of conventional tillage and permanent cover crop of resident vegetation on soil structure in a Spanish Mediterranean vineyard.

Neurite, a finite difference large scale parallel program for the simulation of the electrical signal propagation in neurites under mechanical loading

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the growing body of research on traumatic brain injury and spinal cord injury, computational neuroscience has recently focused its modeling efforts on neuronal functional deficits following mechanical loading. However, in most of these efforts, cell damage is generally only characterized by purely mechanistic criteria, function of quantities such as stress, strain or their corresponding rates. The modeling of functional deficits in neurites as a consequence of macroscopic mechanical insults has been rarely explored. In particular, a quantitative mechanically based model of electrophysiological impairment in neuronal cells has only very recently been proposed (Jerusalem et al., 2013). In this paper, we present the implementation details of Neurite: the finite difference parallel program used in this reference. Following the application of a macroscopic strain at a given strain rate produced by a mechanical insult, Neurite is able to simulate the resulting neuronal electrical signal propagation, and thus the corresponding functional deficits. The simulation of the coupled mechanical and electrophysiological behaviors requires computational expensive calculations that increase in complexity as the network of the simulated cells grows. The solvers implemented in Neurite-explicit and implicit-were therefore parallelized using graphics processing units in order to reduce the burden of the simulation costs of large scale scenarios. Cable Theory and Hodgkin-Huxley models were implemented to account for the electrophysiological passive and active regions of a neurite, respectively, whereas a coupled mechanical model accounting for the neurite mechanical behavior within its surrounding medium was adopted as a link between lectrophysiology and mechanics (Jerusalem et al., 2013). This paper provides the details of the parallel implementation of Neurite, along with three different application examples: a long myelinated axon, a segmented dendritic tree, and a damaged axon. The capabilities of the program to deal with large scale scenarios, segmented neuronal structures, and functional deficits under mechanical loading are specifically highlighted.

Morphological Functions with Parallel Sets for the Pore Space of X-ray CT Images of Soil Columns

Relevância:

20.00% 20.00%

Publicador:

Resumo:

During the last few decades, new imaging techniques like X-ray computed tomography have made available rich and detailed information of the spatial arrangement of soil constituents, usually referred to as soil structure. Mathematical morphology provides a plethora of mathematical techniques to analyze and parameterize the geometry of soil structure. They provide a guide to design the process from image analysis to the generation of synthetic models of soil structure in order to investigate key features of flow and transport phenomena in soil. In this work, we explore the ability of morphological functions built over Minkowski functionals with parallel sets of the pore space to characterize and quantify pore space geometry of columns of intact soil. These morphological functions seem to discriminate the effects on soil pore space geometry of contrasting management practices in a Mediterranean vineyard, and they provide the first step toward identifying the statistical significance of the observed differences.

El paralelo. Bosquejo de un método gráfico = The parallel. Sketch of a graphical method

Relevância:

20.00% 20.00%

Publicador:

Resumo:

El paralelo gráfico ha sido -y continúa siendo- un excepcional método para conocer, aprender, investigar y difundir la forma arquitectónica y urbana. Aquí intentamos esbozar los principios que rigen su elaboración y echar un leve vistazo a alguno de los jalones de su intensa historia, que merecería una atención más pausada.

Parallel Virtual Urban Workshops A “resasonable-cost” methodology for academic internationalization in problem-solving oriented postgraduate subjects

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Descripción y análisis critic de una metodología de taller de posgrado a realizar entre dos universidades en idioma ingles y con el apoyo de las nuevas tecnologías

RDSZ: an approach for lossless RDF stream compression

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In many applications (like social or sensor networks) the in- formation generated can be represented as a continuous stream of RDF items, where each item describes an application event (social network post, sensor measurement, etc). In this paper we focus on compressing RDF streams. In particular, we propose an approach for lossless RDF stream compression, named RDSZ (RDF Differential Stream compressor based on Zlib). This approach takes advantage of the structural similarities among items in a stream by combining a differential item encoding mechanism with the general purpose stream compressor Zlib. Empirical evaluation using several RDF stream datasets shows that this combi- nation produces gains in compression ratios with respect to using Zlib alone.

Multiphase Parallel Interleaved And Primary-Parallel Secondary-Series Forward Micro-Inverter Comparison.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Using multiphase technique is interesting in PV AC-module application due to light-load efficiency improvement by applying phase shedding, and the possibility of low-profile implementation. This paper presents a comparison, in terms of size and efficiency, of the parallel interleaved and the parallel-series connected multiphase configurations, as a function of the number of phases, for a forward micro-inverter operated in DCM. 8-phase prototypes of both multiphase configurations are built and compared between them and with the single phase forward micro-inverter, validating the presented analysis.

Logarithmical hopping encoding: a low computational complexity algorithm for image compression

Relevância:

20.00% 20.00%

Publicador:

Resumo:

LHE (logarithmical hopping encoding) is a computationally efficient image compression algorithm that exploits the Weber–Fechner law to encode the error between colour component predictions and the actual value of such components. More concretely, for each pixel, luminance and chrominance predictions are calculated as a function of the surrounding pixels and then the error between the predictions and the actual values are logarithmically quantised. The main advantage of LHE is that although it is capable of achieving a low-bit rate encoding with high quality results in terms of peak signal-to-noise ratio (PSNR) and image quality metrics with full-reference (FSIM) and non-reference (blind/referenceless image spatial quality evaluator), its time complexity is O( n) and its memory complexity is O(1). Furthermore, an enhanced version of the algorithm is proposed, where the output codes provided by the logarithmical quantiser are used in a pre-processing stage to estimate the perceptual relevance of the image blocks. This allows the algorithm to downsample the blocks with low perceptual relevance, thus improving the compression rate. The performance of LHE is especially remarkable when the bit per pixel rate is low, showing much better quality, in terms of PSNR and FSIM, than JPEG and slightly lower quality than JPEG-2000 but being more computationally efficient.

Comparative theoretical analysis between parallel and perpendicular geometries for 2D particle patterning in photovoltaic ferroelectric substrates

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes the dielectrophoretic potential created by the evanescent electric field acting on a particle near a photovoltaic crystal surface depending on the crystal cut. This electric field is obtained from the steady state solution of the Kukhtarev equations for the photovoltaic effect, where the diffusion term has been disregarded. First, the space charge field generated by a small, square, light spot where d _ l (being d a side of the square and l the crystal thickness) is studied. The surface charge density generated in both geometries is calculated and compared as their relation determines the different properties of the dielectrophoretic potential for both cuts. The shape of the dielectrophoretic potential is obtained and compared for several distances to the sample. Afterwards other light patterns are studied by the superposition of square spots, and the resulting trapping profiles are analysed. Finally the surface charge densities and trapping profiles for different d/l relations are studied.

Advanced Control Strategies for a 6 DoF Hydraulic Parallel Robot Based on the Dynamic Model

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nowadays robots have made their way into real applications that were prohibitive and unthinkable thirty years ago. This is mainly due to the increase in power computations and the evolution in the theoretical field of robotics and control. Even though there is plenty of information in the current literature on this topics, it is not easy to find clear concepts of how to proceed in order to design and implement a controller for a robot. In general, the design of a controller requires of a complete understanding and knowledge of the system to be controlled. Therefore, for advanced control techniques the systems must be first identified. Once again this particular objective is cumbersome and is never straight forward requiring of great expertise and some criteria must be adopted. On the other hand, the particular problem of designing a controller is even more complex when dealing with Parallel Manipulators (PM), since their closed-loop structures give rise to a highly nonlinear system. Under this basis the current work is developed, which intends to resume and gather all the concepts and experiences involve for the control of an Hydraulic Parallel Manipulator. The main objective of this thesis is to provide a guide remarking all the steps involve in the designing of advanced control technique for PMs. The analysis of the PM under study is minced up to the core of the mechanism: the hydraulic actuators. The actuators are modeled and experimental identified. Additionally, some consideration regarding traditional PID controllers are presented and an adaptive controller is finally implemented. From a macro perspective the kinematic and dynamic model of the PM are presented. Based on the model of the system and extending the adaptive controller of the actuator, a control strategy for the PM is developed and its performance is analyzed with simulation.

Development and Testing of a Fiber Bragg Grating Strain Sensor for Uniaxial Compression of Rock Specimens

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Los sensores de fibra óptica son una tecnología que ha madurado en los últimos años, sin embargo, se requiere un mayor desarrollo de aplicaciones para materiales naturales como las rocas, que por ser agregados complejos pueden contener partículas minerales y fracturas de tamaño mucho mayor que las galgas eléctricas usadas tradicionalmente para medir deformaciones en las pruebas de laboratorio, ocasionando que los resultados obtenidos puedan ser no representativos. En este trabajo fueron diseñados, fabricados y probados sensores de deformación de gran área y forma curvada, usando redes de Bragg en fibra óptica (FBG) con el objetivo de obtener registros representativos en rocas que contienen minerales y estructuras de diversas composiciones, tamaños y direcciones. Se presenta el proceso de elaboración del transductor, su caracterización mecánica, su calibración y su evaluación en pruebas de compresión uniaxial en muestras de roca. Para verificar la eficiencia en la transmisión de la deformación de la roca al sensor una vez pegado, también fue realizado el análisis de la transferencia incluyendo los efectos del adhesivo, de la muestra y del transductor. Los resultados experimentales indican que el sensor desarrollado permite registro y transferencia de la deformación fiables, avance necesario para uso en rocas y otros materiales heterogénos, señalando una interesante perspectiva para aplicaciones sobre superficies irregulares, pues permite aumentar a voluntad el tamaño y forma del área de registro, posibilita también obtener mayor fiabilidad de resultados en muestras de pequeño tamaño y sugiere su conveniencia en obras, en las cuales los sistemas eléctricos tradicionales tienen limitaciones. ABSTRACT Optical fiber sensors are a technology that has matured in recent years, however, further development for rock applications is needed. Rocks contain mineral particles and features larger than electrical strain gauges traditionally used in laboratory tests, causing the results to be unrepresentative. In this work were designed, manufactured, and tested large area and curved shape strain gages, using fiber Bragg gratings in optical fiber (FBG) in order to obtain representative measurement on surface rocks samples containing minerals and structures of different compositions, sizes and directions. This reports presents the processes of manufacturing, mechanical characterization, calibration and evaluation under uniaxial compression tests on rock samples. To verify the efficiency of rock deformation transmitted to attached sensor, it was also performed the analysis of the strain transfer including the effects of the bonding, the sample and the transducer. The experimental results indicate that the developed sensor enables reliable measurements of the strain and its transmission from rock to sensor, appropriate for use in heterogeneous materials, pointing an interesting perspective for applications on irregular surfaces, allowing increasing at will the size and shape of the measurement area. This research suggests suitability of the optical strain gauge for real scale, where traditional electrical systems have demonstrated some limitations.

Parallel Computer Vision Algorithms for Graphics Processing Units

Relevância:

20.00% 20.00%

Publicador:

Resumo:

La evolución de los teléfonos móviles inteligentes, dotados de cámaras digitales, está provocando una creciente demanda de aplicaciones cada vez más complejas que necesitan algoritmos de visión artificial en tiempo real; puesto que el tamaño de las señales de vídeo no hace sino aumentar y en cambio el rendimiento de los procesadores de un solo núcleo se ha estancado, los nuevos algoritmos que se diseñen para visión artificial han de ser paralelos para poder ejecutarse en múltiples procesadores y ser computacionalmente escalables. Una de las clases de procesadores más interesantes en la actualidad se encuentra en las tarjetas gráficas (GPU), que son dispositivos que ofrecen un alto grado de paralelismo, un excelente rendimiento numérico y una creciente versatilidad, lo que los hace interesantes para llevar a cabo computación científica. En esta tesis se exploran dos aplicaciones de visión artificial que revisten una gran complejidad computacional y no pueden ser ejecutadas en tiempo real empleando procesadores tradicionales. En cambio, como se demuestra en esta tesis, la paralelización de las distintas subtareas y su implementación sobre una GPU arrojan los resultados deseados de ejecución con tasas de refresco interactivas. Asimismo, se propone una técnica para la evaluación rápida de funciones de complejidad arbitraria especialmente indicada para su uso en una GPU. En primer lugar se estudia la aplicación de técnicas de síntesis de imágenes virtuales a partir de únicamente dos cámaras lejanas y no paralelas—en contraste con la configuración habitual en TV 3D de cámaras cercanas y paralelas—con información de color y profundidad. Empleando filtros de mediana modificados para la elaboración de un mapa de profundidad virtual y proyecciones inversas, se comprueba que estas técnicas son adecuadas para una libre elección del punto de vista. Además, se demuestra que la codificación de la información de profundidad con respecto a un sistema de referencia global es sumamente perjudicial y debería ser evitada. Por otro lado se propone un sistema de detección de objetos móviles basado en técnicas de estimación de densidad con funciones locales. Este tipo de técnicas es muy adecuada para el modelado de escenas complejas con fondos multimodales, pero ha recibido poco uso debido a su gran complejidad computacional. El sistema propuesto, implementado en tiempo real sobre una GPU, incluye propuestas para la estimación dinámica de los anchos de banda de las funciones locales, actualización selectiva del modelo de fondo, actualización de la posición de las muestras de referencia del modelo de primer plano empleando un filtro de partículas multirregión y selección automática de regiones de interés para reducir el coste computacional. Los resultados, evaluados sobre diversas bases de datos y comparados con otros algoritmos del estado del arte, demuestran la gran versatilidad y calidad de la propuesta. Finalmente se propone un método para la aproximación de funciones arbitrarias empleando funciones continuas lineales a tramos, especialmente indicada para su implementación en una GPU mediante el uso de las unidades de filtraje de texturas, normalmente no utilizadas para cómputo numérico. La propuesta incluye un riguroso análisis matemático del error cometido en la aproximación en función del número de muestras empleadas, así como un método para la obtención de una partición cuasióptima del dominio de la función para minimizar el error. ABSTRACT The evolution of smartphones, all equipped with digital cameras, is driving a growing demand for ever more complex applications that need to rely on real-time computer vision algorithms. However, video signals are only increasing in size, whereas the performance of single-core processors has somewhat stagnated in the past few years. Consequently, new computer vision algorithms will need to be parallel to run on multiple processors and be computationally scalable. One of the most promising classes of processors nowadays can be found in graphics processing units (GPU). These are devices offering a high parallelism degree, excellent numerical performance and increasing versatility, which makes them interesting to run scientific computations. In this thesis, we explore two computer vision applications with a high computational complexity that precludes them from running in real time on traditional uniprocessors. However, we show that by parallelizing subtasks and implementing them on a GPU, both applications attain their goals of running at interactive frame rates. In addition, we propose a technique for fast evaluation of arbitrarily complex functions, specially designed for GPU implementation. First, we explore the application of depth-image–based rendering techniques to the unusual configuration of two convergent, wide baseline cameras, in contrast to the usual configuration used in 3D TV, which are narrow baseline, parallel cameras. By using a backward mapping approach with a depth inpainting scheme based on median filters, we show that these techniques are adequate for free viewpoint video applications. In addition, we show that referring depth information to a global reference system is ill-advised and should be avoided. Then, we propose a background subtraction system based on kernel density estimation techniques. These techniques are very adequate for modelling complex scenes featuring multimodal backgrounds, but have not been so popular due to their huge computational and memory complexity. The proposed system, implemented in real time on a GPU, features novel proposals for dynamic kernel bandwidth estimation for the background model, selective update of the background model, update of the position of reference samples of the foreground model using a multi-region particle filter, and automatic selection of regions of interest to reduce computational cost. The results, evaluated on several databases and compared to other state-of-the-art algorithms, demonstrate the high quality and versatility of our proposal. Finally, we propose a general method for the approximation of arbitrarily complex functions using continuous piecewise linear functions, specially formulated for GPU implementation by leveraging their texture filtering units, normally unused for numerical computation. Our proposal features a rigorous mathematical analysis of the approximation error in function of the number of samples, as well as a method to obtain a suboptimal partition of the domain of the function to minimize approximation error.

Grid-Connected Forward Microinverter With Primary-Parallel Secondary-Series Transformer

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a primary-parallel secondaryseries multicore forward microinverter for photovoltaic ac-module application. The presented microinverter operates with a constant off-time boundary mode control, providing MPPT capability and unity power factor. The proposed multitransformer solution allows using low-profile unitary turns ratio transformers. Therefore, the transformers are better coupled and the overall performance of the microinverter is improved. Due to the multiphase solution, the number of devices increases but the current stress and losses per device are reduced contributing to an easier thermal management. Furthermore, the decoupling capacitor is split among the phases, contributing to a low-profile solution without electrolytic capacitors suitable to be mounted in the frame of a PV module. The proposed solution is compared to the classical parallel-interleaved approach, showing better efficiency in a wide power range and improving the weighted efficiency.

Optimizing communication by compression for Multi-GPU Scalable Breadth-First Searches

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Debido al creciente aumento del tamaño de los datos en muchos de los actuales sistemas de información, muchos de los algoritmos de recorrido de estas estructuras pierden rendimento para realizar búsquedas en estos. Debido a que la representacion de estos datos en muchos casos se realiza mediante estructuras nodo-vertice (Grafos), en el año 2009 se creó el reto Graph500. Con anterioridad, otros retos como Top500 servían para medir el rendimiento en base a la capacidad de cálculo de los sistemas, mediante tests LINPACK. En caso de Graph500 la medicion se realiza mediante la ejecución de un algoritmo de recorrido en anchura de grafos (BFS en inglés) aplicada a Grafos. El algoritmo BFS es uno de los pilares de otros muchos algoritmos utilizados en grafos como SSSP, shortest path o Betweeness centrality. Una mejora en este ayudaría a la mejora de los otros que lo utilizan. Analisis del Problema El algoritmos BFS utilizado en los sistemas de computación de alto rendimiento (HPC en ingles) es usualmente una version para sistemas distribuidos del algoritmo secuencial original. En esta versión distribuida se inicia la ejecución realizando un particionado del grafo y posteriormente cada uno de los procesadores distribuidos computará una parte y distribuirá sus resultados a los demás sistemas. Debido a que la diferencia de velocidad entre el procesamiento en cada uno de estos nodos y la transfencia de datos por la red de interconexión es muy alta (estando en desventaja la red de interconexion) han sido bastantes las aproximaciones tomadas para reducir la perdida de rendimiento al realizar transferencias. Respecto al particionado inicial del grafo, el enfoque tradicional (llamado 1D-partitioned graph en ingles) consiste en asignar a cada nodo unos vertices fijos que él procesará. Para disminuir el tráfico de datos se propuso otro particionado (2D) en el cual la distribución se haciá en base a las aristas del grafo, en vez de a los vertices. Este particionado reducía el trafico en la red en una proporcion O(NxM) a O(log(N)). Si bien han habido otros enfoques para reducir la transferecnia como: reordemaniento inicial de los vertices para añadir localidad en los nodos, o particionados dinámicos, el enfoque que se va a proponer en este trabajo va a consistir en aplicar técnicas recientes de compression de grandes sistemas de datos como Bases de datos de alto volume o motores de búsqueda en internet para comprimir los datos de las transferencias entre nodos.---ABSTRACT---The Breadth First Search (BFS) algorithm is the foundation and building block of many higher graph-based operations such as spanning trees, shortest paths and betweenness centrality. The importance of this algorithm increases each day due to it is a key requirement for many data structures which are becoming popular nowadays. These data structures turn out to be internally graph structures. When the BFS algorithm is parallelized and the data is distributed into several processors, some research shows a performance limitation introduced by the interconnection network [31]. Hence, improvements on the area of communications may benefit the global performance in this key algorithm. In this work it is presented an alternative compression mechanism. It differs with current existing methods in that it is aware of characteristics of the data which may benefit the compression. Apart from this, we will perform a other test to see how this algorithm (in a dis- tributed scenario) benefits from traditional instruction-based optimizations. Last, we will review the current supercomputing techniques and the related work being done in the area.

«
1
2
...
48
49
50
51
52
53
54
55
56
»