23 resultados para efficient algorithm

em Universidad Politécnica de Madrid


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Inicio del desarrollo de un algoritmo eficiente orientado a dispositivos con baja capacidad de proceso, que ayude a personas sin necesariamente una preparación adecuada a llevar a cabo un proceso de toma de una señal biológica, como puede ser un electrocardiograma. La aplicación deberá, por tanto, asesorar en la toma de la señal al usuario, evaluar la calidad de la grabación obtenida, y en tiempo seudo real, comprobar si la calidad de la señal obtenida es suficientemente buena para su posterior diagnóstico, de tal modo que en caso de que sea necesaria una repetición de la prueba médica, esta pueda realizarse de inmediato. Además, el algoritmo debe extraer las características más relevantes de la señal electrocardiográfica, procesarlas, y obtener una serie de patrones significativos que permitan la orientación a la diagnosis de algunas de las patologías más comunes que se puedan extraer de la información de las señales cardíacas. Para la extracción, evaluación y toma de decisiones de este proceso previo a la generación del diagnóstico, se seguirá la arquitectura clásica de un sistema de detección de patrones, definiendo las clases que sean necesarias según el número de patologías que se deseen identificar. Esta información de diagnosis, obtenida mediante la identificación del sistema de reconocimiento de patrones, podría ser de ayuda u orientación para la posterior revisión de la prueba por parte de un profesional médico cualificado y de manera remota, evitando así el desplazamiento del mismo a zonas donde, por los medios existentes a día de hoy, es muy remota la posibilidad de presencia de personal sanitario. ABTRACT Start of development of an efficient algorithm designed to devices with low processing power, which could help people without adequate preparation to undertake a process of taking a biological signal, such as an electrocardiogram. Therefore, the application must assist the user in taking the signal and evaluating the quality of the recording. All of this must to be in live time. It must to check the quality of the signal obtained, and if is it necessary a repetition of the test, this could be done immediately. Furthermore, the algorithm must extract the most relevant features of the ECG signal, process it, and get meaningful patterns that allow to a diagnosis orientation of some of the more common diseases that can be drawn from the cardiac signal information. For the extraction, evaluation and decision making in this previous process to the generation of diagnosis, we will follow the classic architecture of a pattern recognition system, defining the necessary classes according to the number of pathologies that we wish to identify. This diagnostic information obtained by identifying the pattern recognition system could be for help or guidance for further review of the signal by a qualified medical professional, and it could be done remotely, thus avoiding the movements to areas where nowadays it is extremely unlikely to place any health staff, due to the poor economic condition.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La ecuación en derivadas parciales de advección difusión con reacción química es la base de los modelos de dispersión de contaminantes en la atmósfera, y los diferentes métodos numéricos empleados para su resolución han sido objeto de amplios estudios a lo largo de su desarrollo. En esta Tesis se presenta la implementación de un nuevo método conservativo para la resolución de la parte advectiva de la ecuación en derivadas parciales que modela la dispersión de contaminantes dentro del modelo mesoescalar de transporte químico CHIMERE. Este método está basado en una técnica de volúmenes finitos junto con una interpolación racional. La ventaja de este método es la conservación exacta de la masa transportada debido al empleo de la ley de conservación de masas. Para ello emplea una formulación de flujo basado en el cálculo de la integral ponderada dentro de cada celda definida para la discretización del espacio en el método de volúmenes finitos. Los resultados numéricos obtenidos en las simulaciones realizadas (implementando el modelo conservativo para la advección en el modelo CHIMERE) se han comparado con los datos observados de concentración de contaminantes registrados en la red de estaciones de seguimiento y medición distribuidas por la Península Ibérica. Los datos estadísticos de medición del error, la media normalizada y la media absoluta normalizada del error, presentan valores que están dentro de los rangos propuestos por la EPA para considerar el modelo preciso. Además, se introduce un nuevo método para resolver la parte advectivadifusiva de la ecuación en derivadas parciales que modeliza la dispersión de contaminantes en la atmósfera. Se ha empleado un método de diferencias finitas de alto orden para resolver la parte difusiva de la ecuación de transporte de contaminantes junto con el método racional conservativo para la parte advectiva en una y dos dimensiones. Los resultados obtenidos de la aplicación del método a diferentes situaciones incluyendo casos académicos y reales han sido comparados con la solución analítica de la ecuación de advección-difusión, demostrando que el nuevo método proporciona un resultado preciso para aproximar la solución. Por último, se ha desarrollado un modelo completo que contempla los fenómenos advectivo y difusivo con reacción química, usando los métodos anteriores junto con una técnica de diferenciación regresiva (BDF2). Esta técnica consiste en un método implícito multipaso de diferenciación regresiva de segundo orden, que nos permite resolver los problemas rígidos típicos de la química atmosférica, modelizados a través de sistemas de ecuaciones diferenciales ordinarias. Este método hace uso de la técnica iterativa Gauss- Seidel para obtener la solución de la parte implícita de la fórmula BDF2. El empleo de la técnica de Gauss-Seidel en lugar de otras técnicas comúnmente empleadas, como la iteración por el método de Newton, nos proporciona rapidez de cálculo y bajo consumo de memoria, ideal para obtener modelos operativos para la resolución de la cinética química atmosférica. ABSTRACT Extensive research has been performed to solve the atmospheric chemicaladvection- diffusion equation and different numerical methods have been proposed. This Thesis presents the implementation of an exactly conservative method for the advection equation in the European scale Eulerian chemistry transport model CHIMERE based on a rational interpolation and a finite volume algorithm. The advantage of the method is that the cell-integrated average is predicted via a flux formulation, thus the mass is exactly conserved. Numerical results are compared with a set of observation registered at some monitoring sites in Spain. The mean normalized bias and the mean normalized absolute error present values that are inside the range to consider an accurate model performance. In addition, it has been introduced a new method to solve the advectiondiffusion equation. It is based on a high-order accurate finite difference method to solve de diffusion equation together with a rational interpolation and a finite volume to solve the advection equation in one dimension and two dimensions. Numerical results obtained from solving several problems include academic and real atmospheric problems have been compared with the analytical solution of the advection-diffusion equation, showing that the new method give an efficient algorithm for solving such problems. Finally, a complete model has been developed to solve the atmospheric chemical-advection-diffusion equation, adding the conservative method for the advection equation, the high-order finite difference method for the diffusion equation and a second-order backward differentiation formula (BDF2) to solve the atmospheric chemical kinetics. The BDF2 is an implicit, second order multistep backward differentiation formula used to solve the stiff systems of ordinary differential equations (ODEs) from atmospheric chemistry. The Gauss-Seidel iteration is used for approximately solving the implicitly defined BDF solution, giving a faster tool than the more commonly used iterative modified Newton technique. This method implies low start-up costs and a low memory demand due to the use of Gauss-Seidel iteration.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Los sistemas empotrados son cada día más comunes y complejos, de modo que encontrar procesos seguros, eficaces y baratos de desarrollo software dirigidos específicamente a esta clase de sistemas es más necesario que nunca. A diferencia de lo que ocurría hasta hace poco, en la actualidad los avances tecnológicos en el campo de los microprocesadores de los últimos tiempos permiten el desarrollo de equipos con prestaciones más que suficientes para ejecutar varios sistemas software en una única máquina. Además, hay sistemas empotrados con requisitos de seguridad (safety) de cuyo correcto funcionamiento depende la vida de muchas personas y/o grandes inversiones económicas. Estos sistemas software se diseñan e implementan de acuerdo con unos estándares de desarrollo software muy estrictos y exigentes. En algunos casos puede ser necesaria también la certificación del software. Para estos casos, los sistemas con criticidades mixtas pueden ser una alternativa muy valiosa. En esta clase de sistemas, aplicaciones con diferentes niveles de criticidad se ejecutan en el mismo computador. Sin embargo, a menudo es necesario certificar el sistema entero con el nivel de criticidad de la aplicación más crítica, lo que hace que los costes se disparen. La virtualización se ha postulado como una tecnología muy interesante para contener esos costes. Esta tecnología permite que un conjunto de máquinas virtuales o particiones ejecuten las aplicaciones con unos niveles de aislamiento tanto temporal como espacial muy altos. Esto, a su vez, permite que cada partición pueda ser certificada independientemente. Para el desarrollo de sistemas particionados con criticidades mixtas se necesita actualizar los modelos de desarrollo software tradicionales, pues estos no cubren ni las nuevas actividades ni los nuevos roles que se requieren en el desarrollo de estos sistemas. Por ejemplo, el integrador del sistema debe definir las particiones o el desarrollador de aplicaciones debe tener en cuenta las características de la partición donde su aplicación va a ejecutar. Tradicionalmente, en el desarrollo de sistemas empotrados, el modelo en V ha tenido una especial relevancia. Por ello, este modelo ha sido adaptado para tener en cuenta escenarios tales como el desarrollo en paralelo de aplicaciones o la incorporación de una nueva partición a un sistema ya existente. El objetivo de esta tesis doctoral es mejorar la tecnología actual de desarrollo de sistemas particionados con criticidades mixtas. Para ello, se ha diseñado e implementado un entorno dirigido específicamente a facilitar y mejorar los procesos de desarrollo de esta clase de sistemas. En concreto, se ha creado un algoritmo que genera el particionado del sistema automáticamente. En el entorno de desarrollo propuesto, se han integrado todas las actividades necesarias para desarrollo de un sistema particionado, incluidos los nuevos roles y actividades mencionados anteriormente. Además, el diseño del entorno de desarrollo se ha basado en la ingeniería guiada por modelos (Model-Driven Engineering), la cual promueve el uso de los modelos como elementos fundamentales en el proceso de desarrollo. Así pues, se proporcionan las herramientas necesarias para modelar y particionar el sistema, así como para validar los resultados y generar los artefactos necesarios para el compilado, construcción y despliegue del mismo. Además, en el diseño del entorno de desarrollo, la extensión e integración del mismo con herramientas de validación ha sido un factor clave. En concreto, se pueden incorporar al entorno de desarrollo nuevos requisitos no-funcionales, la generación de nuevos artefactos tales como documentación o diferentes lenguajes de programación, etc. Una parte clave del entorno de desarrollo es el algoritmo de particionado. Este algoritmo se ha diseñado para ser independiente de los requisitos de las aplicaciones así como para permitir al integrador del sistema implementar nuevos requisitos del sistema. Para lograr esta independencia, se han definido las restricciones al particionado. El algoritmo garantiza que dichas restricciones se cumplirán en el sistema particionado que resulte de su ejecución. Las restricciones al particionado se han diseñado con una capacidad expresiva suficiente para que, con un pequeño grupo de ellas, se puedan expresar la mayor parte de los requisitos no-funcionales más comunes. Las restricciones pueden ser definidas manualmente por el integrador del sistema o bien pueden ser generadas automáticamente por una herramienta a partir de los requisitos funcionales y no-funcionales de una aplicación. El algoritmo de particionado toma como entradas los modelos y las restricciones al particionado del sistema. Tras la ejecución y como resultado, se genera un modelo de despliegue en el que se definen las particiones que son necesarias para el particionado del sistema. A su vez, cada partición define qué aplicaciones deben ejecutar en ella así como los recursos que necesita la partición para ejecutar correctamente. El problema del particionado y las restricciones al particionado se modelan matemáticamente a través de grafos coloreados. En dichos grafos, un coloreado propio de los vértices representa un particionado del sistema correcto. El algoritmo se ha diseñado también para que, si es necesario, sea posible obtener particionados alternativos al inicialmente propuesto. El entorno de desarrollo, incluyendo el algoritmo de particionado, se ha probado con éxito en dos casos de uso industriales: el satélite UPMSat-2 y un demostrador del sistema de control de una turbina eólica. Además, el algoritmo se ha validado mediante la ejecución de numerosos escenarios sintéticos, incluyendo algunos muy complejos, de más de 500 aplicaciones. ABSTRACT The importance of embedded software is growing as it is required for a large number of systems. Devising cheap, efficient and reliable development processes for embedded systems is thus a notable challenge nowadays. Computer processing power is continuously increasing, and as a result, it is currently possible to integrate complex systems in a single processor, which was not feasible a few years ago.Embedded systems may have safety critical requirements. Its failure may result in personal or substantial economical loss. The development of these systems requires stringent development processes that are usually defined by suitable standards. In some cases their certification is also necessary. This scenario fosters the use of mixed-criticality systems in which applications of different criticality levels must coexist in a single system. In these cases, it is usually necessary to certify the whole system, including non-critical applications, which is costly. Virtualization emerges as an enabling technology used for dealing with this problem. The system is structured as a set of partitions, or virtual machines, that can be executed with temporal and spatial isolation. In this way, applications can be developed and certified independently. The development of MCPS (Mixed-Criticality Partitioned Systems) requires additional roles and activities that traditional systems do not require. The system integrator has to define system partitions. Application development has to consider the characteristics of the partition to which it is allocated. In addition, traditional software process models have to be adapted to this scenario. The V-model is commonly used in embedded systems development. It can be adapted to the development of MCPS by enabling the parallel development of applications or adding an additional partition to an existing system. The objective of this PhD is to improve the available technology for MCPS development by providing a framework tailored to the development of this type of system and by defining a flexible and efficient algorithm for automatically generating system partitionings. The goal of the framework is to integrate all the activities required for developing MCPS and to support the different roles involved in this process. The framework is based on MDE (Model-Driven Engineering), which emphasizes the use of models in the development process. The framework provides basic means for modeling the system, generating system partitions, validating the system and generating final artifacts. The framework has been designed to facilitate its extension and the integration of external validation tools. In particular, it can be extended by adding support for additional non-functional requirements and support for final artifacts, such as new programming languages or additional documentation. The framework includes a novel partitioning algorithm. It has been designed to be independent of the types of applications requirements and also to enable the system integrator to tailor the partitioning to the specific requirements of a system. This independence is achieved by defining partitioning constraints that must be met by the resulting partitioning. They have sufficient expressive capacity to state the most common constraints and can be defined manually by the system integrator or generated automatically based on functional and non-functional requirements of the applications. The partitioning algorithm uses system models and partitioning constraints as its inputs. It generates a deployment model that is composed by a set of partitions. Each partition is in turn composed of a set of allocated applications and assigned resources. The partitioning problem, including applications and constraints, is modeled as a colored graph. A valid partitioning is a proper vertex coloring. A specially designed algorithm generates this coloring and is able to provide alternative partitions if required. The framework, including the partitioning algorithm, has been successfully used in the development of two industrial use cases: the UPMSat-2 satellite and the control system of a wind-power turbine. The partitioning algorithm has been successfully validated by using a large number of synthetic loads, including complex scenarios with more that 500 applications.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A method to analyze parabolic reflectors with arbitrary piecewise rim is presented in this communication. This kind of reflectors, when operating as collimators in compact range facilities, needs to be large in terms of wavelength. Their analysis is very inefficient, when it is carried out with fullwave/MoM techniques, and it is not very appropriate for designing with PO techniques. Also, fast GO formulations do not offer enough accuracy to reach performance results. The proposed algorithm is based on a GO-PWS hybrid scheme, using analytical as well as non-analytical formulations. On one side, an analytical treatment of the polygonal rim reflectors is carried out. On the other side, non-analytical calculi are based on efficient operations, such as M2 order 2-dimensional FFT. A combination of these two techniques in the algorithm ensures real ad-hoc design capabilities, reached through analysis speedup. The purpose of the algorithm is to obtain an optimal conformal serrated-edge reflector design through the analysis of the field quality within the quiet zone that it is able to generate in its forward half space.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Abstract interpretation has been widely used for the analysis of object-oriented languages and, in particular, Java source and bytecode. However, while most existing work deals with the problem of flnding expressive abstract domains that track accurately the characteristics of a particular concrete property, the underlying flxpoint algorithms have received comparatively less attention. In fact, many existing (abstract interpretation based—) flxpoint algorithms rely on relatively inefHcient techniques for solving inter-procedural caligraphs or are speciflc and tied to particular analyses. We also argüe that the design of an efficient fixpoint algorithm is pivotal to supporting the analysis of large programs. In this paper we introduce a novel algorithm for analysis of Java bytecode which includes a number of optimizations in order to reduce the number of iterations. The algorithm is parametric -in the sense that it is independent of the abstract domain used and it can be applied to different domains as "plug-ins"-, multivariant, and flow-sensitive. Also, is based on a program transformation, prior to the analysis, that results in a highly uniform representation of all the features in the language and therefore simplifies analysis. Detailed descriptions of decompilation solutions are given and discussed with an example. We also provide some performance data from a preliminary implementation of the analysis.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Abstract interpretation has been widely used for the analysis of object-oriented languages and, more precisely, Java source and bytecode. However, while most of the existing work deals with the problem of finding expressive abstract domains that track accurately the characteristics of a particular concrete property, the underlying fixpoint algorithms have received comparatively less attention. In fact, many existing (abstract interpretation based) fixpoint algorithms rely on relatively inefficient techniques to solve inter-procedural call graphs or are specific and tied to particular analyses. We argue that the design of an efficient fixpoint algorithm is pivotal to support the analysis of large programs. In this paper we introduce a novel algorithm for analysis of Java bytecode which includes a number of optimizations in order to reduce the number of iterations. Also, the algorithm is parametric in the sense that it is independent of the abstract domain used and it can be applied to different domains as "plug-ins". It is also incremental in the sense that, if desired, analysis data can be saved so that only a reduced amount of reanalysis is needed after a small program change, which can be instrumental for large programs. The algorithm is also multivariant and flowsensitive. Finally, another interesting characteristic of the algorithm is that it is based on a program transformation, prior to the analysis, that results in a highly uniform representation of all the features in the language and therefore simplifies analysis. Detailed descriptions of decompilation solutions are provided and discussed with an example.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

HELLO protocol or neighborhood discovery is essential in wireless ad hoc networks. It makes the rules for nodes to claim their existence/aliveness. In the presence of node mobility, no fix optimal HELLO frequency and optimal transmission range exist to maintain accurate neighborhood tables while reducing the energy consumption and bandwidth occupation. Thus a Turnover based Frequency and transmission Power Adaptation algorithm (TFPA) is presented in this paper. The method enables nodes in mobile networks to dynamically adjust both their HELLO frequency and transmission range depending on the relative speed. In TFPA, each node monitors its neighborhood table to count new neighbors and calculate the turnover ratio. The relationship between relative speed and turnover ratio is formulated and optimal transmission range is derived according to battery consumption model to minimize the overall transmission energy. By taking advantage of the theoretical analysis, the HELLO frequency is adapted dynamically in conjunction with the transmission range to maintain accurate neighborhood table and to allow important energy savings. The algorithm is simulated and compared to other state-of-the-art algorithms. The experimental results demonstrate that the TFPA algorithm obtains high neighborhood accuracy with low HELLO frequency (at least 11% average reduction) and with the lowest energy consumption. Besides, the TFPA algorithm does not require any additional GPS-like device to estimate the relative speed for each node, hence the hardware cost is reduced.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis deals with the problem of efficiently tracking 3D objects in sequences of images. We tackle the efficient 3D tracking problem by using direct image registration. This problem is posed as an iterative optimization procedure that minimizes a brightness error norm. We review the most popular iterative methods for image registration in the literature, turning our attention to those algorithms that use efficient optimization techniques. Two forms of efficient registration algorithms are investigated. The first type comprises the additive registration algorithms: these algorithms incrementally compute the motion parameters by linearly approximating the brightness error function. We centre our attention on Hager and Belhumeur’s factorization-based algorithm for image registration. We propose a fundamental requirement that factorization-based algorithms must satisfy to guarantee good convergence, and introduce a systematic procedure that automatically computes the factorization. Finally, we also bring out two warp functions to register rigid and nonrigid 3D targets that satisfy the requirement. The second type comprises the compositional registration algorithms, where the brightness function error is written by using function composition. We study the current approaches to compositional image alignment, and we emphasize the importance of the Inverse Compositional method, which is known to be the most efficient image registration algorithm. We introduce a new algorithm, the Efficient Forward Compositional image registration: this algorithm avoids the necessity of inverting the warping function, and provides a new interpretation of the working mechanisms of the inverse compositional alignment. By using this information, we propose two fundamental requirements that guarantee the convergence of compositional image registration methods. Finally, we support our claims by using extensive experimental testing with synthetic and real-world data. We propose a distinction between image registration and tracking when using efficient algorithms. We show that, depending whether the fundamental requirements are hold, some efficient algorithms are eligible for image registration but not for tracking.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The failure detector class Omega (Ω) provides an eventual leader election functionality, i.e., eventually all correct processes permanently trust the same correct process. An algorithm is communication-efficient if the number of links that carry messages forever is bounded by n, being n the number of processes in the system. It has been defined that an algorithm is crash-quiescent if it eventually stops sending messages to crashed processes. In this regard, it has been recently shown the impossibility of implementing Ω crash quiescently without a majority of correct processes. We say that the membership is unknown if each process pi only knows its own identity and the number of processes in the system (that is, i and n), but pi does not know the identity of the rest of processes of the system. There is a type of link (denoted by ADD link) in which a bounded (but unknown) number of consecutive messages can be delayed or lost. In this work we present the first implementation (to our knowledge) of Ω in partially synchronous systems with ADD links and with unknown membership. Furthermore, it is the first implementation of Ω that combines two very interesting properties: communication-efficiency and crash-quiescence when the majority of processes are correct. Finally, we also obtain with the same algorithm a failure detector () such that every correct process eventually and permanently outputs the set of all correct processes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Industrial applications of computer vision sometimes require detection of atypical objects that occur as small groups of pixels in digital images. These objects are difficult to single out because they are small and randomly distributed. In this work we propose an image segmentation method using the novel Ant System-based Clustering Algorithm (ASCA). ASCA models the foraging behaviour of ants, which move through the data space searching for high data-density regions, and leave pheromone trails on their path. The pheromone map is used to identify the exact number of clusters, and assign the pixels to these clusters using the pheromone gradient. We applied ASCA to detection of microcalcifications in digital mammograms and compared its performance with state-of-the-art clustering algorithms such as 1D Self-Organizing Map, k-Means, Fuzzy c-Means and Possibilistic Fuzzy c-Means. The main advantage of ASCA is that the number of clusters needs not to be known a priori. The experimental results show that ASCA is more efficient than the other algorithms in detecting small clusters of atypical data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automatic visual object counting and video surveillance have important applications for home and business environments, such as security and management of access points. However, in order to obtain a satisfactory performance these technologies need professional and expensive hardware, complex installations and setups, and the supervision of qualified workers. In this paper, an efficient visual detection and tracking framework is proposed for the tasks of object counting and surveillance, which meets the requirements of the consumer electronics: off-the-shelf equipment, easy installation and configuration, and unsupervised working conditions. This is accomplished by a novel Bayesian tracking model that can manage multimodal distributions without explicitly computing the association between tracked objects and detections. In addition, it is robust to erroneous, distorted and missing detections. The proposed algorithm is compared with a recent work, also focused on consumer electronics, proving its superior performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A fully 3D iterative image reconstruction algorithm has been developed for high-resolution PET cameras composed of pixelated scintillator crystal arrays and rotating planar detectors, based on the ordered subsets approach. The associated system matrix is precalculated with Monte Carlo methods that incorporate physical effects not included in analytical models, such as positron range effects and interaction of the incident gammas with the scintillator material. Custom Monte Carlo methodologies have been developed and optimized for modelling of system matrices for fast iterative image reconstruction adapted to specific scanner geometries, without redundant calculations. According to the methodology proposed here, only one-eighth of the voxels within two central transaxial slices need to be modelled in detail. The rest of the system matrix elements can be obtained with the aid of axial symmetries and redundancies, as well as in-plane symmetries within transaxial slices. Sparse matrix techniques for the non-zero system matrix elements are employed, allowing for fast execution of the image reconstruction process. This 3D image reconstruction scheme has been compared in terms of image quality to a 2D fast implementation of the OSEM algorithm combined with Fourier rebinning approaches. This work confirms the superiority of fully 3D OSEM in terms of spatial resolution, contrast recovery and noise reduction as compared to conventional 2D approaches based on rebinning schemes. At the same time it demonstrates that fully 3D methodologies can be efficiently applied to the image reconstruction problem for high-resolution rotational PET cameras by applying accurate pre-calculated system models and taking advantage of the system's symmetries.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A backtracking algorithm for AND-Parallelism and its implementation at the Abstract Machine level are presented: first, a class of AND-Parallelism models based on goal independence is defined, and a generalized version of Restricted AND-Parallelism (RAP) introduced as characteristic of this class. A simple and efficient backtracking algorithm for R A P is then discussed. An implementation scheme is presented for this algorithm which offers minimum overhead, while retaining the performance and storage economy of sequent ial implementations and taking advantage of goal independence to avoid unnecessary backtracking ("restricted intelligent backtracking"). Finally, the implementation of backtracking in sequential and AND-Parallcl systems is explained through a number of examples.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Applications that operate on meshes are very popular in High Performance Computing (HPC) environments. In the past, many techniques have been developed in order to optimize the memory accesses for these datasets. Different loop transformations and domain decompositions are com- monly used for structured meshes. However, unstructured grids are more challenging. The memory accesses, based on the mesh connectivity, do not map well to the usual lin- ear memory model. This work presents a method to improve the memory performance which is suitable for HPC codes that operate on meshes. We develop a method to adjust the sequence in which the data are used inside the algorithm, by means of traversing and sorting the mesh. This sorted mesh can be transferred sequentially to the lower memory levels and allows for minimum data transfer requirements. The method also reduces the lower memory requirements dra- matically: up to 63% of the L1 cache misses are removed in a traditional cache system. We have obtained speedups of up to 2.58 on memory operations as measured in a general- purpose CPU. An improvement is also observed with se- quential access memories, where we have observed reduc- tions of up to 99% in the required low-level memory size.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Monte Carlo techniques, which require the generation of samples from some target density, are often the only alternative for performing Bayesian inference. Two classic sampling techniques to draw independent samples are the ratio of uniforms (RoU) and rejection sampling (RS). An efficient sampling algorithm is proposed combining the RoU and polar RS (i.e. RS inside a sector of a circle using polar coordinates). Its efficiency is shown in drawing samples from truncated Cauchy and Gaussian random variables, which have many important applications in signal processing and communications. RESUMEN. Método eficiente para generar algunas variables aleatorias de uso común en procesado de señal y comunicaciones (por ejemplo, Gaussianas o Cauchy truncadas) mediante la combinación de dos técnicas: "ratio of uniforms" y "rejection sampling".