44 resultados para GPU computing


Relevância:

20.00% 20.00%

Publicador:

Resumo:

En el presente artículo se muestran las ventajas de la programación en paralelo resolviendo numéricamente la ecuación del calor en dos dimensiones a través del método de diferencias finitas explícito centrado en el espacio FTCS. De las conclusiones de este trabajo se pone de manifiesto la importancia de la programación en paralelo para tratar problemas grandes, en los que se requiere un elevado número de cálculos, para los cuales la programación secuencial resulta impracticable por el elevado tiempo de ejecución. En la primera sección se describe brevemente los conceptos básicos de programación en paralelo. Seguidamente se resume el método de diferencias finitas explícito centrado en el espacio FTCS aplicado a la ecuación parabólica del calor. Seguidamente se describe el problema de condiciones de contorno y valores iniciales específico al que se va a aplicar el método de diferencias finitas FTCS, proporcionando pseudocódigos de una implementación secuencial y dos implementaciones en paralelo. Finalmente tras la discusión de los resultados se presentan algunas conclusiones. In this paper the advantages of parallel computing are shown by solving the heat conduction equation in two dimensions with the forward in time central in space (FTCS) finite difference method. Two different levels of parallelization are consider and compared with traditional serial procedures. We show in this work the importance of parallel computing when dealing with large problems that are impractical or impossible to solve them with a serial computing procedure. In the first section a summary of parallel computing approach is presented. Subsequently, the forward in time central in space (FTCS) finite difference method for the heat conduction equation is outline, describing how the heat flow equation is derived in two dimensions and the particularities of the finite difference numerical technique considered. Then, a specific initial boundary value problem is solved by the FTCS finite difference method and serial and parallel pseudo codes are provided. Finally after results are discussed some conclusions are presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Currently, student dropout rates are a matter of concern among universities. Many research studies, aimed at discovering the causes, have been carried out. However, few solutions, that could serve all students and related problems, have been proposed so far. One such problem is caused by the lack of the "knowledge chain educational links" that occurs when students move onto higher studies without mastering their basic studies. Most regulated studies imparted at universities are designed so that some basic subjects serve as support for other, more complicated, subjects, thus forming a complicated knowledge network. When a link in this chain fails, student frustration occurs as it prevents him from fully understanding the following educational links. In this proposal we try to mitigate these disasters that stem, for the most part, the student?s frustration beyond his college stay. On one hand, we make a dissertation on the student?s learning process, which we divide into a series of phases that amount to what we call the "learning lifecycle." Also, we analyze at what stage the action by the stakeholders involved in this scenario: teachers and students; is most important. On the other hand, we consider that Information and Communication Technologies ICT, such as Cloud Computing, can help develop new ways, allowing for the teaching of higher education, while easing and facilitating the student?s learning process. But, methods and processes need to be defined as to direct the use of such technologies; in the teaching process in general, and within higher education in particular; in order to achieve optimum results. Our methodology integrates, as another actor, the ICT into the "Learning Lifecycle". We stimulate students to stop being merely spectators of their own education, and encourage them to take an active part in their training process. To do this, we offer a set of useful tools to determine not only academic failure causes, (for self assessment), but also to remedy these failures (with corrective actions); "discovered the causes it is easier to determine solutions?. We believe this study will be useful for both students and teachers. Students learn from their own experience and improve their learning process, while obtaining all of the "knowledge chain educational links? required in their studies. We stand by the motto "Studying to learn instead of studying to pass". Teachers will also be benefited by detecting where and how to strengthen their teaching proposals. All of this will also result in decreasing dropout rates.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

RESUMEN La realización de túneles de gran longitud para ferrocarriles ha adquirido un gran auge en los últimos años. En España se han abordado proyectos de estas características, no existiendo para su ejecución una metodología completa y contrastada de actuación. Las características geométricas, de observación y de trabajo en túneles hace que las metodologías que se aplican en otros proyectos de ingeniería no sean aplicables por las siguientes causas: separación de las redes exteriores e interiores de los túneles debido a la diferente naturaleza de los observables, geometría en el interior siempre desfavorable a los requerimientos de observación clásica, mala visibilidad dentro del túnel, aumento de errores conforme avanza la perforación, y movimientos propios del túnel durante su ejecución por la propia geodinámica activa. Los patrones de observación geodésica usados deben revisarse cuando se ejecutan túneles de gran longitud. Este trabajo establece una metodología para el diseño de redes exteriores. ABSTRACT: The realization of long railway tunnels has acquired a great interest in recent years. In Spain it is necessary to address projects of this nature, but ther is no corresponding methodological framework supporting them. The tunnel observational and working geometrical properties, make that former methodologies used may be unuseful in this case: the observation of the exterior and interior geodetical networks of the tunnel is different in nature. Conditions of visibility in the interior of the tunnels, regardless of the geometry, are not the most advantageous for observation due to the production system and the natural conditions of the tunnels. Errors increase as the drilling of the tunnel progresses, as it becomes problematical to perform continuous verifications along the itinerary itself. Moreover, inherent tunnel movements due to active geodynamics must also be considered. Therefore patterns for geodetic and topographic observations have to be reviewed when very long tunnels are constructed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A first-rate e-Health system saves lives, provides better patient care, allows complex but useful epidemiologic analysis and saves money. However, there may also be concerns about the costs and complexities associated with e-health implementation, and the need to solve issues about the energy footprint of the high-demanding computing facilities. This paper proposes a novel and evolved computing paradigm that: (i) provides the required computing and sensing resources; (ii) allows the population-wide diffusion; (iii) exploits the storage, communication and computing services provided by the Cloud; (iv) tackles the energy-optimization issue as a first-class requirement, taking it into account during the whole development cycle. The novel computing concept and the multi-layer top-down energy-optimization methodology obtain promising results in a realistic scenario for cardiovascular tracking and analysis, making the Home Assisted Living a reality.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an approach to create what we have called a Unified Sentiment Lexicon (USL). This approach aims at aligning, unifying, and expanding the set of sentiment lexicons which are available on the web in order to increase their robustness of coverage. One problem related to the task of the automatic unification of different scores of sentiment lexicons is that there are multiple lexical entries for which the classification of positive, negative, or neutral {P, Z, N} depends on the unit of measurement used in the annotation methodology of the source sentiment lexicon. Our USL approach computes the unified strength of polarity of each lexical entry based on the Pearson correlation coefficient which measures how correlated lexical entries are with a value between 1 and -1, where 1 indicates that the lexical entries are perfectly correlated, 0 indicates no correlation, and -1 means they are perfectly inversely correlated and so is the UnifiedMetrics procedure for CPU and GPU, respectively. Another problem is the high processing time required for computing all the lexical entries in the unification task. Thus, the USL approach computes a subset of lexical entries in each of the 1344 GPU cores and uses parallel processing in order to unify 155802 lexical entries. The results of the analysis conducted using the USL approach show that the USL has 95.430 lexical entries, out of which there are 35.201 considered to be positive, 22.029 negative, and 38.200 neutral. Finally, the runtime was 10 minutes for 95.430 lexical entries; this allows a reduction of the time computing for the UnifiedMetrics by 3 times.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

As advanced Cloud services are becoming mainstream, the contribution of data centers in the overall power consumption of modern cities is growing dramatically. The average consumption of a single data center is equivalent to the energy consumption of 25.000 households. Modeling the power consumption for these infrastructures is crucial to anticipate the effects of aggressive optimization policies, but accurate and fast power modeling is a complex challenge for high-end servers not yet satisfied by analytical approaches. This work proposes an automatic method, based on Multi-Objective Particle Swarm Optimization, for the identification of power models of enterprise servers in Cloud data centers. Our approach, as opposed to previous procedures, does not only consider the workload consolidation for deriving the power model, but also incorporates other non traditional factors like the static power consumption and its dependence with temperature. Our experimental results shows that we reach slightly better models than classical approaches, but simul- taneously simplifying the power model structure and thus the numbers of sensors needed, which is very promising for a short-term energy prediction. This work, validated with real Cloud applications, broadens the possibilities to derive efficient energy saving techniques for Cloud facilities.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A novel GPU-based nonparametric moving object detection strategy for computer vision tools requiring real-time processing is proposed. An alternative and efficient Bayesian classifier to combine nonparametric background and foreground models allows increasing correct detections while avoiding false detections. Additionally, an efficient region of interest analysis significantly reduces the computational cost of the detections.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This is the final report on reproducibility@xsede, a one-day workshop held in conjunction with XSEDE14, the annual conference of the Extreme Science and Engineering Discovery Environment (XSEDE). The workshop's discussion-oriented agenda focused on reproducibility in large-scale computational research. Two important themes capture the spirit of the workshop submissions and discussions: (1) organizational stakeholders, especially supercomputer centers, are in a unique position to promote, enable, and support reproducible research; and (2) individual researchers should conduct each experiment as though someone will replicate that experiment. Participants documented numerous issues, questions, technologies, practices, and potentially promising initiatives emerging from the discussion, but also highlighted four areas of particular interest to XSEDE: (1) documentation and training that promotes reproducible research; (2) system-level tools that provide build- and run-time information at the level of the individual job; (3) the need to model best practices in research collaborations involving XSEDE staff; and (4) continued work on gateways and related technologies. In addition, an intriguing question emerged from the day's interactions: would there be value in establishing an annual award for excellence in reproducible research? Overview

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract. Receptive fields of retinal and other sensory neurons show a large variety of spatiotemporal linear and non linear types of responses to local stimuli. In visual neurons, these responses present either asymmetric sensitive zones or center-surround organization. In most cases, the nature of the responses suggests the existence of a kind of distributed computation prior to the integration by the final cell which is evidently supported by the anatomy. We describe a new kind of discrete and continuous filters to model the kind of computations taking place in the receptive fields of retinal cells. To show their performance in the analysis of diferent non-trivial neuron-like structures, we use a computer tool specifically programmed by the authors to that efect. This tool is also extended to study the efect of lesions on the whole performance of our model nets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many computer vision and human-computer interaction applications developed in recent years need evaluating complex and continuous mathematical functions as an essential step toward proper operation. However, rigorous evaluation of this kind of functions often implies a very high computational cost, unacceptable in real-time applications. To alleviate this problem, functions are commonly approximated by simpler piecewise-polynomial representations. Following this idea, we propose a novel, efficient, and practical technique to evaluate complex and continuous functions using a nearly optimal design of two types of piecewise linear approximations in the case of a large budget of evaluation subintervals. To this end, we develop a thorough error analysis that yields asymptotically tight bounds to accurately quantify the approximation performance of both representations. It provides an improvement upon previous error estimates and allows the user to control the trade-off between the approximation error and the number of evaluation subintervals. To guarantee real-time operation, the method is suitable for, but not limited to, an efficient implementation in modern Graphics Processing Units (GPUs), where it outperforms previous alternative approaches by exploiting the fixed-function interpolation routines present in their texture units. The proposed technique is a perfect match for any application requiring the evaluation of continuous functions, we have measured in detail its quality and efficiency on several functions, and, in particular, the Gaussian function because it is extensively used in many areas of computer vision and cybernetics, and it is expensive to evaluate.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Esta tesis presenta un modelo, una metodología, una arquitectura, varios algoritmos y programas para crear un lexicón de sentimientos unificado (LSU) que cubre cuatro lenguas: inglés, español, portugués y chino. El objetivo principal es alinear, unificar, y expandir el conjunto de lexicones de sentimientos disponibles en Internet y los desarrollados a lo largo de esta investigación. Así, el principal problema a resolver es la tarea de unificar de forma automatizada los diferentes lexicones de sentimientos obtenidos por el crawler CSR, porque la unidad de medida para asignar la intensidad de los valores de la polaridad (de forma manual, semiautomática y automática) varía de acuerdo con las diferentes metodologías utilizadas para la construcción de cada lexicón. La representación codificada de la estructura de datos de los términos presenta también una variación en la estructura de lexicón a lexicón. Por lo que al unificar en un lexicón de sentimientos se hace posible la reutilización del conocimiento recopilado por los diferentes grupos de investigación y se incrementa, a la vez, el alcance, la calidad y la robustez de los lexicones. Nuestra metodología LSU calcula un valor unificado de la intensidad de la polaridad para cada entrada léxica que está presente en al menos dos de los lexicones de sentimientos que forman parte de este estudio. En contraste, las entradas léxicas que no son comunes en al menos dos de los lexicones conservan su valor original. El coeficiente de Pearson resultante permite medir la correlación existente entre las entradas léxicas asignándoles un rango de valores de uno a menos uno, donde uno indica que los valores de los términos están perfectamente correlacionados, cero indica que no existe correlación y menos uno significa que están inversamente correlacionados. Este procedimiento se lleva acabo con la función de MetricasUnificadas tanto en la CPU como en la GPU. Otro problema a resolver es el tiempo de procesamiento que se requiere para realizar la tarea de unificación de la intensidad de la polaridad y con ello alcanzar una cobertura mayor de lemas en los lexicones de sentimientos existentes. Asimismo, la metodología LSU utiliza el procesamiento paralelo para unificar los 155 802 términos. El algoritmo LSU procesa mediante cargas iguales el subconjunto de entradas léxicas en cada uno de los 1344 núcleos en la GPU. Los resultados de nuestro análisis arrojaron un total de 95 430 entradas léxicas donde 35 201 obtuvieron valores positivos, 22 029 negativos y 38 200 neutrales. Finalmente, el tiempo de ejecución fue de 2,506 segundos para el total de las entradas léxicas, lo que permitió reducir el procesamiento de cómputo hasta en una tercera parte con respecto al algoritmo secuencial. De estos resultados se concluye que al lograr un lexicón de sentimientos unificado que permite homogeneizar la intensidad de la polaridad de las unidades léxicas (con valores positivos, negativos y neutrales) deriva no sólo en el análisis semántico del corpus basado en los términos con una mayor carga de polaridad, o del resumen de las valoraciones o las tendencias de neuromarketing, sino también en aplicaciones como el etiquetado subjetivo de sitios web o de portales sintácticos y semánticos, por mencionar algunas. ABSTRACT This thesis presents an approach to create what we have called a Unified Sentiment Lexicon (USL). This approach aims at aligning, unifying, and expanding the set of sentiment lexicons which are available on the web in order to increase their robustness of coverage. One problem related to the task of the automatic unification of different scores of sentiment lexicons is that there are multiple lexical entries for which the classification of positive, negative, or neutral P, N, Z depends on the unit of measurement used in the annotation methodology of the source sentiment lexicon. Our USL approach computes the unified strength of polarity of each lexical entry based on the Pearson correlation coefficient which measures how correlated lexical entries are with a value between 1 and - 1 , where 1 indicates that the lexical entries are perfectly correlated, 0 indicates no correlation, and -1 means they are perfectly inversely correlated and so is the UnifiedMetrics procedure for CPU and GPU, respectively. Another problem is the high processing time required for computing all the lexical entries in the unification task. Thus, the USL approach computes a subset of lexical entries in each of the 1344 GPU cores and uses parallel processing in order to unify 155,802 lexical entries. The results of the analysis conducted using the USL approach show that the USL has 95,430 lexical entries, out of which there are 35,201 considered to be positive, 22,029 negative, and 38,200 neutral. Finally, the runtime was 2.505 seconds for 95,430 lexical entries; this allows a reduction of the time computing for the UnifiedMetrics by 3 times with respect to the sequential implementation. A key contribution of this work is that we preserve the use of a unified sentiment lexicon for all tasks. Such lexicon is used to define resources and resource-related properties that can be verified based on the results of the analysis and is powerful, general and extensible enough to express a large class of interesting properties. Some applications of this work include merging, aligning, pruning and extending the current sentiment lexicons.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

During the last three decades, FPGA technology has quickly evolved to become a major subject of research in computer and electrical engineering as it has been identified as a powerful alternative for creating highly efficient computing systems. FPGA devices offer substantial performance improvements when compared against traditional processing architectures via custom design and reconfiguration capabilities.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Debido al creciente aumento del tamaño de los datos en muchos de los actuales sistemas de información, muchos de los algoritmos de recorrido de estas estructuras pierden rendimento para realizar búsquedas en estos. Debido a que la representacion de estos datos en muchos casos se realiza mediante estructuras nodo-vertice (Grafos), en el año 2009 se creó el reto Graph500. Con anterioridad, otros retos como Top500 servían para medir el rendimiento en base a la capacidad de cálculo de los sistemas, mediante tests LINPACK. En caso de Graph500 la medicion se realiza mediante la ejecución de un algoritmo de recorrido en anchura de grafos (BFS en inglés) aplicada a Grafos. El algoritmo BFS es uno de los pilares de otros muchos algoritmos utilizados en grafos como SSSP, shortest path o Betweeness centrality. Una mejora en este ayudaría a la mejora de los otros que lo utilizan. Analisis del Problema El algoritmos BFS utilizado en los sistemas de computación de alto rendimiento (HPC en ingles) es usualmente una version para sistemas distribuidos del algoritmo secuencial original. En esta versión distribuida se inicia la ejecución realizando un particionado del grafo y posteriormente cada uno de los procesadores distribuidos computará una parte y distribuirá sus resultados a los demás sistemas. Debido a que la diferencia de velocidad entre el procesamiento en cada uno de estos nodos y la transfencia de datos por la red de interconexión es muy alta (estando en desventaja la red de interconexion) han sido bastantes las aproximaciones tomadas para reducir la perdida de rendimiento al realizar transferencias. Respecto al particionado inicial del grafo, el enfoque tradicional (llamado 1D-partitioned graph en ingles) consiste en asignar a cada nodo unos vertices fijos que él procesará. Para disminuir el tráfico de datos se propuso otro particionado (2D) en el cual la distribución se haciá en base a las aristas del grafo, en vez de a los vertices. Este particionado reducía el trafico en la red en una proporcion O(NxM) a O(log(N)). Si bien han habido otros enfoques para reducir la transferecnia como: reordemaniento inicial de los vertices para añadir localidad en los nodos, o particionados dinámicos, el enfoque que se va a proponer en este trabajo va a consistir en aplicar técnicas recientes de compression de grandes sistemas de datos como Bases de datos de alto volume o motores de búsqueda en internet para comprimir los datos de las transferencias entre nodos.---ABSTRACT---The Breadth First Search (BFS) algorithm is the foundation and building block of many higher graph-based operations such as spanning trees, shortest paths and betweenness centrality. The importance of this algorithm increases each day due to it is a key requirement for many data structures which are becoming popular nowadays. These data structures turn out to be internally graph structures. When the BFS algorithm is parallelized and the data is distributed into several processors, some research shows a performance limitation introduced by the interconnection network [31]. Hence, improvements on the area of communications may benefit the global performance in this key algorithm. In this work it is presented an alternative compression mechanism. It differs with current existing methods in that it is aware of characteristics of the data which may benefit the compression. Apart from this, we will perform a other test to see how this algorithm (in a dis- tributed scenario) benefits from traditional instruction-based optimizations. Last, we will review the current supercomputing techniques and the related work being done in the area.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It has been shown that cloud computing brings cost benefits and promotes efficiency in the operations of the organizations, no matter what their type or size. However, few public organizations are benefiting from this new paradigm shift in the way the organizations consume and manage computational resources. The objective of this thesis is to analyze both internal and external factors that may influence the adoption of cloud computing by public organizations and propose possible strategies that can assist these organizations in their path to cloud usage. In order to achieve this objective, a SWOT analysis has been conducted, detecting internal factors (strengths and weaknesses) and external factors (opportunities and threats) that can influence the adoption of a governmental cloud. With the application of a TOWS matrix, by combining the internal and external factors, a list of possible strategies have been formulated to be used as a guide to decision-making related to the transition to a cloud environment.