961 resultados para graphics processing unit (GPU)
Resumo:
The aim of my thesis is to parallelize the Weighting Histogram Analysis Method (WHAM), which is a popular algorithm used to calculate the Free Energy of a molucular system in Molecular Dynamics simulations. WHAM works in post processing in cooperation with another algorithm called Umbrella Sampling. Umbrella Sampling has the purpose to add a biasing in the potential energy of the system in order to force the system to sample a specific region in the configurational space. Several N independent simulations are performed in order to sample all the region of interest. Subsequently, the WHAM algorithm is used to estimate the original system energy starting from the N atomic trajectories. The parallelization of WHAM has been performed through CUDA, a language that allows to work in GPUs of NVIDIA graphic cards, which have a parallel achitecture. The parallel implementation may sensibly speed up the WHAM execution compared to previous serial CPU imlementations. However, the WHAM CPU code presents some temporal criticalities to very high numbers of interactions. The algorithm has been written in C++ and executed in UNIX systems provided with NVIDIA graphic cards. The results were satisfying obtaining an increase of performances when the model was executed on graphics cards with compute capability greater. Nonetheless, the GPUs used to test the algorithm is quite old and not designated for scientific calculations. It is likely that a further performance increase will be obtained if the algorithm would be executed in clusters of GPU at high level of computational efficiency. The thesis is organized in the following way: I will first describe the mathematical formulation of Umbrella Sampling and WHAM algorithm with their apllications in the study of ionic channels and in Molecular Docking (Chapter 1); then, I will present the CUDA architectures used to implement the model (Chapter 2); and finally, the results obtained on model systems will be presented (Chapter 3).
Resumo:
Efficient image blurring techniques based on the pyramid algorithm can be implemented on modern graphics hardware; thus, image blurring with arbitrary blur width is possible in real time even for large images. However, pyramidal blurring methods do not achieve the image quality provided by convolution filters; in particular, the shape of the corresponding filter kernel varies locally, which potentially results in objectionable rendering artifacts. In this work, a new analysis filter is designed that significantly reduces this variation for a particular pyramidal blurring technique. Moreover, the pyramidal blur algorithm is generalized to allow for a continuous variation of the blur width. Furthermore, an efficient implementation for programmable graphics hardware is presented. The proposed method is named “quasi-convolution pyramidal blurring” since the resulting effect is very close to image blurring based on a convolution filter for many applications.
Resumo:
We present a high performance-yet low cost-system for multi-view rendering in virtual reality (VR) applications. In contrast to complex CAVE installations, which are typically driven by one render client per view, we arrange eight displays in an octagon around the viewer to provide a full 360° projection, and we drive these eight displays by a single PC equipped with multiple graphics units (GPUs). In this paper we describe the hardware and software setup, as well as the necessary low-level and high-level optimizations to optimally exploit the parallelism of this multi-GPU multi-view VR system.
Resumo:
In this paper we present a scalable software architecture for on-line multi-camera video processing, that guarantees a good trade off between computational power, scalability and flexibility. The software system is modular and its main blocks are the Processing Units (PUs), and the Central Unit. The Central Unit works as a supervisor of the running PUs and each PU manages the acquisition phase and the processing phase. Furthermore, an approach to easily parallelize the desired processing application has been presented. In this paper, as case study, we apply the proposed software architecture to a multi-camera system in order to efficiently manage multiple 2D object detection modules in a real-time scenario. System performance has been evaluated under different load conditions such as number of cameras and image sizes. The results show that the software architecture scales well with the number of camera and can easily works with different image formats respecting the real time constraints. Moreover, the parallelization approach can be used in order to speed up the processing tasks with a low level of overhead
Resumo:
A novel GPU-based nonparametric moving object detection strategy for computer vision tools requiring real-time processing is proposed. An alternative and efficient Bayesian classifier to combine nonparametric background and foreground models allows increasing correct detections while avoiding false detections. Additionally, an efficient region of interest analysis significantly reduces the computational cost of the detections.
Resumo:
Esta tesis presenta un modelo, una metodología, una arquitectura, varios algoritmos y programas para crear un lexicón de sentimientos unificado (LSU) que cubre cuatro lenguas: inglés, español, portugués y chino. El objetivo principal es alinear, unificar, y expandir el conjunto de lexicones de sentimientos disponibles en Internet y los desarrollados a lo largo de esta investigación. Así, el principal problema a resolver es la tarea de unificar de forma automatizada los diferentes lexicones de sentimientos obtenidos por el crawler CSR, porque la unidad de medida para asignar la intensidad de los valores de la polaridad (de forma manual, semiautomática y automática) varía de acuerdo con las diferentes metodologías utilizadas para la construcción de cada lexicón. La representación codificada de la estructura de datos de los términos presenta también una variación en la estructura de lexicón a lexicón. Por lo que al unificar en un lexicón de sentimientos se hace posible la reutilización del conocimiento recopilado por los diferentes grupos de investigación y se incrementa, a la vez, el alcance, la calidad y la robustez de los lexicones. Nuestra metodología LSU calcula un valor unificado de la intensidad de la polaridad para cada entrada léxica que está presente en al menos dos de los lexicones de sentimientos que forman parte de este estudio. En contraste, las entradas léxicas que no son comunes en al menos dos de los lexicones conservan su valor original. El coeficiente de Pearson resultante permite medir la correlación existente entre las entradas léxicas asignándoles un rango de valores de uno a menos uno, donde uno indica que los valores de los términos están perfectamente correlacionados, cero indica que no existe correlación y menos uno significa que están inversamente correlacionados. Este procedimiento se lleva acabo con la función de MetricasUnificadas tanto en la CPU como en la GPU. Otro problema a resolver es el tiempo de procesamiento que se requiere para realizar la tarea de unificación de la intensidad de la polaridad y con ello alcanzar una cobertura mayor de lemas en los lexicones de sentimientos existentes. Asimismo, la metodología LSU utiliza el procesamiento paralelo para unificar los 155 802 términos. El algoritmo LSU procesa mediante cargas iguales el subconjunto de entradas léxicas en cada uno de los 1344 núcleos en la GPU. Los resultados de nuestro análisis arrojaron un total de 95 430 entradas léxicas donde 35 201 obtuvieron valores positivos, 22 029 negativos y 38 200 neutrales. Finalmente, el tiempo de ejecución fue de 2,506 segundos para el total de las entradas léxicas, lo que permitió reducir el procesamiento de cómputo hasta en una tercera parte con respecto al algoritmo secuencial. De estos resultados se concluye que al lograr un lexicón de sentimientos unificado que permite homogeneizar la intensidad de la polaridad de las unidades léxicas (con valores positivos, negativos y neutrales) deriva no sólo en el análisis semántico del corpus basado en los términos con una mayor carga de polaridad, o del resumen de las valoraciones o las tendencias de neuromarketing, sino también en aplicaciones como el etiquetado subjetivo de sitios web o de portales sintácticos y semánticos, por mencionar algunas. ABSTRACT This thesis presents an approach to create what we have called a Unified Sentiment Lexicon (USL). This approach aims at aligning, unifying, and expanding the set of sentiment lexicons which are available on the web in order to increase their robustness of coverage. One problem related to the task of the automatic unification of different scores of sentiment lexicons is that there are multiple lexical entries for which the classification of positive, negative, or neutral P, N, Z depends on the unit of measurement used in the annotation methodology of the source sentiment lexicon. Our USL approach computes the unified strength of polarity of each lexical entry based on the Pearson correlation coefficient which measures how correlated lexical entries are with a value between 1 and - 1 , where 1 indicates that the lexical entries are perfectly correlated, 0 indicates no correlation, and -1 means they are perfectly inversely correlated and so is the UnifiedMetrics procedure for CPU and GPU, respectively. Another problem is the high processing time required for computing all the lexical entries in the unification task. Thus, the USL approach computes a subset of lexical entries in each of the 1344 GPU cores and uses parallel processing in order to unify 155,802 lexical entries. The results of the analysis conducted using the USL approach show that the USL has 95,430 lexical entries, out of which there are 35,201 considered to be positive, 22,029 negative, and 38,200 neutral. Finally, the runtime was 2.505 seconds for 95,430 lexical entries; this allows a reduction of the time computing for the UnifiedMetrics by 3 times with respect to the sequential implementation. A key contribution of this work is that we preserve the use of a unified sentiment lexicon for all tasks. Such lexicon is used to define resources and resource-related properties that can be verified based on the results of the analysis and is powerful, general and extensible enough to express a large class of interesting properties. Some applications of this work include merging, aligning, pruning and extending the current sentiment lexicons.
Resumo:
Syntax denotes a rule system that allows one to predict the sequencing of communication signals. Despite its significance for both human speech processing and animal acoustic communication, the representation of syntactic structure in the mammalian brain has not been studied electrophysiologically at the single-unit level. In the search for a neuronal correlate for syntax, we used playback of natural and temporally destructured complex species-specific communication calls—so-called composites—while recording extracellularly from neurons in a physiologically well defined area (the FM–FM area) of the mustached bat’s auditory cortex. Even though this area is known to be involved in the processing of target distance information for echolocation, we found that units in the FM–FM area were highly responsive to composites. The finding that neuronal responses were strongly affected by manipulation in the time domain of the natural composite structure lends support to the hypothesis that syntax processing in mammals occurs at least at the level of the nonprimary auditory cortex.
Resumo:
The transporter associated with antigen processing (TAP) comprises two subunits, TAP1 and TAP2, each containing a hydrophobic membrane-spanning region (MSR) and a nucleotide binding domain (NBD). The TAP1/TAP2 complex is required for peptide translocation across the endoplasmic reticulum membrane. To understand the role of each structural unit of the TAP1/TAP2 complex, we generated two chimeras containing TAP1 MSR and TAP2 NBD (T1MT2C) or TAP2 MSR and TAP1 NBD (T2MT1C). We show that TAP1/T2MT1C, TAP2/T1MT2C, and T1MT2C/T2MT1C complexes bind peptide with an affinity comparable to wild-type complexes. By contrast, TAP1/T1MT2C and TAP2/T2MT1C complexes, although observed, are impaired for peptide binding. Thus, the MSRs of both TAP1 and TAP2 are required for binding peptide. However, neither NBD contains unique determinants required for peptide binding. The NBD-switched complexes, T1MT2C/T2MT1C, TAP1/T2MT1C, and TAP2/T1MT2C, all translocate peptides, but with progressively reduced efficiencies relative to the TAP1/TAP2 complex. These results indicate that both nucleotide binding sites are catalytically active and support an alternating catalytic sites model for the TAP transport cycle, similar to that proposed for P-glycoprotein. The enhanced translocation efficiency of TAP1/T2MT1C relative to TAP2/T1MT2C complexes correlates with enhanced binding of the TAP1 NBD-containing constructs to ATP-agarose beads. Preferential ATP interaction with TAP1, if occurring in vivo, might polarize the transport cycle such that ATP binding to TAP1 initiates the cycle. However, our observations that TAP complexes containing two identical TAP NBDs can mediate translocation indicate that distinct properties of the nucleotide binding site per se are not essential for the TAP catalytic cycle.
Resumo:
Most chloroplast genes in vascular plants are organized into polycistronic transcription units, which generate a complex pattern of mono-, di-, and polycistronic transcripts. In contrast, most Chlamydomonas reinhardtii chloroplast transcripts characterized to date have been monocistronic. This paper describes the atpA gene cluster in the C. reinhardtii chloroplast genome, which includes the atpA, psbI, cemA, and atpH genes, encoding the α-subunit of the coupling-factor-1 (CF1) ATP synthase, a small photosystem II polypeptide, a chloroplast envelope membrane protein, and subunit III of the CF0 ATP synthase, respectively. We show that promoters precede the atpA, psbI, and atpH genes, but not the cemA gene, and that cemA mRNA is present only as part of di-, tri-, or tetracistronic transcripts. Deletions introduced into the gene cluster reveal, first, that CF1-α can be translated from di- or polycistronic transcripts, and, second, that substantial reductions in mRNA quantity have minimal effects on protein synthesis rates. We suggest that posttranscriptional mRNA processing is common in C. reinhardtii chloroplasts, permitting the expression of multiple genes from a single promoter.
Resumo:
Proteins anchored to the cell membrane via a glycosylphosphatidylinositol (GPI) moiety are found in all eukaryotes. After NH2-terminal peptide cleavage of the nascent protein by the signal peptidase, a second COOH-terminal signal peptide is cleaved with the concomitant addition of the GPI unit. The proposed mechanism of the GPI transfer is a transamidation reaction that involves the formation of an activated carbonyl intermediate (enzyme-substrate complex) with the ethanolamine moiety of the preassembled GPI unit serving as a nucleophile. Other nucleophilic acceptors like hydrazine (HDZ) and hydroxylamine have been shown to be possible alternate substrates for GPI. Since GPI has yet to be purified, the use of readily available nucleophilic substitutes such as HDZ and hydroxylamine is a viable alternative to study COOH-terminal processing by the putative transamidase. As a first step in developing a soluble system to study this process, we have examined the amino acid requirements at the COOH terminus for the transamidation reaction using HDZ as the nucleophilic acceptor instead of GPI. The hydrazide-forming reaction shows identical amino acid requirement profiles to that of GPI anchor addition. Additionally, we have studied other parameters relating to the kinetics of the transamidation reaction in the context of rough microsomal membranes. The findings with HDZ provide further evidence for the transamidase nature of the enzyme and also provide a starting point for development of a soluble assay.
Resumo:
A parallel algorithm for image noise removal is proposed. The algorithm is based on peer group concept and uses a fuzzy metric. An optimization study on the use of the CUDA platform to remove impulsive noise using this algorithm is presented. Moreover, an implementation of the algorithm on multi-core platforms using OpenMP is presented. Performance is evaluated in terms of execution time and a comparison of the implementation parallelised in multi-core, GPUs and the combination of both is conducted. A performance analysis with large images is conducted in order to identify the amount of pixels to allocate in the CPU and GPU. The observed time shows that both devices must have work to do, leaving the most to the GPU. Results show that parallel implementations of denoising filters on GPUs and multi-cores are very advisable, and they open the door to use such algorithms for real-time processing.
Resumo:
Gasoline coming from refinery fluid catalytic cracking (FCC) unit is a major contributor to the total commercial grade gasoline pool. The contents of the FCC gasoline are primarily paraffins, naphthenes, olefins, aromatics, and undesirables such as sulfur and sulfur containing compounds in low quantities. The proportions of these components in the FCC gasoline invariable determine its quality as well as the performance of the associated downstream units. The increasing demand for cleaner and lighter fuels significantly influences the need not only for novel processing technologies but also for alternative refinery and petrochemical feedstocks. Current and future clean gasoline requirements include increased isoparaffins contents, reduced olefin contents, reduced aromatics, reduced benzene, and reduced sulfur contents. The present study is aimed at investigating the effect of processing an unconventional refinery feedstock, composed of blend of vacuum gas oil (VGO) and low density polyethylene (LDPE) on FCC full range gasoline yields and compositional spectrum including its paraffins, isoparaffins, olefins, napthenes, and aromatics contents distribution within a range of operating variables of temperature (500–700 °C) and catalyst-feed oil ratio (CFR 5–10) using spent equilibrium FCC Y-zeolite based catalyst in a FCC pilot plant operated at the University of Alicante’s Research Institute of Chemical Process Engineering (RICPE). The coprocessing of the oil-polymer blend led to the production of gasoline with very similar yields and compositions as those obtained from the base oil, albeit, in some cases, the contribution of the feed polymer content as well as the processing variables on the gasoline compositional spectrum were appreciated. Carbon content analysis showed a higher fraction of the C9–C12 compounds at all catalyst rates employed and for both feedstocks. The gasoline’s paraffinicity, olefinicity, and degrees of branching of the paraffins and olefins were also affected in various degrees by the scale of operating severity. In the majority of the cases, the gasoline aromatics tended toward the decrease as the reactor temperature was increased. While the paraffins and iso-paraffins gasoline contents were relatively stable at around 5 % wt, the olefin contents on the other hand generally increased with increase in the FCC reactor temperature.
Resumo:
Paper submitted to the XVIII Conference on Design of Circuits and Integrated Systems (DCIS), Ciudad Real, España, 2003.