844 resultados para GPU-friendly
Resumo:
Dans ce mémoire, nous examinons certaines propriétés des représentations distribuées de mots et nous proposons une technique pour élargir le vocabulaire des systèmes de traduction automatique neurale. En premier lieu, nous considérons un problème de résolution d'analogies bien connu et examinons l'effet de poids adaptés à la position, le choix de la fonction de combinaison et l'impact de l'apprentissage supervisé. Nous enchaînons en montrant que des représentations distribuées simples basées sur la traduction peuvent atteindre ou dépasser l'état de l'art sur le test de détection de synonymes TOEFL et sur le récent étalon-or SimLex-999. Finalament, motivé par d'impressionnants résultats obtenus avec des représentations distribuées issues de systèmes de traduction neurale à petit vocabulaire (30 000 mots), nous présentons une approche compatible à l'utilisation de cartes graphiques pour augmenter la taille du vocabulaire par plus d'un ordre de magnitude. Bien qu'originalement développée seulement pour obtenir les représentations distribuées, nous montrons que cette technique fonctionne plutôt bien sur des tâches de traduction, en particulier de l'anglais vers le français (WMT'14).
Resumo:
This thesis explores the capabilities of heterogeneous multi-core systems, based on multiple Graphics Processing Units (GPUs) in a standard desktop framework. Multi-GPU accelerated desk side computers are an appealing alternative to other high performance computing (HPC) systems: being composed of commodity hardware components fabricated in large quantities, their price-performance ratio is unparalleled in the world of high performance computing. Essentially bringing “supercomputing to the masses”, this opens up new possibilities for application fields where investing in HPC resources had been considered unfeasible before. One of these is the field of bioelectrical imaging, a class of medical imaging technologies that occupy a low-cost niche next to million-dollar systems like functional Magnetic Resonance Imaging (fMRI). In the scope of this work, several computational challenges encountered in bioelectrical imaging are tackled with this new kind of computing resource, striving to help these methods approach their true potential. Specifically, the following main contributions were made: Firstly, a novel dual-GPU implementation of parallel triangular matrix inversion (TMI) is presented, addressing an crucial kernel in computation of multi-mesh head models of encephalographic (EEG) source localization. This includes not only a highly efficient implementation of the routine itself achieving excellent speedups versus an optimized CPU implementation, but also a novel GPU-friendly compressed storage scheme for triangular matrices. Secondly, a scalable multi-GPU solver for non-hermitian linear systems was implemented. It is integrated into a simulation environment for electrical impedance tomography (EIT) that requires frequent solution of complex systems with millions of unknowns, a task that this solution can perform within seconds. In terms of computational throughput, it outperforms not only an highly optimized multi-CPU reference, but related GPU-based work as well. Finally, a GPU-accelerated graphical EEG real-time source localization software was implemented. Thanks to acceleration, it can meet real-time requirements in unpreceeded anatomical detail running more complex localization algorithms. Additionally, a novel implementation to extract anatomical priors from static Magnetic Resonance (MR) scansions has been included.
Resumo:
Chemical cross-linking has emerged as a powerful approach for the structural characterization of proteins and protein complexes. However, the correct identification of covalently linked (cross-linked or XL) peptides analyzed by tandem mass spectrometry is still an open challenge. Here we present SIM-XL, a software tool that can analyze data generated through commonly used cross-linkers (e.g., BS3/DSS). Our software introduces a new paradigm for search-space reduction, which ultimately accounts for its increase in speed and sensitivity. Moreover, our search engine is the first to capitalize on reporter ions for selecting tandem mass spectra derived from cross-linked peptides. It also makes available a 2D interaction map and a spectrum-annotation tool unmatched by any of its kind. We show SIM-XL to be more sensitive and faster than a competing tool when analyzing a data set obtained from the human HSP90. The software is freely available for academic use at http://patternlabforproteomics.org/sim-xl. A video demonstrating the tool is available at http://patternlabforproteomics.org/sim-xl/video. SIM-XL is the first tool to support XL data in the mzIdentML format; all data are thus available from the ProteomeXchange consortium (identifier PXD001677).
Resumo:
A photometric procedure for the determination of ClO(-) in tap water employing a miniaturized multicommuted flow analysis setup and an LED-based photometer is described. The analytical procedure was implemented using leucocrystal violet (LCV; 4,4', 4 ''-methylidynetris (N, N-dimethylaniline), C(25)H(31)N(3)) as a chromogenic reagent. Solenoid micropumps employed for solutions propelling were assembled together with the photometer in order to compose a compact unit of small dimensions. After control variables optimization, the system was applied for the determination of ClO(-) in samples of tap water, and aiming accuracy assessment samples were also analyzed using an independent method. Applying the paired t-test between results obtained using both methods, no significant difference at the 95% confidence level was observed. Other useful features include low reagent consumption, 2.4 mu g of LCV per determination, a linear response ranging from 0.02 up to 2.0 mg L(-1) ClO(-), a relative standard deviation of 1.0% (n = 11) for samples containing 0.2 mg L(-1) ClO(-), a detection limit of 6.0 mu g L(-1) ClO(-), a sampling throughput of 84 determinations per hour, and a waste generation of 432 mu L per determination.
Resumo:
A green and highly sensitive analytical procedure was developed for the determination of free chlorine in natural waters, based on the reaction with N,N-diethyl-p-phenylenediamine (DPD). The flow system was designed with solenoid micro-pumps in order to improve mixing conditions by pulsed flows and to minimize reagent consumption as well as waste generation. A 100-cm optical path flow cell based on a liquid core waveguide was employed to increase sensitivity. A linear response was observed within the range 10.0 to 100.0 mu g L(-1), with the detection limit, coefficient of variation and sampling rate estimated as 6.8 mu g (99.7% confidence level), 0.9% (n = 20) and 60 determinations per hour, respectively. The consumption of the most toxic reagent (DPD) was reduced 20,000-fold and 30-fold in comparison to the batch method and flow injection with continuous reagent addition, respectively. The results for natural and tap water samples agreed with those obtained by the reference batch spectrophotometric procedure at the 95% confidence level. (C) 2010 Elsevier By. All rights reserved.
Resumo:
Glyoxalated soy flour adhesives for wood particleboard added with a much smaller proportion of glyoxalated lignin or tannin and without any addition of either formaldehyde or formaldehyde-based resin are shown to yield results satisfying the relevant standard specifications for interior wood boards. Adhesive resin formulations in which the total content of natural material is either 70 or 80% of the total resin solids content gave good results. The resins comprising 70% by weight of natural material can be used in a much lower proportion on wood chips and can afford pressing times fast enough to be significant under industrial panel pressing conditions. The best formulation of all the ones tried was the one based on glyoxalated precooked soy flour (SG), to which a condensed tannin was added in water solution and a polymeric isocyanate (pMDI), where the proportions of the components SG/T/pMDI was 54/16/30 by weight. (C) 2008 Wiley Periodicals, Inc.
Resumo:
For the last decade, elliptic curve cryptography has gained increasing interest in industry and in the academic community. This is especially due to the high level of security it provides with relatively small keys and to its ability to create very efficient and multifunctional cryptographic schemes by means of bilinear pairings. Pairings require pairing-friendly elliptic curves and among the possible choices, Barreto-Naehrig (BN) curves arguably constitute one of the most versatile families. In this paper, we further expand the potential of the BN curve family. We describe BN curves that are not only computationally very simple to generate, but also specially suitable for efficient implementation on a very broad range of scenarios. We also present implementation results of the optimal ate pairing using such a curve defined over a 254-bit prime field. (C) 2001 Elsevier Inc. All rights reserved.
Resumo:
An efficient and green synthesis of thiocarbamoyl-3,5-diaryl-4,5-dihydro-1 H-pyrazoles via the condensation of chalcones with thiosemicarbazide in ethanol and KOH under ultrasound irradiation is reported. The products were isolated in good yields after short reaction times. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
The commercially available Jacobsen catalyst, Mn(salen), was occluded in hybrid polymeric membranes based on poly(dimethylsiloxane) (PDMS) and poly(vinyl alcohol) (PVA). The obtained systems were characterized by UV-vis spectroscopy and SEM techniques. The membranes were used as a catalytic barrier between two different phases: an organic substrate phase (cyclooctene or styrene) in the absence of solvent, and an aqueous solution of either t-BuOOH or H(2)O(2). Membranes containing different percentages of PVA were prepared, in order to modulate their hydrophilic/hydrophobic swelling properties. The occluded complex proved to be an efficient catalyst for the oxidation of alkenes. The new triphasic system containing a cheap and easily available catalyst allowed substrate oxidation and easy product separation using ""green"" oxidants. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Background: Provision of health information to people with aphasia is inadequate. Current practice in providing printed health education materials to people with aphasia does not routinely take into consideration their language and associated reading difficulties. Aims: This study aimed to investigate if people with aphasia can comprehend health information contained in printed health education materials and if the application of aphasia-friendly principles is effective in assisting them to comprehend health information. It was hypothesised that participants with aphasia would comprehend significantly more information from aphasia-friendly materials than from existing materials. Other aims included investigating if the effectiveness of the aphasia-friendly principles is related to aphasia severity, if people with aphasia are more confident in responding to health information questions after they have read the aphasia-friendly material, if they prefer to read the aphasia-friendly brochures, and if they prefer to read the brochure type that resulted in the greatest increase in their knowledge. Methods & Procedures: Twelve participants with mild to moderately severe aphasia were matched according to their reading abilities. A pre and post experimental design was employed with repeated measures ANOVA (p
Resumo:
Graphics processors were originally developed for rendering graphics but have recently evolved towards being an architecture for general-purpose computations. They are also expected to become important parts of embedded systems hardware -- not just for graphics. However, this necessitates the development of appropriate timing analysis techniques which would be required because techniques developed for CPU scheduling are not applicable. The reason is that we are not interested in how long it takes for any given GPU thread to complete, but rather how long it takes for all of them to complete. We therefore develop a simple method for finding an upper bound on the makespan of a group of GPU threads executing the same program and competing for the resources of a single streaming multiprocessor (whose architecture is based on NVIDIA Fermi, with some simplifying assunptions). We then build upon this method to formulate the derivation of the exact worst-case makespan (and corresponding schedule) as an optimization problem. Addressing the issue of tractability, we also present a technique for efficiently computing a safe estimate of the worstcase makespan with minimal pessimism, which may be used when finding an exact value would take too long.
Resumo:
Graphics processor units (GPUs) today can be used for computations that go beyond graphics and such use can attain a performance that is orders of magnitude greater than a normal processor. The software executing on a graphics processor is composed of a set of (often thousands of) threads which operate on different parts of the data and thereby jointly compute a result which is delivered to another thread executing on the main processor. Hence the response time of a thread executing on the main processor is dependent on the finishing time of the execution of threads executing on the GPU. Therefore, we present a simple method for calculating an upper bound on the finishing time of threads executing on a GPU, in particular NVIDIA Fermi. Developing such a method is nontrivial because threads executing on a GPU share hardware resources at very fine granularity.