15 resultados para Parallel Computation
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
The modern GPUs are well suited for intensive computational tasks and massive parallel computation. Sparse matrix multiplication and linear triangular solver are the most important and heavily used kernels in scientific computation, and several challenges in developing a high performance kernel with the two modules is investigated. The main interest it to solve linear systems derived from the elliptic equations with triangular elements. The resulting linear system has a symmetric positive definite matrix. The sparse matrix is stored in the compressed sparse row (CSR) format. It is proposed a CUDA algorithm to execute the matrix vector multiplication using directly the CSR format. A dependence tree algorithm is used to determine which variables the linear triangular solver can determine in parallel. To increase the number of the parallel threads, a coloring graph algorithm is implemented to reorder the mesh numbering in a pre-processing phase. The proposed method is compared with parallel and serial available libraries. The results show that the proposed method improves the computation cost of the matrix vector multiplication. The pre-processing associated with the triangular solver needs to be executed just once in the proposed method. The conjugate gradient method was implemented and showed similar convergence rate for all the compared methods. The proposed method showed significant smaller execution time.
Resumo:
Despite their generality, conventional Volterra filters are inadequate for some applications, due to the huge number of parameters that may be needed for accurate modelling. When a state-space model of the target system is known, this can be assessed by computing its kernels, which also provides valuable information for choosing an adequate alternate Volterra filter structure, if necessary, and is useful for validating parameter estimation procedures. In this letter, we derive expressions for the kernels by using the Carleman bilinearization method, for which an efficient algorithm is given. Simulation results are presented, which confirm the usefulness of the proposed approach.
Resumo:
This long-term extension of an 8-week randomized, naturalistic study in patients with panic disorder with or without agoraphobia compared the efficacy and safety of clonazepam (n = 47) and paroxetine (n = 37) over a 3-year total treatment duration. Target doses for all patients were 2 mg/d clonazepam and 40 mg/d paroxetine (both taken at bedtime). This study reports data from the long-term period (34 months), following the initial 8-week treatment phase. Thus, total treatment duration was 36 months. Patients with a good primary outcome during acute treatment continued monotherapy with clonazepam or paroxetine, but patients with partial primary treatment success were switched to the combination therapy. At initiation of the long-term study, the mean doses of clonazepam and paroxetine were 1.9 (SD, 0.30) and 38.4 (SD, 3.74) mg/d, respectively. These doses were maintained until month 36 (clonazepam 1.9 [ SD, 0.29] mg/d and paroxetine 38.2 [SD, 3.87] mg/d). Long-term treatment with clonazepam led to a small but significantly better Clinical Global Impression (CGI)-Improvement rating than treatment with paroxetine (mean difference: CGI-Severity scale -3.48 vs -3.24, respectively, P = 0.02; CGI-Improvement scale 1.06 vs 1.11, respectively, P = 0.04). Both treatments similarly reduced the number of panic attacks and severity of anxiety. Patients treated with clonazepam had significantly fewer adverse events than those treated with paroxetine (28.9% vs 70.6%, P < 0.001). The efficacy of clonazepam and paroxetine for the treatment of panic disorder was maintained over the long-term course. There was a significant advantage with clonazepam over paroxetine with respect to the frequency and nature of adverse events.
Resumo:
In this article, we introduce two new variants of the Assembly Line Worker Assignment and Balancing Problem (ALWABP) that allow parallelization of and collaboration between heterogeneous workers. These new approaches suppose an additional level of complexity in the Line Design and Assignment process, but also higher flexibility; which may be particularly useful in practical situations where the aim is to progressively integrate slow or limited workers in conventional assembly lines. We present linear models and heuristic procedures for these two new problems. Computational results show the efficiency of the proposed approaches and the efficacy of the studied layouts in different situations. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Data visualization techniques are powerful in the handling and analysis of multivariate systems. One such technique known as parallel coordinates was used to support the diagnosis of an event, detected by a neural network-based monitoring system, in a boiler at a Brazilian Kraft pulp mill. Its attractiveness is the possibility of the visualization of several variables simultaneously. The diagnostic procedure was carried out step-by-step going through exploratory, explanatory, confirmatory, and communicative goals. This tool allowed the visualization of the boiler dynamics in an easier way, compared to commonly used univariate trend plots. In addition it facilitated analysis of other aspects, namely relationships among process variables, distinct modes of operation and discrepant data. The whole analysis revealed firstly that the period involving the detected event was associated with a transition between two distinct normal modes of operation, and secondly the presence of unusual changes in process variables at this time.
Resumo:
Consider the NP-hard problem of, given a simple graph G, to find a series-parallel subgraph of G with the maximum number of edges. The algorithm that, given a connected graph G, outputs a spanning tree of G, is a 1/2-approximation. Indeed, if n is the number of vertices in G, any spanning tree in G has n-1 edges and any series-parallel graph on n vertices has at most 2n-3 edges. We present a 7/12 -approximation for this problem and results showing the limits of our approach.
Resumo:
As in the case of most small organic molecules, the electro-oxidation of methanol to CO2 is believed to proceed through a so-called dual pathway mechanism. The direct pathway proceeds via reactive intermediates such as formaldehyde or formic acid, whereas the indirect pathway occurs in parallel, and proceeds via the formation of adsorbed carbon monoxide (COad). Despite the extensive literature on the electro-oxidation of methanol, no study to date distinguished the production of CO2 from direct and indirect pathways. Working under, far-from-equilibrium, oscillatory conditions, we were able to decouple, for the first time, the direct and indirect pathways that lead to CO2 during the oscillatory electro-oxidation of methanol on platinum. The CO2 production was followed by differential electrochemical mass spectrometry and the individual contributions of parallel pathways were identified by a combination of experiments and numerical simulations. We believe that our report opens some perspectives, particularly as a methodology to be used to identify the role played by surface modifiers in the relative weight of both pathways-a key issue to the effective development of catalysts for low temperature fuel cells.
Resumo:
Objective: Gastric development depends directly on the proliferation and differentiation of epithelial cells, and these processes are controlled by multiple elements, such as diet, hormones, and growth factors. Protein restriction affects gastrointestinal functions, but its effects on gastric growth are not fully understood. Methods: The present study evaluated cell proliferation in the gastric epithelia of rats subjected to protein restriction since gestation. Because ghrelin is increasingly expressed from the fetal to the weaning stages and might be part of growth regulation, its distribution in the stomach of rats was investigated at 14, 30, and 50 d old. Results: Although the protein restriction at 8% increased the intake of food and body weight, the body mass was lower (P < 0.05). The stomach and intestine were also smaller but increased proportionately throughout treatment. Cell proliferation was estimated through DNA synthesis and metaphase indices, and lower rates (P < 0.05) were detected at the different ages. The inhibition was concomitant with a larger number of ghrelin-immunolabeled cells at 30 and 50 d postnatally. Conclusion: Protein restriction impairs cell proliferation in the gastric epithelium, and a ghrelin upsurge under this condition is parallel to lower gastric and body growth rates. (C) 2012 Elsevier Inc. All rights reserved.
Resumo:
We study a strongly interacting "quantum dot 1" and a weakly interacting "dot 2" connected in parallel to metallic leads. Gate voltages can drive the system between Kondo-quenched and non-Kondo free-moment phases separated by Kosterlitz-Thouless quantum phase transitions. Away from the immediate vicinity of the quantum phase transitions, the physical properties retain signatures of first-order transitions found previously to arise when dot 2 is strictly noninteracting. As interactions in dot 2 become stronger relative to the dot-lead coupling, the free moment in the non-Kondo phase evolves smoothly from an isolated spin-one-half in dot 1 to a many-body doublet arising from the incomplete Kondo compensation by the leads of a combined dot spin-one. These limits, which feature very different spin correlations between dot and lead electrons, can be distinguished by weak-bias conductance measurements performed at finite temperatures.
Resumo:
This work evaluates the efficiency of economic levels of theory for the prediction of (3)J(HH) spin-spin coupling constants, to be used when robust electronic structure methods are prohibitive. To that purpose, DFT methods like mPW1PW91. B3LYP and PBEPBE were used to obtain coupling constants for a test set whose coupling constants are well known. Satisfactory results were obtained in most of cases, with the mPW1PW91/6-31G(d,p)//B3LYP/6-31G(d,p) leading the set. In a second step. B3LYP was replaced by the semiempirical methods PM6 and RM1 in the geometry optimizations. Coupling constants calculated with these latter structures were at least as good as the ones obtained by pure DFT methods. This is a promising result, because some of the main objectives of computational chemistry - low computational cost and time, allied to high performance and precision - were attained together. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
We analytically study the input-output properties of a neuron whose active dendritic tree, modeled as a Cayley tree of excitable elements, is subjected to Poisson stimulus. Both single-site and two-site mean-field approximations incorrectly predict a nonequilibrium phase transition which is not allowed in the model. We propose an excitable-wave mean-field approximation which shows good agreement with previously published simulation results [Gollo et al., PLoS Comput. Biol. 5, e1000402 (2009)] and accounts for finite-size effects. We also discuss the relevance of our results to experiments in neuroscience, emphasizing the role of active dendrites in the enhancement of dynamic range and in gain control modulation.
Resumo:
Measurement-based quantum computation is an efficient model to perform universal computation. Nevertheless, theoretical questions have been raised, mainly with respect to realistic noise conditions. In order to shed some light on this issue, we evaluate the exact dynamics of some single-qubit-gate fidelities using the measurement-based quantum computation scheme when the qubits which are used as a resource interact with a common dephasing environment. We report a necessary condition for the fidelity dynamics of a general pure N-qubit state, interacting with this type of error channel, to present an oscillatory behavior, and we show that for the initial canonical cluster state, the fidelity oscillates as a function of time. This state fidelity oscillatory behavior brings significant variations to the values of the computational results of a generic gate acting on that state depending on the instants we choose to apply our set of projective measurements. As we shall see, considering some specific gates that are frequently found in the literature, the fast application of the set of projective measurements does not necessarily imply high gate fidelity, and likewise the slow application thereof does not necessarily imply low gate fidelity. Our condition for the occurrence of the fidelity oscillatory behavior shows that the oscillation presented by the cluster state is due exclusively to its initial geometry. Other states that can be used as resources for measurement-based quantum computation can present the same initial geometrical condition. Therefore, it is very important for the present scheme to know when the fidelity of a particular resource state will oscillate in time and, if this is the case, what are the best times to perform the measurements.
Resumo:
This paper presents a new parallel methodology for calculating the determinant of matrices of the order n, with computational complexity O(n), using the Gauss-Jordan Elimination Method and Chio's Rule as references. We intend to present our step-by-step methodology using clear mathematical language, where we will demonstrate how to calculate the determinant of a matrix of the order n in an analytical format. We will also present a computational model with one sequential algorithm and one parallel algorithm using a pseudo-code.
Resumo:
Parallel kinematic structures are considered very adequate architectures for positioning and orienti ng the tools of robotic mechanisms. However, developing dynamic models for this kind of systems is sometimes a difficult task. In fact, the direct application of traditional methods of robotics, for modelling and analysing such systems, usually does not lead to efficient and systematic algorithms. This work addre sses this issue: to present a modular approach to generate the dynamic model and through some convenient modifications, how we can make these methods more applicable to parallel structures as well. Kane’s formulati on to obtain the dynamic equations is shown to be one of the easiest ways to deal with redundant coordinates and kinematic constraints, so that a suitable c hoice of a set of coordinates allows the remaining of the modelling procedure to be computer aided. The advantages of this approach are discussed in the modelling of a 3-dof parallel asymmetric mechanisms.
Resumo:
Cutting and packing problems arise in a variety of industries, including garment, wood and shipbuilding. Irregular shape packing is a special case which admits irregular items and is much more complex due to the geometry of items. In order to ensure that items do not overlap and no item from the layout protrudes from the container, the collision free region concept was adopted. It represents all possible translations for a new item to be inserted into a container with already placed items. To construct a feasible layout, collision free region for each item is determined through a sequence of Boolean operations over polygons. In order to improve the speed of the algorithm, a parallel version of the layout construction was proposed and it was applied to a simulated annealing algorithm used to solve bin packing problems. Tests were performed in order to determine the speed improvement of the parallel version over the serial algorithm