918 resultados para parallel execution


Relevância:

20.00% 20.00%

Publicador:

Resumo:

A global recursive bisection algorithm is described for computing the complex zeros of a polynomial. It has complexityO(n 3 p) wheren is the degree of the polynomial andp the bit precision requirement. Ifn processors are available, it can be realized in parallel with complexityO(n 2 p); also it can be implemented using exact arithmetic. A combined Wilf-Hansen algorithm is suggested for reduction in complexity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The stability characteristics of parallel magnetic fields when fluid motions are present along the lines of force is studied. The stability criterion for both symmetric (m=0) and asymmetric (m=1) modes are discussed and the results obtained by Trehan and Singh (1978) are amended in the present study. The results obtained for the cylindrical geometry are shown to play an important role forka<4, wherek is the wave number,a is the radius of the cylinder, compared to the results obtained by Geronicolas (1977) for the slab geometry.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Solutions of the exact characteristic equations for the title problem derived earlier by an extension of Bolotin's asymptotic method are considered. These solutions, which correspond to flexural modes with frequency factor, R, greater than unity, are expressed in convenient forms for all combinations of clamped, simply supported and free conditions at the remaining pair of parallel edges. As in the case of uniform beams, the eigenvalues in the CC case are found to be equal to those of elastic modes in the FF case provided that the Kirchoff's shear condition at a free edge is replaced by the condition. The flexural modes with frequency factor less than unity are also investigated in detail by introducing a suitable modification in the procedure. When Poisson's ratios are not zero, it is shown that the frequency factor corresponding to the first symmetric mode in the free-free case is less than unity for all values of side ratio and rigidity ratios. In the case of one edge clamped and the other free it is found that modes with frequency factor less than unity exist for certain dimensions of the plate—a fact hitherto unrecognized in the literature.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Novel Biginelli dihydropyrimidines of biological interest were prepared using p-toluene sulphonic acid as an efficient catalyst. All the thirty-two synthesised dihydropyrimidines were evaluated for their in vitro antioxidant activity using DPPH method. Only, compounds 28 and 29 exhibited reasonably good antioxidant activity. Furthermore, the synthesised Biginelli compounds were subjected for their in vitro anticancer activity against MCF-7 human breast cancer cells. The title compounds were tested at the concentration of 10 μg. Compounds exhibited activity ranging from weak to moderate and, from moderate to high in terms of percentage cytotoxicity. Among them, compounds 10 and 11 exhibited significant anticancer activity. In order to elucidate the three-dimensional structure–activity relationships (3D QSAR) towards their anticancer activity, we subjected them for comparative molecular similarity indices analysis (CoMSIA). Illustration regarding their synthesis, analysis, antioxidant activity, anticancer activity and 3D QSAR study is described.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An important question which has to be answered in evaluting the suitability of a microcomputer for a control application is the time it would take to execute the specified control algorithm. In this paper, we present a method of obtaining closed-form formulas to estimate this time. These formulas are applicable to control algorithms in which arithmetic operations and matrix manipulations dominate. The method does not require writing detailed programs for implementing the control algorithm. Using this method, the execution times of a variety of control algorithms on a range of 16-bit mini- and recently announced microcomputers are calculated. The formulas have been verified independently by an analysis program, which computes the execution time bounds of control algorithms coded in Pascal when they are run on a specified micro- or minicomputer.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Multiresolution synthetic aperture radar (SAR) image formation has been proven to be beneficial in a variety of applications such as improved imaging and target detection as well as speckle reduction. SAR signal processing traditionally carried out in the Fourier domain has inherent limitations in the context of image formation at hierarchical scales. We present a generalized approach to the formation of multiresolution SAR images using biorthogonal shift-invariant discrete wavelet transform (SIDWT) in both range and azimuth directions. Particularly in azimuth, the inherent subband decomposition property of wavelet packet transform is introduced to produce multiscale complex matched filtering without involving any approximations. This generalized approach also includes the formulation of multilook processing within the discrete wavelet transform (DWT) paradigm. The efficiency of the algorithm in parallel form of execution to generate hierarchical scale SAR images is shown. Analytical results and sample imagery of diffuse backscatter are presented to validate the method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background Ankylosing spondylitis (AS) is an immune-mediated arthritis particularly targeting the spine and pelvis and is characterised by inflammation, osteoproliferation and frequently ankylosis. Current treatments that predominately target inflammatory pathways have disappointing efficacy in slowing disease progression. Thus, a better understanding of the causal association and pathological progression from inflammation to bone formation, particularly whether inflammation directly initiates osteoproliferation, is required. Methods The proteoglycan-induced spondylitis (PGISp) mouse model of AS was used to histopathologically map the progressive axial disease events, assess molecular changes during disease progression and define disease progression using unbiased clustering of semi-quantitative histology. PGISp mice were followed over a 24-week time course. Spinal disease was assessed using a novel semi-quantitative histological scoring system that independently evaluated the breadth of pathological features associated with PGISp axial disease, including inflammation, joint destruction and excessive tissue formation (osteoproliferation). Matrix components were identified using immunohistochemistry. Results Disease initiated with inflammation at the periphery of the intervertebral disc (IVD) adjacent to the longitudinal ligament, reminiscent of enthesitis, and was associated with upregulated tumor necrosis factor and metalloproteinases. After a lag phase, established inflammation was temporospatially associated with destruction of IVDs, cartilage and bone. At later time points, advanced disease was characterised by substantially reduced inflammation, excessive tissue formation and ectopic chondrocyte expansion. These distinct features differentiated affected mice into early, intermediate and advanced disease stages. Excessive tissue formation was observed in vertebral joints only if the IVD was destroyed as a consequence of the early inflammation. Ectopic excessive tissue was predominantly chondroidal with chondrocyte-like cells embedded within collagen type II- and X-rich matrix. This corresponded with upregulation of mRNA for cartilage markers Col2a1, sox9 and Comp. Osteophytes, though infrequent, were more prevalent in later disease. Conclusions The inflammation-driven IVD destruction was shown to be a prerequisite for axial disease progression to osteoproliferation in the PGISp mouse. Osteoproliferation led to vertebral body deformity and fusion but was never seen concurrent with persistent inflammation, suggesting a sequential process. The findings support that early intervention with anti-inflammatory therapies will be needed to limit destructive processes and consequently prevent progression of AS.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A diastereomeric mixture of the tripeptide Boc-Ala-Ile-Aib-OMe crystallized in the space group P1 from CH3OH/H2O. The unit cell parameters are a = 10.593(2) A, b = 14.377(3) A, c = 17.872(4) A, alpha = 104.41(2) degrees, beta = 90.55(2) degrees, gamma = 106.91(2) degrees, V = 2512.4 A3, Z = 4. X-Ray crystallographic studies show the presence of four molecules in the asymmetric unit consisting of two pairs of diastereomeric peptides, Boc-L-Ala-L-Ile-Aib-OMe and Boc-L-Ala-D-Ile-Aib-OMe. The four molecules in the asymmetric unit form a rarely found mixed antiparallel and parallel beta-sheet hydrogen bond motif. The Ala and (L,D)-Ile residues in all the four molecules adopt the extended conformations, while the phi, psi values of the Aib residues are in the right-handed helical region. In one of the molecules the Ile sidechain adopts the unusual gauche conformation about the C beta-C gamma bond.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We examine the interaction between commodity taxes and parallel imports in a two-country model with imperfect competition. While governments determine non-cooperatively their commodity tax rate, the volume of parallel imports is determined endogenously by the retailing sector. We compare the positive and normative implications of having commodity taxes based on destination or origin principle. We show that, as the volume of parallel imports increases, non-cooperative origin taxes converge, while destination taxes diverge. Moreover, origin taxes are more similar and lead to higher aggregate welfare levels than destination taxes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Emerging embedded applications are based on evolving standards (e.g., MPEG2/4, H.264/265, IEEE802.11a/b/g/n). Since most of these applications run on handheld devices, there is an increasing need for a single chip solution that can dynamically interoperate between different standards and their derivatives. In order to achieve high resource utilization and low power dissipation, we propose REDEFINE, a polymorphic ASIC in which specialized hardware units are replaced with basic hardware units that can create the same functionality by runtime re-composition. It is a ``future-proof'' custom hardware solution for multiple applications and their derivatives in a domain. In this article, we describe a compiler framework and supporting hardware comprising compute, storage, and communication resources. Applications described in high-level language (e.g., C) are compiled into application substructures. For each application substructure, a set of compute elements on the hardware are interconnected during runtime to form a pattern that closely matches the communication pattern of that particular application. The advantage is that the bounded CEs are neither processor cores nor logic elements as in FPGAs. Hence, REDEFINE offers the power and performance advantage of an ASIC and the hardware reconfigurability and programmability of that of an FPGA/instruction set processor. In addition, the hardware supports custom instruction pipelining. Existing instruction-set extensible processors determine a sequence of instructions that repeatedly occur within the application to create custom instructions at design time to speed up the execution of this sequence. We extend this scheme further, where a kernel is compiled into custom instructions that bear strong producer-consumer relationship (and not limited to frequently occurring sequences of instructions). Custom instructions, realized as hardware compositions effected at runtime, allow several instances of the same to be active in parallel. A key distinguishing factor in majority of the emerging embedded applications is stream processing. To reduce the overheads of data transfer between custom instructions, direct communication paths are employed among custom instructions. In this article, we present the overview of the hardware-aware compiler framework, which determines the NoC-aware schedule of transports of the data exchanged between the custom instructions on the interconnect. The results for the FFT kernel indicate a 25% reduction in the number of loads/stores, and throughput improves by log(n) for n-point FFT when compared to sequential implementation. Overall, REDEFINE offers flexibility and a runtime reconfigurability at the expense of 1.16x in power and 8x in area when compared to an ASIC. REDEFINE implementation consumes 0.1x the power of an FPGA implementation. In addition, the configuration overhead of the FPGA implementation is 1,000x more than that of REDEFINE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An axis-parallel k-dimensional box is a Cartesian product R-1 x R-2 x...x R-k where R-i (for 1 <= i <= k) is a closed interval of the form [a(i), b(i)] on the real line. For a graph G, its boxicity box(G) is the minimum dimension k, such that G is representable as the intersection graph of (axis-parallel) boxes in k-dimensional space. The concept of boxicity finds applications in various areas such as ecology, operations research etc. A number of NP-hard problems are either polynomial time solvable or have much better approximation ratio on low boxicity graphs. For example, the max-clique problem is polynomial time solvable on bounded boxicity graphs and the maximum independent set problem for boxicity d graphs, given a box representation, has a left perpendicular1 + 1/c log n right perpendicular(d-1) approximation ratio for any constant c >= 1 when d >= 2. In most cases, the first step usually is computing a low dimensional box representation of the given graph. Deciding whether the boxicity of a graph is at most 2 itself is NP-hard. We give an efficient randomized algorithm to construct a box representation of any graph G on n vertices in left perpendicular(Delta + 2) ln nright perpendicular dimensions, where Delta is the maximum degree of G. This algorithm implies that box(G) <= left perpendicular(Delta + 2) ln nright perpendicular for any graph G. Our bound is tight up to a factor of ln n. We also show that our randomized algorithm can be derandomized to get a polynomial time deterministic algorithm. Though our general upper bound is in terms of maximum degree Delta, we show that for almost all graphs on n vertices, their boxicity is O(d(av) ln n) where d(av) is the average degree.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider single-source, single-sink multi-hop relay networks, with slow-fading Rayleigh fading links and single-antenna relay nodes operating under the half-duplex constraint. While two hop relay networks have been studied in great detail in terms of the diversity-multiplexing tradeoff (DMT), few results are available for more general networks. In this two-part paper, we identify two families of networks that are multi-hop generalizations of the two hop network: K-Parallel-Path (KPP) networks and Layered networks. In the first part, we initially consider KPP networks, which can be viewed as the union of K node-disjoint parallel paths, each of length > 1. The results are then generalized to KPP(I) networks, which permit interference between paths and to KPP(D) networks, which possess a direct link from source to sink. We characterize the optimal DMT of KPP(D) networks with K >= 4, and KPP(I) networks with K >= 3. Along the way, we derive lower bounds for the DMT of triangular channel matrices, which are useful in DMT computation of various protocols. As a special case, the DMT of two-hop relay network without direct link is obtained. Two key implications of the results in the two-part paper are that the half-duplex constraint does not necessarily entail rate loss by a factor of two, as previously believed and that, simple AF protocols are often sufficient to attain the best possible DMT.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Parallel programming and effective partitioning of applications for embedded many-core architectures requires optimization algorithms. However, these algorithms have to quickly evaluate thousands of different partitions. We present a fast performance estimator embedded in a parallelizing compiler for streaming applications. The estimator combines a single execution-based simulation and an analytic approach. Experimental results demonstrate that the estimator has a mean error of 2.6% and computes its estimation 2848 times faster compared to a cycle accurate simulator.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Computational docking of ligands to protein structures is a key step in structure-based drug design. Currently, the time required for each docking run is high and thus limits the use of docking in a high-throughput manner, warranting parallelization of docking algorithms. AutoDock, a widely used tool, has been chosen for parallelization. Near-linear increases in speed were observed with 96 processors, reducing the time required for docking ligands to HIV-protease from 81 min, as an example, on a single IBM Power-5 processor ( 1.65 GHz), to about 1 min on an IBM cluster, with 96 such processors. This implementation would make it feasible to perform virtual ligand screening using AutoDock.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The StreamIt programming model has been proposed to exploit parallelism in streaming applications on general purpose multi-core architectures. This model allows programmers to specify the structure of a program as a set of filters that act upon data, and a set of communication channels between them. The StreamIt graphs describe task, data and pipeline parallelism which can be exploited on modern Graphics Processing Units (GPUs), as they support abundant parallelism in hardware. In this paper, we describe the challenges in mapping StreamIt to GPUs and propose an efficient technique to software pipeline the execution of stream programs on GPUs. We formulate this problem - both scheduling and assignment of filters to processors - as an efficient Integer Linear Program (ILP), which is then solved using ILP solvers. We also describe a novel buffer layout technique for GPUs which facilitates exploiting the high memory bandwidth available in GPUs. The proposed scheduling utilizes both the scalar units in GPU, to exploit data parallelism, and multiprocessors, to exploit task and pipelin parallelism. Further it takes into consideration the synchronization and bandwidth limitations of GPUs, and yields speedups between 1.87X and 36.83X over a single threaded CPU.