36 resultados para Quadratic, sieve, CUDA, OpenMP, SOC, Tegrak1
em Chinese Academy of Sciences Institutional Repositories Grid Portal
Resumo:
Quadratic optical nonlinearity chi((2)) can be exploited in femtosecond lasers and regarded as a significant new degree of freedom for the design of short-pulse sources. We will review our recent progress on developing nonlinear quadratic technologies for femtosecond lasers. Our nonlinear laser technology offers new properties for femtosecond lasers, including optical parametric amplifier with novel working regime, efficient second harmonic generation, and time telescope.
Resumo:
The coupled differential recurrence equations for the corrections to the paraxial approximation solutions in transversely nonuniform refractive-index media are established in terms of the perturbation method. All the corrections (including the longitudinal field corrections) to the paraxial approximation solutions are presented in the weak-guidance approximation. As a concrete application, the first-order longitudinal field correction and the second-order transverse field correction to the paraxial approximation of a Gaussian beam propagating in a transversely quadratic refractive index medium are analytically investigated. (C) 1999 Optical Society of America [S0740-3232(99)00310-5].
Resumo:
We demonstrate theoretically and experimentally compensation for positive Kerr phase shifts with negative phases generated by cascade quadratic processes. Experiments show correction of small-scale self-focusing and whole-beam self-focusing in the spatial domain and self-phase modulation in the temporal domain. (C) 2001 Optical Society of America.
Resumo:
Submitted by zhangdi (zhangdi@red.semi.ac.cn) on 2009-04-13T11:45:31Z
Resumo:
Dynamic Power Management (DPM) is a technique to reduce power consumption of electronic system by selectively shutting down idle components. In this article we try to introduce back propagation network and radial basis network into the research of the system-level power management policies. We proposed two PM policies-Back propagation Power Management (BPPM) and Radial Basis Function Power Management (RBFPM) which are based on Artificial Neural Networks (ANN). Our experiments show that the two power management policies greatly lowered the system-level power consumption and have higher performance than traditional Power Management(PM) techniques-BPPM is 1.09-competitive and RBFPM is 1.08-competitive vs. 1.79, 1.45, 1.18-competitive separately for traditional timeout PM, adaptive predictive PM and stochastic PM.
Resumo:
Dynamic Power Management (DPM) is a technique to reduce power consumption of electronic system by selectively shutting down idle components. In this article we try to introduce back propagation network and radial basis network into the research of the system-level power management policies. We proposed two PM policies-Back propagation Power Management (BPPM) and Radial Basis Function Power Management (RBFPM) which are based on Artificial Neural Networks (ANN). Our experiments show that the two power management policies greatly lowered the system-level power consumption and have higher performance than traditional Power Management(PM) techniques-BPPM is 1.09-competitive and RBFPM is 1.08-competitive vs. 1.79 . 1.45 . 1.18-competitive separately for traditional timeout PM . adaptive predictive PM and stochastic PM.
Resumo:
Intel和AMD双核乃至4核处理器的推出,使得并行计算已经普及到PC机。为了充分利用多核,需要对原有程序进行多线程改造,使其充分利用多核处理带来的性能提升。该文利用共享存储编程的工业标准OpenMP对有限元方法涉及的单元计算子程序进行了并行化实现。在机群的一个双CPU的SMP节点上的测试表明,共享并行化使得该单元子程序的性能提高了一倍。
Resumo:
Intel和AMD双核乃至4核处理器的推出,使得并行计算已经普及到PC机。为了充分利用多核,需要对原有程序进行多线程改造,使其充分利用多核处理带来的性能提升。该文利用共享存储编程的工业标准OpenMP对有限元方法涉及的单元计算子程序进行了并行化实现。在机群的一个双CPU的SMP节点上的测试表明,共享并行化使得该单元子程序的性能提高了一倍。
Resumo:
OpenMP是一种支持Fortran,C/C++的共享存储并行编程标准。它基于fork-join的并行执行模型,将程序划分为并行区和串行区。近几年来,OpenMP在SMP(Symmetric Multi-Processing)和多核体系结构的并行编程中得到了广泛的应用。随着多核处理器的发展,实际的应用程序如何充分利用多个处理器核来提高运算效率也成为研究的热点。 在科学计算中,循环结构是最核心的并行对象之一。考虑到负载平衡、调度开销、同步开销等多方面因素,OpenMP标准制定了Static调度、Dynamic调度、Guided调度和Runtime调度等不同策略。针对Guided调度策略不适合递减型循环结构的缺点,本文提出了一种改进的new_guided调度策略,并在OMPi编译器上加以实现。New_guided调度策略的主要思想是对前半部分的循环采用Static调度,后半部分的循环采用Guided调度。此外,本文针对不同的循环结构,在多核处理器上对不同的调度策略进行了评测。测试结果表明,在一般情况下,OpenMP默认的Static策略的调度性能最差;对于规则的循环结构和递增的循环结构,Dynamic调度策略、Guided调度策略和new_guided策略的性能差别不大;对于递减型的循环结构,Dynamic调度策略和new_guided策略的性能相当,要优于Guided调度策略;对于求解Mandelbrot集合这类计算量集中在中间的随机循环结构,Dynamic调度策略优于其它策略,new_guided策略的性能介于Dynamic调度和Guided调度之间。 随着多核处理器的问世和发展,多线程程序设计也已经成为一个不可回避的问题。稀疏矩阵向量乘(SpMV, Sparse Matrix-Vector Multiplication)是一个十分重要且经常被大量调用的科学计算内核。SpMV的存储访问一般都极不规则,导致现有的SpMV算法效率都比较低。目前,多核处理器芯片上的内核数量正在逐步增加。这使得在多核处理器上对SpMV进行并行化加速变得非常重要。本文介绍了稀疏矩阵的两种常用的存储格式CSR和BCSR,并采用OpenMP实现了SpMV的多核并行化。此外,本文还讨论了寄存器分块算法、压缩列索引等优化技术,以及不同调度策略对多线程并行后的SpMV的影响。在曙光天阔服务器S4800A1上的测试表明,大部分矩阵都取得了可扩展、甚至是超线性的加速比,但是对于部分规模较大的矩阵,加速效果并不明显。在我们的测试中,与基于CSR实现的多线程SpMV相比,采用寄存器分块算法优化后的SpMV运算速度平均提高了28.09%。在基于CSR实现的多线程SpMV中,采用列索引优化技术后的程序比优化前的速度平均提高了13.05%。此外,本文实现了一种基于非零元个数的调度策略。在该策略中,每个线程处理几乎相同数量的非零元。我们将它和OpenMP标准提供的三种调度策略进行了测试和分析。测试结果表明:与OpenMP提供的调度策略相比,基于非零元个数的调度策略能取得更好的负载平衡;Dynamic调度和Guided调度在多线程SpMV中的性能基本相当,均优于Static调度策略。
Resumo:
在科学计算中,循环结构是最重要的并行对象之一.考虑到负载平衡、调度开销等多方面因素,OpenMP标准提供静态调度、动态调度、指导调度和运行时调度等不同策略.针对指导调度策略不适合递减型循环结构的问题,提出一种改进的new_guided指导调度策略,并在OMPi编译器上加以实现.New_guided调度策略的主要思想是对前半部分的循环采用静态调度,后半部分的循环采用指导调度.针对不同循环结构,在多核处理器上对不同调度策略进行评测.结果表明,在一般情况下,OpenMP默认的静态策略的调度性能最差;对于规则的循环结构和递增的循环结构,动态调度、指导调度和new_guided策略的性能差别不大;对于递减型的循环结构,动态调度和new_guided策略的性能相当,要优于指导调度策略;对于某些极不规则的随机循环结构,动态调度明显优于其他策略,new_guided策略的性能介于动态调度和指导调度之间.
Resumo:
近些年来,随着计算机硬件技术的高速发展,大规模并行集群系统被越来越多地用于各种科研应用等活动中,而随着多核CPU芯片的技术成熟,多核集群系统对于科学计算的处理能力得到了空前的提高,如何对科学计算中海量数据进行高效地并行计算,评估影响算法性能的相应因素,成为了一个很重要的研究方向。 快速傅立叶变换作为上个世纪公认的最重要的基础算法之一,在包括大规模科学计算处理,数字信号处理,图形图像仿真等众多领域有着广泛的应用,对此,本文结合了2008年中国最快的超级计算机曙光5000A与大规模非规则区域上的快速傅立叶变换算法,深入研究分析了该算法应用在超大规模多核并行环境下的可扩展性测试及影响性能的因素。测试结果表明,该算法在现有的超大规模并行环境下具有较好的性能,在曙光5000A上,算法在8192核的加速比达到了277倍。 本文的另一部分研究工作集中在探索现有HFFT算法在GPGPU上的并行化应用。GPU在处理能力和存储器带宽上相对CPU有明显优势,在成本和功耗上也不需要付出太大代价,这从而为并行数据处理问题提供了新的解决方案。由于图形渲染的高度并行性,使得GPU可以通过增加并行处理单元和存储器控制单元的方式提高处理能力和存储器带宽。 在实际应用中,Nvidia公司的CUDA是用于GPU计算的并行开发环境,是一个全新的软硬件架构,这个架构可以使用GPU来解决商业、工业以及科学方面的复杂计算问题。CUDA是一个完整的GPGPU解决方案,它提供了直接访问硬件的接口。由于目前GPU已在科研领域中得到广泛研究,为了利用GPU的并行数据处理能力,本文探索了一种通过GPU计算提高现有HFFT算法执行速度的途径。之后,本文对CUDA并行算法进行了实际测试,实验结果表明,GPU对并行FFT部分具有20%的加速比,而除去I/O传输后,程序的加速比是34.4倍。
Resumo:
In this correspondence, we construct some new quadratic bent functions in polynomial forms by using the theory of quadratic forms over finite fields. The results improve some previous work. Moreover, we solve a problem left by Yu and Gong in 2006.
Resumo:
Visual observation of the THF hydrate formation process in the presence of a 3A molecular sieve has been made at normal atmosphere and below a temperature of zero by microscopy. The results indicate that a 3A molecular sieve can induce the nucleation of the THF hydrate and promote the THF hydrate growth. With the existence of a 3A molecular sieve, the growth rate of THF hydrate is between 0.01 and 0.05 mu m/s. In comparison with the system without any 3A molecular sieve, the growth rate increases about 4 nm/s. After the THF hydrate grows into megacryst, the crystals will recombine and partially change under the same condition.