832 resultados para bigdata, data stream processing, dsp, apache storm, cyber security
Resumo:
The hot deformation characteristics of IN 600 nickel alloy are studied using hot compression testing in the temperature range 850-1200-degrees-C and strain rate range 0.001-100 s-1. A processing map for hot working is developed on the basis of the data obtained, using the principles of dynamic materials modelling. The map exhibits a single domain with a peak efficiency of power dissipation of 48% occurring at 1200-degrees-C and 0.2 s-1, at which the material undergoes dynamic recrystallisation (DRX). These are the optimum conditions for hot working of IN 600. At strain rates higher than 1 s-1, the material exhibits flow localisation and its microstructure consists of localised bands of fine recrystallised grains. The presence of iron in the Ni-Cr alloy narrows the DRX domain owing to a higher temperature required for carbide dissolution, which is essential for the occurrence of DRX. The efficiency of DRX in Ni-Cr is, however, enhanced by iron addition.
Resumo:
Compressive stress-strain curves have been generated over a range of temperatures (900-1100-degrees-C and strain rates (0.001-100 s-1) for two starting structures consisting of lath alpha2 and equiaxed alpha2 in a Ti-24Al-11Nb alloy. The data from these tests have been analysed in terms of a dynamic model for processing. The results define domains of strain rate and temperature in which dynamic recrystallization of alpha2 occurs for both starting structures. The rate controlling process for dynamic recrystallization is suggested to be cross-slip in the alpha2 phase. A region of processing instability has also been defined within which shear bands form in the lath structure. Recrystallization of the beta phase is shown to occur for different combinations of strain rate and temperature from those in which the alpha2 phase recrystallizes dynamically
Resumo:
Laser processing of structure sensitive hypereutectic ductile iron, a cast alloy employed for dynamically loaded automative components, was experimentally investigated over a wide range of process parameters: from power (0.5-2.5 kW) and scan rate (7.5-25 mm s(-1)) leading to solid state transformation, all the way through to melting followed by rapid quenching. Superfine dendritic (at 10(5) degrees C s(-1)) or feathery (at 10(4) degrees C s(-1)) ledeburite of 0.2-0.25 mu m lamellar space, gamma-austenite and carbide in the laser melted and martensite in the transformed zone or heat-affected zone were observed, depending on the process parameters. Depth of geometric profiles of laser transformed or melt zone structures, parameters such as dendrile arm spacing, volume fraction of carbide and surface hardness bear a direct relationship with the energy intensity P/UDb2, (10-100 J mm(-3)). There is a minimum energy intensity threshold for solid state transformation hardening (0.2 J mm(-3)) and similarly for the initiation of superficial melting (9 J mm(-3)) and full melting (15 J mm(-3)) in the case of ductile iron. Simulation, modeling and thermal analysis of laser processing as a three-dimensional quasi-steady moving heat source problem by a finite difference method, considering temperature dependent energy absorptivity of the material to laser radiation, thermal and physical properties (kappa, rho, c(p)) and freezing under non-equilibrium conditions employing Scheil's equation to compute the proportion of the solid enabled determination of the thermal history of the laser treated zone. This includes assessment of the peak temperature attained at the surface, temperature gradients, the freezing time and rates as well as the geometric profile of the melted, transformed or heat-affected zone. Computed geometric profiles or depth are in close agreement with the experimental data, validating the numerical scheme.
Resumo:
The hot workability of an Al-Mg-Si alloy has been studied by conducting constant strain-rate compression tests. The temperature range and strain-rate regime selected for the present study were 300-550 degrees C and 0.001-1 s(-1), respectively. On the basis of true stress data, the strain-rate sensitivity values were calculated and used for establishing processing maps following the dynamic materials model. These maps delineate characteristic domains of different dissipative mechanisms. Two domains of dynamic recrystallization (DRX) have been identified which are associated with the peak efficiency of power dissipation (34%) and complete reconstitution of as-cast microstructure. As a result, optimum hot ductility is achieved in the DRX domains. The strain rates at which DRX domains occur are determined by the second-phase particles such as Mg2Si precipitates and intermetallic compounds. The alloy also exhibits microstructural instability in the form of localized plastic deformation in the temperature range 300-350 degrees C and at strain rate 1 s(-1).
Resumo:
Biomedical engineering solutions like surgical simulators need High Performance Computing (HPC) to achieve real-time performance. Graphics Processing Units (GPUs) offer HPC capabilities at low cost and low power consumption. In this work, it is demonstrated that a liver which is discretized by about 2500 finite element nodes, can be graphically simulated in realtime, by making use of a GPU. Present work takes into consideration the time needed for the data transfer from CPU to GPU and back from GPU to CPU. Although behaviour of liver is very complicated, present computer simulation assumes linear elastostatics. One needs to use the commercial software ANSYS to obtain the global stiffness matrix of the liver. Results show that GPUs are useful for the real-time graphical simulation of liver, which in turn is needed in simulators that are used for training surgeons in laparoscopic surgery. Although the computer simulation should involve rendering also, neither rendering, nor the time needed for rendering and displaying the liver on a screen, is considered in the present work. The present work is just a demonstration of a concept; the concept is not really implemented and validated. Future work is to develop software which can accomplish real-time and very realistic graphical simulation of liver, with rendered image of liver on the screen changing in real-time according to the position of the surgical tool tip approximated as the mouse cursor in 3D.
Resumo:
In this paper we propose a new method of data handling for web servers. We call this method Network Aware Buffering and Caching (NABC for short). NABC facilitates reduction of data copies in web server's data sending path, by doing three things: (1) Layout the data in main memory in a way that protocol processing can be done without data copies (2) Keep a unified cache of data in kernel and ensure safe access to it by various processes and kernel and (3) Pass only the necessary meta data between processes so that bulk data handling time spent during IPC can be reduced. We realize NABC by implementing a set of system calls and an user library. The end product of the implementation is a set of APIs specifically designed for use by the web servers. We port an in house web server called SWEET, to NABC APIs and evaluate performance using a range of workloads both simulated and real. The results show a very impressive gain of 12% to 21% in throughput for static file serving and 1.6 to 4 times gain in throughput for lightweight dynamic content serving for a server using NABC APIs over the one using UNIX APIs.
Resumo:
Processing maps have been developed for hot deformation of Mg-2Zn-1Mn alloy in as-cast condition and after homogenization with a view to evaluate the influence of homogenization. Hot compression data in the temperature range 300-500degreesC and strain rate range 0.001-100 s(-1) were used for generating the processing map. In the map for the as-cast alloy the domain of dynamic recrystallization occurring, at 450degreesC and 0.1 s(-1) has merged with another domain occurring at 500degreesC and 0.001 s(-1) representing grain boundary cracking. The latter domain is eliminated by homogenization and the dynamic recrystallization domain expanded with a higher peak efficiency occurring at 500 degreesC and 0.05 s(-1). The flow localization occurring at strain rates higher than 5 s(-1) is unaffected by homogenization.
Resumo:
In a statistical downscaling model, it is important to remove the bias of General Circulations Model (GCM) outputs resulting from various assumptions about the geophysical processes. One conventional method for correcting such bias is standardisation, which is used prior to statistical downscaling to reduce systematic bias in the mean and variances of GCM predictors relative to the observations or National Centre for Environmental Prediction/ National Centre for Atmospheric Research (NCEP/NCAR) reanalysis data. A major drawback of standardisation is that it may reduce the bias in the mean and variance of the predictor variable but it is much harder to accommodate the bias in large-scale patterns of atmospheric circulation in GCMs (e.g. shifts in the dominant storm track relative to observed data) or unrealistic inter-variable relationships. While predicting hydrologic scenarios, such uncorrected bias should be taken care of; otherwise it will propagate in the computations for subsequent years. A statistical method based on equi-probability transformation is applied in this study after downscaling, to remove the bias from the predicted hydrologic variable relative to the observed hydrologic variable for a baseline period. The model is applied in prediction of monsoon stream flow of Mahanadi River in India, from GCM generated large scale climatological data.
Resumo:
The paper presents an adaptive Fourier filtering technique and a relaying scheme based on a combination of a digital band-pass filter along with a three-sample algorithm, for applications in high-speed numerical distance protection. To enhance the performance of above-mentioned technique, a high-speed fault detector has been used. MATLAB based simulation studies show that the adaptive Fourier filtering technique provides fast tripping for near faults and security for farther faults. The digital relaying scheme based on a combination of digital band-pass filter along with three-sample data window algorithm also provides accurate and high-speed detection of faults. The paper also proposes a high performance 16-bit fixed point DSP (Texas Instruments TMS320LF2407A) processor based hardware scheme suitable for implementation of the above techniques. To evaluate the performance of the proposed relaying scheme under steady state and transient conditions, PC based menu driven relay test procedures are developed using National Instruments LabVIEW software. The test signals are generated in real time using LabVIEW compatible analog output modules. The results obtained from the simulation studies as well as hardware implementations are also presented.
Resumo:
The memory subsystem is a major contributor to the performance, power, and area of complex SoCs used in feature rich multimedia products. Hence, memory architecture of the embedded DSP is complex and usually custom designed with multiple banks of single-ported or dual ported on-chip scratch pad memory and multiple banks of off-chip memory. Building software for such large complex memories with many of the software components as individually optimized software IPs is a big challenge. In order to obtain good performance and a reduction in memory stalls, the data buffers of the application need to be placed carefully in different types of memory. In this paper we present a unified framework (MODLEX) that combines different data layout optimizations to address the complex DSP memory architectures. Our method models the data layout problem as multi-objective genetic algorithm (GA) with performance and power being the objectives and presents a set of solution points which is attractive from a platform design viewpoint. While most of the work in the literature assumes that performance and power are non-conflicting objectives, our work demonstrates that there is significant trade-off (up to 70%) that is possible between power and performance.
Resumo:
Abstract—A new breed of processors like the Cell Broadband Engine, the Imagine stream processor and the various GPU processors emphasize data-level parallelism (DLP) and threadlevel parallelism (TLP) as opposed to traditional instructionlevel parallelism (ILP). This allows them to achieve order-ofmagnitude improvements over conventional superscalar processors for many workloads. However, it is unclear as to how much parallelism of these types exists in current programs. Most earlier studies have largely concentrated on the amount of ILP in a program, without differentiating DLP or TLP. In this study, we investigate the extent of data-level parallelism available in programs in the MediaBench suite. By packing instructions in a SIMD fashion, we observe reductions of up to 91 % (84 % on average) in the number of dynamic instructions, indicating a very high degree of DLP in several applications. I.
Resumo:
We address the problem of recognition and retrieval of relatively weak industrial signal such as Partial Discharges (PD) buried in excessive noise. The major bottleneck being the recognition and suppression of stochastic pulsive interference (PI) which has similar time-frequency characteristics as PD pulse. Therefore conventional frequency based DSP techniques are not useful in retrieving PD pulses. We employ statistical signal modeling based on combination of long-memory process and probabilistic principal component analysis (PPCA). An parametric analysis of the signal is exercised for extracting the features of desired pules. We incorporate a wavelet based bootstrap method for obtaining the noise training vectors from observed data. The procedure adopted in this work is completely different from the research work reported in the literature, which is generally based on deserved signal frequency and noise frequency.
Resumo:
Instruction reuse is a microarchitectural technique that improves the execution time of a program by removing redundant computations at run-time. Although this is the job of an optimizing compiler, they do not succeed many a time due to limited knowledge of run-time data. In this paper we examine instruction reuse of integer ALU and load instructions in network processing applications. Specifically, this paper attempts to answer the following questions: (1) How much of instruction reuse is inherent in network processing applications?, (2) Can reuse be improved by reducing interference in the reuse buffer?, (3) What characteristics of network applications can be exploited to improve reuse?, and (4) What is the effect of reuse on resource contention and memory accesses? We propose an aggregation scheme that combines the high-level concept of network traffic i.e. "flows" with a low level microarchitectural feature of programs i.e. repetition of instructions and data along with an architecture that exploits temporal locality in incoming packet data to improve reuse. We find that for the benchmarks considered, 1% to 50% of instructions are reused while the speedup achieved varies between 1% and 24%. As a side effect, instruction reuse reduces memory traffic and can therefore be considered as a scheme for low power.
Resumo:
Today's SoCs are complex designs with multiple embedded processors, memory subsystems, and application specific peripherals. The memory architecture of embedded SoCs strongly influences the power and performance of the entire system. Further, the memory subsystem constitutes a major part (typically up to 70%) of the silicon area for the current day SoC. In this article, we address the on-chip memory architecture exploration for DSP processors which are organized as multiple memory banks, where banks can be single/dual ported with non-uniform bank sizes. In this paper we propose two different methods for physical memory architecture exploration and identify the strengths and applicability of these methods in a systematic way. Both methods address the memory architecture exploration for a given target application by considering the application's data access characteristics and generates a set of Pareto-optimal design points that are interesting from a power, performance and VLSI area perspective. To the best of our knowledge, this is the first comprehensive work on memory space exploration at physical memory level that integrates data layout and memory exploration to address the system objectives from both hardware design and application software development perspective. Further we propose an automatic framework that explores the design space identifying 100's of Pareto-optimal design points within a few hours of running on a standard desktop configuration.
Resumo:
Carbon nanotubes dispersed in polymer matrix have been aligned in the form of fibers and interconnects and cured electrically and by UV light. Conductivity and effective semiconductor tunneling against reverse to forward bias field have been designed to have differentiable current-voltage response of each of the fiber/channel. The current-voltage response is a function of the strain applied to the fibers along axial direction. Biaxial and shear strains are correlated by differentiating signals from the aligned fibers/channels. Using a small doping of magnetic nanoparticles in these composite fibers, magneto-resistance properties are realized which are strong enough to use the resulting magnetostriction as a state variable for signal processing and computing. Various basic analog signal processing tasks such as addition, convolution and filtering etc. can be performed. These preliminary study shows promising application of the concept in combined analog-digital computation in carbon nanotube based fibers. Various dynamic effects such as relaxation, electric field dependent nonlinearities and hysteresis on the output signals are studied using experimental data and analytical model.