166 resultados para execution traces
Resumo:
Security in a mobile communication environment is always a matter for concern, even after deploying many security techniques at device, network, and application levels. The end-to-end security for mobile applications can be made robust by developing dynamic schemes at application level which makes use of the existing security techniques varying in terms of space, time, and attacks complexities. In this paper we present a security techniques selection scheme for mobile transactions, called the Transactions-Based Security Scheme (TBSS). The TBSS uses intelligence to study, and analyzes the security implications of transactions under execution based on certain criterion such as user behaviors, transaction sensitivity levels, and credibility factors computed over the previous transactions by the users, network vulnerability, and device characteristics. The TBSS identifies a suitable level of security techniques from the repository, which consists of symmetric, and asymmetric types of security algorithms arranged in three complexity levels, covering various encryption/decryption techniques, digital signature schemes, andhashing techniques. From this identified level, one of the techniques is deployed randomly. The results shows that, there is a considerable reduction in security cost compared to static schemes, which employ pre-fixed security techniques to secure the transactions data.
Resumo:
Previous studies have shown that buffering packets in DRAM is a performance bottleneck. In order to understand the impediments in accessing the DRAM, we developed a detailed Petri net model of IP forwarding application on IXP2400 that models the different levels of the memory hierarchy. The cell based interface used to receive and transmit packets in a network processor leads to some small size DRAM accesses. Such narrow accesses to the DRAM expose the bank access latency, reducing the bandwidth that can be realized. With real traces up to 30% of the accesses are smaller than the cell size, resulting in 7.7% reduction in DRAM bandwidth. To overcome this problem, we propose buffering these small chunks of data in the on chip scratchpad memory. This scheme also exploits greater degree of parallelism between different levels of the memory hierarchy. Using real traces from the internet, we show that the transmit rate can be improved by an average of 21% over the base scheme without the use of additional hardware. Further, the impact of different traffic patterns on the network processor resources is studied. Under real traffic conditions, we show that the data bus which connects the off-chip packet buffer to the micro-engines, is the obstacle in achieving higher throughput.
Resumo:
A major concern of embedded system architects is the design for low power. We address one aspect of the problem in this paper, namely the effect of executable code compression. There are two benefits of code compression – firstly, a reduction in the memory footprint of embedded software, and secondly, potential reduction in memory bus traffic and power consumption. Since decompression has to be performed at run time it is achieved by hardware. We describe a tool called COMPASS which can evaluate a range of strategies for any given set of benchmarks and display compression ratios. Also, given an execution trace, it can compute the effect on bus toggles, and cache misses for a range of compression strategies. The tool is interactive and allows the user to vary a set of parameters, and observe their effect on performance. We describe an implementation of the tool and demonstrate its effectiveness. To the best of our knowledge this is the first tool proposed for such a purpose.
Resumo:
A polymorphic ASIC is a runtime reconfigurable hardware substrate comprising compute and communication elements. It is a ldquofuture proofrdquo custom hardware solution for multiple applications and their derivatives in a domain. Interoperability between application derivatives at runtime is achieved through hardware reconfiguration. In this paper we present the design of a single cycle Network on Chip (NoC) router that is responsible for effecting runtime reconfiguration of the hardware substrate. The router design is optimized to avoid FIFO buffers at the input port and loop back at output crossbar. It provides virtual channels to emulate a non-blocking network and supports a simple X-Y relative addressing scheme to limit the control overhead to 9 bits per packet. The 8times8 honeycomb NoC (RECONNECT) implemented in 130 nm UMC CMOS standard cell library operates at 500 MHz and has a bisection bandwidth of 28.5 GBps. The network is characterized for random, self-similar and application specific traffic patterns that model the execution of multimedia and DSP kernels with varying network loads and virtual channels. Our implementation with 4 virtual channels has an average network latency of 24 clock cycles and throughput of 62.5% of the network capacity for random traffic. For application specific traffic the latency is 6 clock cycles and throughput is 87% of the network capacity.
Resumo:
Modern database systems incorporate a query optimizer to identify the most efficient "query execution plan" for executing the declarative SQL queries submitted by users. A dynamic-programming-based approach is used to exhaustively enumerate the combinatorially large search space of plan alternatives and, using a cost model, to identify the optimal choice. While dynamic programming (DP) works very well for moderately complex queries with up to around a dozen base relations, it usually fails to scale beyond this stage due to its inherent exponential space and time complexity. Therefore, DP becomes practically infeasible for complex queries with a large number of base relations, such as those found in current decision-support and enterprise management applications. To address the above problem, a variety of approaches have been proposed in the literature. Some completely jettison the DP approach and resort to alternative techniques such as randomized algorithms, whereas others have retained DP by using heuristics to prune the search space to computationally manageable levels. In the latter class, a well-known strategy is "iterative dynamic programming" (IDP) wherein DP is employed bottom-up until it hits its feasibility limit, and then iteratively restarted with a significantly reduced subset of the execution plans currently under consideration. The experimental evaluation of IDP indicated that by appropriate choice of algorithmic parameters, it was possible to almost always obtain "good" (within a factor of twice of the optimal) plans, and in the few remaining cases, mostly "acceptable" (within an order of magnitude of the optimal) plans, and rarely, a "bad" plan. While IDP is certainly an innovative and powerful approach, we have found that there are a variety of common query frameworks wherein it can fail to consistently produce good plans, let alone the optimal choice. This is especially so when star or clique components are present, increasing the complexity of th- e join graphs. Worse, this shortcoming is exacerbated when the number of relations participating in the query is scaled upwards.
Resumo:
RECONNECT is a Network-on-Chip using a honeycomb topology. In this paper we focus on properties of general rules applicable to a variety of routing algorithms for the NoC which take into account the missing links of the honeycomb topology when compared to a mesh. We also extend the original proposal [5] and show a method to insert and extract data to and from the network. Access Routers at the boundary of the execution fabric establish connections to multiple periphery modules and create a torus to decrease the node distances. Our approach is scalable and ensures homogeneity among the compute elements in the NoC. We synthesized and evaluated the proposed enhancement in terms of power dissipation and area. Our results indicate that the impact of necessary alterations to the fabric is negligible and effects the data transfer between the fabric and the periphery only marginally.
Resumo:
The hydrolytic reactions of tetrasulphur tetranitride are studied in a homogeneous medium. Alkaline hydrolysis gives sulphite, thiosulphate, sulphate and sulphide whereas the products in acid hydrolysis are mainly sulphur dioxide, elemental sulphur and hydrogen sulphide, with traces of polythionates. Under optimum conditions, tetrasulphur tetranitride reacts with sulphite consuming 2 moles of sulphite per mole of sulphur nitride to give 2 moles of trithionate. The reaction of sulphur nitride with thiosulphuric acid gives pentathionate and tetrathionate.
Resumo:
A comparatively simple and rapid method for the identification, estimation and preparation of fatty acids has been developed, using reversed phase circular paper chromatography. The method is also suitable for the analysis of “Critical Pairs” of fatty acids and for the preparation of fatty acids. Further, when used at a higher temperature, the method is more sensitive in revealing the presence of even traces of higher fatty acids in the seeds of Adenanthera pavonina.
Resumo:
CMPs enable simultaneous execution of multiple applications on the same platforms that share cache resources. Diversity in the cache access patterns of these simultaneously executing applications can potentially trigger inter-application interference, leading to cache pollution. Whereas a large cache can ameliorate this problem, the issues of larger power consumption with increasing cache size, amplified at sub-100nm technologies, makes this solution prohibitive. In this paper in order to address the issues relating to power-aware performance of caches, we propose a caching structure that addresses the following: 1. Definition of application-specific cache partitions as an aggregation of caching units (molecules). The parameters of each molecule namely size, associativity and line size are chosen so that the power consumed by it and access time are optimal for the given technology. 2. Application-Specific resizing of cache partitions with variable and adaptive associativity per cache line, way size and variable line size. 3. A replacement policy that is transparent to the partition in terms of size, heterogeneity in associativity and line size. Through simulation studies we establish the superiority of molecular cache (caches built as aggregations of molecules) that offers a 29% power advantage over that of an equivalently performing traditional cache.
Resumo:
Clustered architecture processors are preferred for embedded systems because centralized register file architectures scale poorly in terms of clock rate, chip area, and power consumption. Although clustering helps by improving clock speed, reducing energy consumption of the logic, and making the design simpler, it introduces extra overheads by way of inter-cluster communication. This communication happens over long global wires which leads to delay in execution and significantly high energy consumption.In this paper, we propose a new instruction scheduling algorithm that exploits scheduling slacks of instructions and communication slacks of data values together to achieve better energy-performance trade-offs for clustered architectures with heterogeneous interconnect. Our instruction scheduling algorithm achieves 35% and 40% reduction in communication energy, whereas the overall energy-delay product improves by 4.5% and 6.5% respectively for 2 cluster and 4 cluster machines with marginal increase (1.6% and 1.1%) in execution time. Our test bed uses the Trimaran compiler infrastructure.
Resumo:
One of the key problems in the design of any incompletely connected multiprocessor system is to appropriately assign the set of tasks in a program to the Processing Elements (PEs) in the system. The task assignment problem has proven difficult both in theory and in practice. This paper presents a simple and efficient heuristic algorithm for assigning program tasks with precedence and communication constraints to the PEs in a Message-based Multiple-bus Multiprocessor System, M3, so that the total execution time for the program is minimized. The algorithm uses a cost function: “Minimum Distance and Parallel Transfer” to minimize the completion time. The effectiveness of the algorithm has been demonstrated by comparing the results with (i) the lower bound on the execution time of a program (task) graph and (ii) a random assignment.
Resumo:
An endo-xylanase (1,4-β-d-xylanxylanohydrolase EC 3.2.1.8) was isolated from the culture filtrate of Paecilomyces varioti Bainier. The enzyme was purified 3.2 fold with a 60% yield by gel filtration and ion exchange chromatography. The purified enzyme had a molecular weight of 25,000 with a sedimentation coefficient of 2.2 S. The isoelectric point of the enzyme was 3.9. The enzyme was obtained in crystalline form. The optimum pH range was 5.5–7.0 and the temperature, 65°C. The Michaelis constant was 2.5 mg larchwood xylan/ml. The enzyme was found to degrade xylan by an endo mechanism producing arabinose, xylobiose, xylo- and arabinosylxylo-oligosaccharides, during the initial stages of hydrolysis. On prolonged incubation, xylotriose, arabinosylxylotriose and xylobiose were the major products with traces of xylotetraose, xylose and arabinose.
Resumo:
Short-time analytical solutions of solid and liquid temperatures and freezing front have been obtained for the outward radially symmetric spherical solidification of a superheated melt. Although results are presented here only for time dependent boundary flux, the method of solution can be used for other kinds of boundary conditions also. Later, the analytical solution has been compared with the numerical solution obtained with the help of a finite difference numerical scheme in which the grid points change with the freezing front position. An efficient method of execution of the numerical scheme has been discussed in details. Graphs have been drawn for the total solidification times and temperature distributions in the solid.
Resumo:
A new language concept for high-level distributed programming is proposed. Programs are organised as a collection of concurrently executing processes. Some of these processes, referred to as liaison processes, have a monitor-like structure and contain ports which may be invoked by other processes for the purposes of synchronisation and communication. Synchronisation is achieved by conditional activation of ports and also through port control constructs which may directly specify the execution ordering of ports. These constructs implement a path-expression-like mechanism for synchronisation and are also equipped with options to provide conditional, non-deterministic and priority ordering of ports. The usefulness and expressive power of the proposed concepts are illustrated through solutions of several representative programming problems. Some implementation issues are also considered.
Resumo:
Massively parallel SIMD computing is applied to obtain an order of magnitude improvement in the executional speed of an important algorithm in VLSI design automation. The physical design of a VLSI circuit involves logic module placement as a subtask. The paper is concerned with accelerating the well known Min-cut placement technique for logic cell placement. The inherent parallelism of the Min-cut algorithm is identified, and it is shown that a parallel machine based on the efficient execution of the placement procedure.