977 resultados para Scalable Nanofabrication


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Numerical Linear Algebra (NLA) kernels are at the heart of all computational problems. These kernels require hardware acceleration for increased throughput. NLA Solvers for dense and sparse matrices differ in the way the matrices are stored and operated upon although they exhibit similar computational properties. While ASIC solutions for NLA Solvers can deliver high performance, they are not scalable, and hence are not commercially viable. In this paper, we show how NLA kernels can be accelerated on REDEFINE, a scalable runtime reconfigurable hardware platform. Compared to a software implementation, Direct Solver (Modified Faddeev's algorithm) on REDEFINE shows a 29X improvement on an average and Iterative Solver (Conjugate Gradient algorithm) shows a 15-20% improvement. We further show that solution on REDEFINE is scalable over larger problem sizes without any notable degradation in performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Real-Time services are traditionally supported on circuit switched network. However, there is a need to port these services on packet switched network. Architecture for audio conferencing application over the Internet in the light of ITU-T H.323 recommendations is considered. In a conference, considering packets only from a set of selected clients can reduce speech quality degradation because mixing packets from all clients can lead to lack of speech clarity. A distributed algorithm and architecture for selecting clients for mixing is suggested here based on a new quantifier of the voice activity called “Loudness Number” (LN). The proposed system distributes the computation load and reduces the load on client terminals. The highlights of this architecture are scalability, bandwidth saving and speech quality enhancement. Client selection for playing out tries to mimic a physical conference where the most vocal participants attract more attention. The contributions of the paper are expected to aid H.323 recommendations implementations for Multipoint Processors (MP). A working prototype based on the proposed architecture is already functional.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With ever increasing network speed, scalable and reliable detection of network port scans has become a major challenge. In this paper, we present a scalable and flexible architecture and a novel algorithm, to detect and block port scans in real time. The proposed architecture detects fast scanners as well as stealth scanners having large inter-probe periods. FPGA implementation of the proposed system gives an average throughput of 2 Gbps with a system clock frequency of 100 MHz on Xilinx Virtex-II Pro FPGA. Experimental results on real network trace show the effectiveness of the proposed system in detecting and blocking network scans with very low false positives and false negatives.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of identifying user intent has received considerable attention in recent years, particularly in the context of improving the search experience via query contextualization. Intent can be characterized by multiple dimensions, which are often not observed from query words alone. Accurate identification of Intent from query words remains a challenging problem primarily because it is extremely difficult to discover these dimensions. The problem is often significantly compounded due to lack of representative training sample. We present a generic, extensible framework for learning the multi-dimensional representation of user intent from the query words. The approach models the latent relationships between facets using tree structured distribution which leads to an efficient and convergent algorithm, FastQ, for identifying the multi-faceted intent of users based on just the query words. We also incorporated WordNet to extend the system capabilities to queries which contain words that do not appear in the training data. Empirical results show that FastQ yields accurate identification of intent when compared to a gold standard.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose a power scalable digital base band for a low-IF receiver for IEEE 802.15.4-2006. The digital section's sampling frequency and bit width are used as knobs to reduce the power under favorable signal and interference scenarios, thus recovering the design margins introduced to handle worst case conditions. We propose tuning of these knobs based on measurements of Signal and the interference levels. We show that in a 0.13u CMOS technology, for an adaptive digital base band section of the receiver designed to meet the 802.15.4 standard specification, power saving can be up to nearly 85% (0.49mW against 3.3mW) in favorable interference and signal conditions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Structural Support Vector Machines (SSVMs) have recently gained wide prominence in classifying structured and complex objects like parse-trees, image segments and Part-of-Speech (POS) tags. Typical learning algorithms used in training SSVMs result in model parameters which are vectors residing in a large-dimensional feature space. Such a high-dimensional model parameter vector contains many non-zero components which often lead to slow prediction and storage issues. Hence there is a need for sparse parameter vectors which contain a very small number of non-zero components. L1-regularizer and elastic net regularizer have been traditionally used to get sparse model parameters. Though L1-regularized structural SVMs have been studied in the past, the use of elastic net regularizer for structural SVMs has not been explored yet. In this work, we formulate the elastic net SSVM and propose a sequential alternating proximal algorithm to solve the dual formulation. We compare the proposed method with existing methods for L1-regularized Structural SVMs. Experiments on large-scale benchmark datasets show that the proposed dual elastic net SSVM trained using the sequential alternating proximal algorithm scales well and results in highly sparse model parameters while achieving a comparable generalization performance. Hence the proposed sequential alternating proximal algorithm is a competitive method to achieve sparse model parameters and a comparable generalization performance when elastic net regularized Structural SVMs are used on very large datasets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Development towards the combination of miniaturization and improved functionality of RFIC has been stalled due to the lack of high-performance integrated inductors. To meet this challenge, integration of magnetic material with high permeability as well as low conductivity is a must. Ferrite films are excellent candidates for RF devices due to their low cost, high resistivity, and low eddy current losses. Unlike its bulk counterpart, nanocrystalline zinc ferrite, because of partial inversion in the spinel structure, exhibits novel magnetic properties suitable for RF applications. However, most scalable ferrite film deposition processes require either high temperature or expensive equipment or both. We report a novel low temperature (< 200 degrees C) solution-based deposition process for obtaining high quality, polycrystalline zinc ferrite thin films (ZFTF) on Si (100) and on CMOS-foundry-fabricated spiral inductor structures, rapidly, using safe solvents and precursors. An enhancement of up to 20% at 5 GHz in the inductance of a fabricated device was achieved due to the deposited ZFTF. Substantial inductance enhancement requires sufficiently thick films and our reported process is capable of depositing smooth, uniform films as thick as similar to 20 mu m just by altering the solution composition. The method is capable of depositing film conformally on a surface with complex geometry. As it requires neither a vacuum system nor any post-deposition processing, the method reported here has a low thermal budget, making it compatible with modern CMOS process flow.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In social choice theory, preference aggregation refers to computing an aggregate preference over a set of alternatives given individual preferences of all the agents. In real-world scenarios, it may not be feasible to gather preferences from all the agents. Moreover, determining the aggregate preference is computationally intensive. In this paper, we show that the aggregate preference of the agents in a social network can be computed efficiently and with sufficient accuracy using preferences elicited from a small subset of critical nodes in the network. Our methodology uses a model developed based on real-world data obtained using a survey on human subjects, and exploits network structure and homophily of relationships. Our approach guarantees good performance for aggregation rules that satisfy a property which we call expected weak insensitivity. We demonstrate empirically that many practically relevant aggregation rules satisfy this property. We also show that two natural objective functions in this context satisfy certain properties, which makes our methodology attractive for scalable preference aggregation over large scale social networks. We conclude that our approach is superior to random polling while aggregating preferences related to individualistic metrics, whereas random polling is acceptable in the case of social metrics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The ability to perform strong updates is the main contributor to the precision of flow-sensitive pointer analysis algorithms. Traditional flow-sensitive pointer analyses cannot strongly update pointers residing in the heap. This is a severe restriction for Java programs. In this paper, we propose a new flow-sensitive pointer analysis algorithm for Java that can perform strong updates on heap-based pointers effectively. Instead of points-to graphs, we represent our points-to information as maps from access paths to sets of abstract objects. We have implemented our analysis and run it on several large Java benchmarks. The results show considerable improvement in precision over the points-to graph based flow-insensitive and flow-sensitive analyses, with reasonable running time.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A power scalable receiver architecture is presented for low data rate Wireless Sensor Network (WSN) applications in 130nm RF-CMOS technology. Power scalable receiver is motivated by the ability to leverage lower run-time performance requirement to save power. The proposed receiver is able to switch power settings based on available signal and interference levels while maintaining requisite BER. The Low-IF receiver consists of Variable Noise and Linearity LNA, IQ Mixers, VGA, Variable Order Complex Bandpass Filter and Variable Gain and Bandwidth Amplifier (VGBWA) capable of driving variable sampling rate ADC. Various blocks have independent power scaling controls depending on their noise, gain and interference rejection (IR) requirements. The receiver is designed for constant envelope QPSK-type modulation with 2.4GHz RF input, 3MHz IF and 2MHz bandwidth. The chip operates at 1V Vdd with current scalable from 4.5mA to 1.3mA and chip area of 0.65mm2.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Structural Support Vector Machines (SSVMs) and Conditional Random Fields (CRFs) are popular discriminative methods used for classifying structured and complex objects like parse trees, image segments and part-of-speech tags. The datasets involved are very large dimensional, and the models designed using typical training algorithms for SSVMs and CRFs are non-sparse. This non-sparse nature of models results in slow inference. Thus, there is a need to devise new algorithms for sparse SSVM and CRF classifier design. Use of elastic net and L1-regularizer has already been explored for solving primal CRF and SSVM problems, respectively, to design sparse classifiers. In this work, we focus on dual elastic net regularized SSVM and CRF. By exploiting the weakly coupled structure of these convex programming problems, we propose a new sequential alternating proximal (SAP) algorithm to solve these dual problems. This algorithm works by sequentially visiting each training set example and solving a simple subproblem restricted to a small subset of variables associated with that example. Numerical experiments on various benchmark sequence labeling datasets demonstrate that the proposed algorithm scales well. Further, the classifiers designed are sparser than those designed by solving the respective primal problems and demonstrate comparable generalization performance. Thus, the proposed SAP algorithm is a useful alternative for sparse SSVM and CRF classifier design.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An efficient and scalable total synthesis of the architecturally challenging sesquiterpenoid (+/-)-penifulvin A has been accomplished via a 12-step sequence with an overall yield of 16%. For the construction of this structurally complex tetracyclic molecule, the key steps used included 1,4-conjugate addition, a Pd(0) catalyzed cross-coupling reaction between an enol phosphate and trimethyl aluminum, Claisen rearrangement using the Johnson orthoester protocol, Ti(III)-mediated reductive epoxide opening-cyclization, Lewis acid catalyzed epoxy-aldehyde rearrangement, and finally a substrate controlled oxidative cascade lactonization process.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is essential to accurately estimate the working set size (WSS) of an application for various optimizations such as to partition cache among virtual machines or reduce leakage power dissipated in an over-allocated cache by switching it OFF. However, the state-of-the-art heuristics such as average memory access latency (AMAL) or cache miss ratio (CMR) are poorly correlated to the WSS of an application due to 1) over-sized caches and 2) their dispersed nature. Past studies focus on estimating WSS of an application executing on a uniprocessor platform. Estimating the same for a chip multiprocessor (CMP) with a large dispersed cache is challenging due to the presence of concurrently executing threads/processes. Hence, we propose a scalable, highly accurate method to estimate WSS of an application. We call this method ``tagged WSS (TWSS)'' estimation method. We demonstrate the use of TWSS to switch-OFF the over-allocated cache ways in Static and Dynamic NonUniform Cache Architectures (SNUCA, DNUCA) on a tiled CMP. In our implementation of adaptable way SNUCA and DNUCA caches, decision of altering associativity is taken by each L2 controller. Hence, this approach scales better with the number of cores present on a CMP. It gives overall (geometric mean) 26% and 19% higher energy-delay product savings compared to AMAL and CMR heuristics on SNUCA, respectively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Here, we demonstrate an uninterrupted galvanic replacement reaction (GRR) for the synthesis of metallic (Ag, Cu and Sn) and bimetallic (Cu M, M=Ag, Au, Pt and Pd) sponges/dendrites by sacrificing the low reduction potential metals (Mg in our case) in acidic medium. The acidic medium prevents the oxide formation on Mg surface and facilitates the uninterrupted reaction. The morphology of dendritic/spongy structures is controlled by the volume of acid used for this reaction. The growth mechanism of the spongy/dendritic microstructures is explained by diffusion-limited aggregate model (DLA), which is also largely affected by the volume of acid. The significance of this method is that the yield can be easily predicted, which is a major challenge for the commercialization of the products. Furthermore, the synthesis is complete in 1-2 minutes at room temperature. We show that the sponges/dendrites efficiently act as catalysts to reduce 4-nitrophenol (4-NP) to 4-aminophenol (4-AP) using NaBH4-a widely studied conversion process.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We describe our novel LED communication infrastructure and demonstrate its scalability across platforms. Our system achieves 50 kilo bits per second on very simple SoCs and scales to megabits bits per second rates on dual processor based mobile phone platforms.