4 resultados para Extended-Range

em Massachusetts Institute of Technology


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Scheduling tasks to efficiently use the available processor resources is crucial to minimizing the runtime of applications on shared-memory parallel processors. One factor that contributes to poor processor utilization is the idle time caused by long latency operations, such as remote memory references or processor synchronization operations. One way of tolerating this latency is to use a processor with multiple hardware contexts that can rapidly switch to executing another thread of computation whenever a long latency operation occurs, thus increasing processor utilization by overlapping computation with communication. Although multiple contexts are effective for tolerating latency, this effectiveness can be limited by memory and network bandwidth, by cache interference effects among the multiple contexts, and by critical tasks sharing processor resources with less critical tasks. This thesis presents techniques that increase the effectiveness of multiple contexts by intelligently scheduling threads to make more efficient use of processor pipeline, bandwidth, and cache resources. This thesis proposes thread prioritization as a fundamental mechanism for directing the thread schedule on a multiple-context processor. A priority is assigned to each thread either statically or dynamically and is used by the thread scheduler to decide which threads to load in the contexts, and to decide which context to switch to on a context switch. We develop a multiple-context model that integrates both cache and network effects, and shows how thread prioritization can both maintain high processor utilization, and limit increases in critical path runtime caused by multithreading. The model also shows that in order to be effective in bandwidth limited applications, thread prioritization must be extended to prioritize memory requests. We show how simple hardware can prioritize the running of threads in the multiple contexts, and the issuing of requests to both the local memory and the network. Simulation experiments show how thread prioritization is used in a variety of applications. Thread prioritization can improve the performance of synchronization primitives by minimizing the number of processor cycles wasted in spinning and devoting more cycles to critical threads. Thread prioritization can be used in combination with other techniques to improve cache performance and minimize cache interference between different working sets in the cache. For applications that are critical path limited, thread prioritization can improve performance by allowing processor resources to be devoted preferentially to critical threads. These experimental results show that thread prioritization is a mechanism that can be used to implement a wide range of scheduling policies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Integration of inputs by cortical neurons provides the basis for the complex information processing performed in the cerebral cortex. Here, we propose a new analytic framework for understanding integration within cortical neuronal receptive fields. Based on the synaptic organization of cortex, we argue that neuronal integration is a systems--level process better studied in terms of local cortical circuitry than at the level of single neurons, and we present a method for constructing self-contained modules which capture (nonlinear) local circuit interactions. In this framework, receptive field elements naturally have dual (rather than the traditional unitary influence since they drive both excitatory and inhibitory cortical neurons. This vector-based analysis, in contrast to scalarsapproaches, greatly simplifies integration by permitting linear summation of inputs from both "classical" and "extraclassical" receptive field regions. We illustrate this by explaining two complex visual cortical phenomena, which are incompatible with scalar notions of neuronal integration.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It has been widely known that a significant part of the bits are useless or even unused during the program execution. Bit-width analysis targets at finding the minimum bits needed for each variable in the program, which ensures the execution correctness and resources saving. In this paper, we proposed a static analysis method for bit-widths in general applications, which approximates conservatively at compile time and is independent of runtime conditions. While most related work focus on integer applications, our method is also tailored and applicable to floating point variables, which could be extended to transform floating point number into fixed point numbers together with precision analysis. We used more precise representations for data value ranges of both scalar and array variables. Element level analysis is carried out for arrays. We also suggested an alternative for the standard fixed-point iterations in bi-directional range analysis. These techniques are implemented on the Trimaran compiler structure and tested on a set of benchmarks to show the results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We have discovered that the current protocols to assemble Au nanoparticles based on DNA hybridization do not work well with the small metal nanoparticles (e.g. 5 nm Au, 3.6 nm Pt and 3.2 nm Ru particles). Further investigations revealed the presence of strong interaction between the oligonucleotide backbone and the surface of the small metal nanoparticles. The oligonucleotides in this case are recumbent on the particle surface and are therefore not optimally oriented for hybridization. The nonspecific adsorption of oligonucleotides on small metal nanoparticles must be overcome before DNA hybridization can be accepted as a general assembly method. Two methods have been suggested as possible solutions to this problem. One is based on the use of stabilizer molecules which compete with the oligonucleotides for adsorption on the metal nanoparticle surface. Unfortunately, the reported success of this approach in small Au nanoparticles (using K₂BSPP) and Au films (using 6-mercapto-1-hexanol) could not be extended to the assembly of Pt and Ru nanoparticles by DNA hybridization. The second approach is to simply use larger metal particles. Indeed most reports on the DNA hybridization induced assembly of Au nanoparticles have made use of relatively large particles (>10 nm), hinting at a weaker non-specific interaction between the oligonucleotides and large Au nanoparticles. However, most current methods of nanoparticle synthesis are optimized to produce metal nanoparticles only within a narrow size range. We find that core-shell nanoparticles formed by the seeded growth method may be used to artificially enlarge the size of the metal particles to reduce the nonspecific binding of oligonucleotides. We demonstrate herein a core-shell assisted growth method to assemble Pt and Ru nanoparticles by DNA hybridization. This method involves firstly synthesizing approximately 16 nm core-shell Ag-Pt and 21 nm core-shell Au-Ru nanoparticles from 9.6 nm Ag seeds and 17.2 nm Au seeds respectively by the seed-mediated growth method. The core-shell nanoparticles were then functionalized by complementary thiolated oligonucleotides followed by aging in 0.2 M PBS buffer for 6 hours. The DNA hybridization induced bimetallic assembly of Pt and Ru nanoparticles could then be carried out in 0.3 M PBS buffer for 10 hours.