999 resultados para Scalable documents
Resumo:
In most previous research on distributional semantics, Vector Space Models (VSMs) of words are built either from topical information (e.g., documents in which a word is present), or from syntactic/semantic types of words (e.g., dependency parse links of a word in sentences), but not both. In this paper, we explore the utility of combining these two representations to build VSM for the task of semantic composition of adjective-noun phrases. Through extensive experiments on benchmark datasets, we find that even though a type-based VSM is effective for semantic composition, it is often outperformed by a VSM built using a combination of topic- and type-based statistics. We also introduce a new evaluation task wherein we predict the composed vector representation of a phrase from the brain activity of a human subject reading that phrase. We exploit a large syntactically parsed corpus of 16 billion tokens to build our VSMs, with vectors for both phrases and words, and make them publicly available.
Resumo:
We address the problem of designing distributed algorithms for large scale networks that are robust to Byzantine faults. We consider a message passing, full information model: the adversary is malicious, controls a constant fraction of processors, and can view all messages in a round before sending out its own messages for that round. Furthermore, each bad processor may send an unlimited number of messages. The only constraint on the adversary is that it must choose its corrupt processors at the start, without knowledge of the processors’ private random bits.
A good quorum is a set of O(logn) processors, which contains a majority of good processors. In this paper, we give a synchronous algorithm which uses polylogarithmic time and Õ(vn) bits of communication per processor to bring all processors to agreement on a collection of n good quorums, solving Byzantine agreement as well. The collection is balanced in that no processor is in more than O(logn) quorums. This yields the first solution to Byzantine agreement which is both scalable and load-balanced in the full information model.
The technique which involves going from situation where slightly more than 1/2 fraction of processors are good and and agree on a short string with a constant fraction of random bits to a situation where all good processors agree on n good quorums can be done in a fully asynchronous model as well, providing an approach for extending the Byzantine agreement result to this model.
Resumo:
This paper presents a scalable, statistical ‘black-box’ model for predicting the performance of parallel programs on multi-core non-uniform memory access (NUMA) systems. We derive a model with low overhead, by reducing data collection and model training time. The model can accurately predict the behaviour of parallel applications in response to changes in their concurrency, thread layout on NUMA nodes, and core voltage and frequency. We present a framework that applies the model to achieve significant energy and energy-delay-square (ED2) savings (9% and 25%, respectively) along with performance improvement (10% mean) on an actual 16-core NUMA system running realistic application workloads. Our prediction model proves substantially more accurate than previous efforts.