18 resultados para National Science Council (U.S.)
em Boston University Digital Common
Resumo:
This paper introduces BoostMap, a method that can significantly reduce retrieval time in image and video database systems that employ computationally expensive distance measures, metric or non-metric. Database and query objects are embedded into a Euclidean space, in which similarities can be rapidly measured using a weighted Manhattan distance. Embedding construction is formulated as a machine learning task, where AdaBoost is used to combine many simple, 1D embeddings into a multidimensional embedding that preserves a significant amount of the proximity structure in the original space. Performance is evaluated in a hand pose estimation system, and a dynamic gesture recognition system, where the proposed method is used to retrieve approximate nearest neighbors under expensive image and video similarity measures. In both systems, BoostMap significantly increases efficiency, with minimal losses in accuracy. Moreover, the experiments indicate that BoostMap compares favorably with existing embedding methods that have been employed in computer vision and database applications, i.e., FastMap and Bourgain embeddings.
Resumo:
The CIL compiler for core Standard ML compiles whole programs using a novel typed intermediate language (TIL) with intersection and union types and flow labels on both terms and types. The CIL term representation duplicates portions of the program where intersection types are introduced and union types are eliminated. This duplication makes it easier to represent type information and to introduce customized data representations. However, duplication incurs compile-time space costs that are potentially much greater than are incurred in TILs employing type-level abstraction or quantification. In this paper, we present empirical data on the compile-time space costs of using CIL as an intermediate language. The data shows that these costs can be made tractable by using sufficiently fine-grained flow analyses together with standard hash-consing techniques. The data also suggests that non-duplicating formulations of intersection (and union) types would not achieve significantly better space complexity.
Resumo:
A probabilistic, nonlinear supervised learning model is proposed: the Specialized Mappings Architecture (SMA). The SMA employs a set of several forward mapping functions that are estimated automatically from training data. Each specialized function maps certain domains of the input space (e.g., image features) onto the output space (e.g., articulated body parameters). The SMA can model ambiguous, one-to-many mappings that may yield multiple valid output hypotheses. Once learned, the mapping functions generate a set of output hypotheses for a given input via a statistical inference procedure. The SMA inference procedure incorporates an inverse mapping or feedback function in evaluating the likelihood of each of the hypothesis. Possible feedback functions include computer graphics rendering routines that can generate images for given hypotheses. The SMA employs a variant of the Expectation-Maximization algorithm for simultaneous learning of the specialized domains along with the mapping functions, and approximate strategies for inference. The framework is demonstrated in a computer vision system that can estimate the articulated pose parameters of a human’s body or hands, given silhouettes from a single image. The accuracy and stability of the SMA are also tested using synthetic images of human bodies and hands, where ground truth is known.
Resumo:
Formal correctness of complex multi-party network protocols can be difficult to verify. While models of specific fixed compositions of agents can be checked against design constraints, protocols which lend themselves to arbitrarily many compositions of agents-such as the chaining of proxies or the peering of routers-are more difficult to verify because they represent potentially infinite state spaces and may exhibit emergent behaviors which may not materialize under particular fixed compositions. We address this challenge by developing an algebraic approach that enables us to reduce arbitrary compositions of network agents into a behaviorally-equivalent (with respect to some correctness property) compact, canonical representation, which is amenable to mechanical verification. Our approach consists of an algebra and a set of property-preserving rewrite rules for the Canonical Homomorphic Abstraction of Infinite Network protocol compositions (CHAIN). Using CHAIN, an expression over our algebra (i.e., a set of configurations of network protocol agents) can be reduced to another behaviorally-equivalent expression (i.e., a smaller set of configurations). Repeated applications of such rewrite rules produces a canonical expression which can be checked mechanically. We demonstrate our approach by characterizing deadlock-prone configurations of HTTP agents, as well as establishing useful properties of an overlay protocol for scheduling MPEG frames, and of a protocol for Web intra-cache consistency.
Resumo:
BoostMap is a recently proposed method for efficient approximate nearest neighbor retrieval in arbitrary non-Euclidean spaces with computationally expensive and possibly non-metric distance measures. Database and query objects are embedded into a Euclidean space, in which similarities can be rapidly measured using a weighted Manhattan distance. The key idea is formulating embedding construction as a machine learning task, where AdaBoost is used to combine simple, 1D embeddings into a multidimensional embedding that preserves a large amount of the proximity structure of the original space. This paper demonstrates that, using the machine learning formulation of BoostMap, we can optimize embeddings for indexing and classification, in ways that are not possible with existing alternatives for constructive embeddings, and without additional costs in retrieval time. First, we show how to construct embeddings that are query-sensitive, in the sense that they yield a different distance measure for different queries, so as to improve nearest neighbor retrieval accuracy for each query. Second, we show how to optimize embeddings for nearest neighbor classification tasks, by tuning them to approximate a parameter space distance measure, instead of the original feature-based distance measure.
Resumo:
As new multi-party edge services are deployed on the Internet, application-layer protocols with complex communication models and event dependencies are increasingly being specified and adopted. To ensure that such protocols (and compositions thereof with existing protocols) do not result in undesirable behaviors (e.g., livelocks) there needs to be a methodology for the automated checking of the "safety" of these protocols. In this paper, we present ingredients of such a methodology. Specifically, we show how SPIN, a tool from the formal systems verification community, can be used to quickly identify problematic behaviors of application-layer protocols with non-trivial communication models—such as HTTP with the addition of the "100 Continue" mechanism. As a case study, we examine several versions of the specification for the Continue mechanism; our experiments mechanically uncovered multi-version interoperability problems, including some which motivated revisions of HTTP/1.1 and some which persist even with the current version of the protocol. One such problem resembles a classic degradation-of-service attack, but can arise between well-meaning peers. We also discuss how the methods we employ can be used to make explicit the requirements for hardening a protocol's implementation against potentially malicious peers, and for verifying an implementation's interoperability with the full range of allowable peer behaviors.
Resumo:
We present new, simple, efficient data structures for approximate reconciliation of set differences, a useful standalone primitive for peer-to-peer networks and a natural subroutine in methods for exact reconciliation. In the approximate reconciliation problem, peers A and B respectively have subsets of elements SA and SB of a large universe U. Peer A wishes to send a short message M to peer B with the goal that B should use M to determine as many elements in the set SB–SA as possible. To avoid the expense of round trip communication times, we focus on the situation where a single message M is sent. We motivate the performance tradeoffs between message size, accuracy and computation time for this problem with a straightforward approach using Bloom filters. We then introduce approximation reconciliation trees, a more computationally efficient solution that combines techniques from Patricia tries, Merkle trees, and Bloom filters. We present an analysis of approximation reconciliation trees and provide experimental results comparing the various methods proposed for approximate reconciliation.
Resumo:
Formal tools like finite-state model checkers have proven useful in verifying the correctness of systems of bounded size and for hardening single system components against arbitrary inputs. However, conventional applications of these techniques are not well suited to characterizing emergent behaviors of large compositions of processes. In this paper, we present a methodology by which arbitrarily large compositions of components can, if sufficient conditions are proven concerning properties of small compositions, be modeled and completely verified by performing formal verifications upon only a finite set of compositions. The sufficient conditions take the form of reductions, which are claims that particular sequences of components will be causally indistinguishable from other shorter sequences of components. We show how this methodology can be applied to a variety of network protocol applications, including two features of the HTTP protocol, a simple active networking applet, and a proposed web cache consistency algorithm. We also doing discuss its applicability to framing protocol design goals and to representing systems which employ non-model-checking verification methodologies. Finally, we briefly discuss how we hope to broaden this methodology to more general topological compositions of network applications.
Resumo:
Principality of typings is the property that for each typable term, there is a typing from which all other typings are obtained via some set of operations. Type inference is the problem of finding a typing for a given term, if possible. We define an intersection type system which has principal typings and types exactly the strongly normalizable λ-terms. More interestingly, every finite-rank restriction of this system (using Leivant's first notion of rank) has principal typings and also has decidable type inference. This is in contrast to System F where the finite rank restriction for every finite rank at 3 and above has neither principal typings nor decidable type inference. This is also in contrast to earlier presentations of intersection types where the status of these properties is not known for the finite-rank restrictions at 3 and above.Furthermore, the notion of principal typings for our system involves only one operation, substitution, rather than several operations (not all substitution-based) as in earlier presentations of principality for intersection types (of unrestricted rank). A unification-based type inference algorithm is presented using a new form of unification, β-unification.
Resumo:
In a recent paper (Changes in Web Client Access Patterns: Characteristics and Caching Implications by Barford, Bestavros, Bradley, and Crovella) we performed a variety of analyses upon user traces collected in the Boston University Computer Science department in 1995 and 1998. A sanitized version of the 1995 trace has been publicly available for some time; the 1998 trace has now been sanitized, and is available from: http://www.cs.bu.edu/techreports/1999-011-usertrace-98.gz ftp://ftp.cs.bu.edu/techreports/1999-011-usertrace-98.gz This memo discusses the format of this public version of the log, and includes additional discussion of how the data was collected, how the log was sanitized, what this log is and is not useful for, and areas of potential future research interest.
Resumo:
Co-release of the inhibitory neurotransmitter GABA and the neuropeptide substance-P (SP) from single axons is a conspicuous feature of the basal ganglia, yet its computational role, if any, has not been resolved. In a new learning model, co-release of GABA and SP from axons of striatal projection neurons emerges as a highly efficient way to compute the uncertainty responses that are exhibited by dopamine (DA) neurons when animals adapt to probabilistic contingencies between rewards and the stimuli that predict their delivery. Such uncertainty-related dopamine release appears to be an adaptive phenotype, because it promotes behavioral switching at opportune times. Understanding the computational linkages between SP and DA in the basal ganglia is important, because Huntington's disease is characterized by massive SP depletion, whereas Parkinson's disease is characterized by massive DA depletion.
Resumo:
Recent electrophysical data inspired the claim that dopaminergic neurons adapt their mismatch sensitivities to reflect variances of expected rewards. This contradicts reward prediction error theory and most basal ganglia models. Application of learning principles points to a testable alternative interpretation-of the same data-that is compatible with existing theory.
Resumo:
Before choosing, it helps to know both the expected value signaled by a predictive cue and the associated uncertainty that the reward will be forthcoming. Recently, Fiorillo et al. (2003) found the dopamine (DA) neurons of the SNc exhibit sustained responses related to the uncertainty that a cure will be followed by reward, in addition to phasic responses related to reward prediction errors (RPEs). This suggests that cue-dependent anticipations of the timing, magnitude, and uncertainty of rewards are learned and reflected in components of the DA signals broadcast by SNc neurons. What is the minimal local circuit model that can explain such multifaceted reward-related learning? A new computational model shows how learned uncertainty responses emerge robustly on single trial along with phasic RPE responses, such that both types of DA responses exhibit the empirically observed dependence on conditional probability, expected value of reward, and time since onset of the reward-predicting cue. The model includes three major pathways for computing: immediate expected values of cures, timed predictions of reward magnitudes (and RPEs), and the uncertainty associated with these predictions. The first two model pathways refine those previously modeled by Brown et al. (1999). A third, newly modeled, pathway is formed by medium spiny projection neurons (MSPNs) of the matrix compartment of the striatum, whose axons co-release GABA and a neuropeptide, substance P, both at synapses with GABAergic neurons in the SNr and with the dendrites (in SNr) of DA neurons whose somas are in ventral SNc. Co-release enables efficient computation of sustained DA uncertainty responses that are a non-monotonic function of the conditonal probability that a reward will follow the cue. The new model's incorporation of a striatal microcircuit allowed it to reveals that variability in striatal cholinergic transmission can explain observed difference, between monkeys, in the amplitutude of the non-monotonic uncertainty function. Involvement of matriceal MSPNs and striatal cholinergic transmission implpies a relation between uncertainty in the cue-reward contigency and action-selection functions of the basal ganglia. The model synthesizes anatomical, electrophysiological and behavioral data regarding the midbrain DA system in a novel way, by relating the ability to compute uncertainty, in parallel with other aspects of reward contingencies, to the unique distribution of SP inputs in ventral SN.
Resumo:
Illusory contours can be induced along directions approximately collinear to edges or approximately perpendicular to the ends of lines. Using a rating scale procedure we explored the relation between the two types of inducers by systematically varying the thickness of inducing elements to result; in varying amounts of "edge-like" or "line-like" induction. Inducers for om illusory figures consisted of concentric rings with arcs missing. Observers judged the clarity and brightness of illusory figures as the number of arcs, their thicknesses, and spacings were parametrically varied. Degree of clarity and amount of induced brightness were both found to be inverted-U functions of the number of arcs. These results mandate that any valid model of illusory contour formation must account for interference effects between parallel lines or between those neural units responsible for completion of boundary signals in directions perpendicular to the ends of thin lines. Line width was found to have an effect on both clarity and brightness, a finding inconsistent with those models which employ only completion perpendicular to inducer orientation.
Resumo:
The giant cholinergic interneurons of the striatum are tonically active neurons (TANs) that respond with characteristic pauses to novel events and to appetitive and aversive conditioned stimuli. Fluctuations in acetylcholine release by TANs modulate performance- and learning-related dynamics in the striatum. Whereas tonic activity emerges from intrinsic properties of these neurons, glutamatergic inputs from thalamic centromedian-parafascicular nuclei, and dopaminergic inputs from midbrain, are required for the generation of pause responses. No prior computational models encompass both intrinsic and synaptically-gated dynamics. We present a mathematical model that robustly accounts for behavior-related electrophysiological properties of TANs in terms of their intrinsic physiological properties and known afferents. In the model, balanced intrinsic hyperpolarizing and depolarizing currents engender tonic firing, and glutamatergic inputs from thalamus (and cortex) both directly excite and indirectly inhibit TANs. If the latter inhibition, presumably mediated by GABAergic interneurons, exceeds a threshold, its effect is amplified by a KIR current to generate a prolonged pause. In the model, the intrinsic mechanisms and external inputs are both modulated by learning-dependent dopamine (DA) signals and our simulations revealed that many learning-dependent behaviors of TANs are explicable without recourse to learning-dependent changes in synapses onto TANs. The "teaching signal" that modulates reinforcement learning at cortico-striatal synapses may be a sequence composed of an adaptively scaled DA burst, a brief ACh burst, and a scaled ACh pause. Such an interpretation is consistent with recent data on cholinergic control of LTD of cortical synapses onto striatal spiny projection neurons.