14 resultados para two-body problem

em Boston University Digital Common


Relevância:

40.00% 40.00%

Publicador:

Resumo:

A non-linear supervised learning architecture, the Specialized Mapping Architecture (SMA) and its application to articulated body pose reconstruction from single monocular images is described. The architecture is formed by a number of specialized mapping functions, each of them with the purpose of mapping certain portions (connected or not) of the input space, and a feedback matching process. A probabilistic model for the architecture is described along with a mechanism for learning its parameters. The learning problem is approached using a maximum likelihood estimation framework; we present Expectation Maximization (EM) algorithms for two different instances of the likelihood probability. Performance is characterized by estimating human body postures from low level visual features, showing promising results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

University of Pretoria / Dissertation / Department of Church History and Church Policy / Advised by Prof J W Hofmeyr

Relevância:

30.00% 30.00%

Publicador:

Resumo:

M.A. Thesis / University of Pretoria / Department of Practical Theology / Advised by Prof M Masango

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Google AdSense Program is a successful internet advertisement program where Google places contextual adverts on third-party websites and shares the resulting revenue with each publisher. Advertisers have budgets and bid on ad slots while publishers set reserve prices for the ad slots on their websites. Following previous modelling efforts, we model the program as a two-sided market with advertisers on one side and publishers on the other. We show a reduction from the Generalised Assignment Problem (GAP) to the problem of computing the revenue maximising allocation and pricing of publisher slots under a first-price auction. GAP is APX-hard but a (1-1/e) approximation is known. We compute truthful and revenue-maximizing prices and allocation of ad slots to advertisers under a second-price auction. The auctioneer's revenue is within (1-1/e) second-price optimal.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A well-known paradigm for load balancing in distributed systems is the``power of two choices,''whereby an item is stored at the less loaded of two (or more) random alternative servers. We investigate the power of two choices in natural settings for distributed computing where items and servers reside in a geometric space and each item is associated with the server that is its nearest neighbor. This is in fact the backdrop for distributed hash tables such as Chord, where the geometric space is determined by clockwise distance on a one-dimensional ring. Theoretically, we consider the following load balancing problem. Suppose that servers are initially hashed uniformly at random to points in the space. Sequentially, each item then considers d candidate insertion points also chosen uniformly at random from the space,and selects the insertion point whose associated server has the least load. For the one-dimensional ring, and for Euclidean distance on the two-dimensional torus, we demonstrate that when n data items are hashed to n servers,the maximum load at any server is log log n / log d + O(1) with high probability. While our results match the well-known bounds in the standard setting in which each server is selected equiprobably, our applications do not have this feature, since the sizes of the nearest-neighbor regions around servers are non-uniform. Therefore, the novelty in our methods lies in developing appropriate tail bounds on the distribution of nearest-neighbor region sizes and in adapting previous arguments to this more general setting. In addition, we provide simulation results demonstrating the load balance that results as the system size scales into the millions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An approach for estimating 3D body pose from multiple, uncalibrated views is proposed. First, a mapping from image features to 2D body joint locations is computed using a statistical framework that yields a set of several body pose hypotheses. The concept of a "virtual camera" is introduced that makes this mapping invariant to translation, image-plane rotation, and scaling of the input. As a consequence, the calibration matrices (intrinsics) of the virtual cameras can be considered completely known, and their poses are known up to a single angular displacement parameter. Given pose hypotheses obtained in the multiple virtual camera views, the recovery of 3D body pose and camera relative orientations is formulated as a stochastic optimization problem. An Expectation-Maximization algorithm is derived that can obtain the locally most likely (self-consistent) combination of body pose hypotheses. Performance of the approach is evaluated with synthetic sequences as well as real video sequences of human motion.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A fundamental task of vision systems is to infer the state of the world given some form of visual observations. From a computational perspective, this often involves facing an ill-posed problem; e.g., information is lost via projection of the 3D world into a 2D image. Solution of an ill-posed problem requires additional information, usually provided as a model of the underlying process. It is important that the model be both computationally feasible as well as theoretically well-founded. In this thesis, a probabilistic, nonlinear supervised computational learning model is proposed: the Specialized Mappings Architecture (SMA). The SMA framework is demonstrated in a computer vision system that can estimate the articulated pose parameters of a human body or human hands, given images obtained via one or more uncalibrated cameras. The SMA consists of several specialized forward mapping functions that are estimated automatically from training data, and a possibly known feedback function. Each specialized function maps certain domains of the input space (e.g., image features) onto the output space (e.g., articulated body parameters). A probabilistic model for the architecture is first formalized. Solutions to key algorithmic problems are then derived: simultaneous learning of the specialized domains along with the mapping functions, as well as performing inference given inputs and a feedback function. The SMA employs a variant of the Expectation-Maximization algorithm and approximate inference. The approach allows the use of alternative conditional independence assumptions for learning and inference, which are derived from a forward model and a feedback model. Experimental validation of the proposed approach is conducted in the task of estimating articulated body pose from image silhouettes. Accuracy and stability of the SMA framework is tested using artificial data sets, as well as synthetic and real video sequences of human bodies and hands.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Interdomain routing on the Internet is performed using route preference policies specified independently, and arbitrarily by each Autonomous System in the network. These policies are used in the border gateway protocol (BGP) by each AS when selecting next-hop choices for routes to each destination. Conflicts between policies used by different ASs can lead to routing instabilities that, potentially, cannot be resolved no matter how long BGP is run. The Stable Paths Problem (SPP) is an abstract graph theoretic model of the problem of selecting nexthop routes for a destination. A stable solution to the problem is a set of next-hop choices, one for each AS, that is compatible with the policies of each AS. In a stable solution each AS has selected its best next-hop given that the next-hop choices of all neighbors are fixed. BGP can be viewed as a distributed algorithm for solving SPP. In this report we consider the stable paths problem, as well as a family of restricted variants of the stable paths problem, which we call F stable paths problems. We show that two very simple variants of the stable paths problem are also NP-complete. In addition we show that for networks with a DAG topology, there is an efficient centralized algorithm to solve the stable paths problem, and that BGP always efficiently converges to a stable solution on such networks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A novel approach for estimating articulated body posture and motion from monocular video sequences is proposed. Human pose is defined as the instantaneous two dimensional configuration (i.e., the projection onto the image plane) of a single articulated body in terms of the position of a predetermined set of joints. First, statistical segmentation of the human bodies from the background is performed and low-level visual features are found given the segmented body shape. The goal is to be able to map these, generally low level, visual features to body configurations. The system estimates different mappings, each one with a specific cluster in the visual feature space. Given a set of body motion sequences for training, unsupervised clustering is obtained via the Expectation Maximation algorithm. Then, for each of the clusters, a function is estimated to build the mapping between low-level features to 3D pose. Currently this mapping is modeled by a neural network. Given new visual features, a mapping from each cluster is performed to yield a set of possible poses. From this set, the system selects the most likely pose given the learned probability distribution and the visual feature similarity between hypothesis and input. Performance of the proposed approach is characterized using a new set of known body postures, showing promising results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In many networked applications, independent caching agents cooperate by servicing each other's miss streams, without revealing the operational details of the caching mechanisms they employ. Inference of such details could be instrumental for many other processes. For example, it could be used for optimized forwarding (or routing) of one's own miss stream (or content) to available proxy caches, or for making cache-aware resource management decisions. In this paper, we introduce the Cache Inference Problem (CIP) as that of inferring the characteristics of a caching agent, given the miss stream of that agent. While CIP is insolvable in its most general form, there are special cases of practical importance in which it is, including when the request stream follows an Independent Reference Model (IRM) with generalized power-law (GPL) demand distribution. To that end, we design two basic "litmus" tests that are able to detect LFU and LRU replacement policies, the effective size of the cache and of the object universe, and the skewness of the GPL demand for objects. Using extensive experiments under synthetic as well as real traces, we show that our methods infer such characteristics accurately and quite efficiently, and that they remain robust even when the IRM/GPL assumptions do not hold, and even when the underlying replacement policies are not "pure" LFU or LRU. We exemplify the value of our inference framework by considering example applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Spotting patterns of interest in an input signal is a very useful task in many different fields including medicine, bioinformatics, economics, speech recognition and computer vision. Example instances of this problem include spotting an object of interest in an image (e.g., a tumor), a pattern of interest in a time-varying signal (e.g., audio analysis), or an object of interest moving in a specific way (e.g., a human's body gesture). Traditional spotting methods, which are based on Dynamic Time Warping or hidden Markov models, use some variant of dynamic programming to register the pattern and the input while accounting for temporal variation between them. At the same time, those methods often suffer from several shortcomings: they may give meaningless solutions when input observations are unreliable or ambiguous, they require a high complexity search across the whole input signal, and they may give incorrect solutions if some patterns appear as smaller parts within other patterns. In this thesis, we develop a framework that addresses these three problems, and evaluate the framework's performance in spotting and recognizing hand gestures in video. The first contribution is a spatiotemporal matching algorithm that extends the dynamic programming formulation to accommodate multiple candidate hand detections in every video frame. The algorithm finds the best alignment between the gesture model and the input, and simultaneously locates the best candidate hand detection in every frame. This allows for a gesture to be recognized even when the hand location is highly ambiguous. The second contribution is a pruning method that uses model-specific classifiers to reject dynamic programming hypotheses with a poor match between the input and model. Pruning improves the efficiency of the spatiotemporal matching algorithm, and in some cases may improve the recognition accuracy. The pruning classifiers are learned from training data, and cross-validation is used to reduce the chance of overpruning. The third contribution is a subgesture reasoning process that models the fact that some gesture models can falsely match parts of other, longer gestures. By integrating subgesture reasoning the spotting algorithm can avoid the premature detection of a subgesture when the longer gesture is actually being performed. Subgesture relations between pairs of gestures are automatically learned from training data. The performance of the approach is evaluated on two challenging video datasets: hand-signed digits gestured by users wearing short sleeved shirts, in front of a cluttered background, and American Sign Language (ASL) utterances gestured by ASL native signers. The experiments demonstrate that the proposed method is more accurate and efficient than competing approaches. The proposed approach can be generally applied to alignment or search problems with multiple input observations, that use dynamic programming to find a solution.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A neural model is described of how the brain may autonomously learn a body-centered representation of 3-D target position by combining information about retinal target position, eye position, and head position in real time. Such a body-centered spatial representation enables accurate movement commands to the limbs to be generated despite changes in the spatial relationships between the eyes, head, body, and limbs through time. The model learns a vector representation--otherwise known as a parcellated distributed representation--of target vergence with respect to the two eyes, and of the horizontal and vertical spherical angles of the target with respect to a cyclopean egocenter. Such a vergence-spherical representation has been reported in the caudal midbrain and medulla of the frog, as well as in psychophysical movement studies in humans. A head-centered vergence-spherical representation of foveated target position can be generated by two stages of opponent processing that combine corollary discharges of outflow movement signals to the two eyes. Sums and differences of opponent signals define angular and vergence coordinates, respectively. The head-centered representation interacts with a binocular visual representation of non-foveated target position to learn a visuomotor representation of both foveated and non-foveated target position that is capable of commanding yoked eye movementes. This head-centered vector representation also interacts with representations of neck movement commands to learn a body-centered estimate of target position that is capable of commanding coordinated arm movements. Learning occurs during head movements made while gaze remains fixed on a foveated target. An initial estimate is stored and a VOR-mediated gating signal prevents the stored estimate from being reset during a gaze-maintaining head movement. As the head moves, new estimates arc compared with the stored estimate to compute difference vectors which act as error signals that drive the learning process, as well as control the on-line merging of multimodal information.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article describes how corollary discharges from outflow eye movement commands can be transformed by two stages of opponent neural processing into a head-centered representation of 3-D target position. This representation implicitly defines a cyclopean coordinate system whose variables approximate the binocular vergence and spherical horizontal and vertical angles with respect to the observer's head. Various psychophysical data concerning binocular distance perception and reaching behavior are clarified by this representation. The representation provides a foundation for learning head-centered and body-centered invariant representations of both foveated and non-foveated 3-D target positions. It also enables a solution to be developed of the classical motor equivalence problem, whereby many different joint configurations of a redundant manipulator can all be used to realize a desired trajectory in 3-D space.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A neural network model of 3-D visual perception and figure-ground separation by visual cortex is introduced. The theory provides a unified explanation of how a 2-D image may generate a 3-D percept; how figures pop-out from cluttered backgrounds; how spatially sparse disparity cues can generate continuous surface representations at different perceived depths; how representations of occluded regions can be completed and recognized without usually being seen; how occluded regions can sometimes be seen during percepts of transparency; how high spatial frequency parts of an image may appear closer than low spatial frequency parts; how sharp targets are detected better against a figure and blurred targets are detector better against a background; how low spatial frequency parts of an image may be fused while high spatial frequency parts are rivalrous; how sparse blue cones can generate vivid blue surface percepts; how 3-D neon color spreading, visual phantoms, and tissue contrast percepts are generated; how conjunctions of color-and-depth may rapidly pop-out during visual search. These explanations arise derived from an ecological analysis of how monocularly viewed parts of an image inherit the appropriate depth from contiguous binocularly viewed parts, as during DaVinci stereopsis. The model predicts the functional role and ordering of multiple interactions within and between the two parvocellular processing streams that join LGN to prestriate area V4. Interactions from cells representing larger scales and disparities to cells representing smaller scales and disparities are of particular importance.