18 resultados para Collapsed objects and Supernovae
em Boston University Digital Common
Resumo:
This paper presents a tool called Gismo (Generator of Internet Streaming Media Objects and workloads). Gismo enables the specification of a number of streaming media access characteristics, including object popularity, temporal correlation of request, seasonal access patterns, user session durations, user interactivity times, and variable bit-rate (VBR) self-similarity and marginal distributions. The embodiment of these characteristics in Gismo enables the generation of realistic and scalable request streams for use in the benchmarking and comparative evaluation of Internet streaming media delivery techniques. To demonstrate the usefulness of Gismo, we present a case study that shows the importance of various workload characteristics in determining the effectiveness of proxy caching and server patching techniques in reducing bandwidth requirements.
Resumo:
A mechanism is proposed that integrates low-level (image processing), mid-level (recursive 3D trajectory estimation), and high-level (action recognition) processes. It is assumed that the system observes multiple moving objects via a single, uncalibrated video camera. A novel extended Kalman filter formulation is used in estimating the relative 3D motion trajectories up to a scale factor. The recursive estimation process provides a prediction and error measure that is exploited in higher-level stages of action recognition. Conversely, higher-level mechanisms provide feedback that allows the system to reliably segment and maintain the tracking of moving objects before, during, and after occlusion. The 3D trajectory, occlusion, and segmentation information are utilized in extracting stabilized views of the moving object. Trajectory-guided recognition (TGR) is proposed as a new and efficient method for adaptive classification of action. The TGR approach is demonstrated using "motion history images" that are then recognized via a mixture of Gaussian classifier. The system was tested in recognizing various dynamic human outdoor activities; e.g., running, walking, roller blading, and cycling. Experiments with synthetic data sets are used to evaluate stability of the trajectory estimator with respect to noise.
Resumo:
This paper attempts two tasks. First, it sketches how the natural sciences (including especially the biological sciences), the social sciences, and the scientific study of religion can be understood to furnish complementary, consonant perspectives on human beings and human groups. This suggests that it is possible to speak of a modern secular interpretation of humanity (MSIH) to which these perspectives contribute (though not without tensions). MSIH is not a comprehensive interpretation of human beings, if only because it adopts a posture of neutrality with regard to the reality of religious objects and the truth of theological claims about them. MSIH is certainly an impressively forceful interpretation, however, and it needs to be reckoned with by any perspective on human life that seeks to insert its truth claims into the arena of public debate. Second, the paper considers two challenges that MSIH poses to specifically theological interpretations of human beings. On the one hand, in spite of its posture of religious neutrality, MSIH is a key element in a class of wider, seemingly antireligious interpretations of humanity, including especially projectionist and illusionist critiques of religion. It is consonance with MSIH that makes these critiques such formidable competitors for traditional theological interpretations of human beings. On the other hand, and taking the religiously neutral posture of MSIH at face value, theological accounts of humanity that seek to coordinate the insights of MSIH with positive religious visions of human life must find ways to overcome or manage such dissonance as arises. The goal of synthesis is defended as important, and strategies for managing these challenges, especially in light of the pluralism of extant philosophical and theological interpretations of human beings, are advocated.
Resumo:
A combined 2D, 3D approach is presented that allows for robust tracking of moving bodies in a given environment as observed via a single, uncalibrated video camera. Tracking is robust even in the presence of occlusions. Low-level features are often insufficient for detection, segmentation, and tracking of non-rigid moving objects. Therefore, an improved mechanism is proposed that combines low-level (image processing) and mid-level (recursive trajectory estimation) information obtained during the tracking process. The resulting system can segment and maintain the tracking of moving objects before, during, and after occlusion. At each frame, the system also extracts a stabilized coordinate frame of the moving objects. This stabilized frame is used to resize and resample the moving blob so that it can be used as input to motion recognition modules. The approach enables robust tracking without constraining the system to know the shape of the objects being tracked beforehand; although, some assumptions are made about the characteristics of the shape of the objects, and how they evolve with time. Experiments in tracking moving people are described.
Resumo:
Temporal locality of reference in Web request streams emerges from two distinct phenomena: the popularity of Web objects and the {\em temporal correlation} of requests. Capturing these two elements of temporal locality is important because it enables cache replacement policies to adjust how they capitalize on temporal locality based on the relative prevalence of these phenomena. In this paper, we show that temporal locality metrics proposed in the literature are unable to delineate between these two sources of temporal locality. In particular, we show that the commonly-used distribution of reference interarrival times is predominantly determined by the power law governing the popularity of documents in a request stream. To capture (and more importantly quantify) both sources of temporal locality in a request stream, we propose a new and robust metric that enables accurate delineation between locality due to popularity and that due to temporal correlation. Using this metric, we characterize the locality of reference in a number of representative proxy cache traces. Our findings show that there are measurable differences between the degrees (and sources) of temporal locality across these traces, and that these differences are effectively captured using our proposed metric. We illustrate the significance of our findings by summarizing the performance of a novel Web cache replacement policy---called GreedyDual*---which exploits both long-term popularity and short-term temporal correlation in an adaptive fashion. Our trace-driven simulation experiments (which are detailed in an accompanying Technical Report) show the superior performance of GreedyDual* when compared to other Web cache replacement policies.
Resumo:
Research on the construction of logical overlay networks has gained significance in recent times. This is partly due to work on peer-to-peer (P2P) systems for locating and retrieving distributed data objects, and also scalable content distribution using end-system multicast techniques. However, there are emerging applications that require the real-time transport of data from various sources to potentially many thousands of subscribers, each having their own quality-of-service (QoS) constraints. This paper primarily focuses on the properties of two popular topologies found in interconnection networks, namely k-ary n-cubes and de Bruijn graphs. The regular structure of these graph topologies makes them easier to analyze and determine possible routes for real-time data than complete or irregular graphs. We show how these overlay topologies compare in their ability to deliver data according to the QoS constraints of many subscribers, each receiving data from specific publishing hosts. Comparisons are drawn on the ability of each topology to route data in the presence of dynamic system effects, due to end-hosts joining and departing the system. Finally, experimental results show the service guarantees and physical link stress resulting from efficient multicast trees constructed over both kinds of overlay networks.
Resumo:
Nearest neighbor retrieval is the task of identifying, given a database of objects and a query object, the objects in the database that are the most similar to the query. Retrieving nearest neighbors is a necessary component of many practical applications, in fields as diverse as computer vision, pattern recognition, multimedia databases, bioinformatics, and computer networks. At the same time, finding nearest neighbors accurately and efficiently can be challenging, especially when the database contains a large number of objects, and when the underlying distance measure is computationally expensive. This thesis proposes new methods for improving the efficiency and accuracy of nearest neighbor retrieval and classification in spaces with computationally expensive distance measures. The proposed methods are domain-independent, and can be applied in arbitrary spaces, including non-Euclidean and non-metric spaces. In this thesis particular emphasis is given to computer vision applications related to object and shape recognition, where expensive non-Euclidean distance measures are often needed to achieve high accuracy. The first contribution of this thesis is the BoostMap algorithm for embedding arbitrary spaces into a vector space with a computationally efficient distance measure. Using this approach, an approximate set of nearest neighbors can be retrieved efficiently - often orders of magnitude faster than retrieval using the exact distance measure in the original space. The BoostMap algorithm has two key distinguishing features with respect to existing embedding methods. First, embedding construction explicitly maximizes the amount of nearest neighbor information preserved by the embedding. Second, embedding construction is treated as a machine learning problem, in contrast to existing methods that are based on geometric considerations. The second contribution is a method for constructing query-sensitive distance measures for the purposes of nearest neighbor retrieval and classification. In high-dimensional spaces, query-sensitive distance measures allow for automatic selection of the dimensions that are the most informative for each specific query object. It is shown theoretically and experimentally that query-sensitivity increases the modeling power of embeddings, allowing embeddings to capture a larger amount of the nearest neighbor structure of the original space. The third contribution is a method for speeding up nearest neighbor classification by combining multiple embedding-based nearest neighbor classifiers in a cascade. In a cascade, computationally efficient classifiers are used to quickly classify easy cases, and classifiers that are more computationally expensive and also more accurate are only applied to objects that are harder to classify. An interesting property of the proposed cascade method is that, under certain conditions, classification time actually decreases as the size of the database increases, a behavior that is in stark contrast to the behavior of typical nearest neighbor classification systems. The proposed methods are evaluated experimentally in several different applications: hand shape recognition, off-line character recognition, online character recognition, and efficient retrieval of time series. In all datasets, the proposed methods lead to significant improvements in accuracy and efficiency compared to existing state-of-the-art methods. In some datasets, the general-purpose methods introduced in this thesis even outperform domain-specific methods that have been custom-designed for such datasets.
Resumo:
A model of laminar visual cortical dynamics proposes how 3D boundary and surface representations of slated and curved 3D objects and 2D images arise. The 3D boundary representations emerge from interactions between non-classical horizontal receptive field interactions with intracorticcal and intercortical feedback circuits. Such non-classical interactions contextually disambiguate classical receptive field responses to ambiguous visual cues using cells that are sensitive to angles and disparity gradients with cortical areas V1 and V2. These cells are all variants of bipole grouping cells. Model simulations show how horizontal connections can develop selectively to angles, how slanted surfaces can activate 3D boundary representations that are sensitive to angles and disparity gradients, how 3D filling-in occurs across slanted surfaces, how a 2D Necker cube image can be represented in 3D, and how bistable Necker cuber percepts occur. The model also explains data about slant aftereffects and 3D neon color spreading. It shows how habituative transmitters that help to control developement also help to trigger bistable 3D percepts and slant aftereffects, and how attention can influence which of these percepts is perceived by propogating along some object boundaries.
Resumo:
How do humans use predictive contextual information to facilitate visual search? How are consistently paired scenic objects and positions learned and used to more efficiently guide search in familiar scenes? For example, a certain combination of objects can define a context for a kitchen and trigger a more efficient search for a typical object, such as a sink, in that context. A neural model, ARTSCENE Search, is developed to illustrate the neural mechanisms of such memory-based contextual learning and guidance, and to explain challenging behavioral data on positive/negative, spatial/object, and local/distant global cueing effects during visual search. The model proposes how global scene layout at a first glance rapidly forms a hypothesis about the target location. This hypothesis is then incrementally refined by enhancing target-like objects in space as a scene is scanned with saccadic eye movements. The model clarifies the functional roles of neuroanatomical, neurophysiological, and neuroimaging data in visual search for a desired goal object. In particular, the model simulates the interactive dynamics of spatial and object contextual cueing in the cortical What and Where streams starting from early visual areas through medial temporal lobe to prefrontal cortex. After learning, model dorsolateral prefrontal cortical cells (area 46) prime possible target locations in posterior parietal cortex based on goalmodulated percepts of spatial scene gist represented in parahippocampal cortex, whereas model ventral prefrontal cortical cells (area 47/12) prime possible target object representations in inferior temporal cortex based on the history of viewed objects represented in perirhinal cortex. The model hereby predicts how the cortical What and Where streams cooperate during scene perception, learning, and memory to accumulate evidence over time to drive efficient visual search of familiar scenes.
Resumo:
This paper introduces ART-EMAP, a neural architecture that uses spatial and temporal evidence accumulation to extend the capabilities of fuzzy ARTMAP. ART-EMAP combines supervised and unsupervised learning and a medium-term memory process to accomplish stable pattern category recognition in a noisy input environment. The ART-EMAP system features (i) distributed pattern registration at a view category field; (ii) a decision criterion for mapping between view and object categories which can delay categorization of ambiguous objects and trigger an evidence accumulation process when faced with a low confidence prediction; (iii) a process that accumulates evidence at a medium-term memory (MTM) field; and (iv) an unsupervised learning algorithm to fine-tune performance after a limited initial period of supervised network training. ART-EMAP dynamics are illustrated with a benchmark simulation example. Applications include 3-D object recognition from a series of ambiguous 2-D views.
Resumo:
The What-and-Where filter forms part of a neural network architecture for spatial mapping, object recognition, and image understanding. The Where fllter responds to an image figure that has been separated from its background. It generates a spatial map whose cell activations simultaneously represent the position, orientation, ancl size of all tbe figures in a scene (where they are). This spatial map may he used to direct spatially localized attention to these image features. A multiscale array of oriented detectors, followed by competitve and interpolative interactions between position, orientation, and size scales, is used to define the Where filter. This analysis discloses several issues that need to be dealt with by a spatial mapping system that is based upon oriented filters, such as the role of cliff filters with and without normalization, the double peak problem of maximum orientation across size scale, and the different self-similar interpolation properties across orientation than across size scale. Several computationally efficient Where filters are proposed. The Where filter rnay be used for parallel transformation of multiple image figures into invariant representations that are insensitive to the figures' original position, orientation, and size. These invariant figural representations form part of a system devoted to attentive object learning and recognition (what it is). Unlike some alternative models where serial search for a target occurs, a What and Where representation can he used to rapidly search in parallel for a desired target in a scene. Such a representation can also be used to learn multidimensional representations of objects and their spatial relationships for purposes of image understanding. The What-and-Where filter is inspired by neurobiological data showing that a Where processing stream in the cerebral cortex is used for attentive spatial localization and orientation, whereas a What processing stream is used for attentive object learning and recognition.
Resumo:
In a constantly changing world, humans are adapted to alternate routinely between attending to familiar objects and testing hypotheses about novel ones. We can rapidly learn to recognize and narne novel objects without unselectively disrupting our memories of familiar ones. We can notice fine details that differentiate nearly identical objects and generalize across broad classes of dissimilar objects. This chapter describes a class of self-organizing neural network architectures--called ARTMAP-- that are capable of fast, yet stable, on-line recognition learning, hypothesis testing, and naming in response to an arbitrary stream of input patterns (Carpenter, Grossberg, Markuzon, Reynolds, and Rosen, 1992; Carpenter, Grossberg, and Reynolds, 1991). The intrinsic stability of ARTMAP allows the system to learn incrementally for an unlimited period of time. System stability properties can be traced to the structure of its learned memories, which encode clusters of attended features into its recognition categories, rather than slow averages of category inputs. The level of detail in the learned attentional focus is determined moment-by-moment, depending on predictive success: an error due to over-generalization automatically focuses attention on additional input details enough of which are learned in a new recognition category so that the predictive error will not be repeated. An ARTMAP system creates an evolving map between a variable number of learned categories that compress one feature space (e.g., visual features) to learned categories of another feature space (e.g., auditory features). Input vectors can be either binary or analog. Computational properties of the networks enable them to perform significantly better in benchmark studies than alternative machine learning, genetic algorithm, or neural network models. Some of the critical problems that challenge and constrain any such autonomous learning system will next be illustrated. Design principles that work together to solve these problems are then outlined. These principles are realized in the ARTMAP architecture, which is specified as an algorithm. Finally, ARTMAP dynamics are illustrated by means of a series of benchmark simulations.
Resumo:
A neural network theory of :3-D vision, called FACADE Theory, is described. The theory proposes a solution of the classical figure-ground problem for biological vision. It does so by suggesting how boundary representations and surface representations are formed within a Boundary Contour System (BCS) and a Feature Contour System (FCS). The BCS and FCS interact reciprocally to form 3-D boundary and surface representations that arc mutually consistent. Their interactions generate 3-D percepts wherein occluding and occluded object completed, and grouped. The theory clarifies how preattentive processes of 3-D perception and figure-ground separation interact reciprocally with attentive processes of spatial localization, object recognition, and visual search. A new theory of stereopsis is proposed that predicts how cells sensitive to multiple spatial frequencies, disparities, and orientations are combined by context-sensitive filtering, competition, and cooperation to form coherent BCS boundary segmentations. Several factors contribute to figure-ground pop-out, including: boundary contrast between spatially contiguous boundaries, whether due to scenic differences in luminance, color, spatial frequency, or disparity; partially ordered interactions from larger spatial scales and disparities to smaller scales and disparities; and surface filling-in restricted to regions surrounded by a connected boundary. Phenomena such as 3-D pop-out from a 2-D picture, DaVinci stereopsis, a 3-D neon color spreading, completion of partially occluded objects, and figure-ground reversals are analysed. The BCS and FCS sub-systems model aspects of how the two parvocellular cortical processing streams that join the Lateral Geniculate Nucleus to prestriate cortical area V4 interact to generate a multiplexed representation of Form-And-Color-And-Depth, or FACADE, within area V4. Area V4 is suggested to support figure-ground separation and to interact. with cortical mechanisms of spatial attention, attentive objcect learning, and visual search. Adaptive Resonance Theory (ART) mechanisms model aspects of how prestriate visual cortex interacts reciprocally with a visual object recognition system in inferotemporal cortex (IT) for purposes of attentive object learning and categorization. Object attention mechanisms of the What cortical processing stream through IT cortex are distinguished from spatial attention mechanisms of the Where cortical processing stream through parietal cortex. Parvocellular BCS and FCS signals interact with the model What stream. Parvocellular FCS and magnocellular Motion BCS signals interact with the model Where stream. Reciprocal interactions between these visual, What, and Where mechanisms arc used to discuss data about visual search and saccadic eye movements, including fast search of conjunctive targets, search of 3-D surfaces, selective search of like-colored targets, attentive tracking of multi-element groupings, and recursive search of simultaneously presented targets.
Resumo:
We describe our work on shape-based image database search using the technique of modal matching. Modal matching employs a deformable shape decomposition that allows users to select example objects and have the computer efficiently sort the set of objects based on the similarity of their shape. Shapes are compared in terms of the types of nonrigid deformations (differences) that relate them. The modal decomposition provides deformation "control knobs" for flexible matching and thus allows for selecting weighted subsets of shape parameters that are deemed significant for a particular category or context. We demonstrate the utility of this approach for shape comparison in 2-D image databases; however, the general formulation is applicable to signals of any dimensionality.
Resumo:
Internet streaming applications are adversely affected by network conditions such as high packet loss rates and long delays. This paper aims at mitigating such effects by leveraging the availability of client-side caching proxies. We present a novel caching architecture (and associated cache management algorithms) that turn edge caches into accelerators of streaming media delivery. A salient feature of our caching algorithms is that they allow partial caching of streaming media objects and joint delivery of content from caches and origin servers. The caching algorithms we propose are both network-aware and stream-aware; they take into account the popularity of streaming media objects, their bit-rate requirements, and the available bandwidth between clients and servers. Using realistic models of Internet bandwidth (derived from proxy cache logs and measured over real Internet paths), we have conducted extensive simulations to evaluate the performance of various cache management alternatives. Our experiments demonstrate that network-aware caching algorithms can significantly reduce service delay and improve overall stream quality. Also, our experiments show that partial caching is particularly effective when bandwidth variability is not very high.