916 resultados para Set topology
Resumo:
Learning an input-output mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multi-dimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, this form of learning is closely related to classical approximation techniques, such as generalized splines and regularization theory. This paper considers the problems of an exact representation and, in more detail, of the approximation of linear and nolinear mappings in terms of simpler functions of fewer variables. Kolmogorov's theorem concerning the representation of functions of several variables in terms of functions of one variable turns out to be almost irrelevant in the context of networks for learning. We develop a theoretical framework for approximation based on regularization techniques that leads to a class of three-layer networks that we call Generalized Radial Basis Functions (GRBF), since they are mathematically related to the well-known Radial Basis Functions, mainly used for strict interpolation tasks. GRBF networks are not only equivalent to generalized splines, but are also closely related to pattern recognition methods such as Parzen windows and potential functions and to several neural network algorithms, such as Kanerva's associative memory, backpropagation and Kohonen's topology preserving map. They also have an interesting interpretation in terms of prototypes that are synthesized and optimally combined during the learning stage. The paper introduces several extensions and applications of the technique and discusses intriguing analogies with neurobiological data.
Resumo:
Amorphous computing is the study of programming ultra-scale computing environments of smart sensors and actuators cite{white-paper}. The individual elements are identical, asynchronous, randomly placed, embedded and communicate locally via wireless broadcast. Aggregating the processors into groups is a useful paradigm for programming an amorphous computer because groups can be used for specialization, increased robustness, and efficient resource allocation. This paper presents a new algorithm, called the clubs algorithm, for efficiently aggregating processors into groups in an amorphous computer, in time proportional to the local density of processors. The clubs algorithm is well-suited to the unique characteristics of an amorphous computer. In addition, the algorithm derives two properties from the physical embedding of the amorphous computer: an upper bound on the number of groups formed and a constant upper bound on the density of groups. The clubs algorithm can also be extended to find the maximal independent set (MIS) and $Delta + 1$ vertex coloring in an amorphous computer in $O(log N)$ rounds, where $N$ is the total number of elements and $Delta$ is the maximum degree.
Resumo:
X. Wang, J. Yang, R. Jensen and X. Liu, 'Rough Set Feature Selection and Rule Induction for Prediction of Malignancy Degree in Brain Glioma,' Computer Methods and Programs in Biomedicine, vol. 83, no. 2, pp. 147-156, 2006.
Resumo:
Feature selection aims to determine a minimal feature subset from a problem domain while retaining a suitably high accuracy in representing the original features. Rough set theory (RST) has been used as such a tool with much success. RST enables the discovery of data dependencies and the reduction of the number of attributes contained in a dataset using the data alone, requiring no additional information. This chapter describes the fundamental ideas behind RST-based approaches and reviews related feature selection methods that build on these ideas. Extensions to the traditional rough set approach are discussed, including recent selection methods based on tolerance rough sets, variable precision rough sets and fuzzy-rough sets. Alternative search mechanisms are also highly important in rough set feature selection. The chapter includes the latest developments in this area, including RST strategies based on hill-climbing, genetic algorithms and ant colony optimization.
Resumo:
R. Jensen, Q. Shen and A. Tuson, 'Finding Rough Set Reducts with SAT,' Proceedings of the 10th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing, LNAI 3641, pp. 194-203, 2005.
Resumo:
Recent studies have noted that vertex degree in the autonomous system (AS) graph exhibits a highly variable distribution [15, 22]. The most prominent explanatory model for this phenomenon is the Barabási-Albert (B-A) model [5, 2]. A central feature of the B-A model is preferential connectivity—meaning that the likelihood a new node in a growing graph will connect to an existing node is proportional to the existing node’s degree. In this paper we ask whether a more general explanation than the B-A model, and absent the assumption of preferential connectivity, is consistent with empirical data. We are motivated by two observations: first, AS degree and AS size are highly correlated [11]; and second, highly variable AS size can arise simply through exponential growth. We construct a model incorporating exponential growth in the size of the Internet, and in the number of ASes. We then show via analysis that such a model yields a size distribution exhibiting a power-law tail. In such a model, if an AS’s link formation is roughly proportional to its size, then AS degree will also show high variability. We instantiate such a model with empirically derived estimates of growth rates and show that the resulting degree distribution is in good agreement with that of real AS graphs.
Resumo:
Wireless sensor networks are characterized by limited energy resources. To conserve energy, application-specific aggregation (fusion) of data reports from multiple sensors can be beneficial in reducing the amount of data flowing over the network. Furthermore, controlling the topology by scheduling the activity of nodes between active and sleep modes has often been used to uniformly distribute the energy consumption among all nodes by de-synchronizing their activities. We present an integrated analytical model to study the joint performance of in-network aggregation and topology control. We define performance metrics that capture the tradeoffs among delay, energy, and fidelity of the aggregation. Our results indicate that to achieve high fidelity levels under medium to high event reporting load, shorter and fatter aggregation/routing trees (toward the sink) offer the best delay-energy tradeoff as long as topology control is well coordinated with routing.
Resumo:
Effective engineering of the Internet is predicated upon a detailed understanding of issues such as the large-scale structure of its underlying physical topology, the manner in which it evolves over time, and the way in which its constituent components contribute to its overall function. Unfortunately, developing a deep understanding of these issues has proven to be a challenging task, since it in turn involves solving difficult problems such as mapping the actual topology, characterizing it, and developing models that capture its emergent behavior. Consequently, even though there are a number of topology models, it is an open question as to how representative the topologies they generate are of the actual Internet. Our goal is to produce a topology generation framework which improves the state of the art and is based on design principles which include representativeness, inclusiveness, and interoperability. Representativeness leads to synthetic topologies that accurately reflect many aspects of the actual Internet topology (e.g. hierarchical structure, degree distribution, etc.). Inclusiveness combines the strengths of as many generation models as possible in a single generation tool. Interoperability provides interfaces to widely-used simulation and visualization applications such as ns and SSF. We call such a tool a universal topology generator. In this paper we discuss the design, implementation and usage of the BRITE universal topology generation tool that we have built. We also describe the BRITE Analysis Engine, BRIANA, which is an independent piece of software designed and built upon BRITE design goals of flexibility and extensibility. The purpose of BRIANA is to act as a repository of analysis routines along with a user–friendly interface that allows its use on different topology formats.
Resumo:
We present new, simple, efficient data structures for approximate reconciliation of set differences, a useful standalone primitive for peer-to-peer networks and a natural subroutine in methods for exact reconciliation. In the approximate reconciliation problem, peers A and B respectively have subsets of elements SA and SB of a large universe U. Peer A wishes to send a short message M to peer B with the goal that B should use M to determine as many elements in the set SB–SA as possible. To avoid the expense of round trip communication times, we focus on the situation where a single message M is sent. We motivate the performance tradeoffs between message size, accuracy and computation time for this problem with a straightforward approach using Bloom filters. We then introduce approximation reconciliation trees, a more computationally efficient solution that combines techniques from Patricia tries, Merkle trees, and Bloom filters. We present an analysis of approximation reconciliation trees and provide experimental results comparing the various methods proposed for approximate reconciliation.
Resumo:
There has been considerable work done in the study of Web reference streams: sequences of requests for Web objects. In particular, many studies have looked at the locality properties of such streams, because of the impact of locality on the design and performance of caching and prefetching systems. However, a general framework for understanding why reference streams exhibit given locality properties has not yet emerged. In this work we take a first step in this direction, based on viewing the Web as a set of reference streams that are transformed by Web components (clients, servers, and intermediaries). We propose a graph-based framework for describing this collection of streams and components. We identify three basic stream transformations that occur at nodes of the graph: aggregation, disaggregation and filtering, and we show how these transformations can be used to abstract the effects of different Web components on their associated reference streams. This view allows a structured approach to the analysis of why reference streams show given properties at different points in the Web. Applying this approach to the study of locality requires good metrics for locality. These metrics must meet three criteria: 1) they must accurately capture temporal locality; 2) they must be independent of trace artifacts such as trace length; and 3) they must not involve manual procedures or model-based assumptions. We describe two metrics meeting these criteria that each capture a different kind of temporal locality in reference streams. The popularity component of temporal locality is captured by entropy, while the correlation component is captured by interreference coefficient of variation. We argue that these metrics are more natural and more useful than previously proposed metrics for temporal locality. We use this framework to analyze a diverse set of Web reference traces. We find that this framework can shed light on how and why locality properties vary across different locations in the Web topology. For example, we find that filtering and aggregation have opposing effects on the popularity component of the temporal locality, which helps to explain why multilevel caching can be effective in the Web. Furthermore, we find that all transformations tend to diminish the correlation component of temporal locality, which has implications for the utility of different cache replacement policies at different points in the Web.
Resumo:
We consider the problem of performing topological optimizations of distributed hash tables. Such hash tables include Chord and Tapestry and are a popular building block for distributed applications. Optimizing topologies over one dimensional hash spaces is particularly difficult as the higher dimensionality of the underlying network makes close fits unlikely. Instead, current schemes are limited to heuristically performing local optimizations finding the best of small random set of peers. We propose a new class of topology optimizations based on the existence of clusters of close overlay members within the underlying network. By constructing additional overlays for each cluster, a significant portion of the search procedure can be performed within the local cluster with a corresponding reduction in the search time. Finally, we discuss the effects of these additional overlays on spatial locality and other load balancing scheme.
Resumo:
Interdomain routing on the Internet is performed using route preference policies specified independently, and arbitrarily by each Autonomous System in the network. These policies are used in the border gateway protocol (BGP) by each AS when selecting next-hop choices for routes to each destination. Conflicts between policies used by different ASs can lead to routing instabilities that, potentially, cannot be resolved no matter how long BGP is run. The Stable Paths Problem (SPP) is an abstract graph theoretic model of the problem of selecting nexthop routes for a destination. A stable solution to the problem is a set of next-hop choices, one for each AS, that is compatible with the policies of each AS. In a stable solution each AS has selected its best next-hop given that the next-hop choices of all neighbors are fixed. BGP can be viewed as a distributed algorithm for solving SPP. In this report we consider the stable paths problem, as well as a family of restricted variants of the stable paths problem, which we call F stable paths problems. We show that two very simple variants of the stable paths problem are also NP-complete. In addition we show that for networks with a DAG topology, there is an efficient centralized algorithm to solve the stable paths problem, and that BGP always efficiently converges to a stable solution on such networks.
Resumo:
We present an online distributed algorithm, the Causation Logging Algorithm (CLA), in which Autonomous Systems (ASes) in the Internet individually report route oscillations/flaps they experience to a central Internet Routing Registry (IRR). The IRR aggregates these reports and may observe what we call causation chains where each node on the chain caused a route flap at the next node along the chain. A chain may also have a causation cycle. The type of an observed causation chain/cycle allows the IRR to infer the underlying policy routing configuration (i.e., the system of economic relationships and constraints on route/path preferences). Our algorithm is based on a formal policy routing model that captures the propagation dynamics of route flaps under arbitrary changes in topology or path preferences. We derive invariant properties of causation chains/cycles for ASes which conform to economic relationships based on the popular Gao-Rexford model. The Gao-Rexford model is known to be safe in the sense that the system always converges to a stable set of paths under static conditions. Our CLA algorithm recovers the type/property of an observed causation chain of an underlying system and determines whether it conforms to the safe economic Gao-Rexford model. Causes for nonconformity can be diagnosed by comparing the properties of the causation chains with those predicted from different variants of the Gao-Rexford model.
Resumo:
Recent advances in processor speeds, mobile communications and battery life have enabled computers to evolve from completely wired to completely mobile. In the most extreme case, all nodes are mobile and communication takes place at available opportunities – using both traditional communication infrastructure as well as the mobility of intermediate nodes. These are mobile opportunistic networks. Data communication in such networks is a difficult problem, because of the dynamic underlying topology, the scarcity of network resources and the lack of global information. Establishing end-to-end routes in such networks is usually not feasible. Instead a store-and-carry forwarding paradigm is better suited for such networks. This dissertation describes and analyzes algorithms for forwarding of messages in such networks. In order to design effective forwarding algorithms for mobile opportunistic networks, we start by first building an understanding of the set of all paths between nodes, which represent the available opportunities for any forwarding algorithm. Relying on real measurements, we enumerate paths between nodes and uncover what we refer to as the path explosion effect. The term path explosion refers to the fact that the number of paths between a randomly selected pair of nodes increases exponentially with time. We draw from the theory of epidemics to model and explain the path explosion effect. This is the first contribution of the thesis, and is a key observation that underlies subsequent results. Our second contribution is the study of forwarding algorithms. For this, we rely on trace driven simulations of different algorithms that span a range of design dimensions. We compare the performance (success rate and average delay) of these algorithms. We make the surprising observation that most algorithms we consider have roughly similar performance. We explain this result in light of the path explosion phenomenon. While the performance of most algorithms we studied was roughly the same, these algorithms differed in terms of cost. This prompted us to focus on designing algorithms with the explicit intent of reducing costs. For this, we cast the problem of forwarding as an optimal stopping problem. Our third main contribution is the design of strategies based on optimal stopping principles which we refer to as Delegation schemes. Our analysis shows that using a delegation scheme reduces cost over naive forwarding by a factor of O(√N), where N is the number of nodes in the network. We further validate this result on real traces, where the cost reduction observed is even greater. Our results so far include a key assumption, which is unbounded buffers on nodes. Next, we relax this assumption, so that the problem shifts to one of prioritization of messages for transmission and dropping. Our fourth contribution is the study of message prioritization schemes, combined with forwarding. Our main result is that one achieves higher performance by assigning higher priorities to young messages in the network. We again interpret this result in light of the path explosion effect.