Sampling Biases in IP Topology Measurements


Autoria(s): Lakhina, Anukool; Byers, John W.; Crovella, Mark; Xie, Peng
Data(s)

20/10/2011

20/10/2011

15/07/2002

Resumo

Considerable attention has been focused on the properties of graphs derived from Internet measurements. Router-level topologies collected via traceroute studies have led some authors to conclude that the router graph of the Internet is a scale-free graph, or more generally a power-law random graph. In such a graph, the degree distribution of nodes follows a distribution with a power-law tail. In this paper we argue that the evidence to date for this conclusion is at best insufficient. We show that graphs appearing to have power-law degree distributions can arise surprisingly easily, when sampling graphs whose true degree distribution is not at all like a power-law. For example, given a classical Erdös-Rényi sparse, random graph, the subgraph formed by a collection of shortest paths from a small set of random sources to a larger set of random destinations can easily appear to show a degree distribution remarkably like a power-law. We explore the reasons for how this effect arises, and show that in such a setting, edges are sampled in a highly biased manner. This insight allows us to distinguish measurements taken from the Erdös-Rényi graphs from those taken from power-law random graphs. When we apply this distinction to a number of well-known datasets, we find that the evidence for sampling bias in these datasets is strong.

National Science Foundation (ANI-9986397, ANI-0095988, ANI-0093296)

Identificador

http://hdl.handle.net/2144/1667

Idioma(s)

en_US

Publicador

Boston University Computer Science Department

Relação

BUCS Technical Reports;BUCS-TR-2002-021

Tipo

Technical Report