Sampling restricted access networks

28/01/2018 09:30
Some topics concerning networks of which structural information is not readily available but has to be collected by performing random walks or searches on them, starting at randomly chosen entry points.
The paper discusses some topics concerning the sampling of restricted access networks (RANs). These are networks that are usually too big to hold all structural information neatly accessible, say in the form of node and arc lists. Often they are very dynamic as well, so that it is impossible to obtain instant snapshots: they change while they are being explored. The internet is a key example of such networks, or social networks like Facebook and LinkedIn that are part of it. Structural information about such networks can only be obtained by sampling them. As there is no list of nodes and arcs available, such information has to be collected, for instance by random searches of the network. Such searches can be carried out in many ways.

It is interesting to apply them to the same RAN and compare the information collected by each method. For each node it is relatively easy to determine the outgoing arcs. But it is difficult to determine the arcs pointing to it. This requires information about the entire network, or a great part of it. An interesting question concerns the size of the sample in order to be able to say something about the structure of the network with sufficient precision. The paper only is a first exploration of the area. It is partly based on existing literature and partly on ideas from the author. It may very well be that these ideas have been described in the literature before, however unbeknownst to the present author.