COMP4880/8880, Semester 1, 2024
Assignment 1
Problem 1 (5 points). Non-random graph, Cayley Tree (Barabasi 3.11.4) A Cayley tree is a symmetric tree, constructed starting from a central node of degree k. Each node at distance d from the central node has degree k, until we reach the nodes at distance P that have degree one and are called leaves (see figure for a Cayley tree with k = 3 and P = 5).
Figure 1: A Caley tree with k = 3 and P = 5.
1. How many nodes are reachable in t steps from the central node?
2. What is the degree distribution of the network?
3. Calculate the diameter dmax of a (k, P) Cayley tree.
4. Express the diameter dmax in terms of the total number of nodes N.
5. Does the (k, P) Cayley tree display the small-world property, why or why not?
Problem 2 (3 points). Small component in G(N, p). Consider Erd¨os-R´enyi random graph G(N, p) in a supercritical state, i.e. Np > 1 with large enough N. Given a small component consisting of s nodes that is not part of the giant component, show that it is likely to be a tree. Noting that a tree consisting of s nodes has exactly s − 1 edges.
1. Compute the expected number of edges in addition to the s − 1 edges.
2. Denote the expected node degree < k >= (N − 1)p. Argue that there is at least some nodes with less than < k > connections in this small component. Consider adding one new edge to such a node, what is the probability that such a new edge is within the small component, or to the giant component?
Problem 3 (5 points). Consider a set of nodes, which represent individuals in a social network, that are placed on lattice points in an n×n square N = {(i, j) : i ∈ {1, . . . , n}, j ∈ {1, . . . , n}}. We define the lattice distance between two nodes (i, j) and (k, l) to be the number of “lattice steps” separating them: d((i, j),(k, l)) = |k − i| + |l − j|. Suppose every node has a directed edge to every other node within lattice distance 1 (local-contacts). For each node u, we also construct one directed edge to another node as its long-range contact: The probability that there is a directed edge from u to another node v is proportional to [d(u, v)]−2 . For two arbitrary nodes s and t, we want to deliver a message from s to t through the contacts. We assume the message holder in a given step knows of
• The set of local contacts among all nodes (i.e. the underlying grid structure);
• The location, on the lattice, of the target t.
• Its own long-range contact.
Prove (by contruction) that there is a decentralised algorithm to deliver the message within O((log(n))2 ) steps. Similar to the 1-dimensional case in EK book Chapter 20, one can use the following three-step breakdown to complete the proof.
1. Fix an arbitrary node u, show that d(u, v) −2 ≤ 4 ln(6n). (Hint: Convert the LHS into a summation over the lattice distances of contacts - what should be the longest distance? For a given distance j, show that there are at most 4j nodes that are j steps away from u.)
2. The decentralized algorithm is defined as follows: in each step, the current message holder u chooses a contact that is as close to the target t as possible, in the sense of lattice distance. For j > 0, we say that the execution of the algorithm is in phase j when the lattice distance from the current node to t is greater than 2j and at most 2j+1. Let the random variable Xj denote the total number of steps spent in phase j (i.e. The total number of steps to enter phase j − 1). Show that E[Xj ] ≤ 128 ln(6n). (Hint: Fix some j, derive a lower bound of the number of nodes with lattice distance at most 2j to t and an upper bound of the lattice distance between the current node and t. Then, what is the lower bound of the probability that we can enter into the next phase through a long contact? To get the upper bound of the expectation, thinking about geometric distribution might be helpful.)
3. Conclude the proof: Show that the algorithm could deliver the message within O((log2 (n))2 ) steps in expectation.
Problem 4 (7 points). Real-world network and small world Examine the following 5 networks. Treat them as undirected and unweighted network - if they are not, convert by assuming an edge in either direction connects two nodes, and ignore all edge weights.
(A) Neural Network of a C. elegans worm (http://opsahl.co.uk/tnet/datasets/celegans_n306.txt) Format of Data: Origin node (Neuron), destination node (Neuron), weight of link.
(B) The largest weakly connected component of the email network of a large European research institution (https://snap.stanford.edu/data/email-Eu-core.html)
(C) An Erd¨os-R´enyi network with the same number of nodes and average degree as network (A).
(D) An Erd¨os-R´enyi network with the same number of nodes and average degree as network (B).
(E) A Watts-Stogatz network with the same number of nodes and average degree as network (A).
1. Present a table containing the following 11 network measurements (rows) for the 5 networks above (columns). The number of nodes and edges, average degree, average clustering coefficient of all nodes, fraction of possible triangles that are closed (tran-sitivity), maximum and minimum degree, network diameter and radius, maximum closeness centrality, maximum betweenness centrality.
2. Present one strategy to make a simulated networks behave more like a real network – for this assignment, the target is network A the C Elegans neural network. Outline the reason of your proposed changes, and show whether it has worked using the 11 network metrics (and possibly others). Also show (via network metrics) whether the same strategy made the network resembling the email network (B) or not.