COMP9312 - 25T2 - Data Analytics for Graphs

Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due 

COMP9312 - 25T2 - Data Analytics for Graphs

Project

Fix & Update (updates for the project are listed here):


Summary

Required Files Q1.zip and Q2.zip
Submission Submit your files on Moodle (Only the last submission will be used)
Deadline 9pm Friday 25 July (Sydney Time)
Marks 25 marks (25% toward your total mark for this course)
Late penalty

5% of the max accessment mark will be deducted for each additional day (24hr) after the specified submission time and date. No submission is accepted 5 days (120hr) after the deadline.

Question Entries

Q1 (10 marks)

Design a solution to compute the top-k shortest simple paths between two vertices in a large directed weighted graph.

Figure 1: Example Graph

Problem Statement

In this question, you need to design an online algorithm to return the top-k shortest simple (loopless) paths between a given pair of vertices s and t in a large directed weighted graph. Your implementation should return k distinct paths ordered by increasing total path weight.

In other words, any simple path from s to t that is not included in the top-k result must either have a greater total weight than the k-th returned path, or have the same weight but a later lexicographical order.

Background & Marking Criteria

You are expected to implement an online algorithm that returns the top-k shortest simple paths between two vertices in a directed weighted graph. Each returned path must be:

  • Simple — acyclic, with no repeated vertices
  • Distinct — no duplicate paths

You should aim to design an efficient solution that works on moderately large graphs. In addition to the implementation, a correct time complexity analysis is compulsory and will be marked.

Marks will be awarded based on:

  • Correctness (autotests): number of test cases passed
  • Theoretical analysis: clarity and correctness of your algorithm description and time complexity

If your code is incomplete or fails some cases, you may still receive partial marks for a clear and well-reasoned explanation. Include your explanation in Q1.pdf, covering your algorithm design, time complexity, and any assumptions or limitations.

Example

Consider the graph shown in Figure 1. Given source vertex s and target vertex t, the top-k shortest simple paths are:

  • k = 1: [[s, a, b, c, d, e, t]]
  • k = 2: [[s, a, b, c, d, e, t], [s, g, t]]
  • k = 3: [[s, a, b, c, d, e, t], [s, g, t], [s, h, t]]
  • k = 4: [[s, a, b, c, d, e, t], [s, g, t], [s, h, t]]

Doing this Project

Open the code template file Q1.ipynb and make a copy in your own Google Drive. You need to implement a class named KShortestPathsQuery with a function query(s, t, k).

We will extract your KShortestPathsQuery class (along with any necessary import statements) and test it independently. Your implementation must run correctly without modification in the Colab environment.

################################################################################ # You can import any Python Standard Library modules. ################################################################################ class KShortestPathsQuery(object): def __init__(self): pass @staticmethod def query(G, s, t, k): ################################################################################ # Input: # s: source vertex # t: target vertex # k: number of shortest simple paths to return # # Output: # A list of top-k shortest simple paths, each as a list of vertex IDs from s to t. # # Note: # Each path must be simple (no repeated vertices) and distinct. # Paths should be returned in ascending order of total weight. # If multiple paths have the same weight, break ties by lexicographical order. # # Please analyze the time complexity of your solution. ################################################################################ return []

Required Files

Compress all related files for Q1 (Q1.ipynb and Q1.pdf, which contains your explanation) into Q1.zip, and submit it on Moodle.

Other Marking Criteria

  1. A correct and efficient implementation of an online algorithm with proper documentation will receive full marks.
  2. Partial marks will be awarded for implementations that pass only a subset of the hidden test cases.
  3. Submissions must return simple paths. Paths containing cycles or repeated vertices will be considered incorrect.
  4. You must provide a clear explanation of your algorithm and its time complexity.

Notes

  1. Your algorithm should handle edge cases such as unreachable targets and cases where fewer than k valid paths exist.
  2. Do not change the input/output format of the provided class or method.
  3. We will test your submission on larger graphs than those shown in the examples.
  4. If multiple simple paths have the same total distance, break ties by lexicographical order of vertex IDs in the path. For example, prefer path 1→2→5 over 1→3→5.
  5. You can not import any external libraries or modules other than the Python Standard Library.

Q2 (15 marks)

�-core based structural diversity

Figure 1: Example Graph

Task Overview

Structural diversity is a crucial aspect of graph analysis, providing insights into the connectivity and importance of nodes within a network. Traditional structural diversity measures often focus on the number of components a node connects to, but they can be limited in their ability to capture the full complexity of graph structures. To enhance this analysis, we introduce �-core based structural diversity, which combines the concept of �-cores with structural diversity to provide a more nuanced understanding of node importance. In this project, you will implement an online efficient algorithm for �-core-based structural diversity, allowing for a deeper exploration of graph structures and their implications.

Problem Definition

Definition (Graph)

A simple, undirected, unweighted graph is a pair �=(�,�) where � is the vertex set and �⊆{{�,�}∣�,�∈�,�≠�} is the edge set. We use �[�′] to denote the subgraph of � induced by a subset �′⊆�.

Definition (�-core)

For an integer �≥0, the �-core of a graph �, denoted �(�), is the maximal connected induced subgraph �=(��,��) that satisfiesdeg�⁡(�)≥�for every �∈��.

Definition (Neighbour-induced subgraph)

For a vertex �∈�, its set of neighbours is �(�)={�∈�∣{�,�}∈�}. The subgraph induced by �(�) is called the neighbour-induced subgraph of � and is denoted nbr�=�[�(�)].

Definition (�-core-based Structural Diversity)

Let ��(�) denote the �-cores of a graph �. The �-core-based structural diversity of a vertex � is the number of �-cores in its neighbour-induced subgraph, denoted as ��(�), which is defined as��(�)=|��(nbr�)|.

Problem Statement

Given a graph �=(�,�) and an integer �≥0, compute ��(�) for every �∈�. The output is an array of length |�| storing, for each vertex, its �-core-based structural diversity value. You can assume that the vertex IDs in the graph are consecutive integers starting from 0 to |�|−1.

Example

Consider the graph in Figure 1. For vertex 5 we have the neighbour set�(5)={0,2,3,4,6,7,8,9}.

  • �=0
    • �0(5)=4
    • �0(nbr5)=[{2},{8},{0,3},{4,6,7,9}]
  • �=1
    • �1(5)=2
    • �1(nbr5)=[{0,3},{4,6,7,9}]
  • �=2
    • �2(5)=1
    • �2(nbr5)=[{4,6,7,9}]
  • �=4
    • �4(5)=0 (no 4-core)

Doing this Project

Open the code template file Q2.ipynb and make a copy in your own Google Drive. You need to implement a class named kCoreBaseStructuralDiversity with a function process(G, k).

We will extract your kCoreBaseStructuralDiversity class (along with any necessary import statements) and test it independently. Your implementation must run correctly without modification in the Colab environment.

class kCoreBaseStructuralDiversity(object): def __init__(self): pass @staticmethod def process(G, k): """ Parameters ---------- G : UndirectedUnweightedGraph k : int Returns ------- List[int] # τ_k(v) for all v """ # TODO τ = [0] * G.vertex_num return τ

Required Files

Compress all related files for Q2 (Q2.ipynb and Q2.pdf, which contains your explanation) into Q2.zip, and submit it on Moodle.

Other Marking Criteria

  1. A correct and efficient implementation of an online algorithm with proper documentation will receive full marks.
  2. Partial marks will be awarded for implementations that pass only a subset of the hidden test cases.
  3. Submissions must return Structural diversity values for all vertices in the graph.
  4. You must provide a clear explanation of your algorithm and its time complexity.

Notes

  1. Do not change the input/output format of the provided class or method.
  2. We will test your submission on larger graphs than those shown in the examples.
  3. Make sure to handle edge cases, such as graphs with no edges or isolated vertices.
  4. You can not import any external libraries or modules other than the Python Standard Library.

Submission Guide

1. Download the google colab notebook. (Go to 'File' on the menu, select 'Download', and click 'Download .ipynb')

2. Rename your notebook as Q1_zid.ipynb/Q2_zid.ipynb

3. Compress all the related files into Q1.zip/Q2.zip

4. Upload Q1.zip and Q2.zip to Moodle

发表评论

电子邮件地址不会被公开。 必填项已用*标注