COMPSCI4077 Health Data Analysis

Web Science (H) COMPSCI4077

Health Data Analysis

Date: 15th March 2024

Individual Assessment: Health Data Analysis

CW is marked out of 100 marks & Weighted 20% for the final marks

Coursework is due on Friday, March 15, 2024, 430 PM (Subject to LTC approval)

All submissions are through Moodle. Penalties will be applied for late submission.

Links to data from SuicideWatch subreddit is given. Crawl the data and create network-based analysis.

Your task is to develop Network analysis on this data set. I recommend you use Jupyter notebook and submit code and outputs archived.

(i) Use the data and create graphs and create visualisation.         [25]

In the report – how did you organise the data?

• Data preparation & statistics (10 marks)

Discuss the data pre-processing, data preparation and the justification.

Discuss the data statistics – here you summarise data and this description could be useful for later elaborations.

• Create a global interaction graph. Explain the Start node, end nodes- (5 marks)

• Visualisation of the network (10 marks)

Gephi can be used to build visualisation. Marks breakdown for the visualisation.

Visualisation 5 marks

Description and interpretation – 5 marks

(ii) Create Graph/network Analysis and understand the important properties of the graph.          [45]

What analysis? Based on Lecture 6, week starting 29/1/2023

Identify the methodology you will use to analyse the data from a network perspective

Each metric, description, pseudo-code, interpretation (15 marks)

Minimum 3 metrics /approaches

(iii) [Open creativity tasks]

Students are encouraged to explore further and describe the characteristics of data, graph.            [20]

(iv) Report Quality – 10 marks

a. Structuring and formatting - 3

b. Articulation of ideas - 3

c. Creativity in addressing the tasks -4           [10]

发表评论

电子邮件地址不会被公开。 必填项已用*标注