Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
GT CS 7280: Network Science Assignment 4: Modeling Epidemics
Fall 2024
Overview
Submission
With Anaconda, you can do this by running: conda list -e > requirements.txt
Ensure all graphs and plots are properly labeled with unit labels and titles for x & y axes. Producing readable, interpretable graphics is part of the grade as it indicates understanding of the content – there may be point deductions if plots are not properly labeled.
Getting Started
You can install the library using: pip install EoN
This homework, especially parts 2 and 3, might take several minutes to run. Be aware of this and plan to complete it accordingly.
**IMPORTANT** As with prior assignments the structure has been designed to have several subsections in each part. The first few subsections are meant to just define useful functions and the final subsection of each part is where the functions are called and the analysis is done. If you are confused about how a function is meant to be used, check the finalsubsection in each part to see how they are being called. This should clear up a lot of potential points of confusion early on.
Part 1: Outbreak Modeling [40 Points]
Construct an undirected graph from that text file using functions in the networkX module. For the purpose of this assignment, we consider only the unweighted graph (i.e., you can ignore the third column).
1. [10 points] Suppose there is a pathogen with transmission rate of 0.01 and recovery rate of 0.5. Suppose that an outbreak started at node 325 (“patient-0”). Complete the simulate_outbreak function to simulate an outbreak under the SIS model using the provided parameters. The function should return a list of length n_iter containing simulation runs where n_iter is an argument to the function.
Important: When running your simulations, you will want to discard the outbreaks that died out stochastically. To do this, check whether the number of infected nodes at the last time step is 0 and replace them with a simulation that does not die out. In total you should have n_iter simulations.
Additionally, complete the plot_outbreaks function to visualize the results of the simulate_outbreak function. Show the results for each of the simulations on a single plot and break each simulation into 2 lines, one for the number of infected and the other for number of susceptible over time. Make sure to properly label these lines and to create a legend identifying which lines are which.
Hint: scipy.optimize.curve_fit is a helpful function to fit the exponent to a curve.
Additionally, complete the plot_curve_fit function to plot both the actual number of infected and the theoretical curve given a value of (for values of Infected < 100). This function should also compute the r-squared between the two curves and print the value for and r-squared in the title of the plot. Again, make sure to label both curves and create a legend identifying which is which.
○ The random distribution shown in the Lesson 9 Canvas lecture “SIS Model”.
○ The arbitrary distribution from the Canvas lectures shown in the Lesson 9 Canvas lecture “Summary of SI, SIS, SIR Models with Arbitrary Degree Distribution”.○ The arbitrary distribution from the textbook found in Ch. 10, Equation 10.21.
Additionally, complete the compare_taus function to show a boxplot of the distribution of sample calculated from simulation runs (see 1.5 to understand where these come from). Visualize the theoretical calculations as dots on the box plot. Again, label each of these dots with the calculation used to generate them.
4. [10 points] Complete the calculate_theoretical_endemic_size function to compute the size of the population that remains infected at the endemic state.
Then, complete the compare_endemic_sizes function to plot the distribution of endemic sizes across several simulation runs as a boxplot, and compare it with the theoretical calculation for endemic size as a single dot, similarly to the previous subsection.
5. [5 points] Run the code provided in cell 1.5 and look at the resulting figures. How good of a fit is the exponential curve in section 1.2? Explain how the theoretical estimates in 1.3 & 1.4 compare to the empirical distribution and indicate which you would consider a reasonable fit for the data.
Part 2: Transmission Rate [25 Points]
1. [10 points] Complete the simulate_beta_sweep function to vary the transmission rate over a range of beta values between beta_min, beta_max with beta_samples number of points. For each value of the transmission rate, compute 5 simulations to avoid outliers.
You can reuse your simulate_outbreak function from Part 1 in this function.
Finally, complete the plot_beta_tau_curves function to show the exponential curve given by the values for each beta value. The x-axis is time and y-axis is the number of infected people. Use a log scale on the y-axis and make sure that each line has its own color. This function should be similar to the plot_curve_fit function in part 1.2, but you will be showing a series of exponentials instead of comparing an experimental with a theoretical curve.
2. [10 points] Complete the extract_average_endemic_size function to return a list of the average endemic size calculated over the five simulation runs for EACH beta value.
Next, complete the calculate_theoretical_endemic function to find the minimum theoretical beta values of the transmission rate for an epidemic to occur. Calculate this minimum based on the equations derived in lecture for both the random distribution and the arbitrary distribution. Also, calculate the theoretical endemic size for each value of beta under the assumption of random distribution.
Finally, complete the compare_endemic_sizes_vs_beta function to plot the average endemic sizes and theoretical endemic sizes as a curve vs beta. Additionally, plot the minimum values for beta to start an epidemic as vertical lines. Make sure to label each line and provide a legend.
3. [5 points] Run the code provided in cell 2.3 and look at the resulting figures. How similar is the theoretical to experimental endemic sizes? How closely do the minimum beta values provide a reasonable lower bound for the start of an endemic?
Part 3: Patient-0 Centrality & [30 Points]
Additionally, complete the compute_centrality function to calculate the: degree centrality, closeness centrality (with wf_improved=false), betweenness centrality, and eigenvector centrality of the graph. Remember to use the unweighted centrality metrics. Return the centralities for each node where the simulation was kept in the previous function.
Hint: We provide “nodes” as an argument which is meant to represent the second output of the previous function. You can use this to filter for centralities of only these nodes before you return them. Check the cell for 3.3 to see exactly how this is used.
Additionally, complete the plot_centrality_vs_tau function to plot a scatter plot between the value that corresponds to each node, and different centrality metrics of that node: degree centrality, closeness centrality, betweenness centrality, and eigenvector centrality. Do this all as one figure with four subfigures. Include the Pearson correlation values as well as the corresponding p-values in the title for each scatter plot. Remember to use the unweighted centrality metrics.
3. [5 points] Rank these centrality metrics based on Pearson’s correlation coefficient, and determine which metrics can be a better predictor of how fast an outbreak will spread from the initial node. Analyze your results. That is, do the results match your intuition? If they differ, why might that be?
Part 4: Knowledge Question [5 Points]
Prove that a non-negative linear combination of a set of submodular functions is also a submodular function.