MATH3802/MATH5802M practical 2023/24
Background
• The MATH3802 and MATH5802 modules are assessed by an examination (80%) and a practical (20%). This is the practical, worth 20% of your final module mark.
• Comments on this sheet which are only relevant to MATH5802M students are indicated by italic font.
• Reports must be clearly marked with your name, student ID, and module and should be no more than six sides of A4 paper in length (eight sides for MATH5802M). If you want to submit e-files,please name your report file as StudentID-Name-MATH3802 or StudentID-
Name-MATH5802.
• Practical is given in week 6 (6th-10th Nov). You will have two hours to work on your practical in the supervised session. My colleagues will be available to answer general questions, but not technical questions.
• You can share ideas with other students, but the work you hand in must be your own.
• Please include a signed Declaration of Academic Integrity Form.
• You must hand in your solutions by 5pm on Friday, 24th Nov 2022. Please hand in an electronic version via Minverva ! Assessment and Feedback ! Submit My Work !
Turnitin. Late work will be penalised at the rate of 5% of the available marks per calendar day.
Data
The data are taken from a database about slave voyages. Specifically, we will consider theyearly estimated number of slaves who embarked on a voyage across the Atlantic Ocean — on a ship under a British flag.
For this practical you will need to obtain the data set X = (X1; : : : ; Xn) from the file “slavery.RData” which you can download from Minerva. Once you have the file “slavery.RData” saved in the folder where you are running R, use the command load("slavery.RData") to read the data into R. This will give you a data frame called xf which has two components: xf$year, and xf$num. (Try summary(xf) or str(xf).)
The task
The task of this practical is to analyse these data, using the following list of points as a guide. Your analysis should consider the points below, but your report should not be laid out as “answer 1”, “answer 2”, . . . . Instead, structure your report in whichever way you prefer to make it easy to read.
• Plot the data (the variable num we will denote by X) and examine any prominent features. Comment on your findings. Use linear regression to remove any linear trend or seasonal effects which you believe to be present in the data. Denote the residuals after removing the trend and/or seasonal effects by Y.
• Inspect the process Y and comment on whether an MA or AR process might be suitable to explain any structure present.
• For each of p = 1, 2, 3, using the Yule-Walker equations, fit an AR(p) model to the time series Y. You can either use the R function ar or solve the Yule-Walker equations “by hand” .
• For each p = 1, 2, 3, consider the residuals of the AR(p) model. Plot a correlogram of the residuals, as well as correlograms of the squared-residuals, and comment on how well the three models fit the data. If you are not happy, you may want to start again with a transformation of the data. Choose the model with the best fit; refer to the corresponding residuals as Z.
• [MATH5802M only] For each of X, Y and Z, plot periodograms. Comment on your results.
• For your chosen model, re-fit the AR parameters using the command arima. Provide a summary of the final model for the original time series, including all the fitted parameters, and their standard errors.
• Using the command predict on the result of the ar fit, obtain a forecast for the next 5 years (i.e. 1808–1812). Interpret and discuss.
Guidance on the report
Don’t forget to look for ideas in the example R code shown in the files we have used in lectures (yule-walker.R, sim.ma.R, sim.ar.R, sim.arima2.R, arima.est.R and fft 202021.R. Of course, you may use the help system in R, e.g. help(predict).
You should take some care with the presentation of your results. This includes using a clear structure and layout, careful explanation of your results and how you obtained them, meaningful plots with appropriate labels, etc. You should start your report with a short (1 paragraph) summary of your findings, written in a style suitable for a non-statistician.
Note that large amounts of R output are not needed. Use R output sparingly or not at all. If you do include R output, make sure it is in a fixed-width font like this— Courier in Word or verbatim/texttt in LATEX. You do not need to attach your R script.
Do not repeat large sections of theory from the notes — I already know what is in them! Use your limited page count to describe your analysis and conclusions.
The data file can be found at http://www1.maths.leeds.ac.uk/ charles/ts/slavery.RData. If you interested in the database, you can find out more about this here https://www.slavevoyages.org/