ST332/ST409 Medical Statistics 2023-24: Practical 2

ST332/ST409 Medical Statistics 2023-24: Practical 2

Case-control study to assess risk factors for diagnosis of Colorectal cancer in patients diagnosed with ulcerative colitis The aim of this study was to assess whether there were any modifiable risk or preventative factors for developing colorectal cancer in patients who had been diagnosed with ulcerative colitis [UC] (an inflammatory bowel disease that causes inflammation and ulcers in the colon and rectum). The original study paper can be found in Aliment Pharmacol Ther 2000 Feb;14(2):145-53 https://doi.org/10.1046/j.1365-2036.2000.00698.x & there is a pdf on Moodle ST332 page.
Cases and controls were matched on sex, age, extent and duration of UC. Primary interest here is whether taking 5-ASA (Aminosalicylic acids – drugs which inhibit the inflammatory process) could be preventative for developing colorectal cancer in this patient population.
Variables
case = “Case", "Control"
pair = number of matched pair
male = “Female”, “Male”
ageg = age group ("<15","15-29","30-49",">50")
extent = extent of UC at diagnosis ("Proctitis","Left-sided","Subtotal/total")
("Caucasian","Asian"))
("No 5-ASA","5-ASA"))
("Never smoked","Current smoker","Ex-smoker"
From the ST332 Moodle page you should download the R dataset crccc2.RData – note that this is a synthetic version of the case control data so your results will not be exactly the same as those reported in the paper but should be similar. To load it then use load("crccc2.RData")after setting your working directory to where you have saved it, i.e. setwd(“YYY”).
Useful R packages are epitools, epiDisplay, epiR and epi – to install packages type install.packages(“XXX”)and then library(“XXX”)
1. In order to assess whether use of 5-ASA was a risk factor (either protective or harmful) calculate the Odds Ratio (OR) from the case-control data and 95% CI – what does this tell you? [Hint: you might find the oddsratio and epitab functions in the epitools package useful, but you should also make sure that you can calculate the OR and 95% CI by hand/manually in R too!!]
2. The case-control study was matched (1:1) – use the formula in the lecture slides to calculate an adjusted OR and 95% CI for 5- ASA – do the results change? [Hint: you might find it helpful to stratify the dataset by cases and controls in order to produce a cross-tabulation of the matched pairs first].
3. Explore whether adjusting for smoking status (via stratification) might be important (given that smoking status may be linked to cancer development generally) – produce ORs and 95% CIs for 5-ASA for the different strata. Is there evidence that they are different? Obtain a Mantel-Haenszel estimate for the 5-ASA OR (and 95% CI) adjusting for smoking status – is it different to that obtained in either Q.1 or Q.2? [Hint: you might find the mhor function in the epiDisplay package useful for this, though you should also make sure that you can apply the methods covered in the lectures in order to do this “by hand”/manually in R!]
4. Fit a GLM to repeat the analysis in Q.1 in order to obtain an unadjusted estimate of the OR for 5-ASA use – is this similar to that obtained in Q.1? Extend the model to explore whether (i) ethnicity (Caucasian/Asian) and/or (ii) smoking status are important factors? Does this agree with your findings in Q.3?

发表评论

电子邮件地址不会被公开。 必填项已用*标注