STATS 201/8 Assignment 5

Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due

STATS 201/8 Assignment 5

Due Date: 3pm Thursday 23rd May

1 Question 1 [16 Marks]

A researcher was interested in the effectiveness of three types of mosquito trap. A total of 180 traps were set in Chicago, 60 of each type. The traps were randomly allocated to locations around the city. The number of mosquitoes caught by each trap was recorded.

This data can be found in the file “Trap.csv”, and includes variables:

Variable        Description

Caught          The number of mosquitoes caught,

Trap              The trap type (CDC, GRAVID or SENTINEL).

The researcher wants to know if we:

• can we state which single trap type is the least effective (tends to catch the lowest number of mosquitoes)? If so, state the trap type and quantify how much more effective the other trap types were compared to this trap type. If not, why not?

• can we state which single trap type is the most effective (tends to catch the highest number of mosquitoes)? If so, state the trap type and quantify how much more effective the trap was compared to the other trap types. If not, why not?

Instructions:

• Make sure you change your name and UPI/ID number at the top of the assignment.

• Comment on the plot and summary statistics of the data.

• Fit a GLM modelling the number of mosquitoes caught as counts.

• Write appropriate Methods and Assumption Checks.

• Write an appropriate Executive Summary.

1.1 Questions of Interest:

We were interested in comparing the number of mosquitoes caught between three different trap types (CDC, GRAVID and SENTINEL).

1.2 Read in and inspect the data:

Trap.df <- read.csv("Trap.csv",header=TRUE,stringsAsFactors=TRUE)

plot(Caught~Trap,horizontal=T,data=Trap.df)

summaryStats(Caught~Trap,data=Trap.df)

## Sample Size Mean Median Std Dev Midspread

## CDC 60 33.36667 21.5 31.10396 56.50

## GRAVID 60 12.55000 4.0 16.54724 11.75

## SENTINEL 60 25.06667 12.0 29.10523 36.50

1.3 Comment on the plot:

Comparing the centres, a CDC trap appears to be the most efficient trap in catching mosquitoes with a larger variation and, a gravid is the least efficient one. The plots and the summary statistics reveal positive skewness of the data and the variance is much larger than the mean.

Note: There is clear evidence suggesting over-dispersion relative to the Poisson model (as means should roughly equal variances and here they roughly equal standard deviations) and a quasi-Poisson model is likely to be a better choice to model the data.

1.4 Fit and check model:

Trap.fit1 = glm(Caught~Trap, family=poisson, data=Trap.df)

summary(Trap.fit1)

##

## Call:

## glm(formula = Caught ~ Trap, family = poisson, data = Trap.df)

##

## Deviance Residuals:

## Min 1Q Median 3Q Max

## -7.597 -4.526 -2.820 2.912 13.522

##

## Coefficients:

## Estimate Std. Error z value Pr(>|z|)

## (Intercept) 3.50756 0.02235 156.942 <2e-16 ***

## TrapGRAVID -0.97784 0.04275 -22.875 <2e-16 ***

## TrapSENTINEL -0.28602 0.03412 -8.382 <2e-16 ***

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##

## (Dispersion parameter for poisson family taken to be 1)

##

## Null deviance: 5258.0 on 179 degrees of freedom

## Residual deviance: 4663.1 on 177 degrees of freedom

## AIC: 5414.5

##

## Number of Fisher Scoring iterations: 5

plot(Trap.fit1,which=1)

1-pchisq(deviance(Trap.fit1), df = df.residual(Trap.fit1))

## [1] 0

# Refit as quasipoisson:

Trap.fit2 = glm(Caught~Trap, family=quasipoisson, data=Trap.df)

summary(Trap.fit2)

##

## Call:

## glm(formula = Caught ~ Trap, family = quasipoisson, data = Trap.df)

##

## Deviance Residuals:

## Min 1Q Median 3Q Max

## -7.597 -4.526 -2.820 2.912 13.522

##

## Coefficients:

## Estimate Std. Error t value Pr(>|t|)

## (Intercept) 3.5076 0.1187 29.552 < 2e-16 ***

## TrapGRAVID -0.9778 0.2270 -4.307 2.73e-05 ***

## TrapSENTINEL -0.2860 0.1812 -1.578 0.116

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##

## (Dispersion parameter for quasipoisson family taken to be 28.20377)

##

## Null deviance: 5258.0 on 179 degrees of freedom

## Residual deviance: 4663.1 on 177 degrees of freedom

## AIC: NA

##

## Number of Fisher Scoring iterations: 5

confint(Trap.fit2)

## Waiting for profiling to be done...

## 2.5 % 97.5 %

## (Intercept) 3.2655347 3.73150810

## TrapGRAVID -1.4398289 -0.54593910

## TrapSENTINEL -0.6451463 0.06710844

exp(confint(Trap.fit2))

## Waiting for profiling to be done...

## 2.5 % 97.5 %

## (Intercept) 26.1941144 41.7420120

## TrapGRAVID 0.2369683 0.5792975

## TrapSENTINEL 0.5245858 1.0694114

100*(exp(confint(Trap.fit2))-1)

## Waiting for profiling to be done...

## 2.5 % 97.5 %

## (Intercept) 2519.41144 4074.201199

## TrapGRAVID -76.30317 -42.070249

## TrapSENTINEL -47.54142 6.941143

# Relevel factor to change baseline:

Trap.df=within(Trap.df,{Trap.R=factor(Trap,levels=c("GRAVID","CDC","SENTINEL"

))})

Trap.fit3 = glm(Caught~Trap.R, family=quasipoisson, data=Trap.df)

summary(Trap.fit3)

##

## Call:

## glm(formula = Caught ~ Trap.R, family = quasipoisson, data = Trap.df)

##

## Deviance Residuals:

## Min 1Q Median 3Q Max

## -7.597 -4.526 -2.820 2.912 13.522

##

## Coefficients:

## Estimate Std. Error t value Pr(>|t|)

## (Intercept) 2.5297 0.1935 13.072 < 2e-16 ***

## Trap.RCDC 0.9778 0.2270 4.307 2.73e-05 ***

## Trap.RSENTINEL 0.6918 0.2371 2.918 0.00398 **

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##

## (Dispersion parameter for quasipoisson family taken to be 28.20377)

##

## Null deviance: 5258.0 on 179 degrees of freedom

## Residual deviance: 4663.1 on 177 degrees of freedom

## AIC: NA

##

## Number of Fisher Scoring iterations: 5

exp(confint(Trap.fit3))

## Waiting for profiling to be done...

## 2.5 % 97.5 %

## (Intercept) 8.371342 17.929840

## Trap.RCDC 1.726229 4.219974

## Trap.RSENTINEL 1.267467 3.224323

100*(exp(confint(Trap.fit3))-1)

## Waiting for profiling to be done...

## 2.5 % 97.5 %

## (Intercept) 737.13416 1692.9840

## Trap.RCDC 72.62287 321.9974

## Trap.RSENTINEL 26.74671 222.4323

# or in opposite direction

100*(exp(-confint(Trap.fit3))-1)

## (Intercept) -88.05448 -94.42271

## Trap.RCDC -42.07025 -76.30317

## Trap.RSENTINEL -21.10249 -68.98574

1.5 Methods and Assumption Checks:

A Poisson log-linear model was fitted to the counts of mosquitoes. The explanatory variable was a single factor, the type of trap used.

The residual plot shows that we seem to have mostly captured the trend. The residual deviance is much bigger than the expected residual deviance given this degrees of freedom. This finding is confirmed by the Chi-square test for the residual deviance, which gives a P-value of 0. Therefore we have very strong evidence that the data is over-dispersed and there is more variation than we would expect from the Poisson model. Therefore, the model was refitted as a quasipoisson model. Trap was strongly significant. Finally, the factor Trap was re-ordered to quantify the difference compared to a base level of GRAVID.

The final model was log(μi) = β0 + β1 × TypeGravidi + β2 × Typesentineli,

where μi is the mean number of mosquitoes caught by the i-th trap and, TypeGravidi is 1 if a gravid trap is used and 0 otherwise. Similarly Typesentineli is 1 if a sentinel trap is used and otherwise, 0. Numberi has an overdispersed distribution with mean μi. (Note: this is not a Poisson distribution in this case as we used a QuasiPoisson model.)

Alternatively, we can write the final model formula as follows:

log(μij ) = μ + αi

where μi is the mean number of mosquitoes caught by the j-th trap of type i, μ is the overall intercept, and αi is the effect of trap type i. Here, i ∈ {CDC,Gravid,Sentinel}.

1.6 Executive Summary:

A researcher was interested in the effectiveness of three types of mosquito trap.

We have evidence that the gravid trap is the least effective trap in catching mosquitoes.

We estimate that the expected number of mosquitoes caught using a gravid trap is somewhere between:

• 42% and 76% lower than when using a CDC trap

• 21% and 69% lower than when using a sentinel trap.

Alternatively, we estimate that, compared to the gravid trap, the expected number of mosquitoes caught using a

• CDC trap will be between 73% and 322% higher.

• sentinel trap will be between 27% and 222% higher.

We cannot choose a single best trap as there is no evidence that the expected number of mosquitoes caught differs between sentinel and CDC traps.

发表评论

电子邮件地址不会被公开。 必填项已用*标注