Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
STA220H1 The Practice of Statistics I (Fall 2024)
Assignment 1 Instructions
Instructions
This is an individual assignment. You are expected to work on this independently. While you may discuss ideas and concepts, please do not share your code or written answers. It is expected that all code and written work should be written by yourself. Please note, this assignment is fairly open, so the context of most of the work completed here should not match your peers.
Submission Format and Instructions
For question 1, your PDF file will need to show (1) R code, (2) R output/figures, and (3) your
- Use Microsoft Word to type out your answers. Screenshot your R output and place these images throughout the document. For the R code, either copy/paste as text or screenshot.
- Use an app like Notability, OneNote, etc., where you can write/type your answers and include screenshots of your R code and output.
- Use RMarkdown and knit to a PDF. Alternatively, you can knit to an HTML file and then save it as a PDF.
Late Penalty
As described on the course syllabus, late work will be deducted 10% per hour.
Question 1 (30 marks)
• Name: The name of the property listing.• host_identity_verified: Indicates whether the host's identity has been verified.• host.name: The name of the host.• neighbourhood.group: The larger neighborhood group where the listing is located.• neighbourhood: The specific neighborhood where the listing is located.• lat: The latitude of the listing, based on the WGS84 geographic coordinate system.• long: The longitude of the listing, based on the WGS84 geographic coordinate system.• instant_bookable: Specifies whether the listing can be instantly booked.• cancellation_policy: The type of cancellation policy applied to the listing.• room.type: The type of room available for the listing (e.g., entire home, private room).• construction.year: The year the property was constructed.• price: The price of the listing per night, in USD.• service.fee: The service fee per night, in USD.• minimum.nights: The minimum number of nights required to book the listing.• number.of.reviews: The total number of reviews received in the last 12 months.• reviews.per.month: The average number of reviews received each month over the listing’s lifetime.• review.rate.number: The average review rating of the listing.• calculated.host.listings.count: The number of listings managed by the host, as calculated by Airbnb.• availability.365: The number of days the listing is available for booking within the next 365 days.
Your task is to explore and analyze the Airbnb dataset using graphical summaries created using R. Create a minimum of 4 properly labelled plots/graphs/figures that summarize some of the patterns in the data. It is recommended that you use at least 3 different types of plots. For each plot, write a few sentences explaining the patterns and/or trends you see in the plot.
You will be graded on not only your ability to create plots in R, but also your ability to use the plots to highlight important and/or compelling information in the data.
Question 1 Rubric
|
Inadequate |
Fair |
Good |
Excellent |
Plots (10 marks) |
0-4 marks
Does not meetthe requirement of 4+ plots.
|
5-6 marks
Required plots are provided, but plots do not highlight theimportant information or show a varietyof trends. Thetype of plots chosen are not ideal for thesituation.
|
7-8 marks
Required plotsare provided,and mostlyshows that the student is able to create a plot relevant for the situation. A variety of plots are used, and plots are labelled properly.
|
9-10 marks
Required plots are provided, and a lot of thought was put into creating the plot. Plots are interesting, compelling, and communicate well to the viewer. |
Written descriptions (10 marks) |
0-4 marks
Written descriptions are not sufficient in describing the plots. Writing is unclear.
|
5-6 marks
Written descriptions are provided butcontain major errors. The descriptions do not accurately describe the plots or are misleading.
Writing is somewhat unclear.
|
7-8 marks
Written descriptions are provided and shows that student is able toproperly interpret plots.
Writing is generally clear.
|
9-10 marks
Written descriptions are excellent.
Student exceeds expectations and highlights the important trends within the data. Writing is clear and compelling.
|
R code (10 marks) |
0-4 marks
R code is not shown or has many major errors.
|
5-6 marks
R code is provided but is difficult to follow.
|
7-8 marks
R code is provided, but does not show a variety of plot types.
|
9-10 marks
R code is provided. A variety of plot types are shown.
|
|
|
Mild |
Severe |
||
|
|
Survived |
Deceased |
Survived |
Deceased |
Treatment |
A |
150 |
850 |
30 |
70 |
B |
5 |
45 |
100 |
400 |