In [ ]:
import pandas as pd import numpy as np import datetime import pandas as pd |
- reservation_date: The date that the reservation was booked for In other words. this is the date when thecustomer will dine.
- reservation_time: The time that the reservation was booked for.
- reservation_party_size: The size of the party for the corresponding reservation. i.e. the number of diners.
- reservation_date_booked: the date on which the reservation was made.
- datetime_booked: The date and time corresponding to when the reservation was made (in UTC). This column has missing values, which have been entered as "#N/A". The restaurant is located in a Pacifid time zone.
Your goal in this final will be to undertand how customers schedule reservations at this restaurant.
Part 1Your first task is to read in the data and do the following:
- delete rows with missing values
- convert datetime_booked to a datetime column with Pacific time zone
- combine reservation_date and reservation time to create a new column called reservation_datetime that is a datetime column that has a Pacific time zone. So the final dataframe you return should have 6 columns intotal.
- Only keep reservations made at the following 8 time slots: 17:30, 17:45, 18:00, 18:15, 20:45, 21:00, 21:15, 21:30
Return this modified version of the oriainal data frame.
In [ ]:
def Read_Data(): df res = pd. read csv("data/Reservation Data. csv", na values=["#N/A"]) return df_res |
In [ ]:
df_res =Read-Data() ssert df_res. shape== (3359, 6) |
In [ ]:
assert np. isclose (df_res. datetime_booked. dt. hour. mean(), 13. 55969) |
Part 2
In this next part, we will write two functions to understand basic patterns in the data.
The first function takes as input one of the eight reservation time slots that we are considering as a datetime time and it outputs the day of the week (as a string) with the smallest average party size of all reservation made for the given inputted time slot.
In [ ]:
def Get_Avg_Party-Size(res_time): smallest_dow = None raise NotImplementedError () return smallest_dow |
In [ ]:
res_time = datetime. time (hour = 17, minute=30) assert Get_Avg_Party_Size (res_time) =="Sunday" |
In [ ]:
res_time = datetime. time (hour = 21, minute=0) assert Get_Avg_Party_Size (res_time)=="Tuesday" |
The second function I would like you to write takes as input a parameter called num_days (you may assume this is an integer), and it returns the fraction of reservations made by parties size 1/2 that are made at most num_days in advance, and then also the same fraction but just for parties of 3/4.
NOTE: The number of days in advance that a reservation is made is (reservation_datetime - datetime_booked).In [ ]:
def Booked_in Advance (num_days): df_res = Read_Data() return party_one_two, party_three_four |
In [ ]:
result = Booked_in Advance (7) assert np. isclose(result[0]. 0. 27188) |
In [ ]:
result = Booked_in_Advance (100) assert np. isclose (result [1], 0. 860534) |
Part 3
Let's assume that there are two service periods:
- Service Period 1 consists of time slots 17:30, 17:45, 18:00, 18:15 Service
- Period 2 consiss of time slots 20:45, 21:00, 21:15, 21:30
The next function takes as input res_time, which is one of the eight times above as a datetime time. Given the inputted time, you will exclusively focus this analysis on either Service Period 1 or 2. For example, if the inputted res_time is 17:45, you should only consider reservations made during Service Period 1. For each day, find the first reservation that books at res_time (datetime_booked tells you the order in which reservations were scheduled). Within the given Service Period of interest, compute how many reservations have already beer booked for the given day before this reservation for res_time was made. If k-1 reservations have already been made, this means that res_time was first booked as the k-th reservation. We will say the "rank" of reservation time res_time on this day was k. So, if res_time was the first time booked on a particular day frot he relevant Service Period, its rank on this day is 1. Return the average rank of res_time across all days in which a reservation for res_time was booked.
In [ ]:
def Get_Avg_ Rank (res time): df_res = Read_Data() |
In [ ]:
res_time = datetime. time (hour = 21, minute=30) assert np. isclose (Get_Avg_Rank (res_time) , 3. 9035 ) |
In [ ]:
res_time = datetime. time (hour =20, minute=45) assert np. isclose (Get_Avg_Rank (res_time), 1. 45714) |