Parking Violations in NYC
Data
Parking violations in 2020/21
NYC Open Data has data on all parking violations issued in NYC since 2014. The updated dataset provided for 2025 currently includes about 26 million observations. To make the assignment manageable, I have reduced it to a subset of tickets issued in Jan 2025 and to Manhattan precincts only, yielding about 200 thousand tickets.
Two support files are also included in the parking sub folder:
- the descriptions of all variables
- the dictionary of violation codes
Police Precincts
Exercise
1. Data exploration
Before focusing on the spatial part of the data, let's explore the basic patterns in the data.
Add the violation code descriptions and fine amounts to the data file. For simplicity, ignore the differing fine amounts above and below 96th street. Simply use the fine amount for below 96th street for all violations.
Provide a visual overview of the top 10 most common types of violations. Compare how this ranking differs if we focus on the total amount of revenue generated.
Find an appropriate visualization to show the average amount of fine by vehicle color and vehicle year. Restrict your attention to non-commercial vehicles (see vehicle plate type. Briefly describe your findings.
2. Map by Precincts
Read in the shape files for the police precincts and remove all precincts outside of Manhattan.
Provide three maps that show choropleth maps of:
- the total number of tickets
- the total amount of fines
- the average amount of fines
I have added a grouping of the violation codes called violation_group to guide your visualization. Provide a faceted set of choropleth maps to capture each of these subgroups and show where different types of violations are more or less common. Choose a color scheme that allows you to highlight the main trends in the data.
3. Focus on the Upper East
Precinct 19 identifies the Upper East Side. The data currently does not provide latitude and longitude of the violation locations (and I am not sure what these street_code variables are for).
Restrict your data to parking violations related to fire hydrants ( Violation Code = 40 ). Using the variables Street Name and House Number as well as the knowledge that these addresses are in the Upper East Side of Manhattan, geocode at least 500 addresses. Include a data table of these addresses and the latitude and longitude of these addresses in the output.
Provide an interactive map of the violations you geocoded using leaflet . Provide at least three pieces of information on the parking ticket in a popup.
Create another variable called luxury_car in which you identify luxury car brands using the Vehicle Make variable.
Start with the previous map. Distinguish the points by whether the car is a luxury car. Add a legend informing the user about the color scheme. Also make sure that the added information about the car type is now contained in the popup information. Show this map.
Add marker clustering, so that zooming in will reveal the individual locations but the zoomed out map only shows the clusters. Show the map with clusters.
Submission
Please follow the instructions to submit your homework. The homework is due on Monday, March 17.