DS2000 Fall 2024 Homework 2

Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due

DS2000

Fall 2024

Homework 2

Assigned: September 20th, 2024

Deadline: September 27th, 2024 @9pm eastern

Submit each program as a .py file in gradescope (filenames are specified below). You may submit multiple times right up until the deadline.

You may submit up to 48 hours late for no penalty. This policy exists for those times you're having a tough week, are feeling sick, or are falling behind in your work; we won't make any exceptions to this policy.

Your solution will be graded according to the DS2000 general rubric and style guide

You may not use dictionaries or functions on this homework (other than main). The entire assignment can be accomplished with material we’ve covered in lecture.

Submit Plots

Problems 2 and 3 ask you to create visualizations; submit them in gradescope along with your code.

Style Guide Focus

(pay attention to these items in particular for this Homework, but the entire style guide will be used during grading, so please make sure you review it!)

● Communication (plots in particular), Spacing, and Variables and Functions

The 30-minute Guideline

If you get stuck on a homework problem, come by office hours, post on Piazza, or take a break! We recommend you spend about 30 minutes trying to figure out a problem -- enough time that you can try a few things to get unstuck, but not SO much time that you’re banging your head against the wall. Try for 30 minutes, then take a break, take a walk, and/or ask us. :)

Review the Autograder Output

We’ll use different .txt files for testing than you’ll use when developing your homework; make sure your code is flexible! When you submit your solution, gradescope will run your code and print out the results so you can see what we’ll see when grading. Look at this output! It serves as a sanity-check to make sure your code produces what you wanted; if it doesn’t, you can make revisions and resubmit up until the deadline.

It’s fine to work with friends and share ideas with each other; it is not fine to share code. Do not show your code to classmates or post code on piazza.

Files (find links to these on the course website, under this homework)

Plot images to submit:

● satistfaction_trend_2016.png

● satistfaction_trend_big.png

● dissatistfaction_change.png

Source: Massachusetts Bay Transportation Authority https://www.mbta.com/performance-metrics/customer-satisfaction 

● (Problem 1) green_line.txt, red_line.txt, blue_line.txt, orange_line.txt

● (Problem 2) somewhat_satisfied_2016.txt, somewhat_satisfied_big.txt

● (Problem 3) somewhat_dissatisfied_big.txt

Problem 1 - Ridership

Filename: ridership.py

Data: green_line.txt, red_line.txt, blue_line.txt, orange_line.txt

For this first problem, we’re interested in MBTA ridership over the course of the four years from 2019 - 2022. (Partly because Felix is obsessed with the MBTA, partly because public transit is linked to housing and social justice issues!)

Save the name of the data file, a string in a constant. The first line in the file is the name of the T line (red, orange, green, or blue), then there is one line per year from 2019 to 2022 of total ridership count. You can always count on there being exactly 5 lines in each file.

Using this information, compute and report the answers to the following questions, related to the ridership statistics reported in these files.

● What is the name of this T line?

● What is the maximum number of riders reported for this line?

● What is the minimum number of riders reported for this line?

● What is the difference between the maximum number of riders and the minimum number of riders?

 

Test your program to make sure it runs correctly on the different files by only modifying the string name of the file (e.g., "orange_line.txt") and nothing in your source code.

For full credit under documentation, write a test case in a comment at the top of your file, something like this but with your own examples. We recommend making your test case into an input file and running your code on it! Do this before you start coding so you know your program works! It's okay if the format of your test case is different from your program's output.

test file: chartreuse_line.txt

chartreuse

10

200

50

100

 

program output:

max: 200

min: 10

difference: 190

Here’s an example of what happens when I run my program. (Your output doesn’t need to look exactly the same):

Name of line: orange

Maximum number of riders: 60451057

Minimum number of riders: 22237495

Difference between max and min: 38213562

When you turn in your code, make sure it's configured to run on one of the four provided input files (we're counting on you using one of these names, but we'll test your code with different file contents).

 

Problem 2 - Customer Satisfaction

Filename: satisfaction_trend.py

Data: somewhat_satisfied_2016.txt, somewhat_satisfied_big.txt

For the next problem, we’re interested in figuring out what percent of riders are satisfied with the MBTA. Start with the data file somewhat_satisfied_2016.txt.

Prompt the user for the name of the data file. Your data starts in February 2016, and has one line per month (the second line represents March 2016, the third April 2016, etc.). Each line has the percent of survey respondents who said they were Somewhat Satisfied from that month.

Using this information, compute and report the answers to the following questions and create the graph requested related to the customer satisfaction survey results.

● Plot the overall trend of satisfaction across time, one point per number in the file. Your y-axis represents the percent of responses with this level of satisfaction. Your x-axis represents the row number (your first point's x-value will be 1, the second will be 2, etc.).

○ Save your graph as satistfaction_trend_2016.png either with the "save" icon (��) or programmatically (using plt.savefig(filename) before plt.show()).

○ Verify your program: re-run your program using the somewhat_satisfied_big.txt file and save your graph as satistfaction_trend_big.png.

● Using print statements:

○ Report any changes you had to make to your program for it to run correctly with the big file. If you did not make any changes, report why you did not need to make changes.

○ Report any overall trends you observe in this plot and give one hypothesis for what may have caused these patterns. 

Make sure that you prompt the user exactly once, just for the filename! (if you don't, the autograder will break)

Problem 3 - Customer Satisfaction Chang

Filename: dissatisfaction_change.py

Data: somewhat_dissatisfied_big.txt

For the next problem, we’re interested in figuring out the change in survey results  over time (versus graphing the raw trend, as in Problem 2).

Read in your data. This data starts in February 2016, and has one line per month (the second line represents March 2016, the third 2016, etc.). For this problem, you'll be calculating the percent change from the previous month. Do not prompt the user for the file name!

For example, if your first month's percentage of Somewhat Dissatisfied customers is 0.1 and the second month's percentage is 0.2, that's a percent change of 100%. If the first month's value was 0.15 and the second month's was 0.12, that's a percent change of -20%.

Produce the following:

● A bar graph visualizing the change in dissatisfaction across time.

○ Your y-axis represents the percent change from the previous month. Your x-axis represents the number of the row read in (your first point's x-value will be 2, the second will be 3, etc.). Note that you'll start graphing the data from the second line in the file, not the first!

○ Turn in dissatistfaction_change.png

● Use print statements to report what you observe in this graph and one hypothesis for the patterns you see. Consider what a negative versus a positive percent change means! Take into account both your observations from problem 2 and any real-world knowledge you have found when giving your hypothesis.

For all plots—make sure you've followed the guidelines in the DS 2000 styleguide! You will turn in your plots as .png files, so make sure to save it and double check the contents of that file. Do not turn in screenshots please (why? because the resolution of screenshots is much lower than saving the plot directly—and you'll want high-resolution plots so your graders can see how awesome they are!).

Note that questions 1 & 3 do not ask the user for input; question 2 does.

发表评论

电子邮件地址不会被公开。 必填项已用*标注