Econ 145 - Fall 2024 Homework 15: Long: Functions and Rmd


Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due


Econ 145 - Fall 2024 Homework 15: Long: Functions and Rmd

Prompt

The goal of this assignment is to practice writing for loops, if statements, functions, and clean documents using R Markdown.

Suppose that you are going to leave your current company in a month. You are asked to write a few functions before you leave and write documentations for the functions so that whoever is taking over will be able to easily understand the functions you write. Remember, your write-up must be written as a narrative but can be longer than a page for this assignment.

Your task in this assignment is to write functions and documentations (what input is allowed and notallowed, what is the output, what is the function doing, example of how to use each function, etc.) for eachfunction using R-markdown. An example of a clearly written function documentation can be found using?function_name in R.

To create Rmd file, go to the R server, click on file, click on new file, and then click on R Markdown. Parts of how to work with an R Markdown file is covered in the lecture slides.

To Receive Credit - READ CAREFULLY!
  • There is no autograder for this assignment. We will check your code from the Rmd file. Your write-upmust be written in RMarkdown, submit the Rmd file and the corresponding knitted .pdf file toreceive credit.
  • Submit the Rmd file through CANVAS under Homework 15 - (Rmd file).
  • Submit the corresponding knitted pdf file through gradescope under Homework 15 - Long(Write-up).
  • Be sure to include your first name, last name, and perm number on your pdf write-up and Rmd file.
  • The function description, the function code, and the output and code for the examples of how to use your functions should all be printed on the knitted pdf file. (You can refer to most of the previous homework prompt for an example of a pdf file with R codes printedon it).
  • Your code should be easy to understand.
  • An example of the report format is shown at the end of this prompt.
Here are the functions that you need to write:

Function 1

(Private Question) Write a function called name_cleaner that takes two arguments: string and list_of_words. The function should return a string with no punctuation, no extra space (before, after, or in the middle of the string), and in “title case”, where only the first letter of each word is capitalized. Additionally, all the words in the list_of_words should be removed from the string, regardless of their capitalization. The words in the list_of_words should be removed from the string regardless of their position in the string or capitalization. For example, if the string is “The quick brown fox jumps over the lazy dog!” and the list_of_words is c("the", "over"), then the function should return “Quick Brown Fox Jumps Lazy Dog”. To summarize, your function should
a) Remove any punctuation from the string.
b) Remove all of the words from `list_of_words` from the string, regardless of their capitalization.
c) Remove any extra spaces from the string.
d) Convert the string to "title case", where the first letter of each word is capitalized.
  • You can use any R function you need.
  • Hint 1: There exist functions in the stringr package that can help you remove punctuation and extra spaces from a string and convert a string to title case.
  • Hint 2: To remove all the words, regardless of their position, you should remember the regular expression symbols ˆ denoting the beginning of a string and $ denoting the end of a string. Also remember that | can be used in regular expressions to denote “or”.
  • Note: You will actually get to use this function in your Final Assignment. Make sure that it is correct!
To see if your function is correct, test your function with the following strings and compare your results to

the correct outputs below.

The output for the string: ‘Hello, my name is John Smith!, and I am a student at UCSB who lives in Isla Vista.’, and the list_of_words c('is', 'I', 'John') is:

Hello My Name Smith And Am A Student At Ucsb Who Lives In Isla Vista

The output for the string: ‘The thing that I like to do most is to go to the Goleta beach. . . . There, I sit out with friends and talk with them!’, and the list_of_words: c('the', 'I') is:

Thing That Like To Do Most Is To Go To Goleta Beach There Sit Out With Friends And Talk With Them

Functions 2 and 3

In this question, you will be tasked with calculating the population of the United States by adding upthe state populations recorded in each Census. First, download the file state_populations.zip from the Canvas class site, move it to the directory of your R project, and unzip the zip file to get a folder called state_populations. You will notice that state_populations contains 612 .csv files. Each csv file contains data on the population and number of representatives to the US Congress for a given state in a given Census.1

To calculate the US population for each Census year, you would need to read each csv file into R before

adding the population of each state together. As you can imagine, completing this manually would be very
time consuming and tedious, making this a perfect task for automating using code. To make your code easier
to read and edit, you will write two functions to complete this task.2

Function 2

First, write a function called populationDataReader that reads in the csv files found in the state_populations folder and merges the rows into a single tibble. The function should take two arguments: folder_path and years. The folder_path argument should be the path to the state_populations folder, and the years argument should be a numeric vector of the Census years you want to read in. The default argument of years should be all of the Census years found in the state_populations folder. The function should return a tibble containing a row each for all 50 states (and the District of Columbia!) for the years specified in the years argument. Hint 1: Remember that R has a built in list of the names of the 50 states and the function seq() may be helpful to create a vector of Census years, spaced out by a decade. Hint 2: Notice that the file names all have the same pattern/format. Using loops, you can read in all of the files by using the pattern found in the file names. Note: Reading in data to R is a time intensive process, so it may take a minute or so for your code to read in all 612 data files.

Function 3

Second, write a function called censusCalculator that takes one argument: data. The data argument should be a tibble of Census data with the columns name, year, resident_population, and number_of_representatives, like returned by the populationDataReader function. The function shouldreturn a table containing the US population for each year of the data, the name of the state with the highest number of representatives to population ratio in each year, and the state with the lowest number of representatives to population ratio in each year. Your output table should be formatted/“look pretty” using the kableExtra, tinytable, or gt packages previously discussed in the “Cross Tabulations and Happy Tables” lecture. You may choose which package/functions and format you like to make your table look pretty.

By writing these two functions, you can call them in sequence to calculate the US population for each Census year and determine which states are most and least represented in Congress in a single line of code. For example, if you wanted to find this information during the last 4 Census, you could call: populationDataReader(folder_path, years = seq(1990, 2020, by = 10)) %>% censusCalculator().

  • The message R prints when reading a csv file should not appear in your final RMarkdown file. There are options for your code chunk that will silence these messages.
  • Print the output of the censusCalculator function for the last 4 Census years to your Rmarkdown file.
  • Print the output of the censusCalculator function for every other Census years from 1910 to 2010 to your Rmarkdown file (i.e. for 1910, 1930, 1950, etc.).
  • You can use any R function you need.
  • You can report Functions 2 and 3 together using the report format below.
An example of the report format is shown in the next page.

Note: you don’t need the fancy header for your report. We understand that you are not RMarkdown experts, but we expect that you try to write a clean report as best as you can. The most important is that we can clearly read your name, PERMID, discussion, code, and outputs, do not worry so much about other printing designs.

Full Name: . . . .
PERMID: . . . ..

Function 1 - name_cleaner:

Function description:

. . . . describe your function here (what does the function do?, what are the inputs/outputs?, discuss the testing results as example on how to use your function, any relevant information regarding your function that the user should know?, etc.) . . . .
Function code: name_cleaner
# Note: Your pdf report must show this code
# set the seed to your PERMID here
......
# Write the name_cleaner function here:
name_cleaner <- function(...){
# write function algorithm here
}
Testing name_cleaner: Test 1:
name_cleaner(string = 'Hello, my name is John Smith!, and I am a student at UCSB who lives in Isla Vista.', list_of_words = `c('is', 'I', 'John')`)
Testing name_cleaner: Test 2:
# Test 2:
# Call your function here with the correct input and print the output below
... call your function here for the test .......
**FOLLOW THE SAME FORMAT FOR FUNCTIONS 2/3**

发表评论

电子邮件地址不会被公开。 必填项已用*标注