Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
INFO201 Problem Set: Accessing files, rmarkdown, data frames
January 24, 2025
Instructions
This assignment is much more advanced than the previous one. In particular, here we ask you to write what you do in rmarkdown, knit the result, and submit the resulting html file.
The PS has multiple aims:
-
Be able to write and knit rmarkdown, and change the basic code chunk properties.
-
Be able to locate files across the file system and load those into R.
-
be mostly profficient to use if, else, ifelse, any and all.
Requirements:
-
Please include question numbers before your answers! (In markdown.)
-
Ensure that your code produces suitable output. Now we expect you to comment your results in markdown, not in code comments. Answers that do not produce legible output will not count!
-
Ensure your code runs, and works correctly. We recommend to test it (knit) frequently.
Submission
When done, knit your file into html. Submit the following files:
-
your rmarkdown file
-
the knitted html file
These things constitute your submission.
Enjoy!
1 Markdown
Consult The Course Book 10.3 Markdown syntax.
1. Format the problem set as follows:
-
The title should be something like “PS3” (in the yaml title field that is automatically generated by rstudio).
-
Sections should be second level headers (use ##) (sections are labeled like 1 Markdown, 2 Working with files, . . . ).
-
Subsections should be 3rd level headers (subsections are labeled as 2.1 Working directory, 2.2 List files, . . . ).
Not all sections have subsections.
-
Use markdown fourth level headers for questions. Questions are 1. Format the problem set. . . , 2. Create a bullet list. . . , 3. Create a numbered list. . . , . . . ).
You only need to mark the section/question number, not the name or the question itself.
2. Create a bullet list of (at least 3) colors
3. Create a numbered list of (at least 3) cities.
4. Create a pre-formatted text (at least 3 lines)
5. Write a sentence that includes both bold and italic text.
6. Include an image in the text using markdown
2 Working with files
2.1 Working directory
Before we get into any work with data, we have to talk about working directory.
1. What is the working directory of your RStudio console? How can you find it?
Note: you cannot find it in your rmarkdown code! Your rmarkdown may give the correct answer, but it is still conceptually wrong. See Course book 10.7.2 Knitting is a separate process.
2. Why cannot you find it in your rmarkdown?
3. What is the folder where your homework .rmd file is saved?
4. run getwd() in your markdown file. What does it result? Is it the same folder as where your .rmd is saved in Q 3? Is it the same as the working directory in RStudio console in Q 1? Why does it matter?
2.2 List files
1. Now create a folder on your Desktop where you put
-
At least 3 pdf files
-
At least 3 picture files
-
At least 3 other files
-
At least 3 directories (you can create new empty directories)
How did you call that folder?
2. Sketch the file system tree of your computer, where you include:
-
the folder where your rmarkdown is saved
-
the folder that you just created on your desktop
-
a few other files/folders.
Include the sketch in your markdown file using markdown tools. If you made the sketch on paper, you can take a photo of it and include here.
Mark the path from the folder where you have your rmarkdown file to the new folder on the desktop.
See Course book 9.1.1
3. What is the relative path of that folder with respect to the rmarkdown working directory (the directory you printed in Q 2.1.3)?
See The course book 9.1.2.
4. Include a screenshot of the folder (in a file explorer) into this document as image. Use the markdown tools to include the image!
5. Now use the relative path you wrote in item 3 to list all files/folders from within R. Store the list of files into a variable, and print it!
See The course book 9.2.2.
6. Do you see the same files as in the image?
7. Do you see the complete file name (including extensions) in the file explorer image?
2.3 How big are the files?
1. Write a for loop over these files. Inside the loop:
(a) Get the file info (you can use file.info() function).
(b) If the file is a directory print “<file name> is a directory” (replace “<file name>” with the actual name).
Hint: you use the $isdir component along these lines:
info <- file.info("ps03-markdown-df.rnw")
info$isdir
# is it a folder?
## [1] FALSE
(c) If it is not directory, find the file size
Hint: you can use the $size component as:
info$size
## [1] 18551
3(d) Print the line name, followed by it’s size in a pretty manner, using comma to separate
thousands. The final line should look something like
## ps03-markdown-df.rnw: 18,551
Feel free to phrase it better, but ensure the size is printed prettily!
Hint: check out the function prettyNum()!
2.4 Display the files (extra credit, 5pt)
Warning: unfortunately, the jupyterhub server does not let you install neither magick or imager package. So you cannot easily display pdf files on the server, you can still use markdown tools to display images though.
1. Write a loop over all files in the folder you made.
Hint: instead of writing the loop over a sequence like 1:10
for(i in 1:10) {
...
}
write it over the file names.
In the loop:
-
If the file is an image (file name ends with jpg/png/heic/...), then load it and display it in your final file (either using code or markdown tools to show the image). Print the file name above the plot in bold!
Hint: there are several R packages that can display images.
You can use endsWith() function to check the file name extension, or the corresponding stringr functions.
-
If the file is a pdf (file name ends with .pdf), then load its 2nd page and plot it (hence you need at least 2-page pdfs...). Print the file name underneath the plot in italics.
-
If the file is something else, then print its name in quotes with a remark “cannot be displayed”, for instance: “resume.doc” cannot be displayed
3 Control structures
1. Take the vector of file names in your sample folder you created above. Transform each file name into a sentence where you tell the file type for each file. You should distinguish between
a) images; b) pdf-s; c) everything else. The output should look something like:
picture.jpg is an image file
document.pdf is a pdf file
output.log is something else
. . .
Do this using ifelse(), do not use loops or indexing!
Hint: you may need to put one ifelse inside the other one.
You can use endsWith() function to check the file name extension, or the corresponding stringr functions.
2. Do you have any .png files included? Compute the number of png files, and print the corresponding sentence. Depending on the number of png files, you should print either:
There are no png files
or
There is a single png file
or
There is n png files where n is the number of png files! This should be done using if/else, not hardcoded!
3. Extra credit (1pt): Use sum() and a logical expression to count png files. See Course book 4.6. Computing with vectors.