Homework 8 - Data Frames

Due: Mar 29 by 11:59pm

Submission Instructions: Create a zip file of all the files in your R project folder for this assignment, then submit your zip file on the corresponding assignment submission on Blackboard.

Weight: This assignment is worth 4% of your final grade.

Purpose: The purposes of this assignment are:

  • To practice creating data frames in R.
  • To practice merging and slicing data frames in R.

Assessment: Each question indicates the % of the assignment grade, summing to 100%. The credit for each question will be assigned as follows:

  • 0% for not attempting a response.
  • 50% for attempting the question but with major errors.
  • 75% for attempting the question but with minor errors.
  • 100% for correctly answering the question.

The reflection portion is always worth 10% and graded for completion.

Rules:

  • Problems marked SOLO may not be worked on with other classmates, though you may consult instructors for help.
  • For problems marked COLLABORATIVE, you may work in groups of up to 3 students who are in this course this semester. You may not split up the work – everyone must work on every problem. And you may not simply copy any code but rather truly work together and submit your own solutions.

Readings

The readings from the last week will serve as a helpful reference as you complete this assignment. You can review them here:

1) Staying organized [SOLO, 5%]

Download and use this template for your assignment. Inside the “hw8” folder, open and edit the R script called hw8.R and fill out your name, GW netID, and the names of anyone you worked with on this assignment.

Using good style

For this assignment, you must use good style to receive full credit. Follow the best practices described in this style guide.

2) Inspect package data [SOLO, 15%]

Write R code to install the dslabs package from CRAN, then write code to load the package. Write some code to preview and inspect the movielens data frame that gets loaded when you load the package using some of the techniques we saw in class. For each of the following questions, write code to find your answer and leave a detailed response in a comment:

  • What is this dataset about?
  • How many observations are in the data frame?
  • What is the original source of the data?
  • What type of data is each variable?
  • What are the years of the earliest and most recent observations in the data set?

3) Answer questions about the data [COLLABORATIVE, 25%]

For each of the following questions, write code to find your answer and leave a detailed response in a comment:

  • What is the min, mean, and max rating in the data set?
  • How many observations received the maximum rating?
  • What percentage of total observations received the maximum rating?
  • What is the title of the observation with the longest title (in terms of numbers of letters in the title)?

4) Loading and inspecting external data [SOLO, 20%]

Write R code to read in the prisoners2019.csv file located in the data folder. Store the object as df. Write some code to preview and inspect the df data frame using some of the techniques we saw in class. For each of the following questions, write code to find your answer and leave a detailed response in a comment:

  • What do you think this dataset is about?
  • How many observations are in the data frame?
  • What type of data is each variable?

5) Answer questions about the data [COLLABORATIVE, 25%]

For each of the following questions, write code to find your answer and leave a detailed response in a comment:

  • Which states have the highest and lowest total prison population?
  • Which states have the highest and lowest total prison population as a percentage of the total state population?
  • According to the 2020 U.S. Census, only 12.4% of the U.S. population is black, but some states have imprisoned more black people than any other race. Which states fit this description?

6) Read and reflect [SOLO, 10%]

Read and reflect on the following readings to preview what we will be covering next:

Afterwards, in a comment (#) in your .R file, write a short reflection on what you’ve learned and any questions or points of confusion you have about what we’ve covered thus far. This can just few a few sentences related to this assignment, next week’s readings, things going on in the world that remind you something from class, etc. If there’s anything that jumped out at you, write it down.

Submit

Instructions for how to submit your assignment are at the top of this page.