HW05: Debugging and practice working with functions
Overview
Due by 11:59 pm on July 5th
The goal of this assignment is to practice debugging common errors in code, and writing/using functions with social science data.
Accessing the hw05 repository
- Go to this link to accept
and create your private
hw05repository on GitHub. Once you do so, your repository will be built in a few seconds. It follows the naming conventionhw05-<USERNAME> - Once the your repository has been created, click on the link you see, which will take you to your repository.
- Finally, clone the repository to your computer (or R workbench) following the process below.
Cloning your hw05 repository
After you have accessed the hw05 repository (see above), follow the
same steps you completed for hw1 to clone
the repository.
General workflow
Your general workflow will be:
- Accept the repo and clone it (see above)
- Make changes locally to the files in RStudio
- Save your changes
- Stage-Commit-Push: stage and commit your changes to your local Git repo; then push them online to GitHub. You can complete these steps using the Git GUI integrated into RStudio. In general, you do not want to directly modify your online GitHub repo (if you do so, remember to pull first); instead modify your local Git repo, then stage-commit-push your changes up to your online GitHub repo.
Part 1: Using functions in social science data analysis
The World Bank publishes extensive socioeconomic data on countries and
economies worldwide. In the data_world_bank folder included in this
assignment, I put a subset (n = 20) of the World Bank’s csv data files
with economic indicators for each country
(https://data.worldbank.org/indicator). Each csv file contains data
on a given country economy’s data.
Your task is edit the functions.Rmd file to write and call a function
(give it a meaningful name) that imports each data file and renames some
of the columns in each data file: * Your function should import a
SINGLE data file (e.g., do not try to run an iterative operation inside
the function – technically this can work, but it is far harder to fix
errors and write the body of the function if you are performing both
tasks simultaneously). The function should take one single argument: the
file path to the data file. Given this path, the function should import
and rename the data, and return the cleaned data as output. * Your
function should rename the following four variables: “Country Name”,
“Country Code”, “Indicator Name”, “Indicator Code”, as country,
country_code, indicator, indicator_code. * Before writing your
function make sure to inspect a few of the csv files. For example,
when you import the data you want to skip the first four rows, etc.
Once you have written this function, demonstrate that it works by
importing the data files and combining them into a single data frame
using an iterative operation. Follow the instructions provided in the
functions.Rmd file for more.
Part 2: Debugging code
The repository contains a file called fix-errors.Rmd. This script
includes code to conduct analysis of baby name popularity in the United
States using the babynames
package.
Its author made some mistakes and the script currently does not work. Fix the errors/warnings in the script to generate the desired output.
Submit the assignment
To submit the assignment, simply push to your repository the last
version of your assignment before the deadline. Then copy your
repository URL (e.g., https://github.com/css-fall22/hw05-brinasab) and
submit it to Canvas under HW05 before the deadline.
Make sure to stage-commit-push:
- the revised
fix-errors.Rmd(from this file, generate and submit also afix-errors.mdfile) - the completed
functions.Rmd(from this file, generate and submit also afunctions.mdfile)
Rubric
Needs improvement: The errors script has not been successfully fixed. The functions to import the data has not been fully set up, and/or is used incorrectly. The code does not run and/or partially runs. Partial or insufficient attention to standards of reproducible research.
Satisfactory: Solid effort. Hits all the elements. Finished all components of the assignment with only minor deficiencies. Easy to follow (both the code and the output).
Excellent: Finished all assignment components correctly and used efficient code to complete the exercises. The solutions adopted went beyond what strictly required. The code is well-documented (both self-documented and with additional comments as necessary). The function is written succinctly/comprehensibly and used correctly. Use multiple commits to back up and show a progression in work.
For further details, see the general rubric we adopt for grading.