General Questions & Info
Should I take this course?
Meet some of the types of students you will find in this class.
Jeri
- Starting points
- Ph.D. student in Sociology
- Has experience analyzing data in Stata
- Feels comfortable with regression and other stats methods
- Tried to learn Git on her own once, quickly became frustrated and gave up
- Needs
- Wants to transition from Stata to R
- Will be analyzing a large-scale dataset for her dissertation
- Seeks a reproducible workflow to manage her data projects
Ryan
- Starting points
- Entering the MAPSS program
- Undergraduate degree in journalism
- Hasn’t taken a statistics class in years
- Took an online course of introduction to R, but hasn’t used it in his day-to-day work
- Needs
- Writing a master’s thesis in a single year
- Expects to analyze a collection of published news articles
- Wants to understand code samples found online and adapt them to his own work
Fernando
- Starting points
- Third-year undergraduate student
- Majoring in political science
- Has taken general education math/stats courses
- Does not have programming experience, but isn’t afraid to tackle a new challenge
- Needs
- Wants to work as a research assistant on a project exploring the onset of civil conflict, which is run exclusively in R
- Will start contributing to a new research paper next quarter
- Wants to produce high-quality visualizations
Fang
- Starting points
- First year grad-student
- Background in psychology, plans to apply for doctoral programs in marketing
- Has experience using Excel, SPSS, and Stata
- Needs
- Is going to analyze data collected by her lab members in the next six months
- Wants to produce analysis notebooks that are easily shareable with her colleagues
- Expects to take courses in machine learning and statistics which require a background in R
General description
This course is open to any graduate (or advanced undergraduate) at the University of Chicago. I anticipate drawing students from a wide range of departments such as Information Science, Sociology, Psychology, Political Science, etc. Typically these students are looking to learn basic computational and analytic skills they can apply to master’s projects or dissertation research.
If you have never programmed before or don’t know what the shell is, prepare for a shock. This class will prove to be very beneficial if you stick with it, but that will require you to commit for the full quarter. I do not presume any prior programming experience, so everyone starts from the same knowledge level. I guarantee that the first few weeks and assignments will be rough - but the good news is that they will be rough for everyone! Your classmates are struggling with you and you can lean on one another to get through the worst part of the learning curve.
Textbooks/Readings
Required
-
R for Data Science – Garrett Grolemund and Hadley Wickham. We will be reading several chapters from this book. The open-source online version is available for free; the hardcover version available for purchase online.
Completing the exercises in the book? No official solution manual exists, but several can be found online. I recommend this version by Jeffrey B. Arnold. Your exact solutions may vary, but these can be a good starting point.
Additional resources
- ggplot2: Elegant Graphics for Data Analysis, 3rd Edition – Hadley Wickham. Excellent resource for the
ggplot2graphics library. - Advanced R – Hadley Wickham. A deep dive into R as a programming language, not just a tool for data science.
- An Introduction to Statistical Learning: with Applications in R – Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. A thorough introduction to statistical learning and machine learning methods, focusing on the fundamentals of how these methods work and the assumptions that go into them. See also ISLR
tidymodelsLabs. This site demonstrates how to implement all the labs. - RStudio Cheatsheets - printable cheat sheets for common R tasks and features.
Resources for under-represented groups in programming
Thanks to Angela Li for compiling these recommendations:
- R LGBTQ Twitter: Affinity group for queer people in the R community – Twitter often promotes events, panels and talks by and for queer R users.
- Gayta Science Twitter: Alliance that uses data science techniques to give LGBTQ+ experiences a voice – Twitter will often share data-driven work concerning the LGBTQ+ community.
- RLadies Community Slack: A global programming meetup for non-binary, trans, and female R users.
- RLadies Remote Twitter: Remote chapter of R Ladies – has Slack coffee chats to discuss programming topics in a supportive environment.
- People of Color Code Meetup: A meetup for POC software developers – has events where POC developers can work on personal projects, collaborate, and learn.
- R Forwards: A task force set up by the R Foundation to address the under-representation of under-represented groups in the R community – collects representation data in the R community, produces workshops and teaching materials
- R Community Diversity and Inclusion Working Group: Working group set up by the R Consortium to encourage and support diversity and inclusion across a variety of events and platforms in the R community
What do I need for this course?
You will need to bring a computer to class each day. Class sessions are a mix of lecture, demonstration, and live coding. It is essential to have a computer so you can follow along and complete the exercises.
By the end of the first week, you should make sure you can access the following software:
- R - the easiest approach is to select a pre-compiled binary appropriate for your operating system.
- RStudio’s IDE - this is a powerful user interface for programming in R.
- Git - Git is a version control system which is used to manage projects and track changes in computer files. Once installed, it can be integrated into RStudio to manage your course assignments and other projects.
Comprehensive instructions for downloading and setting up this software can be found here.
How will I be evaluated?
Students will complete a series of (roughly) weekly programming assignments linked to class materials.Each assignment will be evaluated by myself or a TA.
Assignments will initially come with starter code, or an initial version of the program where you need to fill in the blanks to make it work. As the quarter moves on and your skills become more developed, less help upfront will be provided.
While students are encouraged to assist one another in debugging programs and solving problems in these assignments, it is imperative students also learn how to do this for themselves. That is, students need to understand, write, and submit their own work.
For further information see:
Academic integrity
Each student in this course is expected to abide by the University of Chicago Code of Academic Integrity. Under the provisions of the Code, anyone who gives or receives unauthorized assistance in the preparation of work at home or during tests in class will be subject to disciplinary action. A student’s name on any piece of work is our assurance that they have neither given nor received any unauthorized help in its preparation. Students may assist each other on assignments by answering questions and explaining various concepts. However, one student should not allow another student to copy their work directly. All University policies with respect to cheating will be enforced.
Statement on diversity, inclusion, and disability
The University of Chicago (as an institution) and I (as a human being and instructor of this course) am committed to diversity and rigorous inquiry from multiple perspectives. The MAPSS, CIR, and Computation programs share this commitment and seek to foster productive learning environments based upon inclusion in education, open communication, and mutual respect for a diverse range of identities, experiences, and positions.
Services and reasonable accommodations are available to persons with temporary and permanent disabilities, to students with DACA or undocumented status, to students facing mental health or other personal challenges, and to students with other kinds of learning challenges. Please, feel free to let me know if there are circumstances affecting your ability to participate in class.
If you have, or think you may have a disability, please contact Student Disability Services for a confidential discussion and for requesting accommodation: SDS. Once SDS approves your accommodation, it will be emailed to both you and me. Please, follow up with me to discuss the necessary logistics of your accommodations. If you need immediate accommodation, please speak with me after class or send an email message to me and SDS.
Any suggestions for how we might further such objectives both in and outside the classroom are appreciated and will be given serious consideration. Please share your suggestions or concerns with your instructors, your preceptor, or your program’s Diversity and Inclusion representatives: Darcy Heuring (MAPSS), Matthias Staisch (CIR), and Chad Cyrenne (MACSS). You are also welcome and encouraged to contact the Faculty Director of your program.
Some resources that might be of use include:
Covid-19 Policies
All students on campus are required to adhere to the University of Chicago guidelines. See UChicago Go Forward for the latest updates.
Acknowledgments
- Stock photos of student learners by Generated Photos
- This page has been developed starting from Benjamin Soltoff’s “Computing for the Social Sciences” course materials, licensed under the CC BY-NC 4.0 Creative Commons License.