
Chapter 4 Running a DataTrail cohort
It’s a good idea to have a weekly team meeting amongst the case manager, instructor, tutors, and program administrator so any issues on any fronts that arise during the cohort can be discussed amongst the team.
Our assumption and advise is that if possible you run your cohort with office hours in person, however the Baltimore DataTrail program did run a cohort completely virtually at times.
4.1 Welcome week
The first week of the program should have the following goals:
- Scholars get to know staff and their fellow scholars – don’t underestimate how important human connection is!
-
Set up expectations and motivations for the program
- Signed DataTrail training agreements from each scholar. See our template
- Assess employment skills and make plan for preparing for work
- Assess in more detail scholar’s potential barriers in one on one’s
- Set up each scholar with a Chromebook (advisable to do software updates before handing these out).
- Set up payments payments, and calendar
-
A Google Classroom set up for your cohort. You can request that here
- Invite all scholars, tutors, and the case manager to the GoogleClassroom
4.2 Components of the curriculum
Relevant Roles: Lead instructor, tutors, and Case Manager.
The course material is here: https://datatrail-jhu.github.io/DataTrail/ (Except quizzes and swirl modules will not be present in this link).
The course is split into 7 overall sections:
- 00 Intro - (note this section doesn’t have a project!)
- 01 Forming Questions
- 02 Getting Data
- 03 Cleaning the Data
- 04 Plot the Data
- 05 Get the Stats
- 06 Share Results
- 07 Build your Resume - This project isn’t an Rmd but a website
For scholars to get credit and payment for work on each section they must:
- Attend all office hours.
- Completion and submission of associated projects
- Completion and submission of associated quizzes
- Completion and submission of associated swirl modules
4.3 Supporting students to meet requirements
It will be a partnership between the Case Manager and lead instructor and tutors to keep tabs on the scholars and figure out whether they have met the requirements for each section. If they haven’t met the requirements, that is something the Case manager and instructors will want to communicate and explore with the student about what type of support they may need to reach these requirements.
Unfortunately, not all DataTrail scholars will complete the program. Sometimes the reasons for this are life circumstances that impede their ability to complete the program. But other times, its merely that the scholar, with the help of the Case Manager, may determine that the DataTrail program is not the right fit for them at this time.
It should be a discussion between the scholar, the Case Manager, and when appropriate, the instructors about what the scholar needs most at this time to be supported. Sometimes, the best way to support the scholar is for them to discontinue the program OR attempt the program again at a later date if a particular circumstance might be improved.
TODO: This needs Liz and/or Simone to make this section better and more detailed.
4.4 Example schedule
A weekly schedule during a cohort for a given scholar may look like this:
Sunday | Monday | Tuesday | Wednesday | Thursday | Friday | Saturday |
---|---|---|---|---|---|---|
2.5 hr individual work 30 min meeting with case manager |
2 hr individual work 1.5 hr for office hours |
2 hr individual work 1 hr for tutor one on one |
2 hr individual work 1.5 hr for office hours |
2 hr individual work |
Suggested structure for each office hour:
- Start off by asking what questions folks have and if there’s anything from the material they’d like to go over as a group
- Dive into a exercise – can use the ones below or something else if something comes up during the week that students have requested
Session | Relevant course link | Suggested office hour material | Objectives |
---|---|---|---|
Welcome week | TODO: LINK TO COMPLETED CAREER READINESS | - Make sure they are on Slack and Google Classroom - Have them open up Zoom and practice sharing screens and show that they have access to the GoogleClassroom - Practice sending zoom chat as well |
Comfort with the platforms and with Google Calendar |
Office Hours 1 | Intro 1 | - Ice breakers and introductions. - Introduce what office hours will be like |
- Making sure everyone is on Slack and Google Classroom and is able to use their Chromebook - Try to make everyone comfortable with chatting – keep it informal and chill |
Office Hours 2 | Intro 2 | - Ask folks what data science is. - Address any lagging tech issues. |
- Everyone should be on Basecamp and have an RStudio account |
Office Hours 3 | Forming Questions 1 | - Discuss the steps of data science. - Walk through a data science example - Discussing How to Learn techniques |
Set up the mentality that data science is All about questions so questions are encouraged! AND that data science can be frustrating AND that is expected and okay! |
Office Hours 4 | Forming Questions 2 | - Take a tour through RStudio and demonstrate live how R Markdowns work - Cover the basics of the first project in RStudio |
- Discuss what questions can be answered by data science. - Get familiar with what an R Markdown looks like |
Office Hours 5 | Getting Data 1 | - Going over R objects - Leave a lot of time for R questions. |
- Asking how the first project went - Understanding R objects basics - Cover debugging tips |
Office Hours 6 | Getting Data 2 | - Covering data frames exercise 1 - Data frames exercise 2 |
- Understand data frames - Leaving more time for R questions |
Office Hours 7 | Getting Data 3 | - Cover file paths with RStudio demonstrations - Demonstrate R Markdown functionality | - Understand file paths - Be able to read in a CSV file - Be able to use an R Markdown - Be able to upload a file to RStudio - Practice loading libraries with library() |
Office Hours 8 | Getting Data 4 | Preview Getting Data project specifically making sure everyone has made their Leanpub data googlesheet | - Prep for Getting Data project - Look out for googlesheets credential issues - Practice using googlesheets4 |
Office Hours 9 | Cleaning the Data 1 | - Leave some time for wrapping up the Getting Data project - Start on Tidying exercise 1 |
- Have Getting Data Project wrapped up - Get folks comfortable with the idea of tidy data - Show them about TidyDataTutor |
Office Hours 10 | Cleaning the Data 2 | - Leave time for wrapping up Tidying exercise 1 - Can start Tidying exercise 2 |
- Become more comfortable manipulating strings for data cleaning |
Office Hours 11 | Cleaning the Data 3 | - Can finish up Tidying exercise 2 - Can introduce a tidytuesday case if ahead of schedule |
- Becoming more comfortable with cleaning data |
Office Hours 12 | Cleaning the Data 4 | - Introduce Cleaning the Data Project - As a group, walk through the joining exercises in the material |
- Understand joins - Be ready to clean up a dataset from soup to nuts |
Office Hours 13 | Plotting the data 1 | - Leave time for covering Cleaning Data Project wrap up - Cover what makes a good plot - Start Data Viz exercise 1 |
- Be comfortable with the goal of data viz - Understand the basic formula of ggplots |
Office Hours 14 | Plotting the data 2 | - Start up data viz exercise 2 | Becoming comfortable with ggplot2 |
Office Hours 15 | Plotting the data 3 | - Start up data viz exercise 3 | Further becoming comfortable with ggplot2 |
Office Hours 16 | Plotting the data 4 | - Wrap up any of the unfinished data viz exercises - Introduce Plotting the data Project |
- Being able to ask a question and then make a viz to answer it |
Office Hours 17 | Getting Statistics 1 | Introduce the concepts behind translating questions to stats Go through the In practice chapter as a group | Set up the mentality for statistics and how one might use them in data science – emphasize that memorization is not needed! |
Office Hours 18 | Getting Statistics 2 | Discuss the difference between Descriptive, Exploratory and Inferential statistics by going through those chapters as a group | Understand the groups of kinds of statistical questions |
Office Hours 19 | Getting Statistics 3 | Go through the “Playing with Stats” exercise | Try to get an intuitive sense for statistics and distributions |
Office Hours 20 | Getting Statistics 4 | Introduce the Get the Stats Project | Help students get comfortable with applying the stats |
Office Hours 21 | Sharing Results 1 | Go through what version control means and why it aids in reproducibility. Can use this exercise. Demonstrate how to link RStudio with GitHub | - Help students learn about the importance of GitHub - Help students be able to version control their projects by linking to GitHub in RStudio |
Office Hours 22 | Sharing Results 2 | Demonstrate how to file a pull request and cover GitHub terms and workflow using this chapter | - Leave time open for GitHub terms - Emphasize that using version control and GitHub is a series of habits they can develop over time |
Office Hours 23 | Sharing Results 3 | Introduce the Final project and tell students to start thinking about what kind of data science question they’d like to ask and begin looking for datasets that fit. Ask Davon to show his example. | Encourage them to ask a data science question they are interested in! |
Office Hours 24 | Sharing Results 4 | Help students brainstorm their projects and find data | Setting up students for their final project! |
Office Hours 25 | Sharing Results 5 | Keep this office hours fairly open for students to ask for help with their final project, show their progress, ask for help where needed | |
Office Hours 26 | Building a resume 1 | Encourage students to begin putting their final project in a presentation. Show example good presentations. Ask Davon to show his example presentation. | Prepare the students for presenting their project at graduation! |
Office Hours 27 | Building a resume 2 | Introduce the Create your portfolio project Have additional time for any help needed for their final project presentations | - Help the students prepare how to show off the work they’ve done - Leave time open for any questions about data science careers in general |
Office Hours 28 | Building a resume 3 | Have students take turns showing off their portfolio website! Wrap up any more help they need with their final project presentations | Celebrate the work they’ve done! |
Graduation | Celebrate their amazing work! | Have the students celebrate the large amount of amazing work they’ve done in such a short amount of time – do not give critical feedback on the scholar presentations, these are all about encouragement and celebration! |
4.4.1 Wrapping up a cohort
As you are nearing the end of a cohort, you have likely learned a lot about your scholars. In this time, they’ve likely learned a lot about data science and are interested and ready for an internship which will allow them to further hone their skills!
TODO: How to do internship placements
4.4.2 To contribute to the curriculum:
The DataTrail curriculum is always looking to be improved. If you encounter issues, bugs, or otherwise find things in the curriculum that could use improvement, please let the curriculum developers know.
- You can email or Slack csavonen@fredhutch.org with recommendations/problems/concerns.
- You can also post GitHub issues here: https://github.com/datatrail-jhu/DataTrail/issues
- All the associated DataTrail GitHub repositories are here: https://github.com/datatrail-jhu if you’d like to file pull requests.
* Note that for self-learners (not a part of a cohort), a Leanpub version of this material is available for certification here.