The Story So Far

Pretend you are a new professor at Hogwarts. Their LMS is even more arcane than their subject material, but you have managed to get it to give you the student table and gradebook downloads.

The only problem is they are basically illegible, and you can't read all those lines yourself. Sure, there are only a thousand or so students - but what if there were 10 thousand? 20? The files quickly become too big to open in most text editors, and don't even try excel.

What we need is some way to go through the data, line by line, and extract only the data we want.

And what data do we want? Well, like everyone else in wizarding Britain at this time, we are interested in one Harry James Potter. Your goal - and the graded component of this worksheet - is to figure out which classes Harry is enrolled in, and what his grade is.

Fortunately, you have some versatile magic at your disposal - grep.

Let's start by taking a look inside the student data. Loading all of it at once is a fool's errand, so let us look at the top to see what is available:

head Data/students.csv

Typing "Data/" all the time is wasted effort. We can navigate to the Data folder:

cd Data

Now we can access the files more directly.

All we can remember about this student is his name. Fortunately, grep can help us there:

grep "John" students.csv

No, that's not quite right. But you can figure it out.

Question 1: Figure out Harry Potter's Student id number.

Let's stash this information for later! We can make a new file, harrydata.csv, using the file writing operator from yesterday:

grep "John" students.csv > harrydata.csv

Oh wait, that's John again. You can fix that.

The good news is now we know a lot more about the kid. The bad news is that the classes he is enrolled in aren't in this file. They're hiding somewhere else.

Take a look, using head, at one of the other files.

You should be able to figure out how to search one of these files for Harry now - but how to search all of them?

The second piece of lost magic that got you this job, and perhaps one of the most powerful in your toolkit, is called wildcards. The simple asterisk * can represent almost anything, or everything.

For example, cat * will concatenate everything in the directory and throw it at you. head *.txt will show you the tops of all the txt files.

Putting together what you know, you can figure out what classes files Harry has data in. Make sure to skip the students.csv file - we already have that data. Then figure out how to get the class names from those files, all from the terminal.

We want to store this data in harrydata.csv, but we don't want to lose what we already had - so we can use the file append operator >> instead to append to the file.

Question 2: What classes has Harry taken?

The last matter is figuring out what kind of student this Potter kid is.

Question 3: On the standard Purdue grading scale, figure out Harry's Grade Point Average:

GPA   Letter Grade   Percentage
4.33 A+ 97%-100%
4.0 A 93%-96%
3.6 A- 90%-92%
3.3 B+ 87%-89%
3.0 B 83%-86%
2.6 B- 80%-82%
2.3 C+ 77%-79%
2.0 C 73%-76%
1.6 C- 70%-72%
1.3 D+ 67%-69%
1.0 D 63%-66%
0.6 D- 60%-62%
0 F 0%-59%

Advanced Exercise: Combining BASH Commands We have only scratched the surface of what piping can do.

Reading the help for grep, head, and xargs, can you write a single line that takes a student id and prints out the courses that student is enrolled in?

Taking it a step farther with sed - a "stream editor" - can you write a single line that takes in a student's name and prints out the courses they are enrolled in?


Advanced Exercise: Bash Scripting One of the powerful things about bash is its scripting capabilities.

Following the tutorial from the folks at linuxconfig.org, can you make a script that takes in a student's name and tells you their GPA?


Department of Mathematics, Purdue University
150 N. University Street, West Lafayette, IN 47907-2067
Phone: (765) 494-1901 - FAX: (765) 494-0548
Contact the Webmaster for technical and content concerns about this webpage.
Copyright© 2018, Purdue University, all rights reserved.
West Lafayette, IN 47907 USA, 765-494-4600
An equal access/equal opportunity university
Accessibility issues? Contact the Web Editor (webeditor@math.purdue.edu).