Tuesday, October 8, 2013

Online Course Reviews: Coursera's Machine Learning and Probabilistic Graphical Models

Whoops, I haven't posted in a while.

In May, I started a new job. It has nothing to do with data science, but it has given me experience in supervising other writers, and it's also kept me quite busy. The fact that work kept me busy explains why I haven't posted recently. It also explains the one caveat I have to add to the reviews I'm about to give you: I was never able to finish all the material for either course. I got busy with the new job and moving my family into a new house, and by the time I came up for air, it was too late to finish.

As I mentioned in my April post, I signed up for Coursera's Probabilistic Graphical Models, Machine Learning, and An Introduction to Interactive Programming in Python. I dropped the An Introduction to Interactive Programming in Python almost immediately, after realizing that the course's focus on programming video games made it not as useful for my purposes as I had hoped.

Probabilistic Graphical Models was taught by Stanford Professor and Coursera co-founder Daphne Koller. Coursera hasn't yet listed a new iteration of it, but if the previous pattern holds up, it should be offered again next year. As I mentioned before, I took this course because it includes Bayesian and Markov models, both of which show up in many job ads for data scientists. I decided not to take the optional programming track, figuring that it probably wasn't a good idea to be writing programs for two different courses in a language I was just learning (both this course and Machine Learning use MATLAB and/or the very similar Octave).

Machine Learning was taught by Andrew Ng, also a Stanford professor and Coursera co-founder, and is one of Coursera's best-known and most popular courses. It's also been taught by the University of Washington's Pedro Domingos, but Ng's version will be offered again starting October 14th. I signed up for the course because machine learning is one of the basic skills of data science, but I also wanted the chance to learn one of the most commonly used statistical programming languages, MATLAB/Octave.

As I said in my last post, Daphne Koller is not the most charismatic lecturer, and her explanations can be confusing. What I didn't say last time is that I don't think Koller entirely understands the medium in which she's working. In the classroom, asking questions of the professor can make up for a confusing lecture; Koller seems to be giving the same lecture she would give in the classroom, but without the opportunity to stop her and ask questions about each topic before moving on to the next, that same lecture doesn't work very well.

While the lectures are less than ideal, the quizzes are particularly troubling: rather than presenting a simple test of the material covered in the lecture, the quizzes ask students to move beyond the lecture material, drawing out implications on their own. Asking students to do this is a great pedagogical technique, especially in a graduate-level class. However, it works a lot better when the students have discussed the material in class, giving them the opportunity to start down that path together, with the professor's guidance. None of this is possible in an online class, and, even with discussion forums, rules that prevent students from providing answers to one another prevent full exploration of the quiz topics; part of the problem is that students can see the quiz questions before beginning their discussion, rather than receiving a quiz or homework assignment only after the classroom discussion is over.

It might help to begin each quiz with more straightforward questions, giving students a little practice, before moving on the ones that require additional thinking. Far from adopting this model, Koller actually exacerbates the problem by adopting an unusually strict rule (by MOOC standards) for retaking quizzes: any attempt after the second is penalized. Because of this, I found myself taking quizzes I had no way to prepare for, because they introduced concepts for the first time, and I had no way to practice applying those concepts beforehand.

I want to stress here that I'm not simply some idiot who was in over my head. I'm trained in statistics, and I have experience using structural equation and time series models, both of which share similarities with probabilistic graphical models—and I was really interested in the course material. Koller acknowledges in her lectures that the course is challenging, and even seems to take pride in that fact. However, while the material is indeed challenging, the course is hard partly because it's badly taught. It's also possible that Koller is trying to cover too much material for the online format—the lack of classroom discussion not only makes individual topics more difficult, but increases the time required to cover each topic, since the teacher has to provide a much more detailed lecture, rather than relying on student questions to fill in holes.

While I didn't pursue the programming track, other reviewers have complained that they spent more time trying to figure out how to read in the data than they did conducting the analysis. Mind you, this is a problem that data scientists face in the real world, and so the criticism might not be completely fair.

Andrew Ng's Machine Learning is another beast altogether. Ng is in fact a charismatic, and very clear, lecturer; indeed, Koller uses a couple of his lectures in areas where the material in the two courses overlaps. Not only does Ng convey his topics clearly, but he stresses the practical aspects of the methods he's teaching, and provides useful tips about how to apply them in the real world. While Ng pulls students along at pace much gentler than Koller's, he's still able to teach methods that, he insists, are advanced enough to be unfamiliar to many practicing data scientists. I should add that the automated system used to grade programming assignments works quite well. If I do have one criticism, it's that the programming assignments probably involve a little more copying and pasting than might be ideal for learning Octave, but then, copying and pasting isn't uncommon in real programming.

In short, this is a very good course, and I strongly recommend signing up for the session that starts October 14th. Now that things have calmed down a bit for me, I might even sign up for it myself.


  1. "I don't think Koller entirely understands the medium in which she's working." I could not agree more!

  2. Agree. Probabilistic graph model is a very interesting topic, I hope she can do it better next time

  3. This comment has been removed by the author.