I used to have this weird idea.  It suggested that after I finished my PhD I would have all the technical skills I needed for my career, and it was more a matter of doing the research.

I don’t think I could have been more wrong.  I need to learn the following:

  • Data Mining – finding those nuggets of information in a large data set that can be used to predict student behavior and therefore design interventions for them.
  • Predictive Modeling and Analytics – seems like my job is going to be all about statistical crystal balls.  While this often is considered to include data mining, it’s sort of a matter of data mining telling us what is going on, while predictive analytics suggests how to use that information.
  • Markov Chain modeling – suggested for predicting enrollment and student behavior more accurately
  • R – extremely comprehensive and growing statistical programming language.  Harder than SPSS, but far more complete as well.
  • Bayesian analysis – A different statistical approach from what is called the frequentist approach of traditional hypothesis testing.

But here’s my question:  How do I go about learning all those things?  None of them were covered in my program.  (I’m pretty sure that I would have needed a PhD in statistics to see these things covered.)  I’ve taken a short program on R that helped a little, but it really is a programming language rather than a menu driven tool, and that is going to take some work before I feel like it is as easy to work with as SPSS; work I often don’t have time for when I just need to get something done.  The other areas (the first 3 specifically) are all things I need for my job and have to learn.

Now, obviously, there are books.  Lots and lots of books.  But how do I figure out not just which ones are the most accurate but also the easiest for self-teaching?  I know what some of the best are (such as Gelman‘s book on Bayesian Data Analysis) but don’t have that warm fuzzy feeling that says I know enough to learn it that way.

MIT Open Courseware has a Data Mining course from 2003, but it’s only partial and (more importantly) doesn’t include solutions for tests or homework.  How can I tell if I understood itt?  And what has changed in the field in the last 8 years that might impact my work?

Other options include industry-oriented training programs.  There is one 2-day program on predictive analytics, but it isn’t offered until October in SF.  It’s $1400, and I may talk to the folks at work about paying for it, but when you add travel it starts to get pretty expensive.  Another is in August in Minneapolis for $1500 and is really only part of their full week (which would actually be $3500 for the whole thing).

And then there are academic programs.  I found a Master’s (Northwestern) and two certificate programs (Stanford at $3900 per course and UCSD at a more modest $695 per course).  But is that falling back on what I know?  After all, it’s safe to say that anyone who gets a PhD is “good at school”.  Regardless, the Master’s repeats too much of what I already know, so that’s out.

Another issue is that some of these technical course have prereqs of calculus and matrix algebra.  Now, I took business calculus as an undergrad, but I would be an idiot to think I remember any of it.  I’ve seen matrix algebra a couple of times, but don’t truly get it.  More importantly, these are things needed to understand the theory behind the techniques, not to perform the techniques.  The academic in me thinks they are important, but the professional who needs to show results is less convinced.

And here’s the real rub of the whole thing.  I feel as though a PhD should have prepared me to be able to learn anything, but in reality I only feel that confidence in relation to non-technical subjects.  I taught myself psychology in a summer, and while there is still more to learn I feel relatively conversant in it.

These types of highly technical topics, however, strike me as much harder to teach one’s self.  Teaching myself from a book seems to require a better grounding in the prereqs than I really have.

And then there is the ego-factor.  I feel a bit awkward asking for my boss to pay to send me a 2 day course in predictive analytics when I’ve already been dabbling in it and have all this statistics background.  It’s like I shouldn’t need them to do that.

So I’ll pose this to those of you who read this far.  How would you go about learning a new highly technical skill/area?