I’m sitting at my desk trying to write some very general Markov Chain Monte Carlo (MCMC) code. I’m aware that there are libraries out there which could be used, but when I investigated them they were all so difficult to use that it just made sense to write my own code. At that time, I just wrote some code that was very specific to one problem. Now, I’m trying to generalize that code to be applicable to any problem. That is where things get hard.
I have written previously about the short comings of my formal coding education. This is sort of a continuation to that post in that it is a problem that is also likely due to the way programming is taught. In my one and only formal course, I was taught how to write code to solve a single problem. On top of that, we were rarely encouraged to factor out portions of the code as functions. So, for years, my primary instinct when sitting down to write code is to write code to solve a specific problem. This leads to a re-inventing of the wheel every time I need to solve a different problem, even if it is closely related to one I previously solved.
I have been getting better about this. I wrote the HARPPI library so that now I don’t have to re-write the same functions with minor changes for each peace of code that uses a parameter file. By moving to code that uses parameter files I have been able to use the same code for a multitude of closely related problems without even having to recompile. I’ve started creating header files and associated implementation, files with functions that I frequently need to use so that I don’t have to write the code over and over again, not even just copy and paste. This has made my programming more efficient, yet sometimes writing very general code can be quite difficult.
This brings us to my MCMC code. As I mentioned at the top, there are libraries out there which are quite general. I could learn to use one of those, but you still have to define your model and your posterior distribution, which is the vast majority of the work given that the Metropolis-Hastings algorithm take very few lines to code. Probably 99.9% of the time, I’m going to be using a Gaussian likelihood which allows me to make my code a little less general than the libraries that are available (I would have said I will always be using the Gaussian likelihood, but there may come a time when I need to use something else, but not in my foreseeable future). In the end, the only thing that I want to be able to change is the model that I am fitting to some data with this procedure.
I am writing a function that calculates how much a parameter should be allowed to vary from its current value for a random realization to test in order that the acceptance rate is near the “optimal” 0.234 (many people seem to say this is the optimal MCMC acceptance rate, though there are those with other opinions, but that’s a topic for another post). Trying to get that function general is a bit of a pain and is requiring more mental energy than I can seem to muster at the present (hence the writing of the post). I feel as though one thing that should be stressed in introductory programming courses should be that you should strive to keep your code as general as possible.
I realize from my teaching experience that students like concrete things, and writing very general code can take a lot of abstract thinking. However, I think the goal can be accomplished in a way that would help to structure the course to make the students very good programmers, which should be the goal of an introductory course. Essentially, if an introductory programming course was taught in a way where each program a student writes builds on the last or at least re-uses some of the code from the last, and teaches students to put useful functions into separate pieces of code to make them very easy to use in their next program if they need to, we could teach students from the beginning to start thinking in abstract ways that would make their code more general.
For example, you could have students fairly early on write a program that simply reads a list of names and grades and stores them in a map as a key value pair. You could then later have them write a program which calculates the average grade in the class. Next, have them extend their code to storing grades for various assignments for each student, then computes the average of the each student’s grades to calculate their final grade. Each of these programs could build on the last. The students could be creating a header and implementation file to such as gradebook.h and gradebook.cpp, which has a function for calculating averages, medians, converting numeric grades to letter grades, et cetera. Aside from the conversion from numeric to letter grades, the average and median functions, if kept general enough, could be used in other programs as well. You could teach your students to recognize that, and maybe push them towards creating some stats.h and stats.cpp that has those functions along with some others that they might need like standard deviation, and χ2 for example. You could have them calculate the standard deviation, and see how well the final grades, or even grades from specific assignments fit a Normal distribution.
Later on, you could have them writing another program that also needs to calculate averages, medians, standard deviations and χ2 so they can see that by keeping their code that does those things general, they just have to add it to their new project and then get going. If done well, this could show students how valuable it is to recognize when code can be kept general, to see when some problem they are trying to solve may be something they need to solve over and over again, and how they can write their code to be general from the start so that later on they save a lot of time.
I wish that my introductory C++ course had taught me those lessons instead of me having to learn them on my own much further down the road. Then, maybe I would have thought about how to keep my original MCMC code very general and I wouldn’t be stuck doing it now. Maybe I would have been in the habit of writing code as generally as possible, and keeping the main function of my code quite high level with only a few function calls. Maybe it wouldn’t be so hard to do all of these things now if I had been doing them for a great number of years.
Of course, there is always a chance that others have had very different experiences. Were you taught to code like I described above? Did you think that introductory course should be taught like I described, or should these concepts be taught in higher level programming classes? Let me know in the comments below!