Should I Do My PhD In The Open?

By max, Thu 06 November 2014, in category Essays

"Whenever a work's structure is intentionally one of its own themes, another of its themes is art." ~Annie Dillard


It was a warm afternoon in Paradies - the park in Jena, Thuringia - exuberant children were pretending to be snakes and crocodiles, and I was attempting to understand what I wanted to pretend to be. My current thought was that a PhD in the computer science / information science realm with a focus on Free Culture was a path forwards as I explained to my mentor Daniel Mietchen. Neither persuaded nor unconvinced he socratically proposed, like the Free Open Culture advocate that he is, to open the problem up. That is he suggested that I should do an "Open PhD". It's first component he said should be blog post entitled "Should I Do My PhD in the Open?" which to serve as basic argument that I could come back to in the eventuality of a PhD-induced depression.

Let's start with the fundamental question of motivation. Since I let this post sit in draft-phase too long, and started my applications before finishing it, I can no longer provide application-untainted answers. Let us use that fact to design a new avenue of inquiry. To answer the questions of "what?" and "why?" we can analyse my responses to the basic components of a typical PhD application, Statement of Purpose, Personal History, CV and Letters of recommendation. To answer the question of "how?", I will take the advice of Annie Dillard's quote, and use an open git repository to express my opinions about and share my applications.

Research Agenda - What I Aim To Do

I am pleased with the research proposal that I've laid out in my Statement of Purpose. It's sufficiently grand and challenging, and the start of the path lays right where my past has delivered me.  I lay out:

"[...] a research agenda to classify the already-known social biases by as they appear in the socio-technical fora, and then to search for unidentified phenomena using those classifications. As an explanatory example, create a statistical model of how the known skewed distributions gender, race, and nationality exist in Wikidata, and then inspect all the property distributions for properties that match the biased patterns. The project grows more complex by allowing property-pairs (e.g. gender by race), different social-technical communities (e.g. Freebase, OpenStreetMap), and different models of bias (e.g. editorship-measures). If successful we may find overlooked stigmas and phobias in society, and all the while building a massive Open Dataset of comparative indexes of known social biases."

What I particularly like about this proposal is that it drives at the limits of our knowledge - the term for that I use for it is "Rumsfeldian Unknown Unknowns" (not that I ever had an opinion about his politics but it's a stand-up phrase). The problem excites me because its in the vein of "what is reality?" but lays out a plan to answer that more scientifically than philosophically. It's also got enough holes - in that I need to learn more about machine learning, and enough strong points - in that I've already been parsing and analysing these current data sets. The fact that the question touches and expands on the currently hotly discussed gender gap is an added bonus for application-appeal, but is only coincidental because I really have been thinking about the bias problem in abstract form before I knew about the gender-gap.

Personal Ambition - Why I Aim To Do

The personal statement - which at the time of this writing remains a few scribbled notes - should, when finished, attest to two strands: mind expansion, and a battle with self-discipline.

I wept, the first time I witnessed in clear terms my internalized racism. Well actually I ran away from a protest when I became cognizant that I was listening to the black and female speakers less than the others, and then later I cried. In the underground computer lab, pouring my thoughts into a text editor, I clarified that my thoughts were a product of unquestioned norms, but verifiably wrong nonetheless.

The other strand is about materializing an internal cattle-prod to make my productivity commensurate with whatever natural sharpness I have and however much middle-class privilege it might have derived from. It took some time to start trusting the feeling that I was as dependent on a master as a dog,  and to stop trying to please schoolteachers (even if they were my actually bosses). But now I've quit my job to live frugally on open source contracts, and research for autodidactic pleasure.

Unless you take extreme measures to avoid being involved with any institutions you'll always need a CV. In creating mine so far I seem to over-mention "Wikipedia." It's not a terrible thing, but a reminder to focus the idea on 'socio-technical' systems in general and not on any one of them in specific. This is as true for a reframing of my thoughts as much as it is a single document.

In seeing what I'm focusing on in my personal history, I have to remember the importance using of my privilege correctly. Typically the notion of a personal history is to talk about the overcoming of disadvantages. Being middle-income, male, white, tall, relatively straight, and a native-English speaker means prejudice is lower in my life. The real disadvantage that comes with being so advantaged is the difficulty not to admit or check your privilege. It's hard to champion change in the world when you'd be least benefited by it - but that is the only interesting course of history from here. This is why what I want to do is be able to hold a statistical mirror to the internet and maybe that will be able to awaken a few disbelievers. (Get 'em where it hurts - right in the logic, I know their weakness.)

Lesson Needed - Who I Aim To Impress

With my applications I am clearly trying to impress someone or something, and a more pointed question is what I want from them? Most of what I think a program would give me are: more education, more professional experience and career preparation, and the pressure-shelter to focus on answering my research agenda. Ultimately then I want things out of myself.

On the education point, there are some clear fields of study to improve. Studying social science is to put a finer tipped question onto the fumbly one I am asking now about bias. Studying machine learning and statistics are aimed to automate pattern recognition tasks that I've proposed in my research agenda. I anticipate having not much problem with my mathematics background in learning these things. In fact I secretly look forward to the endorphin rush of expanding my repertoire.  I only partly fear my personality rejecting the classroom setting and it's hierarchy.

In terms of career preparedness the main thing I am trying to avoid is having a boss, or more accurately, avoiding needing a boss. That is not supposed to mean that I would like to learn the entrepreneurial art of "being your own boss," since you would still have one at that point, just you would be it. If I could meet, be advised, teach, read, and work all in the direction of pursuing a goal that I've freely made for myself by myself then I would have met the goal that I'm making for myself right now.

To use the purpose of a doctoral study program to answer a long standing personal question would be a new height of self-fulfilment. It's only a useful coincidence that other people will certainly benefit form the results along the way. However if I were doing it for the side-effect then I may have come to the wrong turn. But how will I know if I've lost my way? This is where the point of openness reigns.

The Point of Openness - How I Aim To Do

When I see the title "Should I Do My PhD In The Open?" I already see a misplaced emphasis. The question rhetorically centres around openness - that is, whether to apply the principles and ethos of Open and Libre culture to my academic pursuits. That aspect however is the most trivial. "Yes" is the answer.

I want to expand the idea of Open Notebook Science one step further. Imagine a whole PhD as an Open Notebook project. Even it's pre-formulations should go online 'as it happens'. Today, the emotions surrounding, and meta-questions of my PhD are online before I've even applied.

The point of Linus' law is validity through mass scrutiny. The point of mass scrutiny is  the topic of my thesis. And the topic of my thesis is the point of the important questions in my life.

Please follow me follow this line of logic for the next few years.