WIGI, an Inspire Grantee

WIGI, the Wikipedia Gender Index, my project which looks at the gender representation in Wikipedia Biography articles, has won an Inspire Grant.

Over the last six months along with fellow Wikipedians we prototyped and extended this research into a paper Gender Gap Through Time and Space: A Journey Through Wikipedia Biographies and the 'WIGI' Index". One aspect of the biography gender gap we were not able to observe however was the trend of female and nonbinary biography.  We were only ever looking at a single point in time because it's too computationally complex to compare all the histories of the Wikipedias together at once.

Asking Ever Bigger Questions With Wikidata

This is a Guest-Blog I wrote for Wikimedia Deutschland: copied here:

German summary: Maximilian Klein benutzt Wikidata als als Datenfundus für statistische Auswertungen über das Wissen der Welt. In seinem Artikel beschreibt er, wie er in Wikidata nach Antworten auf die großen Fragen sucht.

Guest post by Maximilian Klein

A New Era

Simultaneous discovery can sometimes be considered an indication for a paradigm shift in knowledge, and last month Magnus Manske and I seemed to have both had a very similar idea at the same time. Our ideas were to look at gender statistics in Wikidata and to slice them up by date of birth, citizenship, and langauge.

Which Index Is WIGI Most Closely Related To?

In my lastest paper "Gender Gap Through Time and Space: A Journey Through Wikipedia Biographies and the 'WIGI' Index" (blog post and on, my co-author Piotr Konieczny and I proposed a gender index. WIGI, the Wikipedia Gender Inequality Index, is composed of many indicators, but one in particular, the "nation-WIGI", was designed to be comparable with other well-known indices. The nation-WIGI ranks each nation by the ratio of female biography articles who are  citizens of that nation.  Designed in this way it is possible to correlate WIGI to other indexes. And potentially, we thought, given enough indexes and with high enough correlations, we could get a sense for what WIGI is measuring in terms of other indices.

Joining many DataFrames at once in Pandas: “n-ary Join”

Joining many DataFrames at once with Reduce

In my last project I wanted to compare many different Gender Inequality Indexes at once, including the one I had just come up with, called “WIGI”. The problem was that the rank and score data for each index was in a separate DataFrame. I need to perform repeated SQL-style joins. In this case I actually only had to join 5 dataframes, for 5 indices. But later, in helping my partner with her research, she came across the same problem needed to join more than 100. In my mind I saw that we wanted to accomplish this n-ary join.

Preliminary Results From WIGI, The Wikipedia Gender Inequality Index

This is a preliminary list of results from a research project is being compiled into full paper on the subject.

The full paper, in it’s academic form is now available on arxiv.


WIGI is the Wikipedia Gender Inequality Index, a project whose purpose is to attempt to gain insight into the gender gap through understanding which humans are represented in Wikipedia. Professor Piotr Konieczny, and myself thought that, whereas some gender gap research focuses on the editors of Wikipedia directly, we would view the content and metadata of articles as a proxy measure for those editing.

Personal Statement: In Full

The “Personal Statement” for the graduate school application, is the attempt to explain how you will make a difference, not just in your research, but in making the University as a organism more equitable.

Luckily, my proposed research is precisely about  the kinds of social division that that the instructions to this document ask you to address:

Please describe how your personal background and experiences inform your decision to pursue a graduate degree.
In this section, you may also include any relevant information on how you have overcome barriers to access higher education, evidence of how you have come to understand the barriers faced by others, evidence of your academic service to advance equitable access to higher education for women, racial minorities, and individuals from other groups that have been historically underrepresented in higher education, evidence of your research focusing on underserved populations or related issues of inequality, or evidence of your leadership among such groups.

I interviewed with this past week through the fact that they are wanting to be more ‘Wiki’, and I am looking a way to fund my Wiki-based research. After a few videochats, it didn’t quite work out between us, as they are not set-up for housing pure research just yet. But there were some quizzical results that came out of the interview process around user-trust.

I did not want to take their standard programming test because under good advice you never should. Instead I suggested that as a work-trial I run and report the collaborativeness measures I developed this year (accepted to CSCW '15) on their data.

Should I Do My PhD In The Open?

“Whenever a work’s structure is intentionally one of its own themes, another of its themes is art.” ~Annie Dillard

My axe became stuck attempting to split this wood by myself.


It was a warm afternoon in Paradies – the park in Jena, Thuringia – exuberant children were pretending to be snakes and crocodiles, and I was attempting to understand what I wanted to pretend to be. My current thought was that a PhD in the computer science / information science realm with a focus on Free Culture was a path forwards as I explained to my mentor Daniel Mietchen. Neither persuaded nor unconvinced he socratically proposed, like the Free Open Culture advocate that he is, to open the problem up.