German summary:Maximilian Klein benutzt Wikidata als als Datenfundus für statistische Auswertungen über das Wissen der Welt. In seinem Artikel beschreibt er, wie er in Wikidata nach Antworten auf die großen Fragen sucht.
Simultaneous discovery can sometimes be considered an indication for a paradigm shift in knowledge, and last month Magnus Manske and I seemed to have both had a very similar idea at the same time. Our ideas were to look at gender statistics in Wikidata and to slice them up by date of birth, citizenship, and langauge.… Read the rest
In my lastest paper “Gender Gap Through Time and Space: A Journey Through Wikipedia Biographies and the ‘WIGI’ Index” (blog post and on arxiv.org), my co-author Piotr Konieczny and I proposed a gender index. WIGI, the Wikipedia Gender Inequality Index, is composed of many indicators, but one in particular, the “nation-WIGI”, was designed to be comparable with other well-known indices. The nation-WIGI ranks each nation by the ratio of female biography articles who are citizens of that nation. Designed in this way it is possible to correlate WIGI to other indexes. And potentially, we thought, given enough indexes and with high enough correlations, we could get a sense for what WIGI is measuring in terms of other indices.… Read the rest
In my last project I wanted to compare many different Gender Inequality Indexes at once, including the one I had just come up with, called “WIGI”. The problem was that the rank and score data for each index was in a separate DataFrame. I need to perform repeated SQL-style joins. In this case I actually only had to join 5 dataframes, for 5 indices. But later, in helping my partner with her research, she came across the same problem needed to join more than 100. In my mind I saw that we wanted to accomplish this n-ary join.
WIGI is the Wikipedia Gender Inequality Index, a project whose purpose is to attempt to gain insight into the gender gap through understanding which humans are represented in Wikipedia. Professor Piotr Konieczny, and myself thought that, whereas some gender gap research focuses on the editors of Wikipedia directly, we would view the content and metadata of articles as a proxy measure for those editing.… Read the rest
The “Personal Statement” for the graduate school application, is the attempt to explain how you will make a difference, not just in your research, but in making the University as a organism more equitable.
Luckily, my proposed research is precisely about the kinds of social division that that the instructions to this document ask you to address:
Please describe how your personal background and experiences inform your decision to pursue a graduate degree.
In this section, you may also include any relevant information on how you have overcome barriers to access higher education, evidence of how you have come to understand the barriers faced by others, evidence of your academic service to advance equitable access to higher education for women, racial minorities, and individuals from other groups that have been historically underrepresented in higher education, evidence of your research focusing on underserved populations or related issues of inequality, or evidence of your leadership among such groups.
I interviewed with Genius.com this past week through the fact that they are wanting to be more ‘Wiki’, and I am looking a way to fund my Wiki-based research. After a few videochats, it didn’t quite work out between us, as they are not set-up for housing pure research just yet. But there were some quizzical results that came out of the interview process around user-trust.
“Whenever a work’s structure is intentionally one of its own themes, another of its themes is art.” ~Annie Dillard
It was a warm afternoon in Paradies – the park in Jena, Thuringia – exuberant children were pretending to be snakes and crocodiles, and I was attempting to understand what I wanted to pretend to be. My current thought was that a PhD in the computer science / information science realm with a focus on Free Culture was a path forwards as I explained to my mentor Daniel Mietchen. Neither persuaded nor unconvinced he socratically proposed, like the Free Open Culture advocate that he is, to open the problem up.… Read the rest
In what could easily be a recurring annual trip,Matt Senate, and I came to Berlin this week to participate in Open Knowledge Festival. We spoke at the csv,conf a fringe event in its first year, ostensibly about the comma separated values, but more so about unusual data hacking. On behalf of WikiProject Open Access – Signalling OA-ness team, we generalized our experience in data-munging with Wikimedia projects for the new user. We were asked to make the talk more story-oriented than technical; and since we were in Germany, we decided to use that famous narrative of HäskellandGrepl.… Read the rest
Wiki-Class is python package that can determine the quality of a Wikipedia page, using machine learning. It is the open-sourcing of the Random Forest algorithm used by SuggestBot. SuggestBot is an opt-in recommender to Wikipedia editors, offering pages that need work which look like pages they’ve worked on before. Similarly, with this package, you get a function that accepts a string of wikitext, and returns a Wikipedia Class (‘Stub’, ‘C-Class’, ‘Featured Article’, etc.). Wiki-class is currently in alpha according to its packager and developer [@halfak](https://twitter.com/halfak), and although I had to make a few patches to get some examples to work, it’s ready to start classifying your wikitext.