The Universal Empathy Machine: Nonviolent Communication Explained with Mathematics and Computer Science

0. The Universal Empathy Machine

Empathy is not sympathy. What’s the difference? Think of the Universal Turing Machine. It is a machine that accepts a program and data, and runs that program on that data. In this way it can simulate all programs on all data. Let us think of a human as a program and human experience as data. Sympathy then, is running your program on someone else’s data. Empathy is running their program on their data. As you can see the results of the sympathy and empathy computations are not guaranteed to be identical. In a nutshell Nonviolent Communication is about becoming the Universal Empathy Machine, to be able to emulate the architecture of an arbitrary person given a arbitrary experience.



Cover of Nonviolent Communication, replete with sunflower
Cover of Nonviolent Communication, replete with sunflower

Nonviolent Communication (abbrv’d NVC), is a theory by Marshall Rosenberg and the title of a book which has an unfortunate cover. Dressed up in a sunflower, you would associate it with self-help pseudoscience and may not allow it to surprise you. I only popped it open because a) it could be pirated on The Pirate Bay, and b) it was the reciprocating recommendation to me after I had been proselytizing my then-favourite-read to a friend, and so I felt obliged. As you can see neither of those reasons should really have you running to the library.

Its insight-olives are sparse in its cibatta. But however rare they are, those morsel were for me escape plans for decade-long arguments. I felt so resourceful having a theory of dealing with conflict, where I never had one before. The only problem was I couldn’t chat to my friends about it, let alone recommend it. It’s not intended for anyone that would use the terms logical, or reasonable to describe themselves. They’d be seeking different analogies, examples, and want it to be quite a bit shorter.

Well this is that version, your very short introduction to Nonviolent Communication, abridged and explained through mathematics and computer science analogies. I’ll translate it into the realm of motivations, axioms, communication protocols, and finally foundational flaws.

1. The Intention of Nonviolent Communication is Connection

Any good sceptic should immediately be asking the purpose-question. What we are interested in is the family of problems characterized by the set of disharmonies, disagreements, and arguments.

Now it must be noted that nonviolent communication admits to there being intractable arguments. Not every argument is solvable, and like the halting problem, there is no way of deciding whether an argument will run forever without just trying to solve it.

The main approach we use to arguments is finding connections. A connection is relation aRb between persons a and b, not necessarily distinct, such that a and b are ready to resolve the disagreement. (What is it when one person is not ready to resolve? We address that later).

Argument resolution often never starts because it does not aim to find connection. In some cases we are talking past each other, we need to connect onto what the main topics are. In other cases we are discussing the same topic, but cannot connect onto a mutually agreeable answer, here NVC says to connect on observations and feelings and needs.

Protip: Notice connection is not always with “others”, because often we want to change abusive self-dialogue.

2. Axioms of Nonviolent communication

There are three strong axioms; sorry Wittgenstein fans.

2.1 Feelings Are Connected to Needs

We suppose a Connection map, which maps feelings to needs.

C \colon F \mapsto N

Where F is the set of all feelings, and N is the set of all needs. Note that C is not necessarily injective but is surjective. The intuition here is that whatever feelings are observed, can be map to a need – probably unmet. This need usually becomes our focus.

2.2 All Needs Matter

The set of needs and important needs are exactly equal.

\forall n \colon n \in \{ \text{needs} \} \iff n \in \{ \text{imporant needs} \}

Taking as an axiom that all needs are important needs allows participants to declare needs without fear. The Universal Empathy Machine is a system of how to accept needs as important which do not appear important, to you.

2.3 There Is Always a Choice

The empty set is not contained in the set of all choices.

S := \{ s \in \mathcal{P}(choices) \wedge s \neq \emptyset \}

Let’s consider this our “Axiom of Choice” – we always have one. NVC asks us to accept a strong theory of free will. We have a nonempty, and possibly infinite set of reactions for all interpersonal interaction.

3. Communication Protocols

In trying to connect we will have to in some way communicate with each other, let’s call this messaging. NVC says that it’s important to do this in a specific way. The messages that we pass between objects, probably humans, not necessarily distinct, are a 4-tuple containing:

messaging tuple := (observations, feelings, needs, requests)

This quadruple through the unveiling of each element, produces a flow from empiricism, to emotion, humanism, and finally to action.

Not every message needs to contain all four parts, for brevity often they can be omitted. When starting though it can be useful to be exhaustive for practice.

3.1 Observations

"Observing without Evaluation" 
~ NVC Chapter Title 3

In the canonical specification of Nonviolent Communication, this chapter is literally titled “Observing without Evaluation”. Little do they know just how apt that is. In this messaging block we are transmitting our observations rather than our opinions. The assumption NVC operates with is that when viewing the world, we sense our observable universe and then evaluate it to return opinions. But it is not clear that those opinions are useful yet, so let’s not naturally default to eager evaluation. (Well you may not, depending on how much of a functional purist you are.) Taking a cue from the lazy evaluation model, we don’t have to return opinions until they are necessary. In fact, evaluation is not necessary until we send feelings.

If you are not a fan of fixed evaluation strategies, another way to think about the observation section is it’s where we make our imports. Here we are providing the populated namespaces, libraries, and constants that we will reference in the rest in the rest of our communication. The context. Since sensory perception varies from human to human, we cannot rely on the exterior universe to be observed equally, thus we pass along our context. The point is that we do not want any of our further statements to be received ambiguously, so our definitions must be precise. We make only natural-philosophy-style remarks that equate to be exactly true. “You’re always late,” has truth value in the open interval (0,1) depending on which human is observing. “We agreed to meet at 7:30, and I saw you arrive on Monday at 7:45, and on Tuesday at 8:05″ has a truth value of just 1. Now the rest of our program, or proof – whichever side of the isomorphism you prefer – can refer to lateness with no misgivings.

Exercises: Observation or Opinion?

1. “Your email signature is 41 lines long, rendering for me as over 4 screenfuls, where as your last 5 messages to the list were each less than 41 lines long.”
2. “Dante often does not wash his dishes in the hackerspace.”
3. “Allesandra told me that I was not good at identifying contrapositives.”
4. “Our group facilitator controls the meetings.”


1. This is an observation, which is entirely verifiable.
2. This is an opinion because “often” is not defined.
3. This is an observation if Allesandra literally said so, but not if Allesandra was only referring to a specific time the speaker did not identify a contrapositive, in that case the speaker would be making an evaluation.
4. This is an opinion because “controls” is open to interpretation.

Protip: when you are having difficulty finding observations to base what you want to say, and your communication is a reply it is OK, and even encouraged to literally repeat what your partners have said. $echo what-they-said. More about this in 4. Receiving section.

3.2 Feelings

There is the counter-intuitive Rosenberg law: "Expressing our vulnerabilities can help resolve conflict."

After having carefully preserved the pre-evaluation observation, it is finally time to also give the results of our evaluations – feelings. To explain what feelings are we explore a classic gotcha – psuedofeelings. Pseudofeelings unfortunately do pass duck-typing tests. They key difference, is that, with respect to the feeler, feelings are internal, and psuedofeelings are external.

Some examples of pseudofeelings are “I feel unimportant”, “I feel misunderstood”, or “I feel ignored”. Re-expressed as feelings these would be, respectively: “I feel discouraged because I observed I was not part of [important decision]”. “I feel anxious because you doing [action] doesn’t reflect that you understood me.”, “I feel hurt, because I perceive I am being ignored.”. These re-expressions take an external feeling, and talks about what external event made you feel internally. This is important, because a statement about yourself can never be blaming, and allows others to see your perspective on your observations.

Exercises: Feeling or Pseudofeeling?

1. “I feel scared when you talk about about forking”.
2. “When you don’t cite me, I feel neglected.”
3. “I’m happy that you found time to come to Wikimania.”
4. “I feel disappointed by the fact that you did not publish your dataset, because I had to recreate it.”


1. Feeling. Scared describe’s the internal state of the feeler.
2. Pseudo-feeling. Neglect is a thought about the exterior world. Feeler is probably depressed about not having their work recognized.
3. Feeling. The user happy, and said so.
4. Pseudo-feeling. Despite very clear reasoning, disappointment is not a feeling, but a pseudofeeling. User is probably feeling aggrevated because of needless extra work.

3.3 Needs

"God gave us the universal needs, man created the rest" 

We have described the outside world, and stated how we feel about it, but we’ll require one more step to expose our Connection API. Needs are those sufficient and necesarry condition for you life. According to NVC all humans come preloaded with immutable natural feelings which are factory defaults. Because needs are very low-level, primitive objects, it’s likely that the communicators will have some of these in common. And with common needs, connection can be found.

Needs are typically very basic, like: autonomy, celebration, creativity, appreciation, love, respect, play, peace, food, rest, sex. Communicating these needs may seem weak, irrational, and impossible to admit out loud, but the whole point is to open up. This point is opensourcing ourselves to the very lowest machine-level. There are two ways in which this Richard Stallman doctrine aids us with emotion. First, the open code of ourselves is a signal for our partners to work with us, and the mystery of how we work disappears. Secondly, when we disclose our code, all bugs are shallow. It’s scary that others will be delving into our innermost code, but like Heartbleed, it is the only route to long term security. Remember, in order to further our opening have the Axiom 2.2, the safety mechanism that all needs are important.

Exercises: When Are Needs Being Expressed?

  1. “I feel angry when you talk about transhumanists that way, because I am wanting respect for my own destiny and I hear your words as an insult.”
  2. “I’m discouraged because I would have liked to have progressed further in my work by now.”
  3. “I feel disappointed because you assigned yourself to those bugs, but didn’t squash them.”
  4. “I’m sad that you won’t be meeting me at the vegan restaurant for dinner because I was hoping we could chat about anarchism together.”


  1. Wanting respect for way of life is a basic need, whatever it may be.
  2. This is close enough to a need. It’s implied that the need to is for the speaker to be feel fulfilment from progressing through work. This is actually an exercise verbatim out of the original NVC book.
  3. No need is being expressed here. Perhaps the speaker needs the mental comfort of having no outstanding issues, or needs the security of that comes with trustworthy friends – we don’t know and it isn’t clear.
  4. Human contact is a need. Maybe they also need a tempeh gyro.
3.4 Requests's author Max Ogden analogizes callbacks to the numbers given you at restaurants that tell waitrons what to do with your food after it's been cooked. In our case, it's more like the waitron telling you that their job has many harsh realities, which makes them feel very oppressed, crushing their need for economic freedom, and finally telling you to help them smash the capitalist wage-labour sociopolitical complex.

At last we can try to alter the world with requests. Requests are callbacks we send to our communicating partner. They indicate what and when we’d like your partner to do. Like asynchronous javascript, there are a lot of security issues. Co-communicators won’t want to run to malware. That’s why the best request-callbacks are verified non-malicious by being the conclusion of a observation-feeling-needs-request syllogism.

Protip: By Axiom 2.3 we always have a choice, and so it is impossible to be in a situation where only the other party can break a stalemate.

Issuing precise requests clarifies what we want from our partner. If it feels difficult to articulate what we want from others, that’s typically because it’s not an action. NVC says that more specific actions make better requests, otherwise we’re issuing request that the receiver can’t know if they’ve done it. If we shout at our colleague that a project is behind schedule, and we know they can’t speed it up – we’re not asking them to speed it up, but merely to acknowledge our anger. In this case the call back request might be “give receipt of my frustration”. Colloquially this would be known as venting. It’s nice to have no ambiguity about when the roles are just to listen, or to actually address a behavioural pattern.

Exercises: Request or No Request:

  1. “I want you to grok me.”
  2. “I’d like for you to indicate one moment in my presentation that you appreciated.”
  3. “I would like you to walk more slowly in the airport and tell me where you’re going before you walk off.”
  4. “I want you to be proud of your organizing work.”


  1. Not a request, because grok is not specific action. It could however be illustrated by asking for the receiver to paraphrase speaker. (See how to do this well in 4. Receiving)
  2. This request is asking for a something concrete, empathetic reception. These kinds of requests are made to seem ridiculous in the modern era, but that is just the long term cultural effect of “guess culture“.
  3. A little bit exasperated, but quite clear actioning in the request. This isn’t not not one of my pet peeves.
  4. How is the speaker going to know when the receiver is being proud? “I want you to tell my friends about your organizing work,” is more direct if that’s what they’re looking for.

4. Receiving

NVC’s messaging protocol is two-way, and now that we’ve covered “expressing honestly” there’s still “receiving empathetically”. Receiving empathetically can be understood as the process of parsing unstructured conversation text into the formal grammar of NVC. The conversation text is what the other person is telling you and our target grammar is the observation-feelings-needs-request 4-tuple. We are not guaranteed to get all of the components, and not in any specific order. We’ve just got a really difficult parsing problem on our hands.

The reason that people are offended when they are asked, “Did you hear me? What did I say?” is because it’s actually difficult to paraphrase what may already be a hard message to hear. We will attempt to do better, to translate their communication into an NVC object. Once we can be sure we have their experience (data), and how they are dealing with that experience (program), we can become the Univeral Empathy Machine (section 0), and be fully empathetic. When done right the empathetic effect will come alive. Firstly, and very clearly their need to be just-listened-to will be fulfilled. More subtly, our partners can fine-tune their thinking by seeing how closely our Empathy Machine mirrors their Identity function.  That is, how well our reflection matches their intended expression. This lets them know if we are missing any of their points, or they haven’t stressed what they would like to.

Protip: Even if someone starts trying to brute force attack us with volume and vitriol, we can still receive empathetically. Intimidating messages are also people asking us to meet their needs. Try for instance, “It seems like you’re really angry about my deleting the private key; because you need more security about what’s happening in your life.” Likewise, if we are engaged with someone not ready to resolve, see if an application of empathy towards them helps.

Counter-intuitively this paraphrasing saves time, even though it takes time. A typical pitfall in a time-saving mentality of receiving is the bad habit of trying to short circuit the conversation by offering unsolicited advice to people. Offering unsolicited advice would be as if a parser (a) took input, (b) maybe did or didn’t parse the input, (c) did not verify the meaning of the maybe-parsed result, and then (d) returned advice based on exogenous heuristics. Returning that computation to the speaker would understandably be nonplussing if not absolutely frustrating as it is devoid of any indication that it related to what they said. Giving advice is only useful iff advice is what the speaker is asking for. By assuming they want a “fix-it” response, we are only engaging in the folly of mansplaining.

Receiving empathetically is to parse our partners messages and run it through our Universal Empathy Machine.

Exercises: Empathetic Reception (Y/N)?

  1. Person A: Counting error in Ultimate Street Fighter IV finals? How could I do something so stupid? Person B: Nobody’s perfect, don’t be too hard on yourself.
  2. Person A: You’re a delusional utopian.
    Person B: Are you feeling frustrated because you would like me to admit that there could be other ways of interpreting the Black Lives Matter movement?
  3. Person A: Oh I’m being SO BAD! I NEVER eat cupcakes! Person B: Maybe exercising more would help you.
  4. Person A: When friends of a friend of a friend join our camp without showing commitment, I feel encroached on. It’s like how fraccing companies squeeze me with anti-protest tactics. Person B: I know how you feel. I used to feel that way too.
  5. Person A: I’m unhappy with the grant’s status because you should have made more impact by now. Person B: I know you’re unhappy, but we’ve been slowed by bureaucratic process.


  1. B is giving advice to A, which is not an empathetic response. “You sound like you’re enraged by your lapse of concentration,” is more along the NVC lines.
  2. Empathetic response since B is trying to ascertain from A’s perspective why A might be lashing out.
  3. Again B is advising A, even though the tone is lighter. B might want to try and understand what feelings are behind A’s not being neutral about food.
  4. Not an empathetic response, an sympathetic response. Same data, but whose program is being applied?
  5. Trick empathy. Just saying you understand is not the same as demonstrating you understand. B left A’s comment about impact on the floor, which B could have used to empathise with.

5. Criticisms of NVC

How many different input methods can we use to write an email? Maybe with a physical keyboard, or virtual one on a phone, different auto-complete schemes, speech-to-text, and maybe we’ve even had the pleasure of tapping one out T9 stylee. Even though we may aim to transcribe the same thoughts, based on the technique used, the final text will be different. If the text of our emails are altered so too are the conversations. Now, different formats of email it will benefit specific ways of writing. We might be happy arranging dinner plans tapping on glass, but for conforming to the standards of a formal letter begs for the old clickety clack.

As input methods change a conversation, so does NVC. Since it is very literally a theory of discourse, using NVC will necessarily bring with it prediscursive bias. The format of the discussion is not variable that is discussed. But fair enough, any communication strategy would come attached with its own biases. The question then really becomes, since NVC is prescribed conversation format, which speakers does it benefit?

NVC’s founding theorist, was an American white man born in 1930 as the son of Russian Jews. What does that mean specifically for who NVC benefits in conversation? My reading turned up no mention of how Rosenberg’s personal background might affect his theorizing. In my opinion – I am myself a white man with similar citizenry and ancestry – it imports notions of classical logic, a Mazlowic need hiearchy, and western rationality.

To expand, the technique has an orderly system to follow. This system is static and procedural, where it could be more goal-directed. The ontology presupposes the universality of basic needs. This could be interpreted as the hubris of someone currently with privilege assuming that others are like them. And lastly it does not make large mention of how it would fit in a multiplicity of different communication strategies, as a pluralist might.


NVC has a grand concept, which works at times and is undermined by it’s flaws at others. It was useful for me because it was the first conflict strategy I’d heard of, and it “made sense” to me. It turned out not to be a persuasion-hack, but it did teach me the concept of empathy. Understanding empathy for the first time was truly a dose of mind expansion. I’ve kept the format of NVC’s exercises at the end of each chapter, because as contrived as it seems, the questions are hard, empathy is not intuitive, and practice is vital. It’s really more praxis than theory. In fact, practising empathy has been the inroad to new ideologies for me like: feminism, anti-racism, LGBTQ-allyship, and other social movements for which I am not the effected demographic. I hope you, person who likes mathematical analogies, can glean something studying from it too.

Notconfusing rules for conversation: 2 rules and a jumpstart.

Meeting people can be a slog. “Hello, what’s your name?”, “Where are you from?”, “What do you do?”,  “How do you yawn?”. Yawn? Sorry I was nodding off just writing about how repetitive and tiresome modern meeting and greeting can be. Owing to the way that social networks store information about us, we’re used to thinking about people in a list of attributes “forms” structure. Trans-inclusive feminism has already laid out how select-a-value gender is problematic for self-determination, and it has even subtler consequences in meeting people. We’ve come to assume the next person you meet is some combinatoric permutation of drop-down menus. How are we supposed to meet that person that is our life long friend, but at the moment is just looks like one more INTJ or Virgo?

In fact the disillusionment from these gruelling social interactions is exactly the motivation for having friends, as a commiserating shelter. How do we let humans do the human thing and wow us with their outstanding creative expression of self from the moment we first meet? I submit notconfusing’s two rules for conversation.

  1. Ask questions that reflect choices people have or could make.
  2. Ask questions that have never been asked before.

Asking questions that reflect choices or decisions is a way to understand a person’s values and principles, which is more informative than part of their current happenstance. Even though this point is supposed to cause a deeper understanding, the questions need not be heavy. “When you’re sleeping on your favourite side, are you facing towards your alarm clock?” might tell you a bit about how much someone wants to combat their own habits without asking “how cognisant are you of your habits and how do you want to combat them?” The analysis of their choices can be done together out loud or both parties can be trusted to do so internally. In either case the point is to revel in the complexity of your partner, while gifting them a bit of Rogerian psychology.

Notice that just Rule 1 by itself could still allow for a “What are your hobbies?” variant, so Rule 2 is brought in to stem the tedium. At first it might seem impossible to ask an entirely unique question to every person, but – as I will prove – there really are an infinite number of these types of questions. Here are a few strategies.

The first strategy is analogous to a infinite game I learned called “Uses for…” where you try to come up with as many uses as you can for a specific item. The example I recall reading about is a bed sheet. So let’s play: It can be used as a tablecloth, as an escape rope for climbing out windows, as a substitute for an all-white painting, as a shooting target for short-sighted people, as a stencil for papier-mâché bed etc. etc. Try and come up with 5 more.

Now apply  creative riffing to the things you notice about your partner. For instance these are the topics I brought up from the last ice-breaking conversations I’ve had: reminiscing over video rental returns (standing near a letterbox), a comparison of how different tapes will tear when you don’t have scissors (electrical taped wallet), how often I think about life from a bird’s-eye view (standing at different levels), and the history of the vulcanization of rubber (rode with a flat tyre). Going off-script and generating questions based on the partner and surroundings guarantees freshness. The way your associate engages gives you some understanding of their gestalt person-ness.
Even if you are feeling like you filled out pointless forms all day at work so that you are sapped of your free-associativity, there is always the abstraction “meta” trick. Assume that you have racked your brain, and “where are you from?” is the absolute best question you can come up with because you are only meeting people out of some hateful obligation. You can apply question-abstraction to ask them “what does a person’s answer to <absolute best question I can muster> mean about a person’s personality?” Yes, use your own staleness as weapon. Since the result of the question-abstraction is also a question, it can be infinitely applied to itself to yield infinitely many unique questions. QED. (If you think this a sad proof, then I encourage you to really try it. I imagine you’ll become loopy enough by the hypnotic repetition of speaking that your co-discusser will either join in with you in your recursion – great fun – or they will have walked away, which is just a well.)

A last technique, if you want to borrow a bit, is to use my growing list of ice-breakers.  I’ve created them as group introductions when I was facilitating Sudo Room hackerspace meetings. As they are targeted to a tech-y crowd you may need to customize a bit –  exactly the point that I’m trying to champion.

With the application of these 2 rules you begin to transgress social mores for great good. You ought to explode small talk to eschew complacency. Then you can make more and better friends. Although ironically making this kind of conversation may have effect of pinning you as a werido. Yet disobey the laziness of phone alienation as Saul Williams does in Talk to Strangers  “… that square box don’t represent the sphere that we live in. The earth is not a flat screen, I aint trying to fit in.”

List of Yoga Quotations

Here are a list of Yoga quotations that I’ve compiled from my 200-hour yoga teacher training, other classes I’ve attended, and various yoga books.

Jon Isaacs

  • “The pose begins once you want to leave it”.
  • “Who went to their first yoga class because their life was going really well‽”
  • “I had a hedgefund guy fire me once because I was talking about greed in class.”

Sean Feit

  • On taking non-harming literally, “I take an antibiotic – genocide.”

Jean Mazzei

  • “You can have peace or mind, but not peace of mind because the mind’s purpose is to think.”

Cora Wen

  • “You probably think you have a knee, there’s nothing there. There’s no knee.”
  • “The knee is the prisoner of the hip and the ankle.”

Stacey Swan

  • On being a good teacher, “It’s not about putting your foot behind your head, but keeping it out of your mouth.”
  • “The american way is ‘no pain, no gain’, but yoga is ‘no pain, no pain'”
  • “A good yoga class should be like a Seinfeld episode,” (in that is should come full circle at the end.

Karen Macklin

  • “Vinyasa can also mean how you sequence your life.”

Adrianna Webster

  • “On an inhale, breathe out”.

Leslie Kaminoff – Yoga Anatomy

  • On the spine, “The full glory of nature’s ingenuity is apparent in the human spine…From an engineering perspective it is clear that we have the smallest base of support, the highest center of gravity, and the heaviest cranium (proportional to our body weight) of any other mammal.  As the only true bipeds on the planet, we are also earth’s least mechanically stable creatures.”
  • On breathing, “The energy expended in breathing produces a shape change that lowers the pressure in the chest cavity and permits the air to be pushed into the body by the weight of the planet’s atmosphere. In other words, you create the space and the universe fills it.”
  • Paraphrase on hand balances, “4/5th of the foot is dedicated to weight-bearing  and 1/5th is dedicated to dexterity. The hand (on the other hand) is 1/5th weight-bearing, 4/5th dexterous.”

Rudolf von Laban

  • “Each bodily movement is embedded in a chain of infinite happenings from which we distinguish only the immediate steps and, occasionally, those which immediately follow… In every trace form created by the body, both infinity and eternity are hidden.”

Joel Kramer – Yoga as Self-Transformation

  • “The essence of yoga is not attainment, but how awarely you work with your limits.”
  • “If you’re running from the feeling, it’s pain.” (Otherwise it’s just intensity.)
  • “Yesterday’s Level of Flexibility”. The (unhelpful) concept which I call YLF.

Desikachar –  The Heart of Yoga

  • Yoga defined, “attempting to do something you haven’t before.”


Oh headstand, “It’s like Wu-Tang says, you gotta ‘protect ya neck.'”

On stepping onto your mat, “Let’s go for a magic carpet ride.”

Travis Judd

  • “Make a conscious choice about what kind of practitioner you want to be right now.”


I can’t recall the provenance of these quotes sadly. Let me know if you can.

  • “The idea that we are ever not moving is an illusion.”
  • “asana is a process not a product otherwise we could say ‘not in a pose’ if head isn’t touching knee, but that is false.”
  • ‘Yoga’ has the root ‘Yuj’ which is the root for the English word ‘Yoke.’
  • Like humans, “water is transparent and reflective but don’t see those properties when in motion.”
  • “If you feel like you’re being inauthentic start telling the truth.”

















WIGI, an Inspire Grantee

WIGI, the Wikipedia Gender Index, my project which looks at the gender representation in Wikipedia Biography articles, has won an Inspire Grant.

Over the last six months along with fellow Wikipedians we prototyped and extended this research into a paper Gender Gap Through Time and Space: A Journey Through Wikipedia Biographies and the ‘WIGI’ Index”. One aspect of the biography gender gap we were not able to observe however was the trend of female and nonbinary biography.  We were only ever looking at a single point in time because it’s too computationally complex to compare all the histories of the Wikipedias together at once. Now, with $22,500 and a small team, our aim is to sample this data weekly thereby gathering some longitudinal data on the way that Wikipedians are representing biographies.

Our project’s form is to create a data portal which  will display the visualisations of the state of gender in biographies. The underlying data which associates biography gender with Wikipedia language, date of birth/death, citizenship, profession, and celebrity status, will be purposefully published under an open license. We hope that other researchers can make use of this social indicator, much the in same way one can United Nation’s Gender Inequality Index.

The project is will be managed entirely on github, and should be completed in about 6 months.

It promises to be,



Asking Ever Bigger Questions With Wikidata

This is a Guest-Blog I wrote for Wikimedia Deutschland: copied here:

German summary: Maximilian Klein benutzt Wikidata als als Datenfundus für statistische Auswertungen über das Wissen der Welt. In seinem Artikel beschreibt er, wie er in Wikidata nach Antworten auf die großen Fragen sucht.

Asking Ever Bigger Questions with Wikidata

Guest post by Maximilian Klein

A New Era

Simultaneous discovery can sometimes be considered an indication for a paradigm shift in knowledge, and last month Magnus Manske and I seemed to have both had a very similar idea at the same time. Our ideas were to look at gender statistics in Wikidata and to slice them up by date of birth, citizenship, and langauge. (Magnus’ blog post, and my own.) At first it seems like quite elementary and naïve analysis, especially 14 years into Wikipedia, but only within the last year has this type of research become feasible. Like a baby taking its first steps, Wikidata and its tools ecosystem are maturing. That challenges us to creatively use the data in front of us.

Describing 5 stages of Wikidata, Markus Krötsch foresaw this analyis in his presentation at Wikimania 2014. The stages which range fromKnow to Understand are: Read, Browse, Query, Display, and Analyse (see image). Most likey you may have read Wikidata, and perhaps even have browsed with Reasonator, queried with autolist, or displayed with histropedia. I care to focus on analyse – the most understand-y of the stages. In fact the example given for analyse was my first exploration of gender and language, where I analysed the ratio of female biographies by Wikipedia Language: English and German are around 15% and Japanese, Chinese and Korean are each closer to 25%.

To do biography analysis before Wikidata was much harder. To know the gender of an article you’d resort to natural language processing or hacks like counting gendered categories and guessing based on first name. Even more, the effort had to be duplicated for each language that had to be translated. Now the promise of language-free semantic data, and tools like Wikidata Query and Wikidata Toolkit are here. The process is easier because it is more database-like; select, group by,apply, and combine.

With this new simplicity, let’s review what we have imagined so far. Here’s a non-exhaustive introduction to the state of creative question-asking so far:

Pushing Ourselves to Think Even Bigger

Can we think even bigger if we use more of the available data? Thinking about the fact that every claim may have an attached reference, Markus Krötzsch always wants to know, for a given set of claims what references must be believed in order to believe the set of claims? With that notion we could look at all the claims associated with all the items of a given language, and thus the required belief system of that langauge. At this point we could ask what are the differences in the belief systems of any two langauges?

Another way we could test the fundamental principles of knowledge and culture is to consider the chains made by the subclass of, instance of, or cause of properties. Every language is present at different links of each chain. So we can look at the differences in ways in which languages organize a hierarchy of concepts – or if they think it’s a hierarchy at all.

Much fun for logicians and epistemologists. But we can also ask more socially important questions, questions about how language and society relate. What biases do we have that we aren’t even aware of? The method, for which I’ve proposed a PhD, could be conducted as follows. We’re aware of sexism in our societies, and as you’ve seen we’ve started to build a statistical profile of how it manifests in Wikidata. Likewise we’re cognizant of racism and homophobia. We might next look at rates people appear in Wikidata by race and desire. Let’s assume we could train a model to say that these kinds of distributions are types of social biases. Next we could search every property in Wikidata to see if it indicated social bias. If successful we may find overlooked stigmas and phobias in society.

I claim that our theoretical question-answering ability has paradigmatically shifted with the growing up of Wikidata. Soon enough you won’t even need to be a sophisticated programmer to whisper your questions into the system. So next time your reading, browsing, querying or displaying Wikidata, challenge yourself to think about how to analyse it too.

Which Index Is WIGI Most Closely Related To?

In my lastest paper “Gender Gap Through Time and Space: A Journey Through Wikipedia Biographies and the ‘WIGI’ Index” (blog post and on, my co-author Piotr Konieczny and I proposed a gender index. WIGI, the Wikipedia Gender Inequality Index, is composed of many indicators, but one in particular, the “nation-WIGI”, was designed to be comparable with other well-known indices. The nation-WIGI ranks each nation by the ratio of female biography articles who are  citizens of that nation.  Designed in this way it is possible to correlate WIGI to other indexes. And potentially, we thought, given enough indexes and with high enough correlations, we could get a sense for what WIGI is measuring in terms of other indices.

Due to word-count limits, we were unable to submit this research question with the rest of the paper, so it is included here. Formally we formulated is thus:

RQ4: Of the other Gender Indices which divide also by nation which index is Wikipedia most closely related to?

First let’s recap the four other nation divided indices we are inspecting (see section 3 of our paper for more detail).

  • GDI
    • The UNDP’s Gender-related Development Index (GDI) introduced only in 1995.
    • A gender-focused extensions of the Human Development Index. GDI’s primary focus lies in gender-gaps in life expectancy, education, and incomes.
  • GEI
    • The Gender Equity Index (GEI) introduced by Social Watch in 2005.
    • Developed to measure all situations that are unfavourable to women, it ranks countries on three dimensions: education, economic participation and empowerment.
  • GGGI
    • The Global Gender Gap Index (GGGI) developed by the World Economic Forum in 2006.
    • Intended to allow comparative comparison of gender gap across different countries and years, it focuses on four areas:  economic participation and opportunity, educational attainment, political empowerment and health and survival statistic.
  • SIGI
    • The Social Institutions and Gender Index (SIGI) of the OECD Development Centre from 2007.
    • A composite indicator of gender equality that solely focuses on social institutions (norms, values and attitudes), as well as on the four dimensions of family code, physical integrity, ownership rights and civil liberties.

    Comparison Data:

    With each of the above four foreign indices we have a ranking associating a nation (sometimes referred to as an economy) and an ordinal position. We would like to understand how close two indices are, for which we use the Spearman rank correlation coefficient. Two other technical points to be addressed are that we must use the intersection of  nations covered by each index to avoid missing data problems. And lastly, we compute a calibration step to find the start decade of Wikidata-data that maximises the correlation in question.

    The full source code of this calculation is available on github.  Also as an aside, I have another blog post on an functional-programming solution to joining many dataframes at once, that was useful in computing these results.

    Finally we produced a comparison table of indices,  their correlation, the correlation significance, and the maximizing start decade.  We present it ordered by correlation:

    National-WIGI compared to Alternative Indexes


    Spearman Correlation


    Calibrated Start Decade


















    Each alternative index shows some statistically significant moderate correlation with our nation-WIGI index. This proves that the female ratio of Wikidata humans associated with a country is, at minimum, a legitimate addition to the landscape of gender inequality indexes.

    Additionally, the fact that each alternative index most highly correlates when we consider only those biographies starting around 1900 is a positive sanity check for our data. Intuitively this makes sense in the light of the fact that traditional indexes talk about modern history only.

    Still, what is the interpretation that our nation-WIGI is most highly correlated to GEI, and least with GDI? What do GEI and GDI measure that show what WIGI is measuring? We dig further into the methodologies of theses indices.

    Social Watch’s GEI explains itself that:

    “In Education, GEI looks at the gender gap in enrolment at all levels and in literacy; economic participation computes the gaps in income and employment and empowerment measures the gaps in highly qualified jobs, parliament and senior executive positions.”

    And the UN’s GDI reports itself as:

    “The new GDI measures gender gap in human development achievements in three basic dimensions of human development: health, measured by female and male life expectancy at birth; education, measured by female and male expected years of schooling for children and female and male mean years of schooling for adults ages 25 and older; and command over economic resources, measured by female and male estimated earned income.”

    So we find that both indexes use indications connected to education and economic activity. The differing factor ultimately is that the GEI additionally measures empowerment by positions of power whereas the GDI additionally measures life expectancy. This suggests that the ratio of female biographies by nation in Wikidata are more highly correlated to women’s positions of power by country than to life expectancy by country. That, at first glance, is commensurate Wikipedia’s notability policies. Notability in Wikipedia essentially defers to inclusion or absence in the journalistic and scholarly record. That means that humans in positions of power, as GEI covers, would would tend to be in Wikipedias in greater proportion. Thinking about GDI’s life expecetancy uniqueness, one does not obviously see a strong reason that those with greater life expectancy are more covered in Wikipedia.

    Clearly this is a very rough investigation, and our conclusions can only be limited. Yet we still have some evidence for Wikipedia’s notability policy effecting the gender representation. That link might be clear with some feminist reasoning, but the data also supports the notion. Surely this is a nice fact to know for those who criticize the notability inclusion as it stands.

    For questions or suggestions, contact me on twitter – @notconfusing.


Joining many DataFrames at once in Pandas: “n-ary Join”

Joining many DataFrames at once with Reduce

In my last project I wanted to compare many different Gender Inequality Indexes at once, including the one I had just come up with, called “WIGI”. The problem was that the rank and score data for each index was in a separate DataFrame. I need to perform repeated SQL-style joins. In this case I actually only had to join 5 dataframes, for 5 indices. But later, in helping my partner with her research, she came across the same problem needed to join more than 100. In my mind I saw that we wanted to accomplish this n-ary join. Mathematically I wanted this type of operation, which I couldn’t find in pandasjoin

The answer I enjoyed implementing, perhaps because I saw it as this type of repeated operation, is the reduce of functional programming.

Ok, say we have these two data sets:

In [5]:
Rank Score
Republic of China 1 0.356890
Kingdom of Denmark 2 0.347826
Sweden 3 0.345212
South Korea 4 0.343662
Hong Kong 5 0.342857
In [6]:
Rank Score
Iceland 1 0.8594
Finland 2 0.8453
Norway 3 0.8374
Sweden 4 0.8165
Denmark 5 0.8025

We’d probably join them like this:

In [7]:
wigi.join(world_economic_forum, how='outer', lsuffix='_wigi', rsuffix='_wef')
Rank_wigi Score_wigi Rank_wef Score_wef
Denmark NaN NaN 5 0.8025
Finland NaN NaN 2 0.8453
Hong Kong 5 0.342857 NaN NaN
Iceland NaN NaN 1 0.8594
Kingdom of Denmark 2 0.347826 NaN NaN
Norway NaN NaN 3 0.8374
Republic of China 1 0.356890 NaN NaN
South Korea 4 0.343662 NaN NaN
Sweden 3 0.345212 4 0.8165

But we want to generalize. Notice here we also inject the name of the DataFrame into the column names to avoid “suffix-hell” as I would like to term it.

In [1]:
import pandas

def make_df(filename):
    df = pandas.DataFrame.from_csv(filename)
    name = filename.split('.')[0]
    df.columns = map(lambda col: '{}_{}'.format(str(col), name), df.columns)
    return df

filenames = !ls

dfs = [make_df(filename) for filename in filenames]

Now here’s the reducer. I actually end up wanting an inner join in the end, but the type of join is not important to illustrate the fact.

Here we join 5 DataFrames at once.

In [2]:
def join_dfs(ldf, rdf):
    return ldf.join(rdf, how='inner')

final_df = reduce(join_dfs, dfs) #that's the magic
Score_gdi Rank_gdi Score_gei Rank_gei Rank_sigi Score_sigi Rank_wdf Score_wdf Rank_wef Score_wef
Nicaragua 0.912 102 74 37 53 0.8405 13 0.272727 6 0.7894
Rwanda 0.950 80 77 19 43 0.8661 134 0.096154 7 0.7854
Philippines 0.989 17 76 26 57 0.8235 6 0.322785 9 0.7814
Belgium 0.977 38 79 12 1 0.9984 73 0.163734 10 0.7809
Latvia 1.033 52 77 19 24 0.9489 82 0.157623 15 0.7691

I really like the elegance of this solution. I admit there may be other ways to go about it with pandas only, and I understand the R mentality of “no for loops”. Still this is precisely why I like pandas in python – you still get the freedom to play as you wish if it makes more sense to you.

Cyberwizard Institute: Retrospective


Cyber Wizard Institute

The Cyberwizard Institute  (CWI) was a free programming school based out of Sudo Room, running for the month of January 2015. The proclamation that I saw on their website before I volunteered to teach there was:

cwiThe idea is to be an anti-bootcamp. Anyone can participate. It’s free. We’re going to try hard to have lecture notes, assignments, and lecture livestreams up online. It will be primarily self-directed, but with guidance from higher level wizards.

As a founding member of sudoroom since 2011, but suffering from a recent malaise in my hacktivism, this was the perfect project to reinvigorate my involvement. What most appealed to me was the idea of an anti-bootcamp, because I’ve wanted to make clear to world the distinction I care about between start-up culture and technology. I wanted to do something metaphorically akin to hijacking the stereo system at a $4-coffee-wifi-shack and making a public service announcement that the computers are not just fancy TVs, but programmable instruments of self-empowerment, which, in addition, can be used for non-commercial purposes.

Meeting Every Day

Without any formal advertising, each sudoer leading CWI was pleasantly surprised when 27 wizardlings showed up on the first day (14 women and 13 men from my count).  When I remarked this to CWI’s originator @marinakukso, she responded that “when you offer a free programming class, with no experience required – people want that”.

I recall some apprehension when we introduced ourselves, and there was the occasional naïve posturing  of people who claimed themselves as programmers with the phrase “I know HTML”. But the need to impress quickly disappeared as we sat down to struggle with them in installing Linux on the laptops they’d brought.

The next day I was nervous with anticipation to arrive at an empty room after all we had shown fresh minds was that computer programming was about inexplicable Ubuntu hurdles. Still, with only a slightly leaky attendance most wizards did come back for more. And we went right on with teaching them bash.

We continued to meet for 5 hours daily with lectures and hackerspace-esque hands-on floating help from higher level wizards, which we dubbed “social code”. Our rhythm was found quickly, and only half way through the month CWI was feeling so magical, it received coverage in the East Bay Express:

“Many coding bootcamps in the Bay Area charge tens of thousands of dollars in fees, which can be seen as restricting access to what has become essential for finding a job in technology, let alone moving up in Silicon Valley’s so-called “meritocracy.” Kukso explained that Cyber Wizard Institute’s mission is very much aligned with that of Sudo Room, which is to give everyday folks the opportunity to understand and create the technology in their lives. “For a lot people who consider themselves nontechnical,” Kukso said, “a lot things relating to technology or coding seem mystical or secret, our perspective is … everyone can learn these types of things.’

Pedagogical Questions

Yet towards the end, I started to question the effectiveness and importance of CWI. From the beginning as facilitators we quipped that “anti-bootcamp” reallly meant “bootcamp”. And the calendar began by reflecting that.

  • Day 1: Install Linux
  • Day 2: Unix and Bash
  • Day 3: vim
  • Day 4: HTML
  • Day 5: javascript
  • Day 6: Networking
  • Day 7: Node.js
  • Day 8: Git
  • etc…

Which is exactly the way that substack, Oakland’s pre-eminent “unix philosopher,” would have it. Yet, that was before the collaborative aspects took over and I began to try and think about how I would teach a less trained non-programmer version of myself what I know now. I mixed in:

(click to view the recorded lectures)

Where substack was spreading his knowledge of artisinal web-buildery, I was attempting to proselytize a world of Mathematical elegance. At times I was worried this felt interfering and competitive to the wizards.

However the final projects did come to life, instigate solely from the intrinsic motivation of the new-wizards. On the last day arduino hacks and personal-itch websites really had materialized. After speaking to those who made it all the way through the month, they spoke of a brighter perspective than my own: perhaps we inadvertently succeeding at being an anti-bootcamp.

The Medium Was Always The Message

As another facilitator @Johnnyscript, at the  ending Cyberpunk Masquerade Wizard Initiation Ceremony, said we showed them what it coding is actually like – many differently opinionated hackers running around without too much top-down organization. We delivered the essence of the hackerspace more accessibly than just happening upon a room of silent geeks staring down. Our package, despite being a bit dishevelled, did form a solid curriculum, although it was not refined as something that you might pay $17,000 for. Yet it also was not an altar for silicon-valley start-up-ism.

Taken together, we find a point that I am surprised that I missed. Whereas  programming bootcamps are normally Cathedrals, as Eric Raymond might put it, we built a Bazaar.

Notconfusingly yours,

Your humble newb-druid.

Cyberwizard Institute II

“Will there be another Cyberwizard Institute?” many are asking. Likely, but it is as-yet unplanned because volunteer work is tiring. If you have the intitiative or want to hear about an inititiative, join our discussion tracker on github.


Preliminary Results From WIGI, The Wikipedia Gender Inequality Index

This is a preliminary list of results from a research project is being compiled into full paper on the subject.

The full paper, in it’s academic form is now available on arxiv.


WIGI is the Wikipedia Gender Inequality Index, a project whose purpose is to attempt to gain insight into the gender gap through understanding which humans are represented in Wikipedia. Professor Piotr Konieczny, and myself thought that, whereas some gender gap research focuses on the editors of Wikipedia directly, we would view the content and metadata of articles as a proxy measure for those editing. Although the notion of analysing Wikipedia content seems quite old, I believe the advent of Wikidata allows us a new range of ambitious questions to be asked.


We use Wikidata, the new semantic database that feeds Wikipedia. By inspecting it’s weekly data dumps, we are able to inspect all the semantic properties associated with every Wikipedia page in any language, all at once. In this case we focus on any article that is about a person, and their any data recorded for the properties gender, date of birth, date of death, place of birth, citizenship, and ethnic group (example). We do this courtesy of an excellent tool known as the Wikidata Toolkit.

We compare the found data to historical census data and the World Economic Forum’s Gender Gap Index.

For other computations we also supplement the original data with with aggregation maps to make cultures from place of birth, citizenship, and ethnic group, by using Mechanical Turk.

This project has been conducted in an Open Notebook Science way, where we have been posting our results and receiving feedback as we work. You can chat with us on-wiki, or on-github where all the code and data needed to reproduce this research is available.

Let’s begin:

 Summary Statistics

As of October 14 2014 we inspected a total of 2,561,999 or about 2.5 million “human” items, that is any Wikidata item with the property “instance of: Q5 (human)”.

On each of those items we look for the following additional properties and found  them no the following number of items.

% of total Items with property
ethnic group 0.30 7,772
country* 23.47 601,361
place of birth 23.93 613,092
date of death 28.79 737,522
citizenship 41.44 1,061,634
culture** 45.20 1,158,086
date of birth 57.92 1,484,003
gender 89.40 2,290,433
at least one site link 99.05 2,537,545
a “Q” ID 100.00 2,561,999


*country is determined by seeing if the place of birth is a country, or if it is a city, see if the city has a country property

**culture is determined by using translating ethnic group, place of birth, and citizenship into 1 of 9 world cultures as per Inglehart-Welzel map of the world with Mechanical Turk. Then we take the consensus of the three aggregated variables. (Actually there were no disagreements between the three variables.) All aggregation maps are available for inspection on github.

Now the first derived and naive statistic of interest – the total gender breakdown. As we’ve seen above 10.3% is of unknown gender, otherwise we encounter 75.7% male, 13.9% female, and <.01% nonbinary which is perhaps better described as 152 cases.

Sanity Checking With Historical Data

We want some sanity checking that the data from Wikidata reflects the world at large. To do this we compared our total population per year, calculated by date of birth, versus the world population.

Comparing the Wikidata data to historical census data  we find a high significant correlation in total population – Pearson correlation coefficent = .983 with  p<0.01. This lends some credence to the notion that this dataset reflects the world at large. (By the way the historical data trends backwards to 10,000 BCE, but the earliest date of birth in Wikidata is about 4,000 BCE.)

Total Biographies Over Time

These graphs show the absolute volume of items by date of birth and death by gender, and over all time, and 1800 onwards.dob_dod_totals_pretty

This first visualization of the gender gap shows how Wikipedia’s retroactive focus on history has been consistent in it’s bias in representing females. It’s also generally quite a smooth curve save for some noticable spikes around World War I and II.

It’s intriguing to contemplate how we might expect date of birth and death to be related. If they were equally well recorded – and barring extreme events like wars – the death curve would look like a right-shifted birth curve. However we see empirically that is not true. At all times the death curve remains absolutely smaller than birth, by a factor of about two-thirds. So we can see a bias in recording the date of birth more often than than date of death.

Gender Ratios Over Time

The indication of visual skew in gender prodded me to look at how the ratio of male female and nonbinary genders develop over time.

Note: From here I aggregate the nonbinary genders into a single class not for philosophical reasons of them, but for the ease of visualising the more dimensions they represent. I consider it import to be descriptive about what is found in the data, and to not to lose any perspective because of personal assumptions about gender. If you think there are better ways to describe this data, I would be glad to here from you.

We adjust our viewing window here to start at 1400CE here because the data is too sparse to provide meaningful visual data.


Curiously since about 1800 to present, the female ratio of biographies is greater when using the date of birth measure than the date of death measure. What is the interperation? Somehow recording female date of birth is more prominent in a way that recording date of death isn’t. Although both ratios are rising, somehow date of birth is outstripping date of death. It would be great to investigate how much this is owed to recording practices and how much it is owed to social phenomenon.

Notice after about 1990, the spike is very large, and even crosses 50%. This is more statistical anomaly than anything else, since the number of humans with date of birth about 1990 is very small as you can see in the volumes plot. There are only 12,000 entries with date of birth in 1990 and only 199 biographies born in the year 2000. Even with discounting very recent trends of the last 20 years, which describe humans that are just entering adulthood or younger, the female ratio is rising exponentially. I was expecting to fit a logistics curve to the female percentage so that we could predict when we might reach parity, however that notion does not makes sense with what is being shown. Although there it may not necessarily indicated equity, fitting an exponential model to this percentage we can calculate when the female percentage would reach 50%. By our calculations it would be February 2034 when the exponential extrapolation would reach 50% female representation.  But of course predicting growth of percentages can be lead to nonsensical results (as humourously shown in this xkcd comic). I suspect we will see a logistics model, but simply haven’t encountered the inflection point of slowing rate of growth yet.

Aggregating Cultures

First some caveats as to method which we use in the next section.

  • There is no good way to aggregate cultures perfectly. Aggregation in general assumes some loss of fidelity. The point in doing so is to gain a broader-stroke picture, and in this case simplify visualizations.
  • The method we used for aggregation – starting from the Inglehart-Welzel map of the world (right), and then “mechanical turking” in the rest of the values – comes loaded with it’s own cultural baggage and perspective.
    By DancingPhilosopher [CC BY-SA 3.0 (], via Wikimedia Commons
  • Inglehart-Welzel map only really makes sense for modern geopolitical boundaries. For instance the notion of having a Protestant and Catholic world before Protestantism and Catholicism, does not make sense. We use those soft modern boundaries superimposed over the geographical region to determine historical values. So if you were born in ancient Greece, you are known as Orthodox in this method.
  • Some ethnicities were a mixture of two cultures, like “Thai-American”, in those cases we took the modifier, so we’d use “Thai” -> South Asian. There are two ways to do this and both of them are not very good, we make a compromise to get a rough picture. The full data is available for more munging if you would like to fine tune it.
  • We aggregate the the 9 cultures from 3 similar but different Wikidata properties in citizenship, ethnic group, and place of birth. Since each of those are different concepts, a conflict may arise – however in this research we did not find a case where different property aggregations gave different world cultures.

Gender Ratios By Culture

We make a cross-tabulation of gender by culture. A Chi-squared test show the observed distributions of gender by culture to be significantly. We now graph the female percentage of biographies by culture.


More than anything, I think what astounds most is the large different in the difference in absolute number of biographies by culture. The European and English-speaking world dominates by a large amount here. Although, it might be that European and English-Speaking biographies are simply more likely to be described in Wikidata at the moment, by some sort of quirk of the volunteer import process. Later we’ll see how that affects German and Austrian items.

If we do inspect the female percentages as-is, we find a very high showing for in the Confucian culture. After talking to some Confucian-world Wikipedians on twitter (who I can’t find now to credit) and fellow Wikipedia Researcher Hai-Yi Zhu from University of Minnesota, we produced the hypothesis that this is because the phenomenon of celebrity is larger in those cultures, and celebrity is more evenly gender-distributed. We will investigate the celebrity hypothesis in a bit. If you have another hypothesis, we welcome your input for testing.

We provide the graph of nonbinary percentages of biographies by culture too. The cultures are ordered in the same way as the female graph for ease of comparison. Notice that the ordering is relatively similar to the female graph – so on the surface, recording female biographies is linked to recording nonbinary genders too.


 Gender Ratios Over Time

Lets mix all these variable now, by viewing the culture ratio trends over time. To note our sample size as we continue, only 951,101 or about 35% of total records have all of date of birth, culture, and gender data.



You can see that the recent past around 1800 is a low point for female recognition in all cultures and most of history in the past 3 millennia. Likewise visually it is evident that historical trends in different cultures have, while not reaching 50%, peaked at much higher percentages. In the modern historical graph, we can see a rise occurring for all cultures, and super-linear growth even for the Confucian and South Asian countries. .The sky-rocketing ratios after 1990 are less significant as noted above.

Gender by Wikipedia Language

Now let us recall that there is one more dimension we have recorded, the sitelink dimension, which indicates whether or not for an item a Wikipedia language has an entry for it. To be clear, say for instance that Finnish Wikipedia has an article about a Japanese human; we would be commenting on Finnish Wikipedia. With this data we can analyse the female and nonbinary tendencies of a Language, not a nationality or culture.

Here we have plots that show the relative frequencies of female articles per Wikipedia language, versus the size of the language.


And again for nonbinary humans.


Notice in general there is no simple trend linking Wikipedia size to female representation. The visual technique with which I investigate here is to look at for the points whose magnitude from the origin is greatest. Mostly I see relatively a  flat constant rate, with a few Wikipedias standing out a bit, like the Japanese, Chinese and Tagalog. So again we are seeing some evidence for Confucian and South Asian cultures being less gender biased when following the sitelink method analysis.

Gender by Aggregated Wikipedia Language

To sure up the idea of cultural influence in the sitelinks analysis we aggregate the languages into the nine World Cultures as before. In this case, since there are only about 280 languages, I assigned all of the languages by hand, rather than resorting to Mechanical Turk.


To clarify, the technique used here is that every Wikidata item counts towards a culture if a sitelink exists in at least one language associated with that culture. So if an article has language links to English, Chinese, and Japanese wikipedia, that item counts only once towards each of the English-speaking and Confucian categories.

Now we have a more coherent picture about which types of Wikipedias by language are focusing on female articles. And we do continue to see a high Confucian showing.

Let us test our celebrity hypothesis. For the Chinese, Japanese, Korean, Tagalog, Urdu, German and English Wikipedias, we retrieved the page content of each Biography from 1930 until 1989 (recall that there are very few Biographies with date of birth 1990 and higher).

We search for the English or foreign language words that are associated with celebrity. The dictionary used is:

{'jawiki': [u'俳優', u'選手', u'歌手', u'ミュージシャン', u'モデル', u'アイドル'],

'zhwiki': [u'演員', u'運動員', u'歌手', u'音乐家', u'模特兒', u'偶像'],

'kowiki' : [u'배우', u'선수', u'가수', u'음악가', u'모델', u'우상'],

'tlwiki': [u'artista', 'aktor', u'player', u'mang-aawit', u'musikero', u'modelo', u'idolo'],

'urwiki': [u'اردو', u'کھلاڑ', u'گلوکار' , u'موسیقار' , u'ماڈل', u'بت'],

'dewiki': [u'schauspieler' , u'spieler', u'Musiker', u'Sänger', u'Modell', u'Idol'],

'enwiki' :[u'actor', u'actress', u'player', u'singer', u'musician', u'model', u'idol']}

If you can provide better translations than Google’s software, let me know. We consider a celebrity to be a biography that contains one of the above words within the first 200 characters of its Wikipedia entry.

Then we make a heatmap comparing the language, the decade and, the gender, and celebrity percentage.


Using visual inspection, at first glance we can see that the female matrix is darker in general that the other two matrices. So recorded females are more likely to be celebrities among these languages.

Likewise you can see that in general the heatmap transitions to being darker at the top than bottom, so we have shifted to being more celebrity conscious in most languages in recent years.

Lastly we see some vertical-striped features showing that for instance Tagalog is prone to being celebrity conscious across gender and time.

To determine the significance of the effects we perform a logistic regression analysis in predicting the celebrity percentage variable. The coefficient matrix is printed below.

coef std err z P>|z| 1
decade 0.0236 0.013 1.823 0.068 -0.002 0.049
enwiki 0.0509 0.875 0.058 0.954 -1.664 1.766
jawiki 0.7763 0.837 0.927 0.354 -0.865 2.418
kowiki 1.3834 0.832 1.662 0.097 -0.248 3.015
tlwiki 3.0009 0.945 3.176 0.001 1.149 4.853
urwiki 0.8901 0.869 1.025 0.306 -0.813 2.593
zhwiki 0.5383 0.846 0.637 0.524 -1.119 2.196
female 1.3580 0.453 2.999 0.003 0.471 2.245
intercept -47.9056 25.368 -1.888 0.059 -97.626 1.815


Depending on which arbitrary significance threshold you choose to use, we find different answers, but at least the female, and Tagalog, variables are significant with p<0.05. If we loosen the significance threshold slighly, decade, and Korean also become predictors. This lends a lot of credence to the notion that in the cases in which Women are recorded in Wikipedias, they have a strong tendency to be a celebrity.

Connections To The World Economic Forum Index

Indexes are useful, but they are more useful as a group of compatible and comparable indexes. We compared our place of birth and citizenship data as it related to gender, to the World Economic Forum Gender Gap Index. The World Economic Forum uses its own methodology to produce a scalar value on the interval (0,1) to rank the gender equality of a country. To match to that format, we take the Wikidata data in the form of female composition of biographies by country.

We performed a calibration step to see which time window of data would produce our ranking of countries most closely being correlated with the World economic forum. If the Wikidata dataset is used with the time window only considering the biographies with date of birth between 1890 and 1990, the Spearman rank correlation is 0.31 with a p value of 0.03. That means that there is some founding for accepting the female composition of Wikidata items of humans associated with a country as an inequality index, because is significantly correlated with other respected inequality indexes.

Here is a sample of the two rankings side-by-side. We display the top 10 as per the World Economic Forum rank, and then the top 10 as per the Wikipedia Rank. You’ll aslo see the associated WIGI rank, the raw scores for each, and the difference in the ranking.
World Economic Forum Top 10

Country WEF Rank Wikipedia Rank WEF Score Wikipedia Score Rank Difference
Iceland 1 30 0.8594 0.1895 -29
Finland 2 39 0.8453 0.1807 -37
Norway 3 22 0.8374 0.2142 -19
Sweden 4 1 0.8165 0.3452 3
Denmark 5 20 0.8025 0.2149 -15
Nicaragua 6 9 0.7894 0.2727 -3
Rwanda 7 108 0.7854 0.0962 -101
Ireland 8 64 0.7850 0.1586 -56
Philippines 9 3 0.7814 0.3228 6
Belgium 10 58 0.7809 0.1637 -48


WIGI Top 10

Country WEF Rank Wikipedia Rank WEF Score Wikipedia Score Rank Difference
Sweden 4 1 0.8165 0.3452 3
South Korea 117 2 0.6403 0.3437 115
Philippines 9 3 0.7814 0.3228 6
Bahrain 124 4 0.6261 0.3171 120
Mauritius 106 5 0.6541 0.2941 101
People’s Republic of China 87 6 0.6830 0.2812 81
Australia 24 7 0.7409 0.2760 17
Japan 104 8 0.6584 0.2732 96
Nicaragua 6 9 0.7894 0.2727 -3
Swaziland 92 10 0.6772 0.2593 82

We see how the rankings bear some similarity, but that the correlation is mild. Still we can take away that the notion of what the WEF is driving at with it’s measure, and the number of female biographies that exist about humans in a country, as somewhat related idea.

Data Reliability

The question of how well Wikidata accurately reflects all Wikipedias, is important to determine before addressing the question of how well Wikipedias reflect the world at-large.

During our research, we found a curious quirk in the way that nationality is recorded, and the story is instructive in showing that Wikidata still has a few artefacts of its bot-imported nature. A more in-depth analysis, I previously blogged about is available in a post about the “Wikidata and the Measure of Nationality“.

In short, the idea centres around an early finding, that indicated that Protestant European humans seemed to disappear in the 1930s, when we were determining culture just using the “Place of Birth” property. It looked like this:



This is what lead us to investigate how nationalities were being classified on Wikidata. The next graphs show which humans have which classification method – by place of birth, citizenship, or  both – for nationality. For Germanic humans we saw a large shift:


And for all other populations we witness no such thing:


After publishing these finding, a Wikimedian wrote in to explain that the import of Germanic human data into Wikidata occured through a bot called “FischBot”, and that the shift is likely only related to the way that that software operated. The moral being that we should still be vigilant in staying aware of the data quality in Wikidata.


It is not my intention to draw any large scale conclusions at the moment. For that I will wait until the publication of the paper for which this analysis is intended. Still I would be glad to hear any insights you might see until then.


We finished the writing the paper. An excerpt from the conclusion there:

Our research confirms that gender inequality is a phenomenon with a long history, but whose patterns can be analyzed and quantified on a larger scale than previously thought possible. Through the use of Inglehart-Welzel cultural clusters, we show that gender inequality can be analyzed with regards to world’s cultures. In the dimension studied (coverage of females and other genders in reference works) we show a steadily improving trend, through one with aspects that deserve careful follow up analysis (such as the surprisingly high ranking of the Confucian and South Asian clusters).


Tweet at me @notconfusing .


I programmed all of this research in using the IPython notebook, and it’s all entirely open source and hopefully reproducible from

I plan to start parsing and filtering Wikidata montly to provide updated data, which should be coming soon.

  1. 0% Conf. Int.