OpenSym 2015 – Wikipedia in the World of Global Gender Inequality Indices

In a previous post I outlined how the process of correlating the gender bias from Wikipedia with other gender inequality indices. Tomorrow I will present a poster on the same topic at OpenSym 2015. I’ll be explicating how Wikipedia’s biographical bias is closer to the gender bias in highly-qualified jobs, than longevity. It’s part of what I’ve been discovering during my Grant with the Wikimedia Foundation. You can read more in the preprint and poster.


Wikipedia in the World of Global Gender Inequality Indices: What The Biography Gender Gap Is Measuring


OpenSym 2015 Poster

The Universal Empathy Machine: Nonviolent Communication Explained with Mathematics and Computer Science

0. The Universal Empathy Machine

Empathy is not sympathy. What’s the difference? Think of the Universal Turing Machine. It is a machine that accepts a program and data, and runs that program on that data. In this way it can simulate all programs on all data. Let us think of a human as a program and human experience as data. Sympathy then, is running your program on someone else’s data. Empathy is running their program on their data. As you can see the results of the sympathy and empathy computations are not guaranteed to be identical. In a nutshell Nonviolent Communication is about becoming the Universal Empathy Machine, to be able to emulate the architecture of an arbitrary person given an arbitrary experience.



Cover of Nonviolent Communication, replete with sunflower
Cover of Nonviolent Communication, replete with sunflower

Nonviolent Communication (abbrv’d NVC), is a theory by Marshall Rosenberg and the title of a book which has an unfortunate cover. Dressed up in a sunflower, you would associate it with self-help pseudoscience and may not allow it to surprise you. I only popped it open because a) it could be pirated on The Pirate Bay, and b) it was the reciprocating recommendation to me after I had been proselytizing my then-favourite-read to a friend, and so I felt obliged. As you can see neither of those reasons should really have you running to the library.

Its insight-olives are sparse in its ciabatta. But however rare they are, those morsel were escape plans for decade-long arguments. I felt so resourceful having a theory of dealing with conflict where I never had one before. The only problem was I couldn’t chat to my friends about it, let alone recommend it. (Update: which now turns out to be a common phenomenon). It’s not intended for anyone that would use the terms logical, or reasonable to describe themselves. They’d be seeking different analogies, examples, and want it to be quite a bit shorter.

Well this is that version, your very short introduction to Nonviolent Communication, abridged and explained through mathematics and computer science analogies. I’ll translate it into the realm of motivations, axioms, communication protocols, and finally foundational flaws.

1. The Intention of Nonviolent Communication is Connection

Any good sceptic should immediately be asking the purpose-question. What we are interested in is the family of problems characterized by the set of disharmonies, disagreements, and arguments.

Now it must be noted that nonviolent communication admits to there being intractable arguments. Not every argument is solvable, and like the halting problem, there is no way of deciding whether an argument will run forever without just trying to solve it.

The main approach we use to arguments is finding connections. A connection is relation aRb between persons a and b, not necessarily distinct, such that a and b are ready to resolve the disagreement. (What is it when one person is not ready to resolve? We address that later).

Argument resolution often never starts because it does not aim to find connection. In some cases we are talking past each other, we need to connect onto what the main topics are. In other cases we are discussing the same topic, but cannot connect onto a mutually agreeable answer, here NVC says to connect on observations and feelings and needs.

Protip: Notice connection is not always with “others”, because often we want to change abusive self-dialogue.

2. Axioms of Nonviolent communication

There are three strong axioms; sorry Wittgenstein fans.

2.1 Feelings Are Connected to Needs

We suppose a Connection map, which maps feelings to needs.

C \colon F \mapsto N

Where F is the set of all feelings, and N is the set of all needs. Note that C is not necessarily injective but is surjective. The intuition here is that whatever feelings are observed, can be map to a need – probably unmet. This need usually becomes our focus.

2.2 All Needs Matter

The set of needs and important needs are exactly equal.

\forall n \colon n \in \{ \text{needs} \} \iff n \in \{ \text{imporant needs} \}

Taking as an axiom that all needs are important needs allows participants to declare needs without fear. The Universal Empathy Machine is a system of how to accept needs as important which do not appear important, to you.

2.3 There Is Always a Choice

The empty set is not contained in the set of all choices.

S := \{ s \in \mathcal{P}(choices) \wedge s \neq \emptyset \}

Let’s consider this our “Axiom of Choice” – we always have one. NVC asks us to accept a strong theory of free will. We have a nonempty, and possibly infinite set of reactions for all interpersonal interaction.

3. Communication Protocols

In trying to connect we will have to in some way communicate with each other, let’s call this messaging. NVC says that it’s important to do this in a specific way. The messages that we pass between objects, probably humans, not necessarily distinct, are a 4-tuple containing:

messaging tuple := (observations, feelings, needs, requests)

This quadruple through the unveiling of each element, produces a flow from empiricism, to emotion, humanism, and finally to action.

Not every message needs to contain all four parts, for brevity often they can be omitted. When starting though it can be useful to be exhaustive for practice.

3.1 Observations

"Observing without Evaluation" 
~ NVC Chapter Title 3

In the canonical specification of Nonviolent Communication, this chapter is literally titled “Observing without Evaluation”. Little do they know just how apt that is. In this messaging block we are transmitting our observations rather than our opinions. The assumption NVC operates with is that when viewing the world, we sense our observable universe and then evaluate it to return opinions. But it is not clear that those opinions are useful yet, so let’s not naturally default to eager evaluation. (Well you may not, depending on how much of a functional purist you are.) Taking a cue from the lazy evaluation model, we don’t have to return opinions until they are necessary. In fact, evaluation is not necessary until we send feelings.

If you are not a fan of fixed evaluation strategies, another way to think about the observation section is it’s where we make our imports. Here we are providing the populated namespaces, libraries, Connection API and constants that we will reference in the rest in the rest of our communication. The context. Since sensory perception varies from human to human, we cannot rely on the exterior universe to be observed equally, thus we pass along our context. The point is that we do not want any of our further statements to be received ambiguously, so our definitions must be precise. We make only natural-philosophy-style remarks that equate to be exactly true. “You’re always late,” has truth value in the open interval (0,1) depending on which human is observing. “We agreed to meet at 7:30, and I saw you arrive on Monday at 7:45, and on Tuesday at 8:05” has a truth value of just 1. Now the rest of our program, or proof – whichever side of the isomorphism you prefer – can refer to lateness with no misgivings.

Exercises: Observation or Opinion?

1. “Your email signature is 41 lines long, rendering for me as over 4 screenfuls, where as your last 5 messages to the list were each less than 41 lines long.”
2. “Dante often does not wash his dishes in the hackerspace.”
3. “Allesandra told me that I was not good at identifying contrapositives.”
4. “Our group facilitator controls the meetings.”


1. This is an observation, which is entirely verifiable.
2. This is an opinion because “often” is not defined.
3. This is an observation if Allesandra literally said so, but not if Allesandra was only referring to a specific time the speaker did not identify a contrapositive, in that case the speaker would be making an evaluation.
4. This is an opinion because “controls” is open to interpretation.

Protip: when you are having difficulty finding observations to base what you want to say, and your communication is a reply it is OK, and even encouraged to literally repeat what your partners have said. $echo what-they-said. More about this in 4. Receiving section.

3.2 Feelings

There is the counter-intuitive Rosenberg law: "Expressing our vulnerabilities can help resolve conflict."

After having carefully preserved the pre-evaluation observation, it is finally time to also give the results of our evaluations – feelings. To explain what feelings are we explore a classic gotcha – psuedofeelings. Pseudofeelings unfortunately do pass duck-typing tests. They key difference, is that, with respect to the feeler, feelings are internal, and psuedofeelings are external.

Some examples of pseudofeelings are “I feel unimportant”, “I feel misunderstood”, or “I feel ignored”. Re-expressed as feelings these would be, respectively: “I feel discouraged because I observed I was not part of [important decision]”. “I feel anxious because you doing [action] doesn’t reflect that you understood me.”, “I feel hurt, because I perceive I am being ignored.”. These re-expressions take an external feeling, and talks about what external event made you feel internally. This is important, because a statement about yourself can never be blaming, and allows others to see your perspective on your observations.

Exercises: Feeling or Pseudofeeling?

1. “I feel scared when you talk about about forking”.
2. “When you don’t cite me, I feel neglected.”
3. “I’m happy that you found time to come to Wikimania.”
4. “I feel disappointed by the fact that you did not publish your dataset, because I had to recreate it.”


1. Feeling. Scared describe’s the internal state of the feeler.
2. Pseudo-feeling. Neglect is a thought about the exterior world. Feeler is probably depressed about not having their work recognized.
3. Feeling. The user happy, and said so.
4. Pseudo-feeling. Despite very clear reasoning, disappointment is not a feeling, but a pseudofeeling. User is probably feeling aggrevated because of needless extra work.

3.3 Needs

"God gave us the universal needs, man created the rest" 

We have described the outside world, and stated how we feel about it, but we’ll require one more step to expose our “Connection API”. Needs are those sufficient and necesarry condition for you life. According to NVC all humans come preloaded with immutable natural feelings which are factory defaults. Because needs are very low-level, primitive objects, it s likely that the communicators will have some of these in common. And with common needs, connection can be found.

Needs are typically very basic, like: autonomy, celebration, creativity, appreciation, love, respect, play, peace, food, rest, sex. Communicating these needs may seem weak, irrational, and impossible to admit out loud, but the whole point is to open up. This point is opensourcing ourselves to the very lowest machine-level. There are two ways in which this Richard Stallman doctrine aids us with emotion. First, the open code of ourselves is a signal for our partners to work with us, and the mystery of how we work disappears. Secondly, when we disclose our code, all bugs are shallow. It is scary that others will be delving into our innermost code, but like Heartbleed, it is the only route to long term security. Remember, in order to further our opening have the Axiom 2.2, the safety mechanism that all needs are important.

Exercises: When Are Needs Being Expressed?

  1. “I feel angry when you talk about transhumanists that way, because I am wanting respect for my own destiny and I hear your words as an insult.”
  2. “I’m discouraged because I would have liked to have progressed further in my work by now.”
  3. “I feel disappointed because you assigned yourself to those bugs, but didn’t squash them.”
  4. “I’m sad that you won’t be meeting me at the vegan restaurant for dinner because I was hoping we could chat about anarchism together.”


  1. Wanting respect for way of life is a basic need, whatever it may be.
  2. This is close enough to a need. It is implied that the need to is for the speaker to be feel fulfilment from progressing through work. This is actually an exercise verbatim out of the original NVC book.
  3. No need is being expressed here. Perhaps the speaker needs the mental comfort of having no outstanding issues, or needs the security of that comes with trustworthy friends – we don’t know and it isn’t clear.
  4. Human contact is a need. Maybe they also need a tempeh gyro.

3.4 Requests's author Max Ogden analogizes callbacks to the numbers given you at restaurants that tell waitrons what to do with your food after it has been cooked. In our case, it is more like the waitron telling you that their job has many harsh realities, which makes them feel very oppressed, crushing their need for economic freedom, and finally telling you to help them smash the capitalist wage-labour sociopolitical complex.

At last we can try to alter the world with requests. Requests are callbacks we send to our communicating partner. They indicate what and when we’d like your partner to do. Like asynchronous javascript, there are a lot of security issues. Co-communicators won’t want to run to malware. That’s why the best request-callbacks are verified non-malicious by being the conclusion of a observation-feeling-needs-request syllogism.

Protip: By Axiom 2.3 we always have a choice, and so it is impossible to be in a situation where only the other party can break a stalemate.

Issuing precise requests clarifies what we want from our partner. If it feels difficult to articulate what we want from others, that’s typically because it is not an action. NVC says that more specific actions make better requests, otherwise we’re issuing request that the receiver can’t know if they’ve done it. If we shout at our colleague that a project is behind schedule, and we know they can’t speed it up – we’re not asking them to speed it up, but merely to acknowledge our anger. In this case the call back request might be “give receipt of my frustration”. Colloquially this would be known as venting. It is nice to have no ambiguity about when the roles are just to listen, or to actually address a behavioural pattern.

Exercises: Request or No Request:

  1. “I want you to grok me.”
  2. “I’d like for you to indicate one moment in my presentation that you appreciated.”
  3. “I would like you to walk more slowly in the airport and tell me where you’re going before you walk off.”
  4. “I want you to be proud of your organizing work.”


  1. Not a request, because grok is not specific action. It could however be illustrated by asking for the receiver to paraphrase speaker. (See how to do this well in 4. Receiving)
  2. This request is asking for a something concrete, empathetic reception. These kinds of requests are made to seem ridiculous in the modern era, but that is just the long term cultural effect of “guess culture“.
  3. A little bit exasperated, but quite clear actioning in the request. This isn’t not not one of my pet peeves.
  4. How is the speaker going to know when the receiver is being proud? “I want you to tell my friends about your organizing work,” is more direct if that’s what they’re looking for.

4. Receiving

NVC’s messaging protocol is two-way, and now that we’ve covered “expressing honestly” there’s still “receiving empathetically”. Receiving empathetically can be understood as the process of parsing unstructured conversation text into the formal grammar of NVC. The conversation text is what the other person is telling you and our target grammar is the observation-feelings-needs-request 4-tuple. We are not guaranteed to get all of the components, and not in any specific order. We’ve just got a really difficult parsing problem on our hands.

The reason that people are offended when they are asked, “Did you hear me? What did I say?” is because it s actually difficult to paraphrase what may already be a hard message to hear. We will attempt to do better, to translate their communication into an NVC object. Once we can be sure we have their experience (data), and how they are dealing with that experience (program), we can become the Univeral Empathy Machine (section 0), and be fully empathetic. When done right the empathetic affect will come alive. Firstly, and very clearly their need to be just-listened-to will be fulfilled. More subtly, our partners can fine-tune their thinking by seeing how closely our Empathy Machine mirrors their Identity function.  That is, how well our reflection matches their intended expression. This lets them know if we are missing any of their points, or they haven’t stressed what they would like to.

Protip: Even if someone starts trying to brute force attack us with volume and vitriol, we can still receive empathetically. Intimidating messages are also people asking us to meet their needs. Try for instance, “It seems like you’re really angry about my deleting the private key; because you need more security about what’s happening in your life.” Likewise, if we are engaged with someone not ready to resolve, see if an application of empathy towards them helps.

Counter-intuitively this paraphrasing saves time, even though it takes time. A typical pitfall in a time-saving mentality of receiving is the bad habit of trying to short circuit the conversation by offering unsolicited advice to people. Offering unsolicited advice would be as if a parser (a) took input, (b) maybe did or didn’t parse the input, (c) did not verify the meaning of the maybe-parsed result, and then (d) returned advice based on exogenous heuristics. Returning that computation to the speaker would understandably be nonplussing if not absolutely frustrating as it is devoid of any indication that it related to what they said. Giving advice is only useful iff advice is what the speaker is asking for. By assuming they want a “fix-it” response, we are only engaging in the folly of mansplaining.

Receiving empathetically is to parse our partners messages and run it through our Universal Empathy Machine.

Exercises: Empathetic Reception (Y/N)?

  1. Person A: Counting error in Ultimate Street Fighter IV finals? How could I do something so stupid? Person B: Nobody’s perfect, don’t be too hard on yourself.
  2. Person A: You’re a delusional utopian.
    Person B: Are you feeling frustrated because you would like me to admit that there could be other ways of interpreting the Black Lives Matter movement?
  3. Person A: Oh I’m being SO BAD! I NEVER eat cupcakes! Person B: Maybe exercising more would help you.
  4. Person A: When friends of a friend of a friend join our camp without showing commitment, I feel encroached on. It’s like how fraccing companies squeeze me with anti-protest tactics. Person B: I know how you feel. I used to feel that way too.
  5. Person A: I’m unhappy with the grant’s status because you should have made more impact by now. Person B: I know you’re unhappy, but we’ve been slowed by bureaucratic process.


  1. B is giving advice to A, which is not an empathetic response. “You sound like you’re enraged by your lapse of concentration,” is more along the NVC lines.
  2. Empathetic response since B is trying to ascertain from A’s perspective why A might be lashing out.
  3. Again B is advising A, even though the tone is lighter. B might want to try and understand what feelings are behind A’s not being neutral about food.
  4. Not an empathetic response, an sympathetic response. Same data, but whose program is being applied?
  5. Trick empathy. Just saying you understand is not the same as demonstrating you understand. B left A’s comment about impact on the floor, which B could have used to empathise with.

5. Criticisms of NVC

How many different input methods can we use to write an email? Maybe with a physical keyboard, or virtual one on a phone, different auto-complete schemes, speech-to-text, and maybe we’ve even had the pleasure of tapping one out T9 stylee. Even though we may aim to transcribe the same thoughts, based on the technique used, the final text will be different. If the text of our emails are altered so too are the conversations. Now, different formats of email it will benefit specific ways of writing. We might be happy arranging dinner plans tapping on glass, but for conforming to the standards of a formal letter begs for the old clickety clack.

As input methods change a conversation, so does NVC. Since it is very literally a theory of discourse, using NVC will necessarily bring with it prediscursive bias. The format of the discussion is not variable that is discussed. But fair enough, any communication strategy would come attached with its own biases. The question then really becomes, since NVC is prescribed conversation format, which speakers does it benefit?

NVC’s founding theorist, was an American white man born in 1930 as the son of Russian Jews. What does that mean specifically for who NVC benefits in conversation? My reading turned up no mention of how Rosenberg’s personal background might affect his theorizing. In my opinion – I am myself a white man with similar citizenry and ancestry – it imports notions of classical logic, a Mazlowic need hiearchy, and western rationality.

To expand, the technique has an orderly system to follow. This system is static and procedural, where it could be more goal-directed. The ontology presupposes the universality of basic needs. This could be interpreted as the hubris of someone currently with privilege assuming that others are like them. And lastly it does not make large mention of how it would fit in a multiplicity of different communication strategies, as a pluralist might.


NVC has a grand concept, which works at times and is undermined by it’s flaws at others. It was useful for me because it was the first conflict strategy I’d heard of, and it “made sense” to me. It turned out not to be a persuasion-hack, but it did teach me the concept of empathy. Understanding empathy for the first time was truly a dose of mind expansion. I’ve kept the format of NVC’s exercises at the end of each chapter, because as contrived as it seems, the questions are hard, empathy is not intuitive, and practice is vital. It’s really more praxis than theory. In fact, practising empathy has been the inroad to new ideologies for me like: feminism, anti-racism, LGBTQ-allyship, and other social movements for which I am not the effected demographic. I hope you, person who likes mathematical analogies, can glean something studying from it too.

Notconfusing rules for conversation: 2 rules and a jumpstart.

Meeting people can be a slog. “Hello, what’s your name?”, “Where are you from?”, “What do you do?”,  “How do you yawn?”. Yawn? Sorry I was nodding off just writing about how repetitive and tiresome modern meeting and greeting can be. Owing to the way that social networks store information about us, we’re used to thinking about people in a list of attributes “forms” structure. Trans-inclusive feminism has already laid out how select-a-value gender is problematic for self-determination, and it has even subtler consequences in meeting people. We’ve come to assume the next person you meet is some combinatoric permutation of drop-down menus. How are we supposed to meet that person that is our life long friend, but at the moment is just looks like one more INTJ or Virgo?

In fact the disillusionment from these gruelling social interactions is exactly the motivation for having friends, as a commiserating shelter. How do we let humans do the human thing and wow us with their outstanding creative expression of self from the moment we first meet? I submit notconfusing’s two rules for conversation.

  1. Ask questions that reflect choices people have or could make.
  2. Ask questions that have never been asked before.

Asking questions that reflect choices or decisions is a way to understand a person’s values and principles, which is more informative than part of their current happenstance. Even though this point is supposed to cause a deeper understanding, the questions need not be heavy. “When you’re sleeping on your favourite side, are you facing towards your alarm clock?” might tell you a bit about how much someone wants to combat their own habits without asking “how cognisant are you of your habits and how do you want to combat them?” The analysis of their choices can be done together out loud or both parties can be trusted to do so internally. In either case the point is to revel in the complexity of your partner, while gifting them a bit of Rogerian psychology.

Notice that just Rule 1 by itself could still allow for a “What are your hobbies?” variant, so Rule 2 is brought in to stem the tedium. At first it might seem impossible to ask an entirely unique question to every person, but – as I will prove – there really are an infinite number of these types of questions. Here are a few strategies.

The first strategy is analogous to a infinite game I learned called “Uses for…” where you try to come up with as many uses as you can for a specific item. The example I recall reading about is a bed sheet. So let’s play: It can be used as a tablecloth, as an escape rope for climbing out windows, as a substitute for an all-white painting, as a shooting target for short-sighted people, as a stencil for papier-mâché bed etc. etc. Try and come up with 5 more.

Now apply  creative riffing to the things you notice about your partner. For instance these are the topics I brought up from the last ice-breaking conversations I’ve had: reminiscing over video rental returns (standing near a letterbox), a comparison of how different tapes will tear when you don’t have scissors (electrical taped wallet), how often I think about life from a bird’s-eye view (standing at different levels), and the history of the vulcanization of rubber (rode with a flat tyre). Going off-script and generating questions based on the partner and surroundings guarantees freshness. The way your associate engages gives you some understanding of their gestalt person-ness.
Even if you are feeling like you filled out pointless forms all day at work so that you are sapped of your free-associativity, there is always the abstraction “meta” trick. Assume that you have racked your brain, and “where are you from?” is the absolute best question you can come up with because you are only meeting people out of some hateful obligation. You can apply question-abstraction to ask them “what does a person’s answer to <absolute best question I can muster> mean about a person’s personality?” Yes, use your own staleness as weapon. Since the result of the question-abstraction is also a question, it can be infinitely applied to itself to yield infinitely many unique questions. QED. (If you think this a sad proof, then I encourage you to really try it. I imagine you’ll become loopy enough by the hypnotic repetition of speaking that your co-discusser will either join in with you in your recursion – great fun – or they will have walked away, which is just a well.)

A last technique, if you want to borrow a bit, is to use my growing list of ice-breakers.  I’ve created them as group introductions when I was facilitating Sudo Room hackerspace meetings. As they are targeted to a tech-y crowd you may need to customize a bit –  exactly the point that I’m trying to champion.

With the application of these 2 rules you begin to transgress social mores for great good. You ought to explode small talk to eschew complacency. Then you can make more and better friends. Although ironically making this kind of conversation may have effect of pinning you as a werido. Yet disobey the laziness of phone alienation as Saul Williams does in Talk to Strangers  “… that square box don’t represent the sphere that we live in. The earth is not a flat screen, I aint trying to fit in.”

List of Yoga Quotations

Here are a list of Yoga quotations that I’ve compiled from my 200-hour yoga teacher training, other classes I’ve attended, and various yoga books.

Jon Isaacs

  • “The pose begins once you want to leave it”.
  • “Who went to their first yoga class because their life was going really well‽”
  • “I had a hedgefund guy fire me once because I was talking about greed in class.”

Sean Feit

  • On taking non-harming literally, “I take an antibiotic – genocide.”

Jean Mazzei

  • “You can have peace or mind, but not peace of mind because the mind’s purpose is to think.”

Cora Wen

  • “You probably think you have a knee, there’s nothing there. There’s no knee.”
  • “The knee is the prisoner of the hip and the ankle.”

Stacey Swan

  • On being a good teacher, “It’s not about putting your foot behind your head, but keeping it out of your mouth.”
  • “The american way is ‘no pain, no gain’, but yoga is ‘no pain, no pain'”
  • “A good yoga class should be like a Seinfeld episode,” (in that is should come full circle at the end.

Karen Macklin

  • “Vinyasa can also mean how you sequence your life.”

Adrianna Webster

  • “On an inhale, breathe out”.

Leslie Kaminoff – Yoga Anatomy

  • On the spine, “The full glory of nature’s ingenuity is apparent in the human spine…From an engineering perspective it is clear that we have the smallest base of support, the highest center of gravity, and the heaviest cranium (proportional to our body weight) of any other mammal.  As the only true bipeds on the planet, we are also earth’s least mechanically stable creatures.”
  • On breathing, “The energy expended in breathing produces a shape change that lowers the pressure in the chest cavity and permits the air to be pushed into the body by the weight of the planet’s atmosphere. In other words, you create the space and the universe fills it.”
  • Paraphrase on hand balances, “4/5th of the foot is dedicated to weight-bearing  and 1/5th is dedicated to dexterity. The hand (on the other hand) is 1/5th weight-bearing, 4/5th dexterous.”

Rudolf von Laban

  • “Each bodily movement is embedded in a chain of infinite happenings from which we distinguish only the immediate steps and, occasionally, those which immediately follow… In every trace form created by the body, both infinity and eternity are hidden.”

Joel Kramer – Yoga as Self-Transformation

  • “The essence of yoga is not attainment, but how awarely you work with your limits.”
  • “If you’re running from the feeling, it’s pain.” (Otherwise it’s just intensity.)
  • “Yesterday’s Level of Flexibility”. The (unhelpful) concept which I call YLF.

Desikachar –  The Heart of Yoga

  • Yoga defined, “attempting to do something you haven’t before.”


Oh headstand, “It’s like Wu-Tang says, you gotta ‘protect ya neck.'”

On stepping onto your mat, “Let’s go for a magic carpet ride.”

Travis Judd

  • “Make a conscious choice about what kind of practitioner you want to be right now.”


I can’t recall the provenance of these quotes sadly. Let me know if you can.

  • “The idea that we are ever not moving is an illusion.”
  • “asana is a process not a product otherwise we could say ‘not in a pose’ if head isn’t touching knee, but that is false.”
  • ‘Yoga’ has the root ‘Yuj’ which is the root for the English word ‘Yoke.’
  • Like humans, “water is transparent and reflective but don’t see those properties when in motion.”
  • “If you feel like you’re being inauthentic start telling the truth.”

















WIGI, an Inspire Grantee

WIGI, the Wikipedia Gender Index, my project which looks at the gender representation in Wikipedia Biography articles, has won an Inspire Grant.

Over the last six months along with fellow Wikipedians we prototyped and extended this research into a paper Gender Gap Through Time and Space: A Journey Through Wikipedia Biographies and the ‘WIGI’ Index”. One aspect of the biography gender gap we were not able to observe however was the trend of female and nonbinary biography.  We were only ever looking at a single point in time because it’s too computationally complex to compare all the histories of the Wikipedias together at once. Now, with $22,500 and a small team, our aim is to sample this data weekly thereby gathering some longitudinal data on the way that Wikipedians are representing biographies.

Our project’s form is to create a data portal which  will display the visualisations of the state of gender in biographies. The underlying data which associates biography gender with Wikipedia language, date of birth/death, citizenship, profession, and celebrity status, will be purposefully published under an open license. We hope that other researchers can make use of this social indicator, much the in same way one can United Nation’s Gender Inequality Index.

The project is will be managed entirely on github, and should be completed in about 6 months.

It promises to be,



Asking Ever Bigger Questions With Wikidata

This is a Guest-Blog I wrote for Wikimedia Deutschland: copied here:

German summary: Maximilian Klein benutzt Wikidata als als Datenfundus für statistische Auswertungen über das Wissen der Welt. In seinem Artikel beschreibt er, wie er in Wikidata nach Antworten auf die großen Fragen sucht.

Asking Ever Bigger Questions with Wikidata

Guest post by Maximilian Klein

A New Era

Simultaneous discovery can sometimes be considered an indication for a paradigm shift in knowledge, and last month Magnus Manske and I seemed to have both had a very similar idea at the same time. Our ideas were to look at gender statistics in Wikidata and to slice them up by date of birth, citizenship, and langauge. (Magnus’ blog post, and my own.) At first it seems like quite elementary and naïve analysis, especially 14 years into Wikipedia, but only within the last year has this type of research become feasible. Like a baby taking its first steps, Wikidata and its tools ecosystem are maturing. That challenges us to creatively use the data in front of us.

Describing 5 stages of Wikidata, Markus Krötsch foresaw this analyis in his presentation at Wikimania 2014. The stages which range fromKnow to Understand are: Read, Browse, Query, Display, and Analyse (see image). Most likey you may have read Wikidata, and perhaps even have browsed with Reasonator, queried with autolist, or displayed with histropedia. I care to focus on analyse – the most understand-y of the stages. In fact the example given for analyse was my first exploration of gender and language, where I analysed the ratio of female biographies by Wikipedia Language: English and German are around 15% and Japanese, Chinese and Korean are each closer to 25%.

To do biography analysis before Wikidata was much harder. To know the gender of an article you’d resort to natural language processing or hacks like counting gendered categories and guessing based on first name. Even more, the effort had to be duplicated for each language that had to be translated. Now the promise of language-free semantic data, and tools like Wikidata Query and Wikidata Toolkit are here. The process is easier because it is more database-like; select, group by,apply, and combine.

With this new simplicity, let’s review what we have imagined so far. Here’s a non-exhaustive introduction to the state of creative question-asking so far:

Pushing Ourselves to Think Even Bigger

Can we think even bigger if we use more of the available data? Thinking about the fact that every claim may have an attached reference, Markus Krötzsch always wants to know, for a given set of claims what references must be believed in order to believe the set of claims? With that notion we could look at all the claims associated with all the items of a given language, and thus the required belief system of that langauge. At this point we could ask what are the differences in the belief systems of any two langauges?

Another way we could test the fundamental principles of knowledge and culture is to consider the chains made by the subclass of, instance of, or cause of properties. Every language is present at different links of each chain. So we can look at the differences in ways in which languages organize a hierarchy of concepts – or if they think it’s a hierarchy at all.

Much fun for logicians and epistemologists. But we can also ask more socially important questions, questions about how language and society relate. What biases do we have that we aren’t even aware of? The method, for which I’ve proposed a PhD, could be conducted as follows. We’re aware of sexism in our societies, and as you’ve seen we’ve started to build a statistical profile of how it manifests in Wikidata. Likewise we’re cognizant of racism and homophobia. We might next look at rates people appear in Wikidata by race and desire. Let’s assume we could train a model to say that these kinds of distributions are types of social biases. Next we could search every property in Wikidata to see if it indicated social bias. If successful we may find overlooked stigmas and phobias in society.

I claim that our theoretical question-answering ability has paradigmatically shifted with the growing up of Wikidata. Soon enough you won’t even need to be a sophisticated programmer to whisper your questions into the system. So next time your reading, browsing, querying or displaying Wikidata, challenge yourself to think about how to analyse it too.

Which Index Is WIGI Most Closely Related To?

In my lastest paper “Gender Gap Through Time and Space: A Journey Through Wikipedia Biographies and the ‘WIGI’ Index” (blog post and on, my co-author Piotr Konieczny and I proposed a gender index. WIGI, the Wikipedia Gender Inequality Index, is composed of many indicators, but one in particular, the “nation-WIGI”, was designed to be comparable with other well-known indices. The nation-WIGI ranks each nation by the ratio of female biography articles who are  citizens of that nation.  Designed in this way it is possible to correlate WIGI to other indexes. And potentially, we thought, given enough indexes and with high enough correlations, we could get a sense for what WIGI is measuring in terms of other indices.

Due to word-count limits, we were unable to submit this research question with the rest of the paper, so it is included here. Formally we formulated is thus:

RQ4: Of the other Gender Indices which divide also by nation which index is Wikipedia most closely related to?

First let’s recap the four other nation divided indices we are inspecting (see section 3 of our paper for more detail).

  • GDI
    • The UNDP’s Gender-related Development Index (GDI) introduced only in 1995.
    • A gender-focused extensions of the Human Development Index. GDI’s primary focus lies in gender-gaps in life expectancy, education, and incomes.
  • GEI
    • The Gender Equity Index (GEI) introduced by Social Watch in 2005.
    • Developed to measure all situations that are unfavourable to women, it ranks countries on three dimensions: education, economic participation and empowerment.
  • GGGI
    • The Global Gender Gap Index (GGGI) developed by the World Economic Forum in 2006.
    • Intended to allow comparative comparison of gender gap across different countries and years, it focuses on four areas:  economic participation and opportunity, educational attainment, political empowerment and health and survival statistic.
  • SIGI
    • The Social Institutions and Gender Index (SIGI) of the OECD Development Centre from 2007.
    • A composite indicator of gender equality that solely focuses on social institutions (norms, values and attitudes), as well as on the four dimensions of family code, physical integrity, ownership rights and civil liberties.

    Comparison Data:

    With each of the above four foreign indices we have a ranking associating a nation (sometimes referred to as an economy) and an ordinal position. We would like to understand how close two indices are, for which we use the Spearman rank correlation coefficient. Two other technical points to be addressed are that we must use the intersection of  nations covered by each index to avoid missing data problems. And lastly, we compute a calibration step to find the start decade of Wikidata-data that maximises the correlation in question.

    The full source code of this calculation is available on github.  Also as an aside, I have another blog post on an functional-programming solution to joining many dataframes at once, that was useful in computing these results.

    Finally we produced a comparison table of indices,  their correlation, the correlation significance, and the maximizing start decade.  We present it ordered by correlation:

    National-WIGI compared to Alternative Indexes


    Spearman Correlation


    Calibrated Start Decade


















    Each alternative index shows some statistically significant moderate correlation with our nation-WIGI index. This proves that the female ratio of Wikidata humans associated with a country is, at minimum, a legitimate addition to the landscape of gender inequality indexes.

    Additionally, the fact that each alternative index most highly correlates when we consider only those biographies starting around 1900 is a positive sanity check for our data. Intuitively this makes sense in the light of the fact that traditional indexes talk about modern history only.

    Still, what is the interpretation that our nation-WIGI is most highly correlated to GEI, and least with GDI? What do GEI and GDI measure that show what WIGI is measuring? We dig further into the methodologies of theses indices.

    Social Watch’s GEI explains itself that:

    “In Education, GEI looks at the gender gap in enrolment at all levels and in literacy; economic participation computes the gaps in income and employment and empowerment measures the gaps in highly qualified jobs, parliament and senior executive positions.”

    And the UN’s GDI reports itself as:

    “The new GDI measures gender gap in human development achievements in three basic dimensions of human development: health, measured by female and male life expectancy at birth; education, measured by female and male expected years of schooling for children and female and male mean years of schooling for adults ages 25 and older; and command over economic resources, measured by female and male estimated earned income.”

    So we find that both indexes use indications connected to education and economic activity. The differing factor ultimately is that the GEI additionally measures empowerment by positions of power whereas the GDI additionally measures life expectancy. This suggests that the ratio of female biographies by nation in Wikidata are more highly correlated to women’s positions of power by country than to life expectancy by country. That, at first glance, is commensurate Wikipedia’s notability policies. Notability in Wikipedia essentially defers to inclusion or absence in the journalistic and scholarly record. That means that humans in positions of power, as GEI covers, would would tend to be in Wikipedias in greater proportion. Thinking about GDI’s life expecetancy uniqueness, one does not obviously see a strong reason that those with greater life expectancy are more covered in Wikipedia.

    Clearly this is a very rough investigation, and our conclusions can only be limited. Yet we still have some evidence for Wikipedia’s notability policy effecting the gender representation. That link might be clear with some feminist reasoning, but the data also supports the notion. Surely this is a nice fact to know for those who criticize the notability inclusion as it stands.

    For questions or suggestions, contact me on twitter – @notconfusing.


Joining many DataFrames at once in Pandas: “n-ary Join”

Joining many DataFrames at once with Reduce

In my last project I wanted to compare many different Gender Inequality Indexes at once, including the one I had just come up with, called “WIGI”. The problem was that the rank and score data for each index was in a separate DataFrame. I need to perform repeated SQL-style joins. In this case I actually only had to join 5 dataframes, for 5 indices. But later, in helping my partner with her research, she came across the same problem needed to join more than 100. In my mind I saw that we wanted to accomplish this n-ary join. Mathematically I wanted this type of operation, which I couldn’t find in pandasjoin

The answer I enjoyed implementing, perhaps because I saw it as this type of repeated operation, is the reduce of functional programming.

Ok, say we have these two data sets:

In [5]:
Rank Score
Republic of China 1 0.356890
Kingdom of Denmark 2 0.347826
Sweden 3 0.345212
South Korea 4 0.343662
Hong Kong 5 0.342857
In [6]:
Rank Score
Iceland 1 0.8594
Finland 2 0.8453
Norway 3 0.8374
Sweden 4 0.8165
Denmark 5 0.8025

We’d probably join them like this:

In [7]:
wigi.join(world_economic_forum, how='outer', lsuffix='_wigi', rsuffix='_wef')
Rank_wigi Score_wigi Rank_wef Score_wef
Denmark NaN NaN 5 0.8025
Finland NaN NaN 2 0.8453
Hong Kong 5 0.342857 NaN NaN
Iceland NaN NaN 1 0.8594
Kingdom of Denmark 2 0.347826 NaN NaN
Norway NaN NaN 3 0.8374
Republic of China 1 0.356890 NaN NaN
South Korea 4 0.343662 NaN NaN
Sweden 3 0.345212 4 0.8165

But we want to generalize. Notice here we also inject the name of the DataFrame into the column names to avoid “suffix-hell” as I would like to term it.

In [1]:
import pandas

def make_df(filename):
    df = pandas.DataFrame.from_csv(filename)
    name = filename.split('.')[0]
    df.columns = map(lambda col: '{}_{}'.format(str(col), name), df.columns)
    return df

filenames = !ls

dfs = [make_df(filename) for filename in filenames]

Now here’s the reducer. I actually end up wanting an inner join in the end, but the type of join is not important to illustrate the fact.

Here we join 5 DataFrames at once.

In [2]:
def join_dfs(ldf, rdf):
    return ldf.join(rdf, how='inner')

final_df = reduce(join_dfs, dfs) #that's the magic
Score_gdi Rank_gdi Score_gei Rank_gei Rank_sigi Score_sigi Rank_wdf Score_wdf Rank_wef Score_wef
Nicaragua 0.912 102 74 37 53 0.8405 13 0.272727 6 0.7894
Rwanda 0.950 80 77 19 43 0.8661 134 0.096154 7 0.7854
Philippines 0.989 17 76 26 57 0.8235 6 0.322785 9 0.7814
Belgium 0.977 38 79 12 1 0.9984 73 0.163734 10 0.7809
Latvia 1.033 52 77 19 24 0.9489 82 0.157623 15 0.7691

I really like the elegance of this solution. I admit there may be other ways to go about it with pandas only, and I understand the R mentality of “no for loops”. Still this is precisely why I like pandas in python – you still get the freedom to play as you wish if it makes more sense to you.

Cyberwizard Institute: Retrospective


Cyber Wizard Institute

The Cyberwizard Institute  (CWI) was a free programming school based out of Sudo Room, running for the month of January 2015. The proclamation that I saw on their website before I volunteered to teach there was:

cwiThe idea is to be an anti-bootcamp. Anyone can participate. It’s free. We’re going to try hard to have lecture notes, assignments, and lecture livestreams up online. It will be primarily self-directed, but with guidance from higher level wizards.

As a founding member of sudoroom since 2011, but suffering from a recent malaise in my hacktivism, this was the perfect project to reinvigorate my involvement. What most appealed to me was the idea of an anti-bootcamp, because I’ve wanted to make clear to world the distinction I care about between start-up culture and technology. I wanted to do something metaphorically akin to hijacking the stereo system at a $4-coffee-wifi-shack and making a public service announcement that the computers are not just fancy TVs, but programmable instruments of self-empowerment, which, in addition, can be used for non-commercial purposes.

Meeting Every Day

Without any formal advertising, each sudoer leading CWI was pleasantly surprised when 27 wizardlings showed up on the first day (14 women and 13 men from my count).  When I remarked this to CWI’s originator @marinakukso, she responded that “when you offer a free programming class, with no experience required – people want that”.

I recall some apprehension when we introduced ourselves, and there was the occasional naïve posturing  of people who claimed themselves as programmers with the phrase “I know HTML”. But the need to impress quickly disappeared as we sat down to struggle with them in installing Linux on the laptops they’d brought.

The next day I was nervous with anticipation to arrive at an empty room after all we had shown fresh minds was that computer programming was about inexplicable Ubuntu hurdles. Still, with only a slightly leaky attendance most wizards did come back for more. And we went right on with teaching them bash.

We continued to meet for 5 hours daily with lectures and hackerspace-esque hands-on floating help from higher level wizards, which we dubbed “social code”. Our rhythm was found quickly, and only half way through the month CWI was feeling so magical, it received coverage in the East Bay Express:

“Many coding bootcamps in the Bay Area charge tens of thousands of dollars in fees, which can be seen as restricting access to what has become essential for finding a job in technology, let alone moving up in Silicon Valley’s so-called “meritocracy.” Kukso explained that Cyber Wizard Institute’s mission is very much aligned with that of Sudo Room, which is to give everyday folks the opportunity to understand and create the technology in their lives. “For a lot people who consider themselves nontechnical,” Kukso said, “a lot things relating to technology or coding seem mystical or secret, our perspective is … everyone can learn these types of things.’

Pedagogical Questions

Yet towards the end, I started to question the effectiveness and importance of CWI. From the beginning as facilitators we quipped that “anti-bootcamp” reallly meant “bootcamp”. And the calendar began by reflecting that.

  • Day 1: Install Linux
  • Day 2: Unix and Bash
  • Day 3: vim
  • Day 4: HTML
  • Day 5: javascript
  • Day 6: Networking
  • Day 7: Node.js
  • Day 8: Git
  • etc…

Which is exactly the way that substack, Oakland’s pre-eminent “unix philosopher,” would have it. Yet, that was before the collaborative aspects took over and I began to try and think about how I would teach a less trained non-programmer version of myself what I know now. I mixed in:

(click to view the recorded lectures)

Where substack was spreading his knowledge of artisinal web-buildery, I was attempting to proselytize a world of Mathematical elegance. At times I was worried this felt interfering and competitive to the wizards.

However the final projects did come to life, instigate solely from the intrinsic motivation of the new-wizards. On the last day arduino hacks and personal-itch websites really had materialized. After speaking to those who made it all the way through the month, they spoke of a brighter perspective than my own: perhaps we inadvertently succeeding at being an anti-bootcamp.

The Medium Was Always The Message

As another facilitator @Johnnyscript, at the  ending Cyberpunk Masquerade Wizard Initiation Ceremony, said we showed them what it coding is actually like – many differently opinionated hackers running around without too much top-down organization. We delivered the essence of the hackerspace more accessibly than just happening upon a room of silent geeks staring down. Our package, despite being a bit dishevelled, did form a solid curriculum, although it was not refined as something that you might pay $17,000 for. Yet it also was not an altar for silicon-valley start-up-ism.

Taken together, we find a point that I am surprised that I missed. Whereas  programming bootcamps are normally Cathedrals, as Eric Raymond might put it, we built a Bazaar.

Notconfusingly yours,

Your humble newb-druid.

Cyberwizard Institute II

“Will there be another Cyberwizard Institute?” many are asking. Likely, but it is as-yet unplanned because volunteer work is tiring. If you have the intitiative or want to hear about an inititiative, join our discussion tracker on github.