|Principle of charity and n-step theories of mind
||[Oct. 17th, 2012|02:35 am]
I had a hard time writing this article, and I decided to try expressing some concepts in pseudocode. There are at least three problems with this. First, it looks pretentious. Second, I have zero programming training and the code is probably terrible. Third, I hate it when other people try to express simple non-mathematical concepts in math or pseudocode and I imagine other people will hate mine. If you hate it, let me know in the comments and I will try not to do it again. While I'm disclaiming, let me also apologize for pretending that "mental algorithms" are a real thing, and for being super-sloppy in pretending to represent them. I just had a really hard time putting this idea into language and settled for poor communication over none at all.
EDIT: wildeabandon suggests is-ought distinction might work better than code/variable distinction.
Principle of charity
Suppose you're a Democrat who supports Obamacare. In fact, suppose we have an exact printout of your mental algorithm, and it reads:
D: support($POLICY) iff (helps_people_in_need($POLICY) = 1); helps_people_in_need(Obamacare) = 1
That is, you enact a policy if and only if the policy helps people in need, and you believe Obamacare does this.
You go out and meet a Republican who opposes Obamacare. There are at least two possible mental algorithms the Republican might use to produce this position:
R1: support($POLICY) iff (helps_people_in_need($POLICY) = 0); helps_people_in_need(Obamacare) = 1
R2: support($POLICY) iff (helps_people_in_need($POLICY) = 1); helps_people_in_need(Obamacare) = 0
That is, the Republican could agree with you that Obamacare helps people in need, but oppose helping needy people; or the Republican could support helping needy people but believe Obamacare does not do so.
The Principle of Charity tells us to prefer R2 to R1. This suggests a possible nonstandard phrasing of the Principle: prefer explanations of opposing positions in which they share the same algorithm, but disagree on the values of variables.
Theory of mind
Theory of mind is a concept from the psychologists, who tend to worry a lot about when children develop it and whether autistic people might not have it. It mostly means the ability to model other people as possibly thinking differently than you do.
The classic example involves two young children, Jane and John. An experimenter enters the room and hides a toy under the bed. Then she takes John out of the room and, with only Jane watching, takes the toy out from under the bed and hides it in a box. Then the experimenter brings John back into the room and asks Jane where John will look for the toy.
If Jane is very young or developmentally disabled, she predicts John will look for the toy in the box, since that is where she would look for the toy. If Jane is older, she will predict John will look for the toy under the bed. Her theory of mind has successfully predicted that John, who only saw the toy hidden under the bed, will still expect it to be there, even though she herself feels differently.
Bah. There's a better picture here.
Jane's mental algorithm looks like this:
$LAST_KNOWN_LOCATION_OF_TOY = box
Jane correctly assumes she can model John with her own mental algorithm; that is, he will think the same way she does. However, before she has theory of mind, she assumes she and John share all variables. Once she develops theory of mind, she realizes that although John will use the same mental algorithm, his variables may have different values.
Supposedly theory of mind is the sort of thing you either have or you don't, and as long as you're older than four and neurotypical, you have it. But I wonder. There are some people who it seems just don't get the Principle of Charity, and I'm starting to wonder if there's a theory of mind correlation.
Barry The Bully and the 1-step theory
Let me illustrate with a story that is very loosely based around a fight I vividly remember between a very old group of acquaintances, in about the same way the Chronicles of Narnia are very loosely based on the New Testament:
Barry is a bully. Whenever he doesn't like anyone, he picks on them and insults them until they cry or run away. For example, a while back he did this to my friend Bob, and Bob got pretty traumatized.
Mary and Sarah are not bullies. They're both awesome people, and I like both of them. But unfortunately Sarah hurt Mary a long time ago, and Mary hates Sarah viciously.
One day Barry, Mary, Sarah and I happen to be stuck together. Sarah says something, and Mary, who really really hates her, gives her the cold shoulder and makes it clear that she's not talking to her.
Barry, who likes Sarah, gets enraged at Mary and starts cursing at her, telling her what a worthless person she is and how no decent person would ever treat someone the way she's treating Sarah and how she should be ashamed of herself. This goes on and on, until it gets awkward.
Later, I encounter Barry and he still wants to tell me how much of a jerk Mary is for being mean to Sarah. I point out "Barry, come on. You're mean to people all the time. Like Bob!" And Barry says "Yeah, but Bob deserved it! Bob is a terrible person. Mary is mean because she bullied Sarah even though Sarah was nice!"
Barry's algorithm for being mean to someone looks something like:
B:: bully($VICTIM) iff (hate($VICTIM) = 1);
hate(Bob) = 1, hate(Sarah) = 0
In other words, Barry bullies someone only if he hates them, and he hates Bob but doesn't hate Sarah.
If Barry had no theory of mind at all, he would sit there puzzled, wondering why Mary was bullying Sarah when in fact it was Bob she hated and Sarah looked nothing like Bob. But in fact, Barry does have a theory of mind. He realizes that Mary hates Sarah just as he hates Bob, so he's not surprised when she bullies Sarah. So far, so good. But then I ask Barry: given that you and Mary are following the same cognitive algorithm, aren't you morally equivalent? Barry says no, and we imagine his algorithm as follows:
B2: hate($VICTIM) = $is_bad_person($VICTIM); $is_bad_person(Bob) = 1, $is_bad_person(Sarah) = 0
Here Barry says that he only hates someone if they are a bad person. So he's explaining the values he assigned hate(Bob) and hate(Sarah) in the last step of his algorithm. And in the process, he's rejecting my claim that his algorithm and Mary's algorithm are equivalent. He's saying "No, my algorithm tells me to only hate bad people; Mary's algorithm tells her to hate Sarah even though she's perfectly nice."
But here we have a glaring theory-of-mind failure. Barry is failing to realize that Mary may be following the same algorithm, but with different values in the variables:
M2: hate($VICTIM) = $is_bad_person($VICTIM);
$is_bad_person(Bob) = 0, $is_bad_person(Sarah) = 1
In other words, Mary also hates only bad people, just like Barry. She just has a different idea of who the bad people are.
I would argue that here Barry displays a one-step theory of mind. He's capable of understanding that people can think one step differently than he does; otherwise he wouldn't be able to understand why Mary bullies Sarah instead of Bob. But people who think differently than he does two steps back? That's crazy talk!
Therefore, Mary must be an evil person working off a completely different mental algorithm that tells her to hate nice people. And so he fails at Principle of Charity.
Biased creationists and the 2-step theory
Take a look at this article on confirmation bias and in particular at their discussion of creationism. We imagine the evolutionist who wrote the article to be using an algorithm something like this:
E: support($THEORY) iff seems_true($THEORY) = 1;
seems_true(evolution) = 1, seems_true(creation) = 0
That is, support a theory if and only if it seems to be true, and evolution seems to be true and creationism doesn't, so support evolution.
Now the author of that article certainly has theory of mind. She's not accusing the creationists of lying, promoting creationism for their own sinister purposes even though they secretly know evolution is right. She charitably and correctly models the creationists' algorithms as:
C: support($THEORY) iff seems_true($THEORY) = 1;
seems_true(evolution) = 0, seems_true(creation) = 1
That is, the creationists also support the theory that seems true to them, but it's creationism that seems true in their case.
But just as the difference between Barry and Mary lay in the deeper level of how they calculated their hate($VICTIM) function, so we go a little deeper and investigate how the evolutionists and creationists calculate their seems_true function.
E2: seems_true($THEORY) = good_evidence($THEORY) & confirmation_bias($RIVAL_THEORY);
good_evidence(evolution) = 1, confirmation_bias(creation) = 1, good_evidence(creation) = 0, confirmation_bias(evolution) = 0
In other words, a theory will seem true if it has good evidence and if the rival theory is due mostly to confirmation bias. Evolution has good evidence behind it, and creationism only stays standing because of the confirmation bias of its supporters, so evolution should seem true.
But the article writer is more charitable than Barry. She is not a 1-step theory-of-minder. She understands that, even though the creationists are wrong about it, they do believe they have good evidence for their creationism, and they do believe that the evolutionists are suffering from confirmation bias. So the creationists have use the algorithm:
C2: seems_true($THEORY) = good_evidence($THEORY) & confirmation_bias($RIVAL_THEORY);
good_evidence(creation) = 1, confirmation_bias(evolution) = 1, good_evidence(evolution = 0), confirmation_bias(creation = 0)
Here the author is being charitable enough to agree that the creationists are following the same mental algorithms she is even up to the point of what makes a theory seem true to her. But we can still go one level deeper. Let's investigate this term "confirmation_bias" from the evolutionists' point of view.
E3: confirmation_bias($THEORY) = good_response($THEORY, $CRITICISM) = 0; good_response(creationism, angiosperm_pollen = 0), good_response(evolution, irreducible_complexity) = 1
And here, the author says, the evolutionists and creationists are finally using genuinely different algorithms. The evolutionists only accuse a theory of having confirmation bias if it has no good response to the criticisms against it. Creationism has no good response to the argument presented about angiosperms.
The article author didn't provide an example of a "fact" creationists might bring up that they believe evolutionists have no good response for (note that this is hilariously ironic in an article on confirmation bias of all things), so I chose one myself; the claim that certain structures like the bacterial flagellum are "irreducibly complex" and could not have evolved by "mere chance". RationalWiki, of course, believes evolutionists do have an answer to that one, which I have represented by good_response(evolution, irreducible_complexity) = 1.
But let's go back to the example with the Republican at the top of the page. There are two possible explanations for why the creationists might keep accusing the evolutionists of bias at this point rather than accepting their own bias:
C3-a: confirmation_bias($THEORY) = good_response($THEORY, $CRITICISM) = 1; good_response(creationism, angiosperm_pollen) = 0, good_response(evolution, irreducible_complexity) = 1,
C3-b: confirmation_bias($THEORY) = good_response($THEORY, $CRITICISM) = 0; good_response(creationism, angiosperm_pollen) = 1, good_response(evolution, irreducible_complexity) = 0,
In other words, algorithm a admits the evolutionists have much better arguments that they can't disprove, admits that the evolutionists have demolished their flimsy objection about irreducible complexity, but screw it, they're going to accuse the evolutionists of being the ones with the bias anyway! Algorithm b says that no, the creationists have the same criteria for bias as everyone else, but they believe they're right about the angiosperms and they're right about irreducible complexity.
Ten seconds of Google searching shows that the creationists actually have a very complex literature explaining why they think the evolutionists' angiosperm pollen argument collapses and why in fact they think the pollen record is a strong argument for creation (see for example Pollen Order; Pollen Spores and Modern Vascular Plants in the Cambrian and Pre-Cambrian here). And here's a whole page where the guy who originated the idea of irreducible complexity responds to evolutionists' objections and "shows" why they're "wrong". So ten seconds of searching shows it is painfully obvious that, much as the Principle of Charity would predict, the creationists are using algorithm C3-b, the one that is exactly the same as the evolutionists' algorithm, the one where they agree that you only get to call "bias" if you believe your arguments are better than your opponent's - but they believe their arguments are.
Note for the person (people?) who combs my blog looking for things she can take out of context to discredit Less Wrong (oh yes, this happens): I am not claiming that the creationists are right, or that their arguments are "just as good as the evolutionist arguments" or anything stupid like that. This isn't a post about rightness, it's a post about charity. I am claiming that the creationists understand the principle of "if someone criticizes your theory, you need to refute the criticism or abandon the theory", and that they are using the same coarse-grained mental algorithms as the evolutionists, at least at this shallow three-step-down level.
The n-step theory of mind
The three-year old child has a 0-step theory of mind. She can never imagine anyone thinking differently than she does at all.
Barry the Bully had a 1-step theory of mind. He could imagine someone thinking differently than him, but he couldn't imagine they might have a reason for doing so. They must just hate nice people.
Whoever wrote that wiki page has a 2-step theory of mind. She can imagine creationists thinking differently than she does, AND she can imagine them having reasons to do so, but she can't imagine them justifying those reasons. They must just not care if their arguments get demolished.
How deep does the rabbit hole go? I think I have at least a 3-step theory of mind; when I read that RationalWiki article I immediately thought "No, that's not right, the creationists probably believe they can support their own arguments". I don't know if it's possible to have an unlimited-step theory of mind. I expect it is: I don't think more steps take more processing power past a certain point, just willingness to resist the temptation to be uncharitable.
I think if someone did have an unlimited-step theory of mind, the way it would feel from the inside is that they and their worthier opponents have pretty much the same base-level mental algorithms all the time, but their opponents just consistently have worse epistemic luck.
Yes, I agree. But I still feel like there's something meaningful going on here. Can you think of a better analogy that preserves what I'm trying to say but avoids the vague-algorithm/variable distinction problem?
I'm not sure that it's an analogy per se, but I think it's related to the is/ought dichtomy, so that someone with an nth degree theory of mind would assume that those who disagree would have similar ought beliefs, but different is beliefs.
I don't know if this actually sheds any light, but I think there's quite a lot of thought that has already gone into is/ought, and someone more knowledgeable than me might be able to make meaningful inferences.