|Utilitarianism for engineers
||[Jan. 21st, 2013|06:16 pm]
(title a reference to this SMBC comic)
I've said before that it's impossible to compare interpersonal utilities in theory but pretty easy in practice. Every time you give up your seat on the subway to an old woman with a cane, you're doing a quick little interpersonal utility calculation, and as far as I can tell you're getting it right.
The lack of the theory still grates, though, and I appreciate it whenever people come up with something halfway between theory and practice; some hack that lets people measure utilities rigorously enough to calculate surprising results, but not so rigorously that you run up against the limits of the math. The best example of this is the health care concept of QALYs, Quality Adjusted Life Years.
The Life Year part is pretty simple. If you only have $20,000 to spend on health care, and you can buy malaria drugs for $1,000 or cancer drugs for $10,000, what do you do? Suppose on average one out of every ten doses of malaria drugs save the life of a child who goes on to live another sixty years. And suppose on average every dose of cancer drug saves the life of one adult who goes on to live another twenty years.
In that case, each dose of malaria drug saves on average six life years, and each dose of cancer drug saves on average twenty life years. Given the cost of both drugs, your $20000 invested in malaria could save 120 life years, and your $20000 invested in cancer could save 40 life years. So spend the money on malaria (all numbers are made up, but spending health resources on malaria is usually a good decision).
The Quality Adjusted part is a little tougher. Suppose that the malaria drug also made everyone who used it break out in hideous blue boils, but the cancer drug made them perfectly healthy in every way. We would want to penalize the malaria drug for this. How much do we penalize it? Some amount based on how much people disvalue hideous blue boils versus being perfectly healthy versus dying of malaria. A classic question is "If you were covered in hideous blue boils, and there were a drug that had an X% chance of making you perfectly healthy but a (100 - x%) chance of killing you, would you take it?" And if people on average say yes when X = 50, then we may value a life-year spend with hideous blue boils at only 50% that of a life year spent perfectly healthy.
So now instead of being 120 LY from malaria versus 40 LY from cancer, it's 60 LY from malaria versus 40 from cancer; we should still spend the money on the malaria drug, but it's not quite as big a win any more.
[I have gone back and edited parts this post three times, and each time I read that last sentence, I think of a spaceship a hundred twenty light years away from the nearest malaria parasite.]
Some public policy experts actually use utilitarian calculations over QALYs to make policy. I read an excellent analysis once by some surgeons arguing which of two treatment regimens for colon cancer was better. One treatment regimen included much stronger medicine that had much worse side effects. The surgeon supporting it laboriously went through the studies showing increased survival rates, subtracted out QALYs for years spent without a functional colon, found the percent occurrence of each side effect and subtracted out QALYs based on its severity, and found that on average the stronger medicine gained patients more utility than the weaker medicine - let's say 0.5 extra QALYs.
Then he compared the cost of the medicine to the cost of other interventions that on average produced 0.5 extra QALYs. He found that his medicine was more cost-effective than many other health care interventions that returned the same benefits, and therefore recommended it both to patients and insurance bureaucrats.
As far as I can tell, prescribing that one colon cancer medicine is now on sounder epistemological footing than any other decision any human being has ever made.
Towards A More General Hand-Wavy Pseudotheory
So if we can create a serviceable hack that lets us sort of calculate utility in medicine, why can't we do it for everything else?
I'm not saying QALYs are great. In fact, when other people tried the colon cancer calculation they got different results by about an order of magnitude.
But a lot of our social problems seem to be things where the two sides differ by at least an order of magnitude - I don't think even the most conservative mathematician could figure out a plausible way to make the utilitarian costs of gay marriage appear to exceed the benefits. Even a biased calculation would improve political debate: people would be forced to say which term in the equation was wrong, instead of talking about how the senator proposing it had an affair or something. And it could in theory provide the same kind of imperfect-but-useful-for-coordination focal point as a prediction market.
Okay, sorry. I'm done trying to claim this is a useful endeavor. I just think it would be really fun to try. If I need to use the excuse that I'm doing it for a constructed culture in a fictional setting I'm designing, I can pull that one out too (it is in fact true). So how would one create a general measure as useful as the QALY?
Start with a bag of five items, all intended to be good in some way very different from that in which the others are good:
1. $10,000 right now.
2. +5 IQ points
3. Sex with Scarlet Johannson
4. Saving the Amazon rainforest
5. Landing a man on Mars
A good hand-wavy pseudotheory of utility would have to be able to value all five of these goods in a common currency, and by extension relative to one another. We imagine asking several hundred people a certain question, and averaging their results. In some cases the results would be wildly divergent (for example, values of 3 would differ based on sex and sexual orientation) but they might still work as a guide, in the same way that believing each person to have one breast and one testicle would still allow correct calculation of the total number of breasts and testicles in society.
Let's start with the most impossible problem first: what question would we be asking people and then averaging the results of?
The VNM axioms come with a built in procedure for part of this - a tradeoff of probabilities. Would you rather save the Amazon rainforest, or have humankind pull off a successful Mars mission? If you prefer saving the rainforest, your next question is: would you rather have a 50% chance of saving the rainforest, or a 100% chance of a successful Mars mission? If you're indifferent between the second two, we can say that saving the Amazon is worth twice as many utils as a Mars mission for you. If you'd also be indifferent between a 50% chance at a Mars mission and a 100% chance of $10,000, then we can say that - at least within those three things - the money is worth 1 util, the Mars landing is worth 2 utils, and the rainforest is worth 4 utils.
The biggest problem here is that - as has been remarked ad nauseum - this is only ordinal rather than cardinal and so makes interpersonal utility comparisons impossible. It may be that I have stronger desires than you on everything, and this method wouldn't address that. What can we turn into a utility currency that can be compared across different people?
The economy uses money here, and it seems to be doing pretty well for itself. But the whole point of this exercise is to see if we can do better, and money leaves much to be desired. Most important, it weights people's utility in proportion to how much money they have. A poor person who really desperately wants a certain item will be outbid by a rich person who merely has a slight preference for it. This produces various inefficiences (if you can call, for example, a global famine killing millions an "inefficiency") and is exactly the sort of thing we want a hand-wavy pseudotheory of utility to be able to outdo.
We could give everyone 100 Utility Points, no more, no less, and allow these to be used as currency in exactly the same way the modern economy uses money as currency. But is utility a zero sum game within agents? Suppose I want a plasma TV. Then I get cancer. Now I really really want medical treatment. Is there some finite amount of wanting that my desire for cancer treatment takes up, such that I want a plasma TV less than I did before? I'm not sure.
Just as you can assign logarithmic scoring rules to beliefs to force people to make them correspond to probabilities, maybe you can assign them to wants as well? So we could ask people to assign 100% among the five goods in our basket, with the percent equalling the probability that each event will happen, and use some scoring rule to prevent people from assigning all probability to the event they want the most? Mathematicians, back me up on this?
The problem here is that there's no intuitive feel for it. We'd just be assigning numbers. Just as probability calibration is bad, I bet utility calibration is also bad. Also, comparing things specific to me (like me getting $10,000) plus things general to the world (saving the rainforest) is hard.
What about just copying the QALY metric completely? How many years (days?) of life would you give up for a free $10,000? How about to save the rainforest? This one has the advantage of being easy-to-understand and being a real choice that someone could ponder on. And since most people have similar expected lifespans, it's more directly comparable than money.
But this too has its problems. I visualize the last few years of my life being spent in a nursing home - I would give those up pretty easily. The next few decades are iffy. And it would take a lot to make me take forty years off my life, since that would bring my death very close to the present. On the other hand, some things I want more than this scale could represent; if I would gladly give my own life to solve poverty in Africa, how many QALYs is that? The infinity I would be willing to give, or the fifty or so I've actually got. If we limit me to fifty, that suggests I place the same value on solving poverty in Africa as on solving poverty all over the world, which is just dumb.
Someone in the Boston Less Wrong meetup group yesterday suggested pain. How many seconds of some specific torture would you be willing to undergo in order to gain each good? This has the advantage of being testable: we can for example offer a randomly selected sample of people the opportunity to actually undergo torture in order to get $10,000 or whatever in order to calibrate their assessments ("Excuse me, Ms. Johansson, would you like to help us determine people's utility functions?")
But pain probably scales nonlinearly, different tortures are probably more or less painful to different people, and as I mentioned the last time this was brought up society would get taken over by a few people with Congenital Insensitivity To Pain Disorder.
Maybe the best option would be simple VNM comparisons with a few fixed interpersonal comparison points that we expect to be broadly the same among people. A QALY would be one. A certain amount of pain might be another. If we were really clever, we could come up with a curve representing the utility of money at different wealth levels, and use the utility of money transformed via that curve as a third.
Then we just scale everyone's curve so that the comparison points are as close to other people's comparison points as possible, stick it on the interval between one and zero, and call that a cardinal utility function.
Among the horrible problems that would immediately ruin everything are:
- massive irresolvable individual differences (like the sexual orientation thing, or value of money at different wealth levels)
- people exaggerating in order to inflate the value of their preferred policies, difficulty specifying the situation (what exactly needs to occur for the Amazon to be considered "saved"?)
- separating base-level preferences from higher-level preferences (do you have a base level preference against racism, or is your base level preference for people living satisfactory lives and you think racism makes people's lives worse; if the latter we risk double-counting against racism)
- people who just have stupid preferences not based on smart higher-level preferences (THE ONLY THING I CARE ABOUT IS GAY PEOPLE NOT MARRYING!!!)
- scaling the ends of the function (if I have a perfectly normal function but then put "making me supreme ruler of Earth" as 10000000000000000000x more important than everything else, how do we prevent that from making it into the results without denying that some people may really have things they value very very highly?)
- a sneaking suspicion that the scaling process might not be as mathematically easy as I, knowing nothing about mathematics, assume it ought to be.
I'd be very interested if anyone has better ideas along these lines, or stabs at solutions to any of the above problems. I'm not going to commit to actually designing a system like this, but it's been on my list of things to do if I ever get a full month semi-free, and if I can finish Dungeons and Discourse in time I might find myself in that position.
I have a preference over ice cream flavors. Using the VNM axioms, you can measure how strongly I prefer chocolate over vanilla, measured in units of percent-risk-of-coffee-flavor. Similarly, I have a preference over experiences; and you can measure how strongly I prefer dancing over dishwashing in units of stubbed toes. But given these two measurements, you can't generate an exchange rate between chocolate and dancing without an additional judgment on my part, and there's no principled way to make that judgment.
So it seems to be with interpersonal utility comparison. The failure modes commonly ascribed to interpersonal utility comparison could also be applied to internal utility comparison; just replace the Utility Monster with a Utility Tumor, and all preferences are replaced with maximizing probability of one thing. But each person can generate some sort of exchange rate between all their preferences, and we accept this as authoritative (unless fragrantly stupid) because they're better positioned to know what they should have than we are. But when deciding on the exchange rate between A getting chocolate instead of vanilla and B getting to dance instead of dishwash, there is no longer an obvious person to make authoritative.
Except, of course, for the person making decisions that affect A and B. And while it's not a principled or formalizable method, I think what we really do is blur out details until we find something sufficiently analogous to terms in our own utility functions, and then act as though others shared our utility functions, with in-category substitutions like which ice cream flavor is which. But when we encounter a preference that can't be readily mapped to one of our own, we value it at the same level as aesthetics, our weak, catch-all preference.
in the same way that believing each person to have one breast and one testicle would still allow correct calculation of the total number of breasts and testicles in society.
Men have breasts!
"His name is Robert Paulson"
[nitpick]And people who've had mastectomies have fewer than 2! Unless they've had restorative surgery![/nitpick]
[nitpick]And trans people may have neither set or both sets![/nitpick]
Granted, I have no idea what the ratio between transmen and transwomen might be. Perhaps there is no net effect on the population average.
* In general allowing people to set their own utility functions leaves things open for exploitation - I'm probably just going to pick whatever thing is at the top of the list of things I might get and tell you I value it to the exclusion of all else. For example, it would be trivial for me to arrange my claimed utility function to value sex with person A more than person A values not having sex with me. I'm not sure how to correct for this, but it might be doable.
* A similar problem definitely has a solution: http://en.wikipedia.org/wiki/Scoring_rule#Proper_scoring_rules
. A proper scoring rule, of which we have some examples, is such that you maximize your score by presenting your true expectations. It would be nice to have a thing whereby you maximized your utility by presenting your true utility, but unfortunately this is a game (in the -theoretic sense) and so this is much, much harder.
* I think, whatever you do, you're forced to assign everyone the same amount of Utility Points. Not doing so leads to optimizing yourself for whatever case gets extra Points to spend.
* You've implicitly accepted a variant preference utilitarianism, where we attempt to satisfy people's personal utility functions. Not all strains of utilitarianism work like this, and it's not obvious to me that they should
* You haven't even really touched on average vs total utilitarianism.
"I think, whatever you do, you're forced to assign everyone the same amount of Utility Points"
This is not a well-defined suggestion. I'm guessing that you mean that for each person's utility function, the sum of the utilities of each possible outcome is a constant (e.g. everyone's utility function sums to 100 utils across all states of the universe). If you meant something else, please explain further, because I couldn't figure out any other way to interpret your suggestion. I assume you also want no outcomes to have negative utility? Otherwise, you can still make your utilities much greater in magnitude and still get them to add to 100. What if there are an infinite number of possible outcomes? In this case, the sum of the utilities of each possible outcome might not even converge, so there is no way to normalize the sum to 100.
Even in cases where there are a finite number of outcomes, a simple example is enough to show that the suggestion makes no sense. Suppose there are 2 people: Alice and Bob, and 3 outcomes: X, Y, and Z. Alice likes Z best and is indifferent between X and Y. Bob thinks Z is worst and is indifferent between X and Y. Intuitively, it doesn't look like this gives us reason to prefer Z over X and Y in aggregate. But if we take your suggestion, Alice's utility function gives 0 utils to X and Y and 100 utils to Z, and Bob's utility function gives 0 utils to Z and 50 each to X and Y. In total, X and Y each get 50 utils, and Z gets 100, and is declared the best option. Effectively, Bob is being punished for his preferences being more easily achievable (there are 2 options that Bob likes, and only 1 option that Alice likes).
2013-01-22 03:27 am (UTC)
http://dresdencodak.com/2006/12/03/dungeons-and-discourse/ and http://dresdencodak.com/2009/01/27/advanced-dungeons-and-discourse/ were turned into an actual game: http://www.raikoth.net/Stuff/ddisplayer.pdf
2013-01-22 02:27 pm (UTC)
This is so totally awesome. If you guys ever need to test drive it I´m in.
2013-01-22 04:01 am (UTC)
"Ordinal" vs "cardinal" is the problem that VNM solves. When people say that it's just ordinal, they mean that you can determine that someone prefers saving the Amazon to landing on Mars, but they deny that you can determine that the person prefers it "twice as much." That is, you can order the outcomes, but you can't quantify them. Because probability doesn't exist. That's why they're called Austrian economists, rather than Austro-Hungarian economists.
But, yes, even after VNM, there remains the problem of scaling Alice's preferences to be all stronger than Bob's preferences.
I'm sure you know everything I have to say, but I think your phrasing undersells the role of QALYs in the world today.
Some public policy experts actually use utilitarian calculations over QALYs to make policy.
Although this sentence contains the phrase "make policy," it is undercut in my ears by the word "experts," which makes me read this as people lobbying for the future use of QALYs in policy. Indeed, your examples sound like starry-eyed academics, who may have great influence over how doctors practice, but little direct influence on explicit policy, particularly money.
But QALYs are much more popular than that. Today they are a standard part of how drug companies lobby insurers, especially European governments, to pay for their new drugs.
If we were really clever, we could come up with a curve representing the utility of money at different wealth levels, and use the utility of money transformed via that curve as a third.
It's standard in economics to use a wealth utility function of the form f = sqrt(x) or f = ln(x), or more generally f = x^c for 0 < c < 1 a constant and x = total wealth. The Wikipedia article on utility is misleadingly abstract.
I have some articles on the subject written up but can't blog them until I find and learn to use a decent graphing software system.
2013-01-22 04:44 pm (UTC)
Oh, really? A progressive tax wouldn't make sense with those utility functions.
Why not? The main criteria you want in a wealth utility function are f increasing and f' decreasing. It's easy to show that progressive taxation doesn't affect that. Welfare traps are a different matter, of course.
I'm thinking that a flat tax rate of r gives each person approximately -rxf'(x) utilons. If f is logarithmic, the flat tax would be fair in that it would give just as much disutility to the rich as to the poor. A progressive tax rate would be unfair to the rich.
So I'm thinking that someone who thinks that a progressive tax is fair must think that the utility function grows even slower than a logarithm.
If f is logarithmic, a flat tax removes a constant amount of utilons. The rich person with 10 utilons loses one; the poor individual with two utilons loses one. I'm not sure that you can even meaningfully reason in those terms, but it's certainly not clear to me that that's a 'fair' situation.
Logarithmic utility has the problem that destitution is rated at -infinity utilons, of course.
If you use power-function utility, a flat tax multiplies all utilities by r^c, which seems fair as a first pass.
2013-01-23 06:36 am (UTC)
You and gjm have changed my thinking on this matter.
Or have an explict term in their global utility function for equality (hence, in particular, not have a global utility function that's just a sum of individual agents' utilities).
Or think that richer people will be more effective at minimizing their tax burden, and therefore want a nominally progressive system with the goal of getting approximately flat payments in reality.
Or value poor people more than rich people for some reason.
Or think that, quite aside from individual utilities, transferring money from richer people to poorer people tends to have a stimulative effect on the economy because poor people spend more of their money than rich people do.
Or, I suspect, quite a lot of other possibilities. So it's far from clear to me that there's any real conflict between preferring a progressive tax regime and thinking that utility grows roughly logarithmically with wealth.
(My own intuition, incidentally, says that utility typically grows a bit slower than logarithmically for perfectly selfish agents -- the step from a net worth of $10k to a net worth of $1M seems bigger than that from a net worth of $10M to one of $1B -- and almost linearly for perfectly altruistic ones because they can use N times as much wealth to help N times as many people. I'm not sure what the right way is to combine these for real people who are mostly selfish but a bit altruistic. You might think it would be large_constant * sublogarithmic + small_constant * linear, so that the linear term would dominate, but that may be overoptimistic about altruism levels.)
2013-01-23 05:32 am (UTC)
The goal is not some abstract ideal of fairness (ask the fat man how "fair" the solution to the trolley problem is). The goal is maximizing utility.
We tax people because we need money to run the government (and the tax-funded government provides net positive utility). The idea is to get that money while minimizing the disutility of taxation. At a given level of revenue, a progressive tax will produce less disutility than a flat tax for any wealth utility function f(x) with f'(x) decreasing.
Also, I hate to point this out, but the problem of utility scaling with personal wealth does solve, almost tautologically, most of those other problems--partially and probabilistically, but that seems to be the best we can do a lot of the time. What I mean is, people with a lot of money have demonstrated that they are less likely to do stupid, destructive things with it, and conversely struggling to accumulate wealth conditions people to avoid stupid, destructive behavior. Money acts as a rationality signal.
There are tons and tons of caveats and circumstances where this breaks down, but I'm skeptical that anything more efficient is likely to come along (pre-singularity anyway). It's basically the Economic Calculation Problem
The UK's National Health Service explicitly uses QALYs (and cost-per-QALY) to make decisions about health care treatments. (See for example this page from NICE, the body that assesses new treatments and decides whether the NHS should pay for them: http://www.nice.org.uk/newsroom/features/measuringeffectivenessandcosteffectivenesstheqaly.jsp
It's not perfect, and subject to local variation and political pressure (eg a standard approach for manufacturers of treatments deemed to be too expensive is to lobby politicians and campaign to have the decision overturned, and this sometimes works), but I'm fairly impressed.
I don't think even the most conservative mathematician could figure out a plausible way to make the utilitarian costs of gay marriage appear to exceed the benefits.
That's a very interesting question. I agree with the sentiment, that gay marriage does a lot of good and doesn't really do any harm, and that's why it should be enacted. However, some people would disagree that adding up the utilities is the way to do something, that seems to be why I disagree with them in the first place. (I would argue that even if you take a more personal-virtue approach to ethics, gay marriage is still massively the way to go.)
On the other hand, there are times when I do agree with quibbles along the lines of "OK, it looks utility-positive, but I'm pretty sure X is going to come back and bite us later, even if I can't explain exactly why."
It's easy. Just note that straight couples outnumber gay couples by a factor of roughly 50:1, that's about two orders of magnitude. If you can argue that the harm done by gay marriage to straight marriage is even 5% as bad as the harm done to gay couples by denying them marriage rights, it's a no-brainer. And 'civil partnerships' go a long way to offsetting the right-hand side of that inequality.
2013-01-23 07:59 pm (UTC)
Personally, I think that if people were able to choose their culture more, a lot of similar problems would disappear. But I don't see that happening.
people would be forced to say which term in the equation was wrong, instead of talking about how the senator proposing it had an affair or something.
You don't really even believe in this sentence yourself, do you? ;)
>Just as you can assign logarithmic scoring rules to beliefs to force people to make them correspond to probabilities, maybe you can assign them to wants as well? So we could ask people to assign 100% among the five goods in our basket, with the percent equalling the probability that each event will happen, and use some scoring rule to prevent people from assigning all probability to the event they want the most? Mathematicians, back me up on this?
Yes, something like that will work in the abstract. Here's how you force an unrealistically rational agent with preferences that are linear and increasing in the amount of each good it has give you its exchange rate amongst the goods. Give the agent a lump of clay to be divided into piles corresponding to each good. The agent will be rewarded with an amount of each good which is the log of the amount of clay it put into that pile. The solution which maximizes value is to allocate the clay proportionally to how much the agent values each good.
This works because if the agent values apples, say, twice as much as bananas, the point at which there are no opportunities to gain value by moving clay from one pile to the other pile is exactly the point at which there are there is twice as much clay in the apple pile as the banana pile, because this is the point where adding a smidgen of clay to the apple pile will yield a number of marginal apples which is half as much as the number of marginal bananas you'd get if you added the clay to the banana pile. This happens because (d/dx)ln(x) = 1/x, or in english, that because of our scoring rule the marginal amount of some good you can get by adding another smidgen of clay to its pile is inversely proportional to the amount of clay already there.
The practical approach that I've had in mind is a simpler / less mathy version of your suggestion. Give everyone a list of (say) 15 things which are good in diverse ways, and about the same order of magnitude (you could also include some bad things and flip the sign). Have each person rank the 15 things according to their preferences. Define whatever item they rank as 8th to be 1 util for them, and assume that everyone gets the same value from 1 util.
Now you can use standard tradeoff questions to sketch out each individual's utility function, and then calculate away with this cardinal interpersonal utility function (made from ordinal preferences). The advantage of using just the median item (in addition to its simplicity) is that some people might have weird, extreme reactions to a few of the 15 items, but the strength of their preference for their median item should usually be roughly similar to other people's (assuming that people are roughly similar to each other).
the strength of their preference for their median item should usually be roughly similar to other people's
Not if people's priorities follow something like a power law, and the exponent varies. e.g. compare religious ascetics who value God >>> everything else vs. cosmopolitan try-anything-once types.
2013-01-31 04:19 pm (UTC)
I have a quick question about your blog, would you mind emailing me when you get a chance?