Welfare economics for humans

At the moment I’m doing a PhD in political economy. I am about six months in and as part of the process I have to write up a precis of my thesis and its core arguments. I thought it might be nice to share it here, so the thesis can benefit from the wisdom and sharpness of my readership ;-). Astute readers may have noticed I have already shared some stuff on my thesis, but my thinking has changed a lot in the last few months.

  1. Non-specialist introduction

Related topics: Political economy, welfare economics, philosophy of science- economics, philosophy of science- psychology, philosophy of measurement, ethics, political philosophy.

A person’s welfare is how well that person’s life is going for them. Welfare economics is the study of the effects of economic policy on welfare. Often welfare economics has been carried out in isolation from the empirical study of human well-being- as the study of human well-being belongs to psychology rather than economics. Instead welfare has been evaluated through the metrics of revealed preferences and the rational choice theory of behaviour, and sometimes merely as economic abundance (for example, GDP). A typical project in welfare economics might be proving that this or that state of affairs is Pareto optimal, or that the winners of a policy make so much money from it that they could, in principle, compensate the losers. Our thesis is to argue that a fully adequate welfare economics must be rooted in empirical, especially psychological, concepts and measures of well-being.Doing so will allow it to answer the real questions of welfare economics- not just who gains and who loses, but how much one life is improved, or another diminished.

2. The object of study of welfare economics

Our thesis is:

That welfare economics should focus on psychologically grounded and psychometrically measured concepts of well-being.

This includes concepts like happiness, life satisfaction etc., as measured by psychometric instruments. Old welfare economics represented a combination- a psychologically rich theory of welfare that was nonetheless tied tightly to economic choice. At least as Bruni and Sugden (2007) have it, such a combination was possible because of the prevalence of psychological hedonism. As British psychological hedonism withered away, it became necessary to choose one of these elements. Our contention in this thesis is that the road taken (to borrow a metaphor from Bruni and Sugden) by Robbins, Pareto, Hicks etc.- new welfare economics stripped of its rich psychological foundation- was the wrong one because it is too impoverished a theory of what welfare is to answer the fundamental question of welfare economics- which economic policies increase welfare.

The empirical study of well-being and its relationship with economic policies and circumstances is a better model for welfare economics. Such a model is better placed to overcome many of the traditional antinomies which plague the field including the problem of interpersonal comparison and the problem of the role of values in science. The central point of divergence for our paradigm is splitting the original conflation between utility understood as revealed preference fulfilment and welfare which can be understood in a variety of different ways including life satisfaction, happiness and objective goods. 

The concept of utility as employed by economists (and even philosophers) often conflates:

1) Preferences revealed in decision making 

2) Welfare

Of course there is a theoretical awareness that these things are different within economics, but in practice they do tend to be run together.

It might seem that on at least one common account of welfare, this conflation is no real problem, because welfare just is preference satisfaction. However even the preference satisfaction theorist about welfare typically makes stipulations that preferences must be informed, and perhaps also rational. My desire to drink bleach, thinking it will protect me from COVID-19, does not count towards my welfare if fulfilled, even though it is clearly the satisfaction of a preference. 

During the heyday of hedonistic psychology conflating welfare and utility was perhaps not so dangerous- at least according to Bruni and Sugden’s account (2007), as the basis of decision-making was taken to be hedonic, cardinal, interpersonally comparable and, as Cooter and Rappoport (1984) have shown, directly linked to material needs. Once hedonistic psychology was defeated however the conflation between welfare and utility meant welfare became tied to a very thin conception of decision making. Such a knot couldn’t support critical functions- e.g. interpersonal comparison. 

Hands (2010) has argued that the conflation between welfare and decision making was deliberately sustained within the ordinalism endorsed by luminaries such as Robbins. A great deal of effort went into constructing a mentalistic “half-way house” between behaviourism and hedonistic psychology. The purpose of this halfway house was to sustain the normative significance of choice, without also having to purchase the full package of hedonistic psychology. Excursus that won’t go in the finished thesis: We can perhaps afford to be a little cynical about this, and might suggest that a deliberate effort to save ethically preferred parts of the old implicit psychological theory while dodging their potentially redistributive and egalitarian implications. Robbins was close with both Hayek & Von Mises while Hicks once wrote that it was good to be free from the constraints of utilitarianism.

One idea I am interested in developing in my thesis is that psychometric approaches to measurement may be better placed to deal with interpersonal aggregation, and this gives us a strong reason to pursue the psychometric approach. Of course no science in the world can prove that the preference fulfilment theory of welfare (a view which ties together decision making and welfare) is incorrect because this is a normative question. But surely it would have been natural, upon deciding that preference fulfilment could not be interpersonally measured using the tools available to social science, to aim to measure something else of ethical interest that could be? Even if it was only thought to be of ethical interest because it correlated with real, underlying preference fulfilment? 

We don’t even need to give up on the preference fulfilment theory of welfare, or the measurement of a kind of preference fulfillment. There are reasons to think that life satisfaction is a reasonable and interpersonally comparable measure of something like preference fulfilment(). Incidentally, this is why Angner’s (2013) objection that the psychometric approach requires a hedonistic theory of welfare is wrong

3. Motivations: Adjacent literatures

Some key adjacent literatures include:

A) The empiricist/rationalist debate in the methodology and philosophy of economics.

B) Methodological debates about the role of psychology in economics.

C) Contemporary discussion about inequality, its measurement and evaluation.

4. Motivations: The failures of traditional welfare economics

Of course our thesis would be incomplete without a critique of the inadequacies of traditional welfare economics.

Suppose we are evaluating some policy using the standard tools of welfare economics, trying to decide whether it is welfare increasing. This policy has both winners and losers, so holding ourselves to evaluating Pareto improvements is out. The two most prominent approaches in welfare economics for exactly this question are the Kaldor-Hicks criterion and the use of a Social Welfare function, and as we shall see, neither is entirely satisfactory.


The Kaldor-Hicks approach is based on the concept of a potential Pareto improvement. If the winners gain enough that they could, hypothetically, compensate the losers, then the policy has resulted in a potential Pareto improvement. Of course it is very unlikely that the winners will compensate the losers in this way, but such is life. 

For many people it will be obvious that such an approach is not equitable. To give the billionaire another hundred dollars, at the cost of the last ninety dollars of a poor person scarcely seems fair. But in addition to being inequitable it is arguably inefficient in a special sense- on all plausible conceptions of welfare it decreases total welfare. This is because welfare- whatever definition we use- is surely not linear in income.

That welfare is not linear in money has been known for a long time. Sometimes the insight is attributed to the discovery of the St Petersburg Paradox(—). However the fundamental problem with the Kaldor-Hicks approach has been known about for far longer than that. As the New Testament tells it:

“”He sat down opposite the treasury and observed how the crowd put money into the treasury. Many rich people put in large sums. A poor widow also came and put in two small coins worth a few cents. Calling his disciples to himself, he said to them, ‘Amen, I say to you, this poor widow put in more than all the other contributors to the treasury. For they have all contributed from their surplus wealth, but she, from her poverty, has contributed all she had, her whole livelihood.”

This failure is not merely a normative failure though. Because the Kaldor-Hicks criterion is insensitive to declining marginal utility in income, it fails to do what welfare economics “says it will do on the tin”- provide us with information about the effects of economic policies on welfare. Kaldor-Hicks can “go up” while overall welfare goes down, on any plausible conception of welfare.


The SWF function approach in the way it is most commonly practiced counsels us to declare some function over individual resources equal to social welfare:

For example, Social welfare=The sum of the following for each individual(Log((Willingness to pay)+(Income))). 

This approach suffers a dilemma. Is the social welfare function founded on an empirically grounded  theory of the causes and degrees of human flourishing, or is it based on raw, direct intuitions about the justice of particular resource distributions? If the former, it merges with the approach we advocate in this thesis, and the research program we defend is good to go. If the latter, how likely is it that there will be consensus around some specific “fair” function from allocated resources to goodness? Could anyone really hold, for example, that social welfare is equal to the formula above and this is a brute ethical truth that would remain so even if human nature was very different?

5. Is psychometrics good enough? Epistemic risk and the selection of assumptions

Scientific investigation always requires epistemic risk- making questionable assumptions-, and certain fields can be quite parochial in what kinds of risks they view as mundane (for example, modelling agents as fully rational and taking the results proven as important) and which risks they view as irresponsible (relying on questionnaire data). As our example indicates, the methodological question of the selection of risks is integrally connected to any debate about using psychometric methods to assess welfare. I intend to spend a chapter in my thesis considering the question of whether our social objectives can, and should, inform what risks we are willing to take.

One of the traditional reasons for economist’s reluctance to use “thicker” measures of welfare than revealed preference is a lack of understanding about the ingenious methods -and applied philosophy of science- psychologists have developed in order to measure intangibles like happiness.

There are two quite separate approaches to measurement prominent in the social sciences (Angner 2013), the representational theory of measurement and the psychometric approach.

The representational theory of measurement in the context of welfare posits that preferences revealed in behaviour meet certain consistency and completeness axioms. homeomorphism between human behaviour and some set of numbers and permissible relations upon them is established. On the psychometric approach, by contrast, we establish that a measure of a some posited feature of humans (like, for example, “extroversion” or “happiness”) is reliable, and has the right sort of statistical relationships with observable outcomes and other measurements. This nexus of relationships is taken as validating the original measure.

For example, suppose that I want to measure tendency towards violence. I come up with a questionnaire that contains three questions, each of which is to be rated on a scale 1 to 5:

1. I often feel so angry I can’t control myself

2. Other people deserve a good beating

3. I want respect and I don’t care how I get it

I then add up people’s score’s and compare them with a variety of other things. For example, lifetime incidence of being charged with violent crimes, mentions of violence in school disciplinary records, your family’s assessment of how violent you are, a psychologist’s assessment of how violent you are after a structured interview and so on. If I find enough things that our measure correlates with, I have thereby provided evidence that it really is a measure of the variable in question.

The psychometric approach to measurement is tied to the approach we are supporting in this thesis. This is usually thought to be because no one has yet found a satisfactory way to apply the representational theory of measurement to psychological unobservables (Angner 2013)- for example, happiness. The representational theory of measurement is tied to classical welfare economics, as it can be used to define an ordinal utility function- or a cardinal function through Von-Neumann Morgenstern’s approach (Von-Neumann Morgenstern 1944), although this is often ignored or wrongly dismissed as ordinal (Andreas 2011). One topic I want to explore in further depth in my thesis (following a suggestion from Latty 2020- personal communication) is the question of whether the representational theory might, after all, be applied to psychometrics, thus dissolving the traditional distinction.

Historically economists have preferred the representational approach as more rigorous because of its axiomatic quality and foundation in the observable- simply put, psychometrics foundations are both less rigorous and less precise. In truth both approaches have their own problems. The representational approach to measurement suffers from the inconsistency of human preferences. One can define preferences sufficiently narrowly so as to ensure agents are always consistent(—) but doing so risks triviality for the theory (—). 

Engaging in science requires us to use methodological frameworks. These frameworks are often quite distant from direct confirmation or dis-confirmation using empirical evidence (which is not to say they are wholly impervious to such).  Thus the selection of framework is a risk. In choosing which risks to accept, it seems to me that one factor we should be sensitive to is the degree to which the framework will allow us to pursue socially valuable investigations.

In an ideal world both approaches to welfare- revealed preference based and psychometric- would be developed to their maximum limit. However in a world of constrained resources, it does not seem to me unreasonable to hold that one factor in determining which research agenda to follow should be the degree to which the candidate research agenda can answer the questions which matter to us. This includes politically and ethically relevant questions.

Because the psychometric approach to measurement can deliver us answer a wider array of questions- for example, questions about interpersonal levels thus allowing us to answer questions like “which distributional effects are net welfare increasing”, it seems superior if the object of welfare economics is to give us information relevant to the evaluation and selection of policies. Yet welfare economics as it is currently taught focuses on proving theorems using heroic assumptions, and even so of limited application. See, for example, the fundamental theorem of welfare economics which shows that, under certain extremely restrictive assumptions that are never met, the results of market 

We are not suggesting the psychometric approach to measurement to be used to divvy up human wellbeing on pure faith alone. We are merely suggesting that it represents a more promising line of inquiry from the point of view of addressing the concerns that led to the creation of welfare economics in the beginning. The role of values in prioritising scientific research is long acknowledged, e.g. Kitcher (—) and should not be mischaracterized as placing blind faith before evidence or reason.

6. Is the interpersonal utility comparison problem purely an artefact of anti-psychometric approaches to welfare measurement?

The interpersonal utility comparison problem is the problem of comparing the intensity of your preference for A with the intensity of my preference for B. There is no clear formal procedure for doing so. The problem has long dominated economist’s thinking about social welfare, perhaps to an unhealthy degree. Here is a point about the interpersonal utility comparison problem that hasn’t been made often enough- and is one of the major questions I want to tackle in my thesis. It is not clear that welfare, understood psychometrically and not in terms of revealed preferences, has a parallel problem.

Suppose one were concerned about the interpersonal comparability of subjective well-being as evaluated by psychometric measures- how could we demonstrate it? Well suppose that someone was concerned about the inter-location comparability of temperature,  worried that temperatures may mean different things near the poles than the tropics. The way to resolve this issue would be simple. We determine whether everything acts like it would at its equivalent temperature in different locations- for example, do specific chemical reactions happen at the same temperatures? 

Similarly, were interpersonal comparison of psychometrically derived happiness, life satisfaction or capabilities lists based on nothing- so that there was no reason to think your seven meant something remotely similar to my seven, we would not expect well-being to be a predictor of anything across different people. Yet as will be discussed in this thesis, well-being constructs are well validated against a variety of measures.

Indeed, the process of psychometric validation is, almost necessarily, a process of establishing interpersonal comparability, insomuch as it works by showing that measurements M of a trait T covary with measurements M’ of a second trait T’ where one would expect T & T’ to covary on theoretical grounds. From this an abductive inference is made that T is an accurate measure of M. While in principle such a validation could be carried out in a within-person diachronic model only, in practice, validation almost always depends on comparing scores between persons. If such comparisons worked despite your, my and Bob’s seven all not bearing the slightest similarity to each other, it would be a miracle.

It is true that one could reject entirely the psychometric theory of measurement and thus the construct validation which makes interpersonal comparison possible. However psychometrics is by the standards of the social sciences, a well developed discipline with notable empirical successes (—), such an argument then, will be difficult to make.

There is  perhaps a semantic question about what exactly it means for Jill to be happier than Jane. In the absence of a fully adequate semantics for mental state terms, this seems like an unavoidable problem. It is unclear to me though that this is a special problem of comparison, rather than a general problem of understanding what mental states terminology means. Suppose we had no clear account of what heat was, would the difficulties in articulating what was meant by saying one object was hotter than another be of any special interest, or would they be simple symptoms of that larger problem?

7. Excurus: Equal ignorance

One topic I want to consider in my thesis is the role of so-called “equal ignorance” in comparisons between persons.

Suppose that we have a group of agents with positive but concave marginal utility in some resource (say money). Say that we have no further information about the magnitude or shape of their utility curve. Lerner’s equal ignorance result (Lerner —, c.f. also Sen —) shows that, under such conditions, equal distribution is best if our aim is to create as large a possible pool of total utility. The fundamental insight here is that if we don’t have information about how to set a “rate” of interpersonal comparison, we should not arbitrarily assume that one agent’s wants are much more intense than another’s.. This insight- that in the absence of additional information we should assume about equal intensity, is likely essential establishing a principled way to compare utilities.

Now Latty (2020, personal communication) has argued that- for reasons I won’t get into in this precis, that in the logical limit, the psychometrician is forced to rely on equal ignorance type reasoning as well, just as much as the comparer of decision utilities. In my thesis I want to grapple with the question of who needs equal ignorance type reasoning, and to what degree. If I ultimately conclude that the psychometrician does require equal ignorance type reasoning to get interpersonal comparison of subjective well-being off the ground in the same way that utility theorists need equal ignorance, I will further grapple with this question: if both approaches depend on the assumption of equal ignorance at their core, than why has this assumption been so much less controversial in one field than the other?

8. The cardinality problem in decision utility and in subjective wellbeing

The cardinality of a measure is the strength of its scale. Let us explain with examples. Consider a scale that consists of 0 for male, 1 for female and 2 for other. We could easily change these numbers, making 1 represent female for example, and no information would be lost or changed. We call a scale like this nominal. Now consider a scale like 0= angry, 1=average 2=peaceful. Here, because “average” is closer to both angry and to peaceful than angry is to peaceful, the order matters- we can’t change that at will. However we could still make peaceful=3, and no real information would be lost- order matters, but not the size of the gaps- we call this scale ordinal. Now consider a scale like temperature. Here the size of the gap between two numbers matters. The difference between 10c and 20c is the same size as the difference between 20c and 30c. A scale of this strength is called interval. Scales of strength interval and above are sometimes called cardinal scales. Above interval scales there are ratio scales, in which there is a single and non-arbitrary zero point, and there are further graduations above this (for example, if there is a non arbitrary unit as well as a non-arbitrary zero).

There is a longstanding debate in economics about whether utilities are ordinal or cardinal.

In this thesis I want to consider a further interesting question. The extent to which focusing on psychometrics allows us to avoid the ordinalist/cardinalist debate in the theory of utility. Whether preferences are defined up to a cardinal scale (or ratio scale for that matter) is an intriguing question in its own right, but if we treat welfare as at least partly separate from preferences- at least in the way economics treats preference fulfilment-, we avoid much of the practical motivation for the controversy. Of course this escape clause will not apply if we accept the preference fulfilment theory of welfare. On the other hand, many of the strategies we use to establish the cardinality of psychometric welfare measurements might also be useful in studying behaviourally defined utilities, and vice-versa, so the issues are perhaps not so seperable. 

There is a cardinality problem in psychometrics (Ng- forgotten exact reference Kristoffersen  2010). Let us say that we have two sets of happiness scores- {4, 6, 8} and {5, 6, 7}- which represents the greater aggregate happiness? Well the sum and average of both is equal, can we be certain that the gap in responses between 5 and 6 is exactly half the size of the gap between 4 and 6?

There are a number of potential ways to address this problem including approaches based on studying time and risk tradeoffs participants would accept (building on both Von Neumann Morgenstern 1944 and Kahneman —), examining patterns of score change in test/retest over a time period sufficiently short that any change should be due to pure error, item response theory, comparison with another scale that is thought to linear  and other approaches. I intend to study these in detail in my thesis.

I have also done, along with Kieran Latty, some unpublished sensitivity analysis of what happens if the cardinality assumption is violated and the size of say a 1 point difference is not constant across the scale, using various functions to represent different underlying scales, some of them very different from the assumption of linearity. We found that it makes little difference to the ranking of countries and regions. This is significant because one of the main reasons we want to establish cardinality is so that we can compare outcomes, through averages. Space permitting I will discuss this research.

9. A typology of approaches to welfare in welfare economics

As part of the thesis I want to give a taxonomy of methodological approaches to welfare economics. While the concepts I trace are old concepts, the taxonomy, as far as I can tell, is novel.

By Divertism I mean an approach which urges us to stop talking about welfare altogether, and start talking about something else. For example, changes to income plus or minus non-income changes as measured by willingness to pay (Kaldor-Hicks). On no plausible theory of welfare is this a measure of welfare- the closest theory it comes to is the preference fulfilment theory, but the preference fulfilment theory acknowledges declining marginal utility in income. While divertism imposes a heavy cost in urging that welfare economics should not discuss welfare, it’s advantages include that it is very easily measurable, for the most part, without the use of measures or variables outside the realm of economics proper. Hicks (—) in effect advocated for this approach.

Stipulationism is similar to divertism, but subtly different. It can be seen as an extreme expression of the operationalist (—) tradition in the philosophy of measurement. We simply stipulate some function- for example “utility equal log income”- perhaps because it has “interesting” properties, or because we think it will be appealing to policy makers. We declare that “welfare” in some specific context just is this measure, similar to (—)’s classic statement about intelligence, that it is just whatever intelligence tests measure. Archibald (1959) represents this tradition.

Paretianism is an approach which aims to say as little about welfare as possible. It does this by using a purely ordinal scale, assumed to increase in some numeraire like money. We can assume, surely (goes the reasoning), that however welfare is defined, it increases when at least one person is richer and no one is poorer. The main, and long observed, problem is that the Pareto improvement relation is only a partial ordering of states of affairs and most plausible policy options leave at least one person worse off. A secondary problem is that relative income effects may mean that apparent “Pareto improvements” defined, say, in terms of income may actually make certain people worse off. Pareto is of course the classical source for something (—) like this approach.

Realism is the approach that I am defending in this thesis. It is the view that welfare is a real, albeit unobserved, variable. This variable is generally conceived of as psychological, but there are exceptions, c.f., for example, the material welfare school (—). It may be life satisfaction, happiness, some collection of objective goods, or something else entirely, and different realists will vary on this. Realism can be seen as a contemporary continuation of old welfare economics. While the realist has to grapple with both ethical and psychometric questions, if they can answer these questions they are best positioned to contribute to the fundamental task of welfare economics- viz, the practical evaluation of which policies best advance human welfare. Of contemporary theorists, Ng (—) is probably the most famous advocate of this approach. 

A brief note that to advocate for realism doesn’t necessarily imply the thesis that I am defending in this book. [Explicate: The VNM utility account as a potential form of realism grounded only in economic behaviour with no need for psychometrics. It’s main problems are: it’s assumptions make it either over-idealised and false, or trivial and it has no natural mechanism for interpersonal comparison. Nonetheless it is both realist and in a sense non-psychological].

10. Between the Scylla of sectarianism and the Charybdis of faux-neutrality

Welfare economics has long held an ambiguous status between science and normative investigation. Any methodology for welfare economics must grapple with the question of the proper role of norms in that science. Existing approaches, such as the Kaldor-Hicks criterion, are value laden, whatever some of their defenders might claim. In this section I consider our psychological approach from the point of view of the problem of values in science, arguing that our approach is well equipped to deal with attendant problems.

There are many concerns about the role of ethical values in welfare economics, but one way to usefully split them might be as follows:

First there is what we might call the doctrinal problem of values in economics.

Secondly there is what we might call the political problem of values in economics.

The doctrinal problem is an objection in principle to the idea of value judgements in economics. The political problem is a concern with the effects on including controversial value statements in a science that has to seek legitimacy.

In this thesis we will not be concerned with the doctrinal problem in any great detail,  because it has been reviewed extensively elsewhere. Sciences as diverse as medicine and conservation biology have successfully incorporated value judgements. Alexandrova (2017) argues, I believe successfully, that any science of well-being cannot avoid the incorporation of value judgements, and to refuse to study well-being in the name of an abstract principle would be absurd. C.f. chapter x where we deal in detail with the ethical problems of ‘avoiding’ difficult ethical questions in welfare economics for further information about why we consider value judgements unavoidable in welfare economics. In brief, refusing to be explicit about measuring values has its own problems. It can lead to the importation of values that very few would agree with- for example, treating the marginal value of a dollar as the same for everyone (c.f. chapter). A more moderate approach attempting to delimit the points at which values can validly enter our science, for example in the form of a value framework given by policy makers, faces the difficulty that values need to be constantly reinterpreted and applied in the process of public policy research and this process is itself value-laden (Machlup —)

In this thesis we are primarily concerned with the political problem of values in economics. How to incorporate value judgements in a way which isn’t sectarian, and which doesn’t reduce the broad acceptability of economic social science generally, and welfare economics specifically. We acknowledge this is a problem of special concern. As controversial as the value judgements sometimes made in medicine and conservation biology, they pale in comparison to the value judgements potentially made in economics. Nuclear war was  barely averted due to disagreement over such values.

So we need a way to navigate between the scylla of sectarian ethical frameworks, that hold interest only to the fanatics of each sect, and the Charybdis of faux ethical positivism which smuggles ethical judgements under the cover of neutrality.

What is needed then is a framework for measuring welfare which is broadly acceptable, or a collection of frameworks which are jointly broadly acceptable. We might disagree on whether it is optimal, but it needs to contain enough morally relevant information for enough people that it is difficult to argue with the value of research conducted under its aegis.

Let us say that an information gathering paradigm is broadly acceptable when a significant portion of the population concerned with decision-making values the information it generates, and this proportion of the population exceeds the proportion who reject it- for example, by arguing that its results are inherently misleading. Given the diversity of ethical opinion, is a broadly acceptable information gathering paradigm about subjective-well being possible? In what follows, I argue yes. I make a number of points which taken jointly show that while the assessment of welfare in an economic context will always face disagreement over values, these disagreements are unlikely to be fatal to the quest for broad acceptability.

10.1. Welfare economics as decision maker vs welfare economics as information provider

There is an often unremarked serious disagreement in welfare economics between approaches which see welfare economics as a decision making tool (e.g,, Social Welfare Functions- meant to represent the ethical principles of “the decision-maker”), and approaches which see it as an information gathering exercise (most approaches to cost-benefit analysis- at least in principle). Our approach falls within the tradition of welfare economics as an information gathering tool. We will argue such approaches avoid many of the pitfalls of values in welfare economics.

Where our approach differs from past approaches that see welfare economics as about information gathering is that it aims to collect information on variables of clear ethical importance, selected for their ethical importance. In the past, approaches which view welfare economics as an information gathering exercise have often tended to eschew explicit ethical talk (e.g., cost-benefit analysis). While explicitly ethically charged approaches, like the Bergson-Samuelson SWF have often been closer to the welfare economics as decision-making-tool side of the argument.

Admittedly the distinction is imprecise. Kaldor-Hicks could be intended as a decision-making criterion while social-welfare functions can be interpreted as tracking just one criterion of interest. The distinction has more to do with the rhetorical construction of what the measure is intended for than its formal content. Nonetheless there is value in being clear on exactly how we are interpreting a measure. Is it useful information, or is it a decision-making heuristic. Our choice will set the justificatory standard.

The problem with approaches that see welfare economics as, first and foremost, an ethical decision-making procedure is that ethical decision making procedures over social policy are, by their nature, almost as controversial as anything can be. A method for gathering and bundling information needs only to clear more modest barriers. We do not need to establish that everyone wants access to a certain form of information to justify gathering it. We do not need to establish that it is superior to all other approaches. 

This is because while there is a logical tension between different decision rules, there is no such logical tension between different information gathering paradigms- they are non-rivalrous. If I want to gather information on, say, the human rights impacts of a policy in service of some capabilities based approach, that does not undermine someone else- or even myself on another occasion- gathering information on the life-satisfaction effects of a policy.

10.2. Being clear on what we mean by measuring values.

Reoccurring value differences in what it means to measure welfare can plausibly be ameliorated if we regard what we measure as only guaranteed to be correlates of what matters, not, necessarily the variables that matter in themselves. Hausman and — make this point in —. For example, suppose you think that what it means for someone’s life to go well for them is some complex blend of preference satisfaction and hedonic state. Well even if you only directly measure hedonic state, you’re going to also track life satisfaction implicitly, since being satisfied with your life is highly correlated with happiness.

This has many useful consequences. For example, consider the three main theories of what it means for someone’s life to go will for them- life satisfaction, happiness and objective lists. Since all three are all closely empirically related, the partisan of any one of them should probably prefer the use of any of the others over no welfare measurement at all. This makes information gathering around such measures even less controversial.

Of course these variables do not always intercorrelate sufficiently for there to be no practical difference between them. Happiness and life satisfaction respond differently to income for example, with life satisfaction more affected by income than happiness (—). Thus measures of wellbeing are not interchangeable. This is a pity, but in a way reinforces the need for a rich research program studying a multiplicity of well-being measures

10.3. The welfarist interest proposition

Welfarism is the ethical position that welfare is all that matters in ethical decision-making. If progress in welfare economics required us to establish that welfarism is true, it is unlikely we would ever make progress.

Adler proposes what he terms partial welfarism to deal, to some degree, with these concerns. A partial welfarist is a kind of consequentialist who places at least some ethical weight on the welfare effects of any action. Presumably many people who are not welfarists can agree with partial welfarism.

I want to suggest what I call the Welfarist interest proposition, an even less demanding concept than partial welfarism. The welfarist interest proposition is:

Almost all ethical viewpoints desire information about the welfare effects of policies, so long as such information can be obtained reasonably cheaply.

Consider, for example, a Nozickean prioritising a list of policies. Among them are equivalence classes of policies which all violate their deontic principles equally. The Nozickean is not a partial welfarist, since they do not “weight” them against their deontic principles. Even for a Nozickean though, welfare information might be valuable as a tiebreaker- thus even the Nozickean “desires information about the welfare effects of policies, so long as such information can be obtained reasonably cheaply” at least in many cases. 

It seems a modest demand then that we assemble information about the effect of policies on variables which many (all?) people care about at least as instrumental indicators at some point in the decision making process.

10.4. Democratic measurement

One final strategy for ensuring political legitimacy in the study of welfare which deserves greater consideration is outsourcing the selection of exactly what to study to the public. Focus groups, polls etc. can be used to select appropriate measures to include in batteries, or even to design composite measures, as was done in the UK.

The UK office of national statistics found that what the public valued in terms of welfare was a composite measure, in the words of Alexandrova:

“The outcome of this exercise is a measure of UK’s well-being that contains both subjective indicators—happiness, life satisfaction, sense of meaning—and also objective indicators, such as life expectancy, educational achievements, safety, and so on. A colourful wheel where each indicator is a spoke makes up for the fact that the measure basically includes everything but the kitchen sink”

Such an approach does leave open questions. For example, can we be certain that the participants understand the difference between things which are intrinsically v instrumentally connected to happiness? Nonetheless, democracy represents a powerful potential response to those who say that the implicit values of the science of wellbeing in an economic context have no legitimate foundation.

[Reread Nyborg to check this gloss is correct] Indeed Nyborg argues that the great failure of traditional cost-benefit analysis is that it is undemocratic. [Complete]

Kitcher argues that the values of the democratic polis need to guide a research program at many points, including standards of evidence (a la Rudner —) and priorities- the idea of democratic oversight of science- without substituting expertise for folk wisdom- is thus not new. If a shift towards empirical welfare economics is accompanied by greater democratic openness in how the field sets its research agenda, so be it.

We will note though that the dream of economists making no value judgments themselves, and simply taking as inputs the values of others, is unlikely to be achieved. As Machlup (1969) notes, where there is incomplete specification of values the economist will have to fill in the gaps. Also, as Alexandrova notes, the interweaving of moral and positive ideas is often complex, and scientists working areas related to wellbeing- including economics- might often have special insights into what forms, for example, wellbeing, might take in certain contexts.

10.5: Evidence about what people value- re: Welfare

The results of the UK Office of National Statistic’s study- in and of themselves suggest that people value a full complement of objective and subjective well being measures, but are these findings more broadly replicable? Can we show that people value the things that are thought to make one’s life go well? Wholly unsuprisingly, yes. Let us consider the example of happiness and positive hedonic states. In the European Social Survey (2016), ~69% of the sample said that it was at least somewhat like them to think it is important to have a good time. Approximately 66% of the sample said that it was at least somewhat like them to think that having fun and seeking pleasure are important.

10.6 Section conclusion

To summarise- there are broadly two kinds of concerns about values in welfare economics, broad doctrinal concerns about the role of values in science, and narrower concerns about what is feasibly politically acceptable and legitimateable, and here we are focused on dealing with the second concern. We show that the program we advocate can deal with the second problem by arguing:

1. That it avoids many of the traditional problems of explicitly ethically charged approaches by treating welfare economics as an information gathering rather than decision making exercise.

2. That the project need not be committed to any one concept of wellbeing or welfare as true or correct over the others, because they are all heavily intercorrelated, and thus gathering information about one gives us information about the others.

3. That information about welfare is of interest to an enormous range of different ethical perspectives.

4. That we have the option of legitimating our approach by appeal to democratic surveys

5. That evidence suggests that the public value what is measured by common wellbeing constructs, such as happiness.

11. The problem of diversity in welfare concepts

One of the major difficulties with a psychologically grounded approach to studying welfare is that there are many different, normatively grounded, conceptions of what it means for someone’s life to go well for them, all of which need to be fleshed out using different measures. 

Of course, we do have the option of setting our sights narrower, abandoning claims to be definitively measuring how well people’s lives are going- welfare- and instead measure happiness, life satisfaction or collections of objective goods or capabilities for their own value, irrespective of whether or not they are rightly considered welfare. Alexandrova supports what she calls contextualism about claims regarding welfare. The social worker who wishes to ensure her client has material supports, the friend who wants to check in whether you are following your dreams and the psychologist who aims to ensure that you are not plagued by anxiety and depression are all investigating different things, but in each specific context it is appropriate to call them welfare evaluations. We could maintain that welfare economics is measuring welfare in a specific context and in that specific context certain choices are defensible that would be harder to defend as general principles about welfare. Perhaps the economist’s tendency to favour preference satisfaction measures, for example might reflect real aspects of the economic policy context, and might be appropriately met by focusing on life satisfaction in welfare economics, even if life satisfaction is not so good a measure of welfare in other contexts.

But there’s another option for dealing with the multiplicity of theories about what it means for a life to go well for someone, based on the work of Hausman and Mcpherson (2009). Measures based on popular theories of welfare- such as desire satisfaction, hedonism and objective list theory are usually highly inter-correlated with each other. Because they are inter-correlated, information about one will give us evidence about the others. If someone is happy, they are probably also have many of their most intense desires satisfied. If someone is safe, financially and socially secure, educated and healthy, the odds are pretty good that they are happy. 

While the hedonist might not feel completely comfortable with life satisfaction as a measure, given its correlation with their preferred outcome, they will certainly prefer it to, for example, a decision rule like Kaldor-Hicks. 

“Thus we should stop thinking about preference satisfaction, happiness and items on an objective list constitute well-being and rather think of them as evidence for well-being (Hausman and McPherson 2009” (Reiss 2013)

12. Practical applications

I want to conclude my thesis by considering the practical application of the framework I have described. 

The most popular applied form of welfare economics is cost-benefit analysis, which sums the changes a policy will cause in terms of changes in income (positive & negative) and willingness to pay for those changes, or for those changes not to occur. As we have already noted, such a procedure faces the related but distinct difficulties that it is unfair and that it does not maximise wellbeing on any common conception of wellbeing, because such conceptions invariably assume wellbeing is not linear in money

It also faces a special difficulty in the context of a democratic society that, to the best of my knowledge, previous authors have not called out explicitly. It is effectively an income weighted voting procedure! As such its legitimacy in a democratic society is tennous. 

It is possible to weight CBA by income. Such an approach can be done in two ways: 1. On the basis of direct intuitions about the value of certain kinds of income distributions 2. On the basis of an empirically grounded assessment of the function between income and human flourishing. It will come as no surprise that we support the second strategy.

The most prominent applied example of the second strategy in action is contained in the UK’s Green book on cost-benefit analysis. I conclude with a review of such approaches, and a discussion of directions for future research.

If you enjoyed this article please consider joining our mailing list: https://forms.gle/TaQA3BN5w3rgpyqeA also, a collection of my best writing between 2018 and early 2020 is available as a free e-book “Something to read in quarantine: Essays 2018-2020”. You can grab it here.

Biblography (only about one quarter complete)

Alexandrova, A. (2017). A philosophy for the science of well-being. Oxford University Press.

Andreas, J. (2011). THE CARDINALIST MANIFESTO: THE EPISTEMOLOGY OF THE MEASURABILITY OF UTILITY. Journal of the History of Economic Thought, 33(4), 559–561. https://doi.org/10.1017/S1053837211000356

Angner, Erik. “Is It Possible to Measure Happiness?: The Argument from Measurability.” European Journal for Philosophy of Science 3, no. 2 (May 2013): 221–40. https://doi.org/10.1007/s13194-013-0065-2.

Archibald, G. C. (1959). Welfare Economics, Ethics, and Essentialism. Economica, 26(104), 316–327. JSTOR. https://doi.org/10.2307/2550868

Bruni, L., & Sugden, R. (2007). The road not taken: How psychology was removed from economics, and how it might be brought back*. The Economic Journal, 117(516), 146–173. https://doi.org/10.1111/j.1468-0297.2007.02005.x

Cooter, Robert, and Peter Rappoport. “Were the Ordinalists Wrong About Welfare Economics?” Journal of Economic Literature 22, no. 2 (1984): 507–30.

Hands, D. W. (2010). Economics, psychology and the history of consumer choice theory. Cambridge Journal of Economics, 34(4), 633–648. https://doi.org/10.1093/cje/bep045

Kristoffersen, I. (2010). The Metrics of Subjective Wellbeing: Cardinality, Neutrality and Additivity*. Economic Record, 86(272), 98–123. https://doi.org/10.1111/j.1475-4932.2009.00598.x

Von Neumann-Morgenstern 1944 (placeholder)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s