How exploitation launched human cooperation

Preparing to load PDF file. please wait...

0 of 0
How exploitation launched human cooperation

Transcript Of How exploitation launched human cooperation

Behavioral Ecology and Sociobiology

(2019) 73:78


How exploitation launched human cooperation
Rahul Bhui1 & Maciej Chudek2 & Joseph Henrich3,4
Received: 11 January 2018 / Revised: 20 March 2019 / Accepted: 28 March 2019 # Springer-Verlag GmbH Germany, part of Springer Nature 2019
Abstract Cooperation plays a crucial role in primate social life. However, the evolution of large-scale human cooperation from the cognitive fundamentals found in other primates remains an evolutionary puzzle. Most theoretical work focuses on positive reciprocity (helping) or coordinated punishment by assuming well-defined social roles or institutions (e.g., punishment pools), sophisticated cognitive abilities for navigating these, and sufficiently harmonious communities to allow for mutual aid. Here we explore the evolutionary and developmental origins of these assumed preconditions by showing how negative indirect reciprocity (NIR)—tolerated exploitation of those with bad reputations—can suppress misbehavior to foster harmonious communities, favor the cognitive abilities often assumed by other models, and support costly adherence to norms (including contributing to public goods). With minimal cognitive prerequisites, NIR sustains cooperation when exploitation is inefficient (victims suffer greatly; exploiters gain little), which is more plausible earlier in human evolutionary history than the efficient helping found in many models. Moreover, as auxiliary opportunities to improve one’s reputation become more frequent, the communal benefits provided in equilibrium grow, although NIR becomes harder to maintain. This analysis suggests that NIR could have fostered prosociality across a broader spectrum of primate societies and set the stage for the evolution of more complex forms of positive cooperation.
Significance statement The evolutionary origins of human cooperation and prosociality remain an evolutionary puzzle. Theoretical models exploring the dynamics which shaped our ancestors’ interactions stimulate empirical investigations by anthropologists, primatologists, psychologists, archeologists and others, whose results in turn refine and direct theoretical inquiry. Common experience has focused this scholarly synergy on positive cooperation (cooperating by helping) and largely neglected the distinct and important challenge of negative cooperation (cooperating by not exploiting). Our contribution puts negative cooperation back in the spotlight. We outline what makes negative cooperation, especially negative indirect reciprocity, different and potentially more potent than positive cooperation, and present a simple model of how it emerges, shapes interactions, and can form a dynamic foundation that catalyzes more sophisticated forms of cooperation.
Keywords Evolution . Negative indirect reciprocity . Cooperation . Reputation . Social norms

Communicated by F. Amici
This article is a contribution to the Topical Collection An evolutionary perspective on the development of primate sociality - Guest Editors: Federica Amici and Anja Widdig
Electronic supplementary material The online version of this article ( contains supplementary material, which is available to authorized users.
* Joseph Henrich [email protected]
1 Department of Psychology, Harvard University, Cambridge, MA, USA

2 Tulita, Canada 3 Department of Human Evolutionary Biology, Harvard University,
Cambridge, MA, USA 4 Canadian Institute for Advanced Research, Toronto, ON, Canada

78 Page 2 of 14
On a small island in the northwestern corner of the Fijian archipelago, subsistence-oriented farmers and fishers cooperate intensely in many domains of life. The villagers on Yasawa Island reliably show up to work on communal projects such as cleaning up the village, constructing communal buildings, and preparing for public feasts. Such collective activities happen at least weekly, and Yasawans work hard with good cheer and laughter. Yasawan geniality is evident even in experimental paradigms used to measure prosociality; they make equitable offers in dictator, ultimatum, and third-party punishment games, approaching those of Western populations (Henrich and Henrich 2014); yet, unlike Westerners, they generally will not pay to punish or sanction in these experiments. This way of life stands in stark contrast to many other small-scale populations—like the Matsigenka of Peru or the Mapuche of Chile—where folks are wary of communal work and collective action in large groups, making it virtually impossible to assemble labor forces to perform tasks similar to those routinely performed in Yasawan villages; not surprisingly, people in these populations are far less equitable to their fellow villagers in experiments compared with Yasawans (Henrich et al. 2001, 2005; Henrich and Smith 2004).
How is Yasawan cooperation maintained? Some classic theories about the evolution of cooperation imply that prosociality can be driven by direct reciprocity or costly punishment, that is, by overt retaliation in the same kind of economic interaction or by individually costly actions taken by observers. But while this behavior is systematically observed in experiments with Western participants (Ensminger and Henrich 2014), it is far less common or non-existent among the Yasawans (Henrich and Henrich 2014). Instead, systemic interviews and vignette studies reveal that in rare instances where an individual consistently does not contribute to village affairs, their reputation is damaged by gossip, and they are sanctioned by anonymous punishment such as the theft of their crops, often carried out by those with preexisting grudges. Such acts, which provide benefits to the punishers, would normally be investigated by the community—but when the targeted individual has a bad reputation, the community looks the other way. In this world, it is only bad to do bad things to good (or well-reputed) people. In this paper, we formally explore how this mechanism of negative indirect reciprocity can simultaneously control harmful exploitative behaviors and sustain norm adherence (including socially beneficial cooperation) in other domains.
From a wider perspective, human cooperation is peculiar in several ways. Unlike other species, humans not only cooperate more broadly and intensively than other species, but the extent of this cooperation varies dramatically across diverse domains (e.g., in fishing, house building, and war) as well as among societies, including those inhabiting identical environments. Moreover, the scale of human cooperation has

Behav Ecol Sociobiol

(2019) 73:78

expanded dramatically over the last 12 millennia in patterns and at speeds that cannot be accounted for by genetic evolution (Henrich and Henrich 2007; Chudek and Henrich 2011). Consequently, a proper evolutionary approach to human cooperation must seat our species within the natural world, subject to both natural selection and phylogenetic constraints, while at the same time, proposing hypotheses that account for the unique evolutionary, developmental, psychological, and historical features of human cooperation.
Aiming to address the puzzle of human ultra-sociality, many formal evolutionary models of cooperation make assumptions about the cognitive abilities of potential cooperators. Some, such as kinship (Hamilton 1964) and direct reciprocity (Trivers 1971; Axelrod and Hamilton 1981), presuppose few cognitive prerequisites but only explain cooperation under special conditions—among kin, or in very small groups (Boyd and Richerson 1988a, b). Other models tackle the challenge of explaining distinctly human forms of cooperation but do so by presupposing a cognitively sophisticated, highly cultural species. For instance, important models assume that people can establish sophisticated institutions (Sigmund et al. 2010), interpret one another’s signals of cooperative intent (Boyd et al. 2010), or coordinate their community-wide definitions of deserving Brecipients^ and responsible Bdonors^ (Leimar and Hammerstein 2001; Panchanathan and Boyd 2004; Boyd et al. 2010). By emphasizing the evolution of positive cooperation (reciprocal helping), these models also presuppose relatively harmonious communities where the benefits of mutual aid can accumulate and shape long term fitness without being rapidly undermined by opportunistic exploitation, such as theft or rape.
Though they demonstrate how human cooperation may have rapidly escalated, these models gloss over the critical earliest stages of the emergence of human cooperation, since harmonious communities which coordinate complex cognitive representations (e.g., who is a Bdonor^), establish institutions, and dynamically signal their behavioral intentions in novel domains are themselves impressive cooperative accomplishments. Explaining the origins of such communities while assuming only minimal cognitive prerequisites (consistent with what is known about primate cognition) remains an outstanding challenge. To address this challenge, we detail an evolutionary mechanism that rapidly coordinates expectations and behavior in arbitrary domains (e.g., hunting, sharing information, trade) and yet can arise without preexisting capacities for coordinating complex institutions or socially prescribed roles.
Of these approaches to human cooperation, one important class of models is based on Bindirect reciprocity^ (IR; e.g., Nowak and Sigmund 1998, 2005; Leimar and Hammerstein 2001; Panchanathan and Boyd 2003). Prima facie IR models assume only that (i) individuals have opinions of one another and that these opinions (ii) influence how individuals treat

Behav Ecol Sociobiol

(2019) 73:78

Page 3 of 14 78

each other and (iii) can be culturally transmitted. Since many primates form coalitions with non-kin (Silk 2002; Watts 2002; Langergraber et al. 2007; Perry and Manson 2008; Higham and Maestripieri 2010), the first two assumptions are plausible socio-cognitive pre-adaptations in our Pliocene ancestors. The third assumption is also plausible if our early cognitive adaptations for cultural learning (e.g., for acquiring food preferences) spilled over into other domains, producing individuals who sometimes culturally acquired their opinions of one another (Henrich and Gil-White 2001). The cultural transmission of social opinions can transform pairwise coalitional affiliations into community-wide Breputations^. Once reputations had fitness consequences, they could begin shaping behavior in any reputation-relevant domain (Panchanathan and Boyd 2004), stabilizing conformity to arbitrary community norms and providing the substrate for the more complex cooperation-sustaining mechanisms that presuppose coordinated communities (Chudek and Henrich 2011; Henrich 2016, Chap. 11). Crucially, such culture-driven forms of genetic evolution do not emerge in most species due to the barriers to evolving cumulative cultural evolution (Boyd and Richerson 1996; Henrich 2016, Chap. 16).
However, existing IR models make substantially stronger assumptions about the cognitive sophistication and social coordination capacities of our ancestors. Framed in the context of reciprocal helping, these models assume that sometimes someone has an opportunity to help but does nothing, and that their reputation worsens as a consequence of their inaction. This seemingly innocuous assumption implies that their peers cognitively represent, and coordinate their representations of both the abstract opportunity to act and the significance of inaction. This is a sophisticated cognitive feat. Noting this issue, Leimar and Hammerstein (2001) write that IR models assume Ba reasonably fair and efficient mechanism of assigning donors and recipients […] a well-organized society, with a fair amount of agreement between its members as to which circumstances define [these] roles^. Most IR models implicitly mirror these assumptions (Nowak and Sigmund 1998; Panchanathan and Boyd 2003).
Here we ask whether IR is plausible without assuming coordinated reactions to Binaction^. We develop a general model of IR, which incorporates the possibility that reputations are regularly buffeted by random external influences, but inaction never changes reputations. Our results show that IR is nevertheless plausible under these circumstances and can support adherence to community norms in other domains. We demonstrate how early proto-reputations (byproducts of cultural learning and coalitional psychology) can escalate in importance until they form the substrate of more complex forms of cooperation.
Since we are interested in modeling the earliest forms of distinctly human cooperation, we focus on Bnegative indirect reciprocity^ (hereafter, NIR), which has rarely been the focus of study. BNegative reciprocity^ broadly denotes retaliation in response to another’s uncooperative behavior (e.g., Fehr and

Gächter 2000). NIR extends this retaliation to depend on the other person’s reputation, and hence indirectly on their behavior. Such punitive interactions take place in negative cooperative dilemmas, where Bdefecting^ means gainfully exploiting someone and Bcooperating^ means seeing such an opportunity to exploit someone, but passing it up (doing nothing)—though note that reputations (and hence retribution) are allowed to be contingent on behavior in other positive dilemmas in addition to the focal negative one. Typical models treat negative dilemmas as merely the symmetrical flip-side of standard (positive) cooperative dilemmas due to their equivalent payoff matrices. However, there are both theoretical and empirical reasons to think that negative dilemmas are psychologically distinct scenarios that were particularly potent early in the evolution of human cooperation:
1. Substantial positive cooperation presupposes harmonious communities: Before more complex forms of mutual aid, defense, and helping can emerge, the ubiquitous opportunities to exploit each other (particularly the old, weak, and injured) must be brought under control. Otherwise, exploitation and cycles of revenge will undermine positive cooperation. A degree of harmony must come first.
2. Positive cooperation creates or exacerbates negative dilemmas (but not the reverse): Positive cooperation will often create an abundance of exploitable resources, both tangible (e.g., food caches) and intangible (e.g., trust). If cooperation has not first been stabilized in negative dilemmas, escalating opportunities for exploitation can quickly sap these benefits, sabotaging the viability of positive cooperation. For example, our band might cooperate to create a community store of food for the winter. But, then, over several wintery months, nightly thieves might slowly pilfer it away.
3. Escalating returns: Prior to the emergence of complex institutions like debt, money, or police, if a well-reputed individual is helped multiple times (i.e., by multiple peers), they are likely to experience diminishing marginal returns. A little food when you are starving provides a huge benefit, whereas a lot of food when you are full provides only incremental benefits. On the other hand, repeated exploitation (e.g., stealing someone’s resources) can put victims in ever more dire situations with escalating fitness consequences (e.g., the repeated theft of food from the hungry and weak). This suggests that in the IR context, where many community members respond to a focal well- or ill-reputed individual, negative dilemmas likely generate steeper selection gradients. This was likely most relevant earlier in our evolutionary history, before widespread food-sharing norms emerged (likely an early form of positive cooperation).
4. No chicken and egg problem: In a positive cooperative dilemma, when inaction is unobservable or there is a lack of sufficient agreement about what constitutes Binaction^, an individual’s reputation can endogenously rise (by helping), but it cannot effectively fall through inaction.

78 Page 4 of 14
Though an individual’s reputation might fall accidentally, selection will not favor individuals who take deliberate costly actions to worsen their reputation. Clearly, reputation has little value until it can fall as well as rise; but without complex culturally evolved institutions or cognitive abilities to establish agreement about what constitutes Binaction^, it is not clear how positive indirect reciprocity gets off the ground—there is a chicken and egg situation. Negative dilemmas lack the chicken and egg quality because Bdefections^ (e.g., stealing food from the injured) are salient and observable actions. 5. Relevance to culture: The cooperative dilemma of cultural learning (whether to trust information shared by others, and whether to share information honestly) is a major hurdle to more sophisticated institutional forms of cooperation and is a fundamentally negative dilemma. Individuals must pass up opportunities to gainfully deceive their credulous conspecifics. This dilemma is all the worse for more culture-dependent species. Negative dilemmas related to sharing cultural information must be solved to unleash powerful forms of cumulative cultural evolution (Henrich 2016). 6. Preadaptations are more plausible: The cognitive capacities for navigating negative dilemmas (noticing and responding to opportunities to gain benefits by exploiting others) yield individual advantages and so were likely better honed by selection earlier than those for navigating positive dilemmas (noticing opportunities to pay costs for others’ welfare). 7. Supported by psychological evidence: Much contemporary psychological evidence points to the relevance of negative dilemmas. People today are more sensitive to harm than helping (negativity bias), and to harm by commission than by omission. Harmful or aversive actions, events, or stimuli have more and stronger effects on contemporary humans than their positive or beneficial counterparts (for reviews, see Cacioppo and Berntson 1994; Baumeister et al. 2001; Rozin and Royzman 2001). Of particular relevance, negative information (i.e., about others’ harmful acts) has a far more potent effect on reputations than positive information (Fiske 1980; Skowronski and Carlston 1987; Rozin and Royzman 2001), and people judge that others caused negative outcomes more intentionally than positive ones (Knobe 2003, 2010). Young children and even three-month-old infants find wrongdoers more aversive than they find helpers appealing (Hamlin et al. 2010; Tasimi et al. 2017). If our ancestors were as negativity-biased as we are, negative cooperative dilemmas would have dwarfed positive ones in determining the long-run distribution of reputations. People condemn others’ moral transgressions more severely when they are the result of deliberate actions, compared with equal but intentional inactions

Behav Ecol Sociobiol

(2019) 73:78

(Spranca et al. 1991; Baron and Ritov 2004; Cushman et al. 2006). Correspondingly, people seem less disposed to transgress by commission than omission (Ritov and Baron 1999), especially if they might be punished by others (DeScioli et al. 2011). These effects, which seem peculiar to negative commissions (Spranca et al. 1991) not positive ones, support our model’s emphasis on negative cooperation by commission alone.

We are interested in whether detrimental exploitation can be curbed with a simple form of reputation that demands only limited cognitive capacities, and whether this can be used to sustain communal contributions and adherence to norms in other interactions. To tackle this puzzle, we construct a model of negative indirect reciprocity (NIR) where we analyze interactions between very different kinds of individuals, such as reputation-contingent cooperators who always cooperate with well-reputed individuals or obligate defectors who exploit at every turn. We can thus reason formally about what kinds of strategies would be favored by selective evolutionary processes, whether via genetic or cultural evolution. Fig. 1 lays out the basic elements of our NIR model. We first solve the model and describe its properties, and then discuss the degree of public goods provisioning that NIR supports.
To begin, imagine a single, large population of individuals who each have a Breputation^—a community-wide opinion of them that can influence others’ behavior—which can be either Bgood^ or Bbad^. We represent this reputation as a binary stochastic variable whose stationary distribution (denoted G) is the probability of being Bgood^ on average. Reputations are determined by a person’s actions in two kinds of social situations: with probability (1 − ρ), chance furnishes each individual in the population with an opportunity to gainfully exploit (and potentially be exploited by) a random peer; with probability ρ, individuals instead face an opportunity to improve their reputation by paying a cost. We refer to the former as the Btheft game^ and the latter as the Bcontribution game^. The parameter ρ expresses the relative frequency with which each scenario occurs.
In the theft game, people can choose either to exploit their peers (X = 1) to accrue a personal gain (the takings, t) at the expense of the victim who suffers harm (damage, d), or do nothing (X = 0). Important reputational implications follow in each case. If an individual chooses exploitation, we assume that the thief’s reputation declines only if the victim has a good reputation in the community—people do not care about what happens to poorly regarded victims. Thus, in this model and under IR more generally, individuals with Bgood^ reputations are defined as those publicly well-liked enough, with enough friends, allies, or social connections, that actions directed towards them carry

Behav Ecol Sociobiol

(2019) 73:78

Page 5 of 14 78

Parameters: , , , Consequences: , , , Evolving behavioural dispositions: ,
Social opportunity 1−

Contribution game Opportunity for deliberate
reputation improvement
Theft game


Peers apathetic

(Nothing changes)


Do nothing

Peers disappointed (Reputation worsens)

Pay for reputation improvement (Pay c-ost for public b-enefit) Misperceived w/ probability

Target has bad reputation

Exploit (Inflict d-amage, earn t-akings)

Opportunity to exploit

1− Target has good reputation

Do nothing (Nothing changes) Misperceived w/ probability
Exploit (Inflict d-amage, earn t-akings, reputation worsens)

Fig. 1 The NIR decision tree. The probability of each branch is described by blue parameters, and evolving dispositions are represented by green variables (Y, disposition to pay reputation improvement costs; and X,

disposition to exploit well-reputed victims). Red text at terminal nodes describes the consequences of each outcome

reputational consequences. If you exploit someone with a good reputation you acquire a bad reputation. If an individual chooses instead not to exploit a potential target, we assume that no one notices their inaction and nothing changes (assuming their propriety is correctly perceived). This novel assumption lessens the cognitive sophistication assumed by our model relative to existing IR models. With probability η, an individual’s reputation is misperceived such that someone who refrains from exploitation is mistakenly thought to have defected.
In the contribution game, people can choose to either pay to improve their reputation (Y = 1) by contributing a public benefit b at personal cost c or do nothing (Y = 0). To deliberately improve your peers’ opinion of you, you need to know what pleases them as a group. This naturally suggests provisioning public goods (providing for a public feast, communal defense, or chasing away pests or predators) but could also include conformity to others’ preferred behavioral standards and imitation of the best-reputed individuals (and so b need not be positive). Here, to better understand how the socio-ecology of NIR unfolds once norms have become established, we consider the possibility that forfeiting an opportunity to improve one’s reputation (e.g., by not sharing a fortunate day’s catch), whether deliberately or by accident, actually worsens one’s reputation (with probability ζ). As ζ increases, voluntary cooperative contributions become mandatory or normatively cooperative actions—think about giving to charity versus paying taxes. This parameter also nests the possibility that inaction is ignored as before (when ζ = 0). Additionally, following Panchanathan and Boyd (2004), we allow for positive assortment in group formation with strength r, so that the probability of encountering another person of the same type (equivalently, the expected fraction of individuals of the same type in the group) is r + (1 − r)p where p is the frequency of that strategy in the population (and the complementary probability is (1 − r)(1 − p)). Finally, we assume that individuals who

try to improve their reputation can accidentally be misperceived with probability ε as having made no such attempt, though the cost is still exacted and the benefit still produced.
We consider four different strategies defined by their behavior in each game:
1) Obligate defectors (D) who exploit everyone and never contribute (X = 1; Y = 0),
2) Reputational cooperators (R) who never exploit the wellreputed and always contribute (X = 0; Y = 1),
3) Stingy types (S) who never exploit the well-reputed but also do not contribute (X = 0; Y = 0), and
4) Mafiosos (M) who exploit everyone but also contribute (X = 1; Y = 1).
Since obligate cooperators (who contribute and never exploit anyone regardless of their reputation) are dominated by reputational cooperators (see section 4 of the supplemental materials), we do not consider them further. Our main analysis establishes conditions under which a population of reputational cooperators is stable against rare invaders of each type (stability conditions for all other strategies are provided in section 5 of the supplemental materials).
Stability of reputational cooperator population against defector invasion
In a population of common R with rare D playing the contribution game, an individual with strategy R gains benefit b from interaction with other Rs and always pays the contribution cost c. In the theft game, they gain takings t when encountering another

78 Page 6 of 14

Behav Ecol Sociobiol

(2019) 73:78

individual who is in bad standing and suffer damage d when they are themselves in bad standing (since that is the only time other Rs will exploit them). The (long-run mean) fitness of R here is thus
wR ¼ ρfbðr þ ð1−rÞpRÞ−cg þ ð1−ρÞftð1−GRÞ−dð1−GRÞg;
where pR ≈ 1 is the population frequency of R, and GR is the (steady state) probability that an R strategist is in good standing. An individual with strategy D also gains b when they interact with Rs, but never pays c in the contribution game. They always exploit others in the theft game and hence always gain t, but lose d when they are in bad standing. The fitness of D is thus
wD ¼ ρfbð1−rÞpRg þ ð1−ρÞft−dð1−GDÞg;
where GD is the probability that someone playing D is in good standing.
In the long run, the probability of an agent having a good reputation is well approximated by the mean of its stationary distribution; that is, G ¼ PgPþgPb where Pg and Pb are the probabilities of good and bad reputational transitions. An individual arrives at good standing only by paying for reputation and being correctly perceived as such, so Pg = ρY(1 − ε). They fall to bad standing by failing to pay when the community cares or by stealing from someone in good standing (or being misperceived as having committed either transgression), so in a population of Rs, Pb = ρ[(1 − Y) + Yε]ζ + (1 − ρ)GR[X + (1 − X)η]. Thus,

Gi ¼

ρY ið1−εÞ


ρ½Y ið1−εÞð1−ζÞ þ ζŠ þ ð1−ρÞGR½X i þ ð1−X iÞηŠ

so GD = 0, and GR ¼ ρð1−εð1−ρζðÞ1Þ−þεðÞ1−ρÞGRη is the solution to the



ð1−ρÞηG2R þ ρð1−εð1−ζÞÞGR−ρð1−εÞ ¼ 0. This solution is

opaque and hard to interpret analytically (though written out

in section 3 of the supplemental materials)—so, in what fol-

lows, we will develop bounds that approximate the solution and

depict its properties more clearly. Note that when errors are

small (ε, η → 0), GR → 1. Intuitively, this happens because Rs

never intentionally do anything that would place them in bad

standing, and always pay to improve their reputation.

R is stable against invasion by D (wR > wD) when

ρfrb−cg þ ð1−ρÞftð1−GR−1Þ−dð1−GR−ð1−GDÞÞg > 0 ð1−ρdÞ−ftd−tgGρR >ρ1fc−rbg ð1Þ > c−rb 1−ρ GR
This holds assuming that c > rb. If rb > c, cooperation will evolve simply via the non-random association captured in r. So, this formulation shows how NIR can expand the

conditions favorable to cooperation beyond r. This expression
is closely related to the basin of attraction for the R regime, pR   
> cd−−rtb 1−ρρ G1R as shown in section 2 of the supplemental materials, which also includes basins of attraction for strategy trios. To obtain a refined approximation of G1R , we first expand out its expression and subsequently assume that errors are small. By the preceding computations we have that

1 ¼ ρð1−εð1−ζÞÞ þ ð1−ρÞGRη



h  ε i 1−ρ η ¼ 1 þ ζ 1−ε þ ρ 1−ε GR;

meaning that the right-hand side of the stability condition is ρ  1  ρ h  ε i η
1−ρ GR ¼ 1−ρ 1 þ ζ 1−ε þ 1−ε GR:

When errors are small, so GR → 1, the stability condition for R to resist D is approximately


ρ h  ε i η





|cffl−{rzbffl }

1−ρ |ffl{zffl}


Ratio of net costs

Odds of

Impact of the errors

from two games


and judgements

relative to

theft game

This reflects an upper bound on the right-hand side since GR is bounded above by 1, therefore whenever our approximation (2) is satisfied, the exact condition (1) is always also satisfied; the two conditions coincide exactly when η = 0. The simulations depicted in Fig. 2 illustrate the accuracy and conservative nature of the approximation, especially when errors are small (see section 1 of the supplemental materials for extensive simulations).
This stability condition (2) holds a number of meaningful implications. First, defectors will struggle to invade when exploitation is more inefficient—yielding relatively less benefit to the exploiter (t) than the harm it does their victim (d). Intuitively, d > t when the strong and healthy steal from or injure the weak, old, and sick. Second, with positive assortment (r > 0), the most stable arrangements are those in which the contributed public benefits (b) are sufficiently large relative to the cost of provision (c), as will be discussed later. That said, even neutral or harmful norms (where b ≤ 0) can be maintained under certain (more stringent) conditions. For example, both b and r can be zero and R can still be stable. Third, public contributions can only be sustained by the disciplining force of the theft game. Hence, the latter must occur sufficiently often relative to the former, meaning ρ cannot be too large. If ρ = 0, the condition holds as long as d > t. Fourth, errors are always detrimental to

Behav Ecol Sociobiol

(2019) 73:78

Fig. 2 Minimum threshold values

of d-t/c-rb required for

reputational cooperation to be

stable against rare defectors. Non-

varied parameters are set at ρ = 12,









1 10

d −t threshold
c − rb






d −t threshold
c − rb 1.0 1.1 1.2 1.3 1.4 1.5

Page 7 of 14 78
Approximation Exact

0.0 0.2 0.4 0.6 0.8 1.0 ρ

0.0 0.2 0.4 0.6 0.8 1.0 ζ

d −t threshold
c − rb 1.0 1.1 1.2 1.3 1.4 1.5

d −t threshold
c − rb 1.0 1.1 1.2 1.3 1.4 1.5

0.00 0.05 0.10 0.15 0.20 0.25 η

0.00 0.05 0.10 0.15 0.20 0.25 ε

stability, as the right-hand side terms are increasing in ε and η. Their multiplicative relationship also implies the errors compound each other, as the effect of η (doing nothing misperceived as exploitation) is increasing in ε (contribution misperceived as inaction). Finally, intriguingly, the propensity for the community to frown on non-contribution has an adverse effect on the stability of R. Intuitively, this happens because defectors never have good reputations in the long-run, so punishment for non-contribution harms mostly cooperators that are erroneously perceived to have shirked their communal duties; this is made clear by observing that the effect of ζ relies entirely on its interaction with ε. Thus NIR appears most effective at staving off defectors in early societies, before more complex cognitive faculties have developed—but as we will see later, selection pressures entail that when people are strongly expected to contribute, the public benefits produced in equilibrium tend to be more highly valued.
Stability of reputational cooperator population against stingy invasion
In a population of common R with rare S, an individual with strategy R again has fitness
wR ¼ ρfbðr þ ð1−rÞpRÞ−cg þ ð1−ρÞftð1−GRÞ−dð1−GRÞg:
An S does not pay in the contribution game, and so earns b only when meeting Rs. They exploit only those in bad

standing in the theft game and are exploited when they are themselves in bad standing. The fitness of strategy S is thus
wS ¼ ρfbð1−rÞpRg þ ð1−ρÞftð1−GRÞ−dð1−GSÞg:
Since Ss never pay for reputational improvements, they have no other way to achieve good standing and hence GS = 0. Thus, assuming that c > rb, R is stable against invasion by S (wR > wS) when
d ρ 1 c−rb > 1−ρ GR : ð3Þ
Since t > 0, this is a less stringent version of the stability condition against defectors. Therefore, when a population of R is stable against D, it is also stable against S, and the results of the previous section apply here equivalently.
Stability of reputational cooperator population against Mafioso invasion
In a population of common R with rare Mafiosos, an individual with strategy R always gains b and pays c in any contribution event, exploits only the ill reputed in the theft game, and is exploited only when in ill repute. The fitness of R here is thus
wR ¼ ρfb−cg þ ð1−ρÞftð1−GRÞ−dð1−GRÞg:

78 Page 8 of 14

Behav Ecol Sociobiol

(2019) 73:78

An M also gains b and pays c in the contribution game but exploits everyone in the theft game and is exploited when in bad standing, and hence has fitness

wM ¼ ρfb−cg þ ð1−ρÞft−dð1−GM Þg:

Thus, R is stable against invasion by M (wR > wM) when

tð1−GR−1Þ−dð1−GR−ð1−GM ÞÞ > 0

dðGR−GM Þ > tGR


d > GR :




is À









the R regime, pR > d−t t GRG−MGM as shown in section 2 of the

supplemental materials.

Here, Ms are in good standing some of the time:

ρð1−εÞ GM ¼ ρð1−εð1−ζÞÞ þ ð1−ρÞGR ;

and recall that

ρð1−εÞ GR ¼ ρð1−εð1−ζÞÞ þ ð1−ρÞGRη :


GR−GM ¼ 1−GM ¼ 1− ρð1−εð1−ζÞÞ þ ð1−ρÞGRη



ρð1−εð1−ζÞÞ þ ð1−ρÞGR

¼ ð1−ρÞGRð1−ηÞ ; ρð1−εð1−ζÞÞ þ ð1−ρÞGR

and its reciprocal is


GR ¼ 1 1 þ ρ 1−εð1−ζÞ :

GR−GM 1−η

1−ρ GR

As before, for added insight we expand out GR, and as

shown in the appendix we obtain the approximate (upper

bound) stability condition:



d > 1 1 þ ρ ð1−εð1−ζÞÞ2 þ η : ð5Þ

t 1−η

1−ρ 1−ε

The simulations depicted in Fig. 3 indicate that this approximation mimics the properties of the exact solution (and it is indeed exact when η = 0), and several other bounds laid out in section 1 of the supplemental materials converge on similar predictions.
This stability condition (5) has several interesting implications. First, as in the case of the defector invasion, the existence of reputation-based cooperation requires exploitation to

be inefficient (d > t). Second, the costs and benefits in the

contribution game are not relevant here because both types

pay for reputation. Third, as before, contributions are

sustained by the threat of punishment via exploitation in the

theft game, so 1 − ρ must be reasonably large. Fourth, positive

expectations of contributing still make cooperation harder to



of the ÁÁ









ρ 1−ρ

2ε 1−η


ε 1−ε

which is always positive and cru-

cially dependent on ε.

More surprisingly, in some cases, errors can be beneficial

for reputational cooperators. While η always has a strong ad-

verse effect that magnifies the threshold, a higher ε can actu-

ally be advantageous. Intuitively, this happens because al-

though errors in the contribution game are bad for both strat-

egies, they can be even worse for Mafiosos because they often

fall into disrepute due to their exploitative ways, and are thus

more in need of a reliable path back to good standing. This

effect turns out to be beneficial on net when non-contribution

is not penalized, that is, when ζ is low, so that reputational

cooperators are not punished too harshly for others’ mistaken

perceptions. To illustrate this mathematically, observe that the

key middle term ð1−ε1ð−1−εζÞÞ2 reflecting the interaction is 1 − ε










1 1−ε





(which is increasing in ε). More generally, the derivative of

the hright-hand side iwith respect to ε is 1−1η 1−ρρ ð2ζ−ð1−εð1ð−1ζ−ÞεÞÞÞ2ð1−εð1−ζÞÞ , which is negative when

ζ < 12−−εε. In the small error limit where ε → 0, this inequality simplifies to ζ < 12. Figure S3 in the supplemental materials
shows how the minimum stability threshold for d/t changes

with each parameter when ζ is small, depicting the reversal of

ε’s effect. This further indicates that conditions are most fa-

vorable for NIR when ζ is small.

Stages of NIR and sustainable cooperation
What are the consequences of NIR on cooperative outcomes? Through the lens of our model, we envision three progressive stages of socio-cognitive complexity, embodied in special cases of our parameters, which generate different levels of cooperation. Fig. 4 presents the logic of our perspective. We begin with a plausible situation, early in our evolutionary history. The cognitive and behavioral prerequisites for reputations are in place: individuals selectively like or dislike their peers, and care or, selectively, do not care about how third parties treat them. The cultural transmission of reputations (opinions about others) is new, on evolutionary timescales. Here, however, second-order strategic responses to the existence of fitness-relevant reputations have not arisen yet: individuals do not actively monitor others’ opinions of them or seek out opportunities to improve their reputation. In this

Behav Ecol Sociobiol

(2019) 73:78

Fig. 3 Minimum threshold values

of d/t required for reputational

cooperation to be stable against

rare Mafiosos. Non-varied pa-

rameters are set at ρ = 12, ζ = 190,






1 10

d threshold






d threshold
t 2.0 2.2 2.4 2.6 2.8 3.0

Page 9 of 14 78
Approximation Exact

0.0 0.2 0.4 0.6 0.8 1.0 ρ

0.0 0.2 0.4 0.6 0.8 1.0 ζ

d threshold
t 2.0 2.2 2.4 2.6 2.8 3.0

d threshold
t 2.0 2.2 2.4 2.6 2.8 3.0

0.00 0.05 0.10 0.15 0.20 0.25 η

0.00 0.05 0.10 0.15 0.20 0.25 ε

earliest, least cognitively demanding stage, reputations were

improved only by good fortune, not by deliberate effort. In

such an environment, even if inaction is unobservable, selec-

tion can sustain harmony. This stage 1 occurs in our model

when ρ → 0; inequality (2) reveals that reputation-based rec-

iprocity is then stable whenever d > t. Here, NIR can establish

more harmonious communities that limit exploitation of

others—the weak, injured, sick, and elderly—though no pub-

lic goods are provided in this first stage.

Even when individuals are unaware of their own reputa-

tions, oblivious to inaction and to anything that happens to the

ill-reputed, the dynamics of the first stage can coordinate the

weighty fitness consequences of community-wide exploita-

tion. This opens up a new selective landscape, where selection

favors monitoring one’s own reputation and deliberately act-

ing to improve it. We explore the unfolding of NIR dynamics

by opening up the possibility that individuals notice costly

opportunities to improve their reputation, which happens

when ρ increases above zero. We explore what happens if

opportunities for reputational improvement can be ignored

without adverse consequences (ζ → 0). In this socio-ecology

of stage 2, your peers are delighted if you share food with

them, but barely notice if you instead keep it for yourself.

Here, expression (2) entails that cooperation can be



d−t c−rb


ρ 1−ρ






positive amount of reputational norm adherence occurs, but

the resulting public benefits must be large enough to resist

defectors. Specifically, rearranging the inequality reveals that we need
1−ρ rb > c− ρ ðd−tÞ: ð6Þ

This inequality shows how the theft game eases the stan-

dard conditions for cooperation created by non-random asso-

ciation (rb > c). The smaller ρ and more inefficient theft (d − t)

is, the easier it is to maintain cooperation. The right-hand side

of (6) is increasing in ρ (supposing d > t) as its derivative with


respect to ρ is

1−ρ ρ

ðd−tÞ > 0, meaning that selection pres-

sures enforce a higher minimum benefit provided in equilib-

rium as stage 2 progresses. Figure 5 shows that this property is

shared by the exact solution (including both types of errors).

Though neutral or even harmful behaviors can potentially be

sustained when the right-hand side of the inequality is nega-

tive, positive contributions will be particularly favored. We

view this voluntary public goods provisioning as a key

transitional phase, where selection begins to favor individ-

uals who pay closer attention to their reputation and oppor-

tunities to improve it, and therefore to their community’s

behavioral expectations. To deliberately improve your rep-

utation, you need to know what pleases your peers. Stage 2

provides a plausible cognitive foundation for the emer-

gence of social norms (Chudek and Henrich 2011;

Henrich 2016).

78 Page 10 of 14
Fig. 4 Socio-cognitive stages of NIR

Behav Ecol Sociobiol

(2019) 73:78

b threshold 0 2 4 6 8 10
b threshold 01234

Fig. 5 Increase in the minimum

stability threshold for public

benefit provision b across stages

of NIR. Parameters are set at d =

1, t = 12, c = 1, r = 110, ε = η = 110,



1 2





3 10


Approximation Exact

0.0 0.2 0.4 0.6 0.8 1.0 ρ

0.0 0.2 0.4 0.6 0.8 1.0 ζ