What is cultural transmission example?

Around the world, there are thousands of different cultures. Our cultures are made up of the way people think, act, and the material items that combined together create the way of life for people (Macionis, 2017). I am a Central Illinois girl that is fortunate to live in a very positive culture with many opportunities.
In a society there are many sociological concepts that make up one's culture and their way of life. One element of culture are symbols. Symbols are things that carry a particular meaning that are recognized by the individuals who share the same culture (Macionis, 2017). In my community, visual symbols that are recognized by our community are the hornet which represents our school mascot and the dome of courthouse which represents
…show more content…
Throughout Eureka, we often use school and business marquees to communicate special events and recognitions in our community. Cultural Transmission is how one generation passes cultural to the next generation (Macionis, 2017). Where I live, we have very strong family, religious, and community ties. Passing of culture is done through consistent interaction with family members including grandparents and the elderly. The transfer of culture is also done through involvement in the church and community activities such as 4H and school organizations.Values are the standards that people use to determine what is desirable, good, beautiful, and serves as a guideline for social living ( Macionis, 2017). We value trust, hard work, family, and God. In Eureka we trust that people will be there for each other when needed. We trust that people will do the right thing. It is very common for people to leave their cars running and doors unlocked. We are raised from a young age to work hard whether its school, sports, or a job. We don’t rely on taking the easy way out. God and Family are very important values. We support our family and our commitment to God. Beliefs are a specific statement that people hold to be true…show more content…
Expectations for youth include that you work hard in school and graduate high school. This is representative in my high school class where a majority of our students have higher than a 3.0 GPA and we have 14 Valedictorians. Mores are norms that are observed and have great moral significance (Macionis, 2017). Abstinence is an example of this. With religion being a pillar of our community, there is a strong emphasis to wait to engage in sexaul activity until married. Folkways are norms for routine or casual interaction ( Macionis, 2017). In our community we have a large population of Apostolic Faith believers. It is common practice for women who are church members to wear long skirt and their hair in a bun. Social control is the attempt by society for people to regulate people’s thoughts and behavior (Macionis, 2017). An noted before, our community has a strong Apostolic presence. It is very common for parents and church elders to encourage and persuade the youth to repent and join the church. Sometimes they use the fear of going to hell to persuade them. Technology is the knowledge that people use to make a way of life in their surroundings (Macionis, 2017). Many older individuals in our culture probably do not use the advances of

Show

    Cultural transmission is the way a group of people within a society or culture tend to learn and pass on new information.

    Learning Objectives

    • Analyze the importance of cultural transmission, particularly in terms of learning styles

    Key Points

    • Learning styles are greatly influenced by how a culture socializes with its children and young people.
    • The process by which a child acquires his or her own culture is referred to as enculturation.
    • On the basis of cultural learning, people create, remember, and deal with ideas. They understand and apply specific systems of symbolic meaning.
    • A meme is “an idea, behavior or style that spreads from person to person within a culture. ” The term was coined by the British evolutionary biologist Richard Dawkins in The Selfish Gene (1976).
    • Intercultural competence is the ability to communicate successfully with people of other cultures.

    • Intercultural Competence: The ability to communicate successfully with people of other cultures.
    • Cultural Transmission: The way a group of people or animals within a society or culture tend to learn and pass on new information.
    • Symbolic Meaning: Meaning that is conveyed through language; when one knows that X means Y.

    Cultural transmission is the way a group of people or animals within a society or culture tend to learn and pass on new information. Learning styles are greatly influenced by how a culture socializes with its children and young people. The key aspect of culture is that it is not passed on biologically from the parents to the offspring, but rather learned through experience and participation. The process by which a child acquires his or her own culture is referred to as “enculturation. ” Cultural learning allows individuals to acquire skills that they would be unable to independently over the course of their lifetimes.

    Cultural learning is believed to be particularly important for humans. Humans are weaned at an early age compared to the emergence of adult dentition. The immaturity of dentition and the digestive system, the time required for growth of the brain, the rapid skeletory growth needed for the young to reach adult height and strength means that children have special digestive needs and are dependent on adults for a long period of time. This time of dependence also allows time for cultural learning to occur before passage into adulthood.

    On the basis of cultural learning, people create, remember, and deal with ideas. They understand and apply specific systems of symbolic meaning. Cultures have been compared to sets of control mechanisms, plans, recipes, rules, or instructions. Cultural differences have been found in academic motivation, achievement, learning style, conformity, and compliance. Cultural learning is dependent on innovation or the ability to create new responses to the environment and the ability to communicate or imitate the behavior of others. A meme is “an idea, behavior or style that spreads from person to person within a culture. ” A meme acts as a unit for carrying cultural ideas, symbols or practices, which can be transmitted from one mind to another through writing, speech, gestures, rituals, or other imitable phenomena. The term was coined by the British evolutionary biologist Richard Dawkins in The Selfish Gene (1976).

    Intercultural competence is the ability to communicate successfully with people of other cultures. In interactions with people from foreign cultures, a person who is interculturally competent understands culture-specific concepts in perception, thinking, feeling, and acting. The interculturally competent person considers earlier experiences free from prejudices, and has an interest in, and motivation towards, continued learning.

    The development of intercultural competence is mostly based on the individual’s experiences while communicating with different cultures. While interacting with people from other cultures, the individual generally faces certain obstacles, which are caused by differences in cultural understanding between the two people in question. Such experiences motivate the individual to work on skills that can help him communicate his point of view to an audience belonging to a completely different cultural ethnicity and background. For example, showing the thumb held upwards in certain parts of the world means “everything’s okay,” while it is understood in some Islamic countries as a rude sexual sign. Additionally, the thumb is held up to signify “one” in France and certain other European countries, where the index finger is used to signify “one” in other cultures. In India and Indonesia, it is often regarded as wishing “all the best.”

    In studies of cultural evolution, the fidelity of cultural transmission is often considered a core component of the emergence and maintenance of traditions. In this study, we address the assumptions regarding the underlying mechanisms driving transmission fidelity and present a simple model to investigate the influence of trial-and-error on these processes. We first refer to various studies that interpret fidelity as relating to imitation—learning the precise form of demonstrated actions. We suggest that the particular interpretation of imitation common to such studies does not address the role of trial-and-error processes in skill learning and its potential contribution to cultural transmission. In the next section, we focus on the inherent contradiction between the notion of fidelity as a process of exact copying and the necessity for cultural transmission to withstand different sources of variance, namely copying errors and environmental variability. We outline some of the solutions to this tension that have been posited in the literature and present a simple model in which socially mediated trial-and-error learning can lead to successful transmission, especially when copying errors and environmental variability are taken into account. Finally, we discuss the potential benefits of socially mediated trial-and-error learning and suggest that considering it as an inherent part of cultural transmission may advance our understanding of the evolution of culture.

    Social learning can lead to the emergence of cultural traditions by facilitating the spread of group-specific behavioural patterns and maintaining them in the population over the course of successive generations [1]. It is often assumed that the establishment and stability of such traditions require certain levels of copying fidelity [2–9]. The logic behind this is rather intuitive: when innovations (the invention of new behaviours or novel solutions [10]) appear within the population, faithful copying allows them to spread and persist. However, when copying is imprecise, such innovations may quickly disappear as each individual may develop its own behavioural pattern, introducing variance that can impede the formation of shared traditions. Theoretical models exploring cultural transmission dynamics suggest that the probability that an innovation will be established as a stable tradition is positively correlated with transmission fidelity [11] and that fidelity has a strong influence on the build-up of cumulative culture [7,12].

    The means by which high transmission fidelity is achieved is often attributed to imitation [2,3,5,8,9,13], emphasizing the importance of precisely copying the detailed actions of the demonstrator (reviewed in [8,14]). Imitation is often presented in striking contrast to individual learning processes, and as distinct from other mechanisms of social learning (such as emulation [15,16] and stimulus enhancement [17]), in which learners pay attention to other aspects of the task rather than to the demonstrated behaviour itself, and then fill in the gaps using trial-and-error learning [14,16]. The exact copying of demonstrated actions is assumed to bypass the potential diluting effect embedded in such mechanisms and to allow culture to be sustained over time.

    From a mechanistic point of view, the direct link between exact imitation and the fidelity of cultural transmission may not be as straightforward as it often seems. Recent accounts of imitation as an associative mechanism suggest that it develops gradually through experience [18–20]. As such, it may as well be shaped by trial-and-error learning, as attempts to replicate an observed behaviour are likely to involve deviations from it (for instance, due to memory constraints, inaccuracy in performance or even an inherent tendency for exploration). Furthermore, theories addressing the imitation of complex behaviours suggest that the same trial-and-error learning processes with which imitation is often contrasted may actually be embedded in it. Byrne & Russon [21] suggested that imitation may occur at the ‘programme level’, as individuals copy the hierarchic structure of behaviour rather than its surface form [21,22]. The extent of the similarity between observed and performed behaviours can thus vary, depending on the hierarchical level being copied; and trial-and-error learning may be important for fine-tuning the details of performance. Notably, Galef [23] suggested that imitation involves a template-matching process, in which observers create a representation of the demonstrated behaviour in their memory, and then try to match their own behaviour to this stored representation (in a process similar to song learning in birds [23,24], see also [25]). This matching phase is likely to involve trial-and-error learning, especially when the imitated behaviours are novel or complex.

    Furthermore, both empirical studies and theoretical models suggest that social learning processes involving trial-and-error learning can lead to the establishment of viable traditions [26–29]. Examples of such traditions in natural populations are accumulating: young rat pups in Israeli pine forests learn from their mothers a specific pinecone stripping technique [30]; young passerine birds copy the mating songs of adults, in a process that leads to the emergence of local dialects ([31,32], also see [33–35] for examples of vocal learning in mammals); tufted capuchins learn nut-cracking procedures socially [36]; and in various chimpanzee populations, the youngsters learn from adults how to use sticks to dig termites out of their mounds [37] or stones to crack nuts [38] (also see [39] and [40] for descriptions of a range of group-specific behaviours in chimpanzee and orangutan populations). While it is often unclear whether such instances involve imitation or emulation, in all of these documented cases individuals acquire the shared behavioural variants in a long process in which the social information is supplemented by trial-and-error learning [23].

    Similarly, human skill learning may also depend on trial-and-error. In young children, exploration play may be seen as self-generated opportunities for learning about environmental affordances [41]. As such, trial-and-error learning can be used for generative hypothesis testing, and the imitation of instrumental skills may involve variability and even innovation rather than strict and high-fidelity copying [42]. In adults, proficiency in tool manufacturing is highly dependent on the extent of previous tool-making experience [43,44] and cannot be established through observation alone as it requires deliberate practice and experimentation [45]. Thus, attempts to use or manufacture tools can also involve a prolonged exploration period, in which trial-and-error helps to guide affordance and perceptual learning [44,46]. Finally, laboratory experiments attempting to replicate cumulative cultural learning have indicated that precise imitation may not be a necessary requirement, as exposure to the end-product alone may also lead to high-fidelity copying [47] and cumulative improvements [48,49].

    Under realistic conditions, the notion of precise imitation as fundamental to cultural transmission raises additional challenges. Social learners are constantly exposed to different sources of noise and variance, which can lead to copying errors of various types. Inaccurate copying may occur due to perceptual errors and variability in demonstration quality (and visibility), as well as to differences in body size, or differences in strength and motor-coordination between observers and demonstrators, all of which may cause individuals to imprecisely perceive different aspects of the observed behaviours [50,51]. Thus, a certain level of flexibility in replication is of utmost importance in order to compensate for such errors and ensure successful transmission.

    Environmental variability is also likely to contribute to erroneous copying, because it may often lead to differences between the circumstances in which the demonstrated behaviours were observed, and those encountered during subsequent performances by social learners [52]. A young chimpanzee observing its mother as it digs termites out of the mound [37] will later face a slightly different mound or use a slightly different stick for the same purpose. It will thus be required to modify some of the details of the observed actions in order to be able to dig out the termites successfully for itself. A capuchin monkey observing a conspecific cracking a nut [36] will need to use a different stone, on a nut that differs in its size or that may be positioned at a slightly different angle, in its own attempts to crack nuts. Similarly, a hominin attempting to create a hand-axe needs to translate the observation of the construction of an axe by a fellow hominin and adjust it to the stone in hand. An ability to compensate for different sorts of perceptual errors and flexibility in applying observed behaviours to the current state are hence essential for robust and successful transmission.

    Attempts to bridge the gap between cultural stability and the effect of copying errors on transmission fidelity have linked such stability to different factors. At the population level, it has been suggested that individual variation, copying errors and low-fidelity transmission may lead to cultural stability and cumulative cultural adaptation, if they are accompanied by transmission biases and demographic factors [50,53–55]. For instance, when individuals copy the behaviour of the majority (conformity) or of the most successful individuals (prestige bias), such biases may balance the potential negative effects of inaccurate copying and, together with population size, dictate whether the culture will be preserved or lost [53,55]. At the individual level, it has been argued that culture is maintained by a process of reconstruction, in which copying errors are corrected through the use of intrinsic attractors (e.g. [56,57], a similar effect may be produced by inductive biases [58]); or, alternatively, that cultural stability is maintained through different pedagogical adaptations [59,60].

    Here, we offer a different approach and suggest that transmission fidelity is naturally achieved: not through exact copying but, rather, by compensating for copying errors. This can be realized by integrating trial-and-error into the process of social learning, which countervails the effect of inaccurate copying and leads to robust cultural transmission. We further suggest that social learning that is mediated through trial-and-error provides learners with valuable information regarding the connections between their actions and related consequences. Such information may help them to improve their performance of the specific behaviour being copied and to cope with environmental variability. We exemplify these ideas in a simple model of cultural transmission of skills that entail interaction with the physical world.

    In the following simulations, we exemplify a process in which culturally transmitted skills are spread in the population through social learning mediated by different degrees of trial-and-error. The learners in our models range from being ‘perfect imitators’—individuals that copy the demonstration precisely—to individuals that apply varying degrees of trial-and-error in their attempts to replicate the observed behaviour.

    In the simulations, social learning is modelled as a process in which an observer attempts to copy a successful demonstrator. In doing so, the observer creates a template of the demonstrated behaviour in its memory and then tries to replicate this behaviour by matching its own performance to the stored template. The observer's learning process encompasses two sources of variance: first, the template represented by the observer may be inaccurate, involving some degree of copying error; and second, the actions performed by the observer may also be inaccurate and entail deviations from the stored template, thus creating a trial-and-error range. In our simulations, we explored the interaction between these sources of variation, focusing on the process in which observers attempt to replicate the cultural trait. We allowed each observer to try repeatedly to perform the socially transmitted behaviour until it had reached a threshold criterion of successful skill acquisition. We then measured the resulting rate of spread of this skill in the population.

    The socially mediated trial-and-error learning process, through which the learners acquire the observed behaviour, is guided by feedback in the form of value to the learner. The value of the behaviour might be extrinsic—the ability to extract the nut being cracked from its shell or to successfully create a fully functioning hand-axe. However, the value may also be governed by an intrinsic motivation to copy the behavioural variant being demonstrated, for instance—due to its social value, or to an inherent motivation to copy precisely [23,61]. Note that while such a model may not suffice for retaining arbitrary cultural variants [5], under more realistic natural conditions the copying of instrumental behaviours is often goal-directed and involves the attribution of value of some sort [42,62].

    Let

    What is cultural transmission example?
    What is cultural transmission example?
    represent a behaviour implementing a certain skill, and Z(b) the value of the behaviour to the performing individual, which can be attributed to the reward obtained (environmental or other, see above). We set the optimal behaviour at 0 (an arbitrarily chosen reference point), and thus, Z receives a maximum value at Z(0) and declines with the distance of b from 0. We assume that when an individual j observes demonstrator i performing bi, it may attempt to copy it if Z(bi) ≥ W (where W is a fixed threshold value that when reached the behaviour is considered successful, see further below). In this case, the observer j will acquire a template bi′ for the copied behaviour (figure 1a). However, bi′ is not a precise copy of bi owing to a copy inaccuracy, and bi′∼ N(bi, σc). Individual j then repeatedly tries to implement bi′ by performing bj∼ N(bi′, σI).This is a trial-and-error process that ends when Z(bj) ≥ W (i.e. j has been successful in performing b). Note that for any specific value function Z (with a constant value W, for ‘sufficiently skilled’), two parameters control the dynamics of this process: σc, representing the copy inaccuracy, and σI, representing the learner's trial-and-error range. When σc = 0, the template is completely accurate, and when σI = 0, the template is exactly reproduced.

    What is cultural transmission example?

    Figure 1. Model value function and illustrated template-matching process. (a) The value function in Models 1 and 2 is a bell-curve encompassing an optimal behaviour and defined as

    What is cultural transmission example?
    (giving a value Z for a behaviour x). The yellow area depicts a range within which the performance of the behaviour is considered a successful skill performance. Vertical lines illustrate a process of social learning: red: demonstrator's actual performance; green: observer's copied template; grey (dotted): observer's trial-and-error attempts; grey (dashed): observer's successful attempt. (b) Illustration of the value function for Model 3. The value is a function of both the behaviour and the encountered environment and defined as
    What is cultural transmission example?
    (giving a value Z for a behaviour x and environment E).

    • Download figure
    • Open in new tab
    • Download PowerPoint

    As we are interested in the interplay between σc and σI, we set the value function as the Gaussian:

    What is cultural transmission example?
    , and set W as 95% of max(Z) (figure 1a). We consider a population of 100 individuals and inspect the spread of a single innovation in that population. In each iteration, each uninformed individual (observer) encounters a randomly selected peer (demonstrator). If the peer is a successful demonstrator (i.e. obtained the value of at least W in the previous iteration), the observer will copy the demonstrator's behaviour, obtaining a template for future implementation. In each iteration, each informed individual, which has already obtained a template, attempts to implement its acquired template. If its last attempt has been successful (rewarded with at least W), it will replicate its previous behaviour (by repeating its previous offset from the template: this constitutes a precise replication, in Models 1 and 2, and will depend on the current environment in Model 3; see below). Otherwise, it will try to replicate its template with a possible error, determined by σI. This trial-and-error process involves stochastic attempts, but may also incorporate learning (in Models 2 and 3), as the observer gradually updates its template according to the reward obtained by its own actions. We measure the assimilation time of the skill: the time until at least 95% of the individuals have demonstrated a skilful behaviour (i.e. were rewarded with at least W for their behaviour) and inspect various combinations of σc and σI.

    We outline three levels of complexity:

    We first simulate the model as described above, where individuals learn socially through inaccurate copying and then apply trial-and-error in a stochastic manner (depending on the value of σI). Repeated attempts are conducted until the learner manages to perform the observed behaviour successfully. Note that in this model, the observer does not learn from its own attempts, but simply continues to try to replicate the socially acquired template, until it reaches the threshold criterion. This model can exemplify, for instance, a young capuchin trying to copy a skilful adult cracking a nut. The template the youngster acquires might represent the strength of the strike, or some other relevant behavioural feature, and could be inaccurate, depending on the value of σc. This inaccuracy (or copying error) can be caused by differences in viewing angle, size appreciation and physical differences between the two capuchins, or the size of the stone being used, etc. The young capuchin then repeatedly attempts to grab a nut and crack it, attempting to reproduce the strike strength represented by the template it has acquired. The similarity between the strength of its strikes and its stored template varies stochastically, depending on the value of σI.

    In the second version of the model, we consider the more realistic possibility that individuals learn during the process of trial-and-error and can thus gradually improve and direct their attempts based on previous experience. Following a short initial sampling period (of 10 attempts), the learners in this model begin to update their template constantly to the mean of their most successful attempts so far (in the simulations: the top-rewarded 25% attempts). If we return to our capuchin example, the learner now gradually adjusts the strength of its strikes according to its most productive attempts. This might be a more realistic scenario than that proposed in Model 1, as individuals constantly learn about the consequences of their actions on the environment.

    Note that while this ‘updating’ method may work well in a stable environment in which the task the learner faces is always identical, it may be less efficient when the environment is variable. As ‘no man ever steps into the same river twice’ (Heraclitus of Ephesus), the learner may encounter a slightly different version of the task every time it tries to perform it. In the third, following, version of the model, we sought to test the learners' achievements under such varying environmental conditions.

    In the third model, we consider the effect of environmental variability: in each attempt the learner encounters a slightly different variant of the task and needs to adjust its behaviour accordingly. In its attempts to perform the task, the learner can now gain information regarding the relationship between its own behaviour, the relevant environmental features towards which the behaviour is being applied and the consequent reward. Learning about such relationships may help it to direct its future attempts more efficiently.

    In this model, the characteristic value function is a two-dimensional function (Z(b, E)) that is also dependent on a varying environmental factor E that is normally distributed (figure 1b; the distribution of E is the same in all simulations). Thus, in each attempt, the learner encounters a slightly different environmental feature. For the sake of simplicity, we model the relationship between the behaviour and its consequences as a linear relationship between the peak of the value function and the environmental factor E. Similarly to Model 2, in its attempts to perform the task the learner can update its template according to its accumulating experience. However, in this model, the template of the behaviour is a linear function of the environmental factor. Thus, rather than simply averaging its top-rewarded attempts, the learner performs a linear regression on these attempts (to estimate the slope a, in

    What is cultural transmission example?
    ), and uses the relation between the behaviour and the environment in order to update the template. Note that when the environment is constant, Model 3 reduces to Model 2, i.e. ∀b Z(b, E) = Z(b, 0). Returning to our hypothetical example, as the young capuchin attempts to crack nuts, it may encounter nuts of different sizes. Now, rather than searching for the optimal strike strength, the capuchin should fit each strike to the specific nut at hand (E now represents some relevant feature of the nut, such as its size). In its attempts to solve the task, it may learn that the bigger the nut, the stronger the effective strike strength must be.

    Finally, we used Model 3 to simulate a situation in which, following an initial learning period, the variance of the environmental factor sharply increases. This is analogous to a situation in which the young capuchin learns the nut-cracking skill in a specific location (such as in the vicinity of a limited variety of trees), or at a specific time of the year, and then encounters a greater variety of relevant trees or experiences seasonal changes.

    The simulations demonstrate how under even mild copying inaccuracy, a certain degree of trial-and-error is necessary for the spread of a skill in the population. In Models 1 and 2, ‘exact imitators’ (learners whose σI = 0) fail to assimilate the seeded innovation, unless the copying error is exceptionally small (figure 2a,b). In Model 3, exact imitation does not lead to the assimilation of the culture, regardless of the learners' copying error (figure 2c,d). Nevertheless, large trial-and-error deviations also impede the assimilation process.

    What is cultural transmission example?

    Figure 2. The effect of copying inaccuracy and trial-and-error on assimilation rates of the seeded innovation. Assimilation time (number of iterations until 95% of the population acquired the skill) for different combinations of σI (trial-and-error range) and σc (copy inaccuracy) in simulations of (a) Model 1: stochastic trial-and-error; (b) Model 2: trial-and-error with template updating and (c) Model 3: trial-and-error learning in a variable environment. Colours indicate the assimilation time. The white area depicts simulations in which the innovation did not reach the assimilation threshold after 500 iterations. The small circles and their respective regression lines present the fastest assimilation for each value of σc. (d) Assimilation time for different values of σI when σc = 0 in Model 3. Note that, in this model, some degree of trial-and-error produces the most efficient cultural transmission, even without any copying error.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Intuitively, in all three models, the highest acquisition rates are encountered when σc is correlated with σI (figure 2; regression lines). When σI is too small, the learners fail to correct their copying errors. When σI is too large, the attempts to reproduce the copied behaviour are noisy, reducing the chances of reaching a successful performance. In Model 1, where trial-and-error is applied in a stochastic manner, the highest rate is achieved, not surprisingly, when σI = σc.

    In Model 2, when individuals incorporate learning from their attempts to replicate the desired behaviour, trial-and-error enables faster skill acquisition than in Model 1 (33% shorter assimilation time on average in the simulations) and within a larger range of σI (figure 2b). Specifically, in this scenario, each learner tries to replicate its most beneficial actions, which leads to a constant improvement in its performance. This gradually reduces the gap between the template and the target behaviour, allowing lower values of σI to lead to efficient acquisition. In this learning process, an overly widespread sample (high σI values) encapsulates relatively little information as it includes many unrewarded behaviours, while too small σI values slow down the template updating process. Overall, trial-and-error, in this scenario, provides the learner with a spread-out sample of the possible behavioural variants and their resulting rewards, allowing a rapid convergence towards a successful exhibition of the learned skill.

    Model 3 demonstrates the power of trial-and-error learning to uncover hidden connections between environmental factors and successful behaviours. When the environment is variable, the learner has an opportunity to relate the outcome of its behaviour to environmental features. In such circumstances, the efficiency of a specific trial-and-error variance (σI) is more dependent on environmental variability (which we maintained the same for all simulations) than on the extent of copy inaccuracy (σc); and the effect of the copying error is reduced (figures 2c and 3a). Even when the template is accurately copied (σc = 0), some degree of trial-and-error is necessary for the culture to be assimilated (figure 2d).

    In this scenario, a spread-out sample of the interactions between the behaviour and the environment facilitates a better estimation of their relationship. Note that when the learner does not take into account the environmental variability (as it did in Models 1 and 2), its learning will be inefficient and the skill will not be assimilated into the population (figure 3c).

    What is cultural transmission example?

    Figure 3. Simulations of Model 3 under different conditions. (a) Examples of four simulations of Model 3, with different combinations of σI and σc. (b) The effect of changes in environmental variance: examples of three simulations of Model 3, with different combinations of σI and σc, where the variance of the environmental factor increased sharply in iteration no. 250. Note how the top line shows a recovering population, the middle one a partially recovering population, and the lowest line a collapse of the cultural trait. (c) Simulations of Model 3 with different types of learners. Blue: learners that consider environmental variability (as defined in Model 3); red: learners that update their template as in Model 2 (and ignore environmental variability); yellow: a stochastic trial-and-error observer (as in Model 1).

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Populations with different σI levels react differently to a sudden increase in environmental variability (figure 3b). In populations with low levels of trial-and-error variance (σI), most skilful individuals lose their skills owing to the low quality of their constructed templates (i.e. they fail to generalize the relationship between the behaviour and the environment). By contrast, in populations with high σI values, the skilful individuals quickly adapt to the newly encountered environments. This is due to a combination of a higher-quality template (i.e. a better estimation of the underlying environment–behaviour interactions) and the flexibility inherent to their tendency to deviate from the template.

    The outlined models illustrate that trial-and-error can promote robust cultural transmission of seeded innovations as it helps learners to correct copying inaccuracies. When faced with such inaccuracies, certain levels of trial-and-error increase the rate at which the innovation spreads within the population, while ‘exact imitators’ fail to assimilate the observed behaviour. Our models further illustrate that trial-and-error enables individuals to gain information regarding the structure of the cultural trait and can help learners to cope better with environmental variability.

    In these models, one of the main contributors to the failure to assimilate the learned behaviours is that of the difficulty encountered in compensating for reduced template quality. In nature, the construction of a poor-quality template may result from a range of potential perceptual errors and differences between observers and demonstrators (e.g. physical attributes or abilities, as noted above). However, the quality of the template is also affected by different cognitive capacities, which may substantially vary between species [63]. Animals may differ in their motivation to copy the behaviour of others, and in the extent to which they pay attention, or give weight, to relevant social stimuli [64,65]. Such input mechanisms will dictate which aspects of the observed behaviour will most attract the learner's attention, perhaps causing its representation of this behaviour to be incomplete. For instance, the learner might only note the body movements of the demonstrator, the manipulated objects, other elements of the behaviour or any combination of these possibilities (e.g. [16], also see [25]). Finally, the quality of the template is also likely to be affected by memory constraints, which may also differ among species [64,65]. As noted above, when trial-and-error learning is incorporated into the social learning process, it can mitigate the effect of copying inaccuracies, at least to some extent. Notably, when copying errors are constantly balanced by trial-and-error, stable cultures may arise even in the absence of social biases such as conformity or prestige-based copying (e.g. [50,53]).

    Alongside the compensation for copying inaccuracies, socially mediated trial-and-error can provide learners with additional advantages. First, this process yields a spread-out sample of the possible behavioural variants and their consequences (as illustrated in Model 2), and can enhance the understanding of object affordances and states and their interaction with bodily movements (e.g. [41,66]). Second, having a diverse sample can enable the learner to evaluate the covariance between environmental factors and successful actions. Realization of such dependencies facilitates generalization, and coping with environmental variability and instability (as illustrated in Model 3), and allows individuals to adjust their behaviour to changes in circumstances (e.g. [67]). This may be even more pronounced when the behaviour and the environment are multidimensional, and the covariance may encompass multiple dependencies. Furthermore, trial-and-error may also enable deliberate experimentation [41,45] and provide information about the conditions that involve only partial success, or even failure, in performance. Finally, social learning itself can be more effective when the information is gained through self-experience [68].

    Trial-and-error may be especially important when social learning opportunities are confined in time or space. In many species, the period in which social learning of new skills is particularly likely to occur is during the early life stages, when young individuals follow their knowledgeable parents (or other adults). At this stage, they are often exposed to social information that may not be available later on in life (for instance, owing to reduced social tolerance, solitary life stages, etc.; e.g. [30,37,69–71]). These young individuals may also only be exposed to a limited part of the environment, for instance owing to their development during a specific temporal season or within a confined home range. Thus, the environment they experience when opportunities for social learning abound may not encompass the whole range of variability they are likely to encounter later on in life. Social learning that is mediated through trial-and-error may be especially beneficial in such cases, as it can be more robust and expand the learners' acquaintance with relevant environmental features. Such learning can thus be perceived as part of the general tendency of young individuals (including human children) to explore and to engage in play behaviour [72].

    Note that the extent of trial-and-error may not be fixed, and may vary between contexts: in human children, it has been shown that the fidelity of imitation decreases when the socially acquired behaviours are presented in instrumental rather than conventional contexts [42,73,74], and changes with age [74,75], or according to demonstration efficacy [75]. Naturally, a too large extent of trial-and-error is likely to diminish the influence of social information and impede the process of cultural transmission.

    Furthermore, trial-and-error processes are often assumed to incur energetic costs and may lead individual learners to less profitable outcomes or to potentially dangerous exploration [2,76]. Yet, socially mediated trial-and-error learning can bypass some of these costs, when the template is not too far from the observed behaviour, and the range of attempts is restricted. Such learning promotes exploration but limits it to a specific part of the environment, which helps learners to avoid dangerous situations and refrain from futile behaviours (e.g. [77]).

    Finally, accounts of social learning as a process involving trial-and-error may also be important for our understanding of the evolution of cumulative culture. In many of the current models of cumulative culture, advancement appears either through copying errors combined with social learning biases (e.g. [50,53]) or through processes of innovation, modification and the combination of traits (e.g. [7,78–80]). Trial-and-error can facilitate a rich representation of the world and an ability to generalize among contexts, essential for creative processes [81]. As such, it can lead to both accidental innovations and creative modifications, and contribute to the gradual improvement of cumulative culture.

    Data deposited in Dryad: http://dx.doi.org/10.5061/dryad.4m518 [82].

    Both authors contributed equally to this work.

    The authors have no competing interests.

    We received no funding for this study.

    We thank Arnon Lotem, Oren Kolodny, Na'ama Aljadeff and Asaf Moran for useful comments and fruitful discussions.

    Footnotes

    One contribution of 16 to a theme issue ‘Bridging cultural gaps: interdisciplinary studies in human cultural evolution’.

    References

    • 1

      Laland KN, Hoppitt W. 2003Do animals have culture?Evol. Anthropol. 12, 150–159. (doi:10.1002/evan.10111) Crossref, ISI, Google Scholar

    • 2

      Boyd R, Richerson PJ. 1985Culture and the evolutionary process. Chicago, IL: University of Chicago Press. Google Scholar

    • 3

      Tomasello M, Kruger AC, Ratner HH. 1993Cultural learning. Behav. Brain Sci. 16, 495–511. (doi:10.1017/S0140525X0003123X) Crossref, ISI, Google Scholar

    • 4

      Whiten A, McGuigan N, Marshall-Pescini S, Hopper LM. 2009Emulation, imitation, over-imitation and the scope of culture for child and chimpanzee. Phil. Trans. R. Soc. B 364, 2417–2428. (doi:10.1098/rstb.2009.0069) Link, ISI, Google Scholar

    • 5

      Claidiere N, Sperber D. 2010Imitation explains the propagation, not the stability of animal culture. Proc. R. Soc. B 277, 651–659. (doi:10.1098/rspb.2009.1615) Link, ISI, Google Scholar

    • 6

      Mesoudi A. 2011Cultural evolution: how Darwinian theory can explain human culture and synthesize the social sciences. Chicago, IL: University of Chicago Press. Crossref, Google Scholar

    • 7

      Lewis HM, Laland KN. 2012Transmission fidelity is the key to the build-up of cumulative culture. Phil. Trans. R. Soc. B 367, 2171–2180. (doi:10.1098/rstb.2012.0119) Link, ISI, Google Scholar

    • 8

      Dean LG, Vale GL, Laland KN, Flynn E, Kendal RL. 2014Human cumulative culture: a comparative perspective. Biol. Rev. 89, 284–301. (doi:10.1111/brv.12053) Crossref, PubMed, ISI, Google Scholar

    • 9

      Fridland ER. In press.Do as I say and as I do: imitation, pedagogy and cumulative culture. Mind. Lang. ISI, Google Scholar

    • 11

      Enquist M, Strimling P, Eriksson K, Laland K, Sjostrand J. 2010One cultural parent makes no culture. Anim. Behav. 79, 1353–1362. (doi:10.1016/j.anbehav.2010.03.009) Crossref, ISI, Google Scholar

    • 12

      Kempe M, Lycett SJ, Mesoudi A. 2014From cultural traditions to cumulative culture: parameterizing the differences between human and nonhuman culture. J. Theor. Biol. 359, 29–36. (doi:10.1016/j.jtbi.2014.05.046) Crossref, PubMed, ISI, Google Scholar

    • 13

      Tennie C, Call J, Tomasello M. 2009Ratcheting up the ratchet: on the evolution of cumulative culture. Phil. Trans. R. Soc. B 364, 2405–2415. (doi:10.1098/rstb.2009.0052) Link, ISI, Google Scholar

    • 14

      Zentall TR. 2001Imitation in animals: evidence, function, and mechanisms. Cybernet. Syst. 32, 53–96. (doi:10.1080/019697201300001812) Crossref, ISI, Google Scholar

    • 15

      Custance D, Whiten A, Fredman T. 1999Social learning of an artificial fruit task in capuchin monkeys (Cebus apella). J. Comp. Psychol. 113, 13–23. (doi:10.1037/0735-7036.113.1.13) Crossref, ISI, Google Scholar

    • 16

      Hoppitt W, Laland KN. 2008Social processes influencing learning in animals: a review of the evidence. Adv. Stud. Behav. 38, 105–165. (doi:10.1016/s0065-3454(08)00003-x) Crossref, ISI, Google Scholar

    • 17

      Heyes CM. 1994Social-learning in animals: categories and mechanisms. Biol. Rev. Camb. Phil. Soc. 69, 207–231. (doi:10.1111/j.1469-185X.1994.tb01506.x) Crossref, PubMed, ISI, Google Scholar

    • 18

      Heyes CM, Ray ED. 2000What is the significance of imitation in animals?Adv. Stud. Behav. 29, 215–245. (doi:10.1016/S0065-3454(08)60106-0) Crossref, ISI, Google Scholar

    • 19

      Cook R, Bird G, Catmur C, Press C, Heyes C. 2014Mirror neurons: from origin to function. Behav. Brain Sci. 37, 177–192. (doi:10.1017/S0140525X13000903) Crossref, PubMed, ISI, Google Scholar

    • 20

      Oostenbroek J, Suddendorf T, Nielsen M, Redshaw J, Kennedy-Costantini S, Davis J, Clark S, Slaughter V. 2016Comprehensive longitudinal study challenges the existence of neonatal imitation in humans. Curr. Biol. 26, 1334–1338. (doi:10.1016/j.cub.2016.03.047) Crossref, PubMed, ISI, Google Scholar

    • 21

      Byrne RW, Russon AE. 1998Learning by imitation: a hierarchical approach. Behav. Brain Sci. 21, 667–684. (doi:10.1017/s0140525X98001745) Crossref, PubMed, ISI, Google Scholar

    • 22

      Byrne RW. 1999Imitation without intentionality. Using string parsing to copy the organization of behaviour. Anim. Cogn. 2, 63–72. (doi:10.1007/s100710050025) Crossref, Google Scholar

    • 23

      Galef BG. 2015Laboratory studies of imitation/field studies of tradition: towards a synthesis in animal social learning. Behav. Processes 112, 114–119. (doi:10.1016/j.beproc.2014.07.008) Crossref, PubMed, ISI, Google Scholar

    • 24

      Galef BG. 2013Imitation and local enhancement: detrimental effects of consensus definitions on analyses of social learning in animals. Behav. Processes 100, 123–130. (doi:10.1016/j.beproc.2013.07.026) Crossref, PubMed, ISI, Google Scholar

    • 25

      Truskanov N, Lotem A. 2017Trial-and-error copying of demonstrated actions reveals how fledglings learn to ‘imitate’ their mothers. Proc. R. Soc. B 284, 2744. (doi:10.1098/rspb.2016.2744) Link, ISI, Google Scholar

    • 26

      Franz M, Matthews LJ. 2010Social enhancement can create adaptive, arbitrary and maladaptive cultural traditions. Proc. R. Soc. B 277, 3363–3372. (doi:10.1098/rspb.2010.0705) Link, ISI, Google Scholar

    • 27

      Matthews LJ, Paukner A, Suomi SJ. 2010Can traditions emerge from the interaction of stimulus enhancement and reinforcement learning? An experimental model. Am. Anthropol. 112, 257–269. (doi:10.1111/j.1548-1433.2010.01224.x) Crossref, PubMed, ISI, Google Scholar

    • 28

      Alem S, Perry CJ, Zhu XF, Loukola OJ, Ingraham T, Sovik E, Chittka L. 2016Associative mechanisms allow for social learning and cultural transmission of string pulling in an insect. PLoS Biol. 14, 28. (doi:10.1371/journal.pbio.1002564) ISI, Google Scholar

    • 29

      van der Post DJ, Franz M, Laland KN. 2017The evolution of social learning mechanisms and cultural phenomena in group foragers. BMC Evol. Biol. 17, 49. (doi:10.1186/s12862-017-0889-z) Crossref, PubMed, ISI, Google Scholar

    • 30

      Aisner R, Terkel J. 1992Ontogeny of pine-cone opening behavior in the black rat, Rattus rattus. Anim. Behav. 44, 327–336. (doi:10.1016/0003-3472(92)90038-B) Crossref, ISI, Google Scholar

    • 31

      Marler P, Tamura M. 1964Culturally transmitted patterns of vocal behavior in sparrows. Science 146, 1483–1486. (doi:10.1126/science.146.3650.1483) Crossref, PubMed, ISI, Google Scholar

    • 32

      Catchpole CK, Slater PJ. 2003Bird song: biological themes and variations. Cambridge, UK: Cambridge University Press. Google Scholar

    • 33

      Tyack PL. 2008Convergence of calls as animals form social bonds, active compensation for noisy communication channels, and the evolution of vocal learning in mammals. J. Comp. Psychol. 122, 319. (doi:10.1037/a0013087) Crossref, PubMed, ISI, Google Scholar

    • 34

      Garland EC, Goldizen AW, Rekdahl ML, Constantine R, Garrigue C, Hauser ND, Poole MM, Robbins J, Noad MJ. 2011Dynamic horizontal cultural transmission of humpback whale song at the ocean basin scale. Curr. Biol. 21, 687–691. (doi:10.1016/j.cub.2011.03.019) Crossref, PubMed, ISI, Google Scholar

    • 35

      Prat Y, Azoulay L, Dor R, Yovel Y. 2017Crowd vocal learning induces vocal dialects in bats: playback of conspecifics shapes fundamental frequency usage by pups. PLoS Biol. 15, e2002556. (doi:10.1371/journal.pbio.2002556) Crossref, PubMed, ISI, Google Scholar

    • 36

      Coelho CG, Falotico T, Izar P, Mannu M, Resende BD, Siqueira JO, Ottoni EB. 2015Social learning strategies for nut-cracking by tufted capuchin monkeys (Sapajus spp.). Anim. Cogn. 18, 911–919. (doi:10.1007/s10071-015-0861-5) Crossref, PubMed, ISI, Google Scholar

    • 37

      Lonsdorf EV. 2006What is the role of mothers in the acquisition of termite-fishing behaviors in wild chimpanzees (Pan troglodytes schweinfurthii)?Anim. Cogn. 9, 36–46. (doi:10.1007/s10071-005-0002-7) Crossref, PubMed, ISI, Google Scholar

    • 38

      Biro D, Inoue-Nakamura N, Tonooka R, Yamakoshi G, Sousa C, Matsuzawa T. 2003Cultural innovation and transmission of tool use in wild chimpanzees: evidence from field experiments. Anim. Cogn. 6, 213–223. (doi:10.1007/s10071-003-0183-x) Crossref, PubMed, ISI, Google Scholar

    • 39

      Whiten A, Goodall J, McGrew WC, Nishida T, Reynolds V, Sugiyama Y, Tutin CEG, Wrangham RW, Boesch C. 1999Cultures in chimpanzees. Nature 399, 682–685. (doi:10.1038/21415) Crossref, PubMed, ISI, Google Scholar

    • 40

      van Schaik CP, Ancrenaz M, Borgen G, Galdikas B, Knott CD, Singleton I, Suzuki A, Utami SS, Merrill M. 2003Orangutan cultures and the evolution of material culture. Science 299, 102–105. (doi:10.1126/science.1078004) Crossref, PubMed, ISI, Google Scholar

    • 41

      Lockman JJ. 2000A perception–action perspective on tool use development. Child Dev. 71, 137–144. (doi:10.1111/1467-8624.00127) Crossref, PubMed, ISI, Google Scholar

    • 42

      Legare CH, Nielsen M. 2015Imitation and innovation: the dual engines of cultural learning. Trends Cogn. Sci. 19, 688–699. (doi:10.1016/j.tics.2015.08.005) Crossref, PubMed, ISI, Google Scholar

    • 43

      Geribàs N, Mosquera M, Vergès JM. 2010What novice knappers have to learn to become expert stone toolmakers. J. Archaeol. Sci. 37, 2857–2870. (doi:10.1016/j.jas.2010.06.026) Crossref, ISI, Google Scholar

    • 44

      Nonaka T, Bril B, Rein R. 2010How do stone knappers predict and control the outcome of flaking? Implications for understanding early stone tool technology. J. Hum. Evol. 59, 155–167. (doi:10.1016/j.jhevol.2010.04.006) Crossref, PubMed, ISI, Google Scholar

    • 45

      Stout D. 2011Stone toolmaking and the evolution of human culture and cognition. Phil. Trans. R. Soc. B 366, 1050–1059. (doi:10.1098/rstb.2010.0369) Link, ISI, Google Scholar

    • 46

      Stout D, Bril B, Roux V, DeBeaune S, Gowlett J, Keller C, Wynn T, Stout D. 2002Skill and cognition in stone tool production: an ethnographic case study from Irian Jaya. Curr. Anthropol. 43, 693–722. (doi:10.1086/342638) Crossref, ISI, Google Scholar

    • 47

      Caldwell CA, Schillinger K, Evans CL, Hopper LM. 2012End state copying by humans (Homo sapiens): implications for a comparative perspective on cumulative culture. J. Comp. Psychol. 126, 161–169. (doi:10.1037/a0026828) Crossref, PubMed, ISI, Google Scholar

    • 48

      Caldwell CA, Millen AE. 2009Social learning mechanisms and cumulative cultural evolution: is imitation necessary?Psychol. Sci. 20, 1478–1483. (doi:10.1111/j.1467-9280.2009.02469.x) Crossref, PubMed, ISI, Google Scholar

    • 49

      Zwirner E, Thornton A. 2015Cognitive requirements of cumulative culture: teaching is useful but not essential. Sci. Rep. 5, e16781. (doi:1038/srep16781) Crossref, PubMed, ISI, Google Scholar

    • 50

      Henrich J, Boyd R. 2002On modeling cognition and culture: why cultural evolution does not require replication of representations. J. Cogn. Cult. 2, 87–112. (doi:10.1163/156853702320281836) Crossref, Google Scholar

    • 51

      Eerkens JW, Lipo CP. 2005Cultural transmission, copying errors, and the generation of variation in material culture and the archaeological record. J. Anthropol. Archaeol. 24, 316–334. (doi:10.1016/j.jaa.2005.08.001) Crossref, ISI, Google Scholar

    • 52

      Lotem A, Halpern JY, Edelman S, Kolodny O. 2017The evolution of cognitive mechanisms in response to cultural innovations. Proc. Natl Acad. Sci. USA 114, 7915–7922. (doi:10.1073/pnas.1620742114) Crossref, PubMed, ISI, Google Scholar

    • 53

      Henrich J. 2004Demography and cultural evolution: How adaptive cultural processes can produce maladaptive losses—the Tasmanian case. Am. Antiquity 69, 197–214. (doi:10.2307/4128416) Crossref, ISI, Google Scholar

    • 54

      Powell A, Shennan S, Thomas MG. 2009Late Pleistocene demography and the appearance of modern human behavior. Science 324, 1298–1301. (doi:10.1126/science.1170165) Crossref, PubMed, ISI, Google Scholar

    • 55

      Derex M, Beugin M-P, Godelle B, Raymond M. 2013Experimental evidence for the influence of group size on cultural complexity. Nature 503, 389. (doi:10.1038/nature12774) Crossref, PubMed, ISI, Google Scholar

    • 56

      Sperber D. 1996Explaining culture: a naturalistic approach. Oxford, UK: Blackwell. Google Scholar

    • 57

      Claidière N, Sperber D. 2007The role of attraction in cultural evolution. J. Cogn. Cult. 7, 89–111. (doi:10.1163/156853707X171829) Crossref, Google Scholar

    • 58

      Griffiths TL, Kalish ML, Lewandowsky S. 2008Theoretical and empirical evidence for the impact of inductive biases on cultural evolution. Phil. Trans. R. Soc. B 363, 3503–3514. (doi:10.1098/rstb.2008.0146) Link, ISI, Google Scholar

    • 59

      Castro L, Toro MA. 2004The evolution of culture: from primate social learning to human culture. Proc. Natl Acad. Sci. USA 101, 10 235–10 240. (doi:10.1073/pnas.0400156101) Crossref, ISI, Google Scholar

    • 60

      Andersson C. 2013Fidelity and the emergence of stable and cumulative sociotechnical systems. PaleoAnthropology 2013, 88–103. (doi:10.4207/PA.2013.ART81) Google Scholar

    • 61

      Gadagkar V, Puzerey PA, Chen R, Baird-Daniel E, Farhang AR, Goldberg JH. 2016Dopamine neurons encode performance error in singing birds. Science 354, 1278–1282. (doi:10.1126/science.aah6837) Crossref, PubMed, ISI, Google Scholar

    • 62

      Galef BG. 1995Why behavior patterns that animals learn socially are locally adaptive. Anim. Behav. 49, 1325–1334. (doi:10.1006/anbe.1995.0164) Crossref, ISI, Google Scholar

    • 63

      Shettleworth SJ. 2010Cognition, evolution, and behavior, 2nd edn. Oxford, UK: Oxford University Press. Google Scholar

    • 64

      Heyes C. 2012What's social about social learning?J. Comp. Psychol. 126, 193–202. (doi:10.1037/a0025180) Crossref, PubMed, ISI, Google Scholar

    • 65

      Lotem A, Halpern JY. 2012Coevolution of learning and data-acquisition mechanisms: a model for cognitive evolution. Phil. Trans. R. Soc. B 367, 2686–2694. (doi:10.1098/rstb.2012.0213) Link, ISI, Google Scholar

    • 66

      Visalberghi E, Addessi E, Truppa V, Spagnoletti N, Ottoni E, Izar P, Fragaszy D. 2009Selection of effective stone tools by wild bearded capuchin monkeys. Curr. Biol. 19, 213–217. (doi:10.1016/j.cub.2008.11.064) Crossref, PubMed, ISI, Google Scholar

    • 67

      Tumer EC, Brainard MS. 2007Performance variability enables adaptive plasticity of ‘crystallized’ adult birdsong. Nature 450, 1240. (doi:10.1038/nature06390) Crossref, PubMed, ISI, Google Scholar

    • 68

      Truskanov N, Lotem A. 2015The importance of active search for effective social learning: an experimental test in young passerines. Anim. Behav. 108, 165–173. (doi:10.1016/j.anbehav.2015.07.031) Crossref, ISI, Google Scholar

    • 69

      Thornton A, McAuliffe K. 2006Teaching in wild meerkats. Science 313, 227–229. (doi:10.1126/science.1128727) Crossref, PubMed, ISI, Google Scholar

    • 70

      Slagsvold T, Wiebe KL. 2007Learning the ecological niche. Proc. R. Soc. B 274, 19–23. (doi:10.1098/rspb.2006.3663) Link, ISI, Google Scholar

    • 71

      Dell'Mour V, Range F, Huber L. 2009Social learning and mother's behavior in manipulative tasks in infant marmosets. Am. J. Primatol. 71, 503–509. (doi:10.1002/ajp.20682) Crossref, PubMed, ISI, Google Scholar

    • 72

      Bateson P, Martin P. 2013Play, playfulness, creativity and innovation. Cambridge, UK: Cambridge University Press. Crossref, Google Scholar

    • 73

      Legare CH, Wen NJ, Herrmann PA, Whitehouse H. 2015Imitative flexibility and the development of cultural learning. Cognition 142, 351–361. (doi:10.1016/j.cognition.2015.05.020) Crossref, PubMed, ISI, Google Scholar

    • 74

      Clegg JM, Legare CH. 2016Instrumental and conventional interpretations of behavior are associated with distinct outcomes in early childhood. Child Dev. 87, 527–542. (doi:10.1111/cdev.12472) Crossref, PubMed, ISI, Google Scholar

    • 75

      Carr K, Kendal RL, Flynn EG. 2015Imitate or innovate? Children's innovation is influenced by the efficacy of observed behaviour. Cognition 142, 322–332. (doi:10.1016/j.cognition.2015.05.005) Crossref, PubMed, ISI, Google Scholar

    • 76

      Kendal RL, Coolen I, van Bergen Y, Laland KN. 2005Trade-offs in the adaptive use of social and asocial learning. In Advances in the study of behavior, vol. 35 (eds Slater PJB, Snowdon CT, Brockmann HJ, Roper TJ, Naguib M), pp. 333–379. San Diego, CA: Elsevier Academic Press. Google Scholar

    • 77

      Derex M, Feron R, Godelle B, Raymond M. 2015Social learning and the replication process: an experimental investigation. Proc. R. Soc. B 282, 20150719. (doi:10.1098/rspb.2015.0719) Link, ISI, Google Scholar

    • 78

      Enquist M, Ghirlanda S, Eriksson K. 2011Modelling the evolution and diversity of cumulative culture. Phil. Trans. R. Soc. B 366, 412–423. (doi:10.1098/rstb.2010.0132) Link, ISI, Google Scholar

    • 79

      Kolodny O, Creanza N, Feldman MW. 2015Evolution in leaps: the punctuated accumulation and loss of cultural innovations. Proc. Natl Acad. Sci. USA 112, E6762–E6769. (doi:10.1073/pnas.1520492112) Crossref, PubMed, ISI, Google Scholar

    • 80

      Davis SJ, Vale GL, Schapiro SJ, Lambeth SP, Whiten A. 2016Foundations of cumulative culture in apes: improved foraging efficiency through relinquishing and combining witnessed behaviours in chimpanzees (Pan troglodytes). Sci. Rep. 6, 35953. (doi:10.1038/srep35953) Crossref, PubMed, ISI, Google Scholar

    • 81

      Kolodny O, Edelman S, Lotem A. 2015Evolved to adapt: a computational approach to animal innovation and creativity. Curr. Zool. 61, 350. (doi:10.1093/czoolo/61.2.350) Crossref, ISI, Google Scholar

    • 82

      Truskanov N, Prat Y. 2018Cultural transmission in an ever-changing world: trial-and-error copying may be more robust than precise imitation. Dryad Digital Repository. (doi:10.10.5061/dryad.4m518) Google Scholar


    Page 2

    Populational models—such as dual-inheritance theory and cultural epidemiology—put minds at the heart of cultural evolution. Purely historical approaches take whole cultures as their units of analysis and ask about the forces that move these massive, mind-free entities from one condition to the next. By contrast, populational or ‘kinetic’ models take cultural change to be change in the frequencies of types in a population as the aggregate consequence of innumerable episodes of social learning: of episodes in which one mind acquires information from one or more other minds.

    Given this spotlight on the mental, it is surprising that cognitive science rarely makes an appearance at the lively interdisciplinary party of cultural evolutionary studies. The hosts—evolutionary biology, mathematics and anthropology—are often joined by archaeology, economics, ecology, environmental sciences and philosophy. Psychology is certainly not excluded, but the invitations (or perhaps the acceptances) are not uniformly distributed across the discipline. They reach areas—such as comparative, developmental and social psychology—that are rooted in our common sense or ‘folk psychological’ understanding of the mind: in the blend of wisdom and old wives’ tales that explain behaviour with reference to the thoughts and feelings, beliefs and desires, of whole agents (e.g. [1]). But the invitations rarely get through to areas of psychology that are more fully integrated with cognitive science, for example, cognitive psychology, behavioural and cognitive neuroscience, experimental psychology and psychophysics.

    The term ‘cognitive science’ has been used since the early 1970s to refer to research in psychology, computer science, linguistics, neuroscience and philosophy that likens the mind to a computer. It casts thinking as ‘information processing’ and seeks to explain behaviour at a ‘sub-personal’ level [2,3]. That is, in contrast with folk psychology, which takes mental states of the whole agent (e.g. beliefs and desires) to be the drivers of behaviour, cognitive science typically explains behaviour as due to the activities of parts of the mind and of the interactions between these parts. For example, ‘Stephanie said “blue” when she saw BLUE written in red ink because two parts of her mind—one responsible for naming colours, and the other for reading words—competed for control of Stephanie's speech mechanisms, and the reading part won the contest.’ The sub-personal explanations offered by cognitive science are not familiar or intuitive, but they burrow deeper into the mind than folk psychology, and many have survived rigorous experimental tests [4].

    In this article, I suggest that cultural evolutionists and cognitive scientists should party together more often because we need each other. Cultural evolution needs cognitive science for many reasons (for example, to test hypotheses about conformist bias), but especially to get empirical traction on a fundamental question: Are the conditions necessary for Darwinian evolution met in the cultural domain? Cognitive science needs cultural evolution to address another fundamental question: What are the origins of distinctively human cognitive processes? My primary focus here will be on the first question, on what cognitive science can do for cultural evolution. After some reflection on the question itself—on the possibility of ‘third-way cultural selection’ (§2)—I turn to a distinction, between ‘replication’ and ‘reconstruction’, which has been used by cultural epidemiologists to argue, against dual-inheritance theorists, that cultural change is not a selection process. Although inspired by research in psychology, I argue that the replication/reconstruction distinction is being used in a way that prevents cognitive science from informing debate about third-way cultural selection (§3). Making a first excursion into cognitive science, I reconstruct the replication/reconstruction distinction to root it more firmly in research on the sub-personal processes involved in social learning. This exercise suggests that cultural epidemiologists are right in thinking that replication has higher fidelity than reconstruction, but wrong to assume that replication is rare (§4). If replication is not rare, an important requirement for cultural selection—‘one-shot fidelity’—is likely to be met. However, there are two other requirements, often overlooked by dual-inheritance theorists, for ‘dumb choices’ and ‘recurrent fidelity’ (§5). A second excursion into cognitive science suggests that these requirements can be met by ‘metacognitive social learning strategies' (§6). To conclude, I offer a glimpse of the reciprocal relationship—what cultural evolution can do for cognitive science—using the origins of metacognitive social learning strategies as an example (§7).

    The most fully developed populational account of cultural evolution is known as dual-inheritance theory [5–8], or the ‘California school’ [9,10]. This impressive body of work is ‘evolutionary’ in at least three respects. First, it assumes that social learning—or, at least, the kinds of social learning that drive large-scale changes in human populations—is built on a set of genetic adaptations; natural selection acting on genetic variants has given humans psychological mechanisms—called ‘learning biases’ or ‘decision rules’—that are specialized for learning from others. Second, dual-inheritance theory is very much concerned with how genetic evolution interacts with cultural change, with ‘gene–culture coevolution’. This kind of coevolution occurs when a change in the socially learned characteristics of a population provokes a change in genetically inherited characteristics, or vice versa. The classic example of gene–culture coevolution is lactose tolerance [11]. Third, dual-inheritance theory is evolutionary at a methodological level: it borrows techniques from the study of genetic evolution, applying to socially learned characteristics mathematical models that were initially developed in population genetics. Thus, on the dual-inheritance view, cultural change is evolutionary at least by virtue of its relationships of interdependence with genetic evolution: because it is made possible by genetically inherited psychological mechanisms; in continuous interaction with genetic evolution and subject to analysis using mathematical tools developed by geneticists.

    But is dual-inheritance theory ‘evolutionary’ in a stronger sense? Does it claim not only that cultural change is closely related to genetic evolution, but that the conditions required for the occurrence of Darwinian selection—variation, heritability and differential fitness—are present in the cultural domain [12]? Elucidating this distinction, and building on Godfrey-Smith's [13] analysis of ‘Darwinian populations', Sterelny [9] points out that ‘selective’ explanations of cultural change are a subset of populational explanations. All populational explanations suggest that the frequency of types in a population at time T + N is largely determined by their frequency at time T. However, selective explanations further suggest that ‘the frequency of types at T + N is importantly determined by selection on those types at previous time steps, with selectively favoured types at one step increasing in frequency at the next in virtue of that success, together with some mechanism (replication or otherwise) supporting resemblance between parent and offspring’ ([9, p 43]). In common with Lewens [12], Sterelny doubts that dual-inheritance theory is designed to offer selective explanations because, in his view, the members of the California school ‘do not seem to think of selection and fitness in causally robust ways’ ([9, p 43]). For example, they rarely address cui bono questions [14]: when dual-inheritance theorists suggest that one cultural trait is fitter than another, they rarely specify who or what benefits from this fitness, or what is the nature of the benefit.

    Sterelny and Lewens, philosophers of biology with a deep understanding of evolutionary theory and the contemporary literature on cultural evolution, may well be right about this, but my hunch is different. I think dual-inheritance theory is intended to provide selective explanations, and is committed to the idea that cultural change can be Darwinian in its own right, but the California school has not got round to addressing the questions that would make this hypothesis ‘causally robust’. This hunch is based, in part, on the fact that the early development of dual-inheritance theory was much influenced by Donald T. Campbell, and his view of cultural evolution was unambiguously, and indeed evangelically, selectionist [15,16]. Also, to this day, when commenting on their project as a whole, and glossing the results of particularly models, the members of the California school write as if they are aiming for selective explanations. They emphasize the ‘Darwinian’ character of cultural evolution, refer to ‘cultural adaptations’, use the term ‘selection’ repeatedly and make explicit statements such as ‘The logic of natural selection applies to culturally transmitted variation every bit as much as it applies to genetic variation’ ([7, p 76]).

    Only if the dual-inheritance project offers selective explanation does it have the potential to show that there is ‘third way’ in which human thought and behaviour can become adapted, can achieve a better fit with their environments. We know from sociobiology, evolutionary psychology and human behavioural ecology that human thought and behaviour can become adapted to their environments via natural selection operating on genetic variants (the first way; [17]). In humans, as in other animals, genetic evolution has produced behavioural propensities and cognitive processes that enhance survival and reproduction. We know from Enlightenment philosophy, experimental psychology and everyday experience that human thought and behaviour can come to fit their environments through the operation of cognitive processes lodged in individual heads (the second way). Some of these processes—known collectively as ‘learning’, ‘intelligence’, ‘insight’ or ‘foresight’—make individuals, or, as in science, groups of humans working together, smart enough to come up with new solutions to old problems, to distinguish better from worse solutions and selectively to adopt the good ones. The crucial question is whether there is another way, a third way, in which human thought and behaviour can become adapted: a process that selects among cultural rather than genetic variants, and in which the adaptiveness of the selection does not depend on individuals or groups being smart enough to design novel solutions or to recognize what works and what does not [10,18,19]. Thus, the ‘third-way’ question is: Are human thought and behaviour made adaptive—made to fit their environments—not only by genetic selection and intelligence but, at least sometimes, by cultural selection?

    I believe that dual-inheritance theory offers an affirmative answer to this question, whereas Lewens and Sterelny are not so sure. Only time (and members of the California school) can tell us who is right, but in the meantime it is clear that the third-way question is of fundamental importance. It is analogous to the challenge faced, and met, by Darwin. Darwin asked whether ‘intelligent design’ by God was the only way in which morphological characteristics could become adapted to their environments. The third-way cultural selection question asks whether natural selection operating on genetic variants and ‘intelligent design’ by human minds are the only ways in which behavioural characteristics can become adapted to their environments.

    An alternative to dual-inheritance theory, ‘cultural epidemiology’ or the ‘Paris school’, has been gaining ground since the 1990s [20,21]. Like dual-inheritance theory, cultural epidemiology is a populational approach to cultural change. However, according to the Paris school, they disagree with their Californian cousins on the subject of selection. Paris argues that California is committed to third-way cultural selection—a positive answer to the third-way question—and that California makes cultural selection appear plausible by assuming, wrongly, that cultural inheritance typically involves ‘replication’. By contrast, the Paris school denies there is a process of cultural selection producing improvement or adaptation of cultural traits—it offers a negative answer to the third-way question—on the grounds that, in fact, cultural inheritance typically involves ‘reconstruction’ rather than ‘replication’ [20–22].

    What is the difference between replication and reconstruction, and why does it matter? The second of these questions has been given a much more satisfactory answer than the first. It is widely assumed—by dual-inheritance theorists, cultural epidemiologists and others—that the distinction matters because replicative processes have higher-fidelity products than reconstructive processes, and high-fidelity products, although not strictly necessary for selection [4,23,24], make selection more likely to happen, and a more powerful generator of adaptations when it occurs. In other words, and more slowly, it is assumed that replication and reconstruction are both psychological processes, or sets of psychological processes, in which cultural entities—ideas, behaviours and artefacts—play a causal role in the production of new, more-or-less similar entities. The products of these psychological processes are high-fidelity when the new entities closely resemble the old ones, for example, when the idea you form as a result of reading my words is very similar to the idea that inspired me to write them. High-fidelity inheritance enhances the probability and the power of third-way or ‘cumulative’ cultural selection—gradual improvement or adaptation of cultural variants over successive generations—because it preserves small improvements (the analogue of beneficial mutations), and thereby makes them available for further improvement in the future [25,26].

    If this is correct, the distinction between replication and reconstruction matters a great deal—it is a key to answering the third-way question. To find out whether cultural selection is likely to occur, or under what conditions it is likely to occur, we just need to work out whether the social learning processes that mediate cultural inheritance are replicative or reconstructive, and that seems to be an eminently tractable empirical question. Indeed, there are already a number of laboratory experiments that appear to have made progress in answering the question (e.g. [22,27,28]). But there is a problem. Although many cultural evolutionists write confidently about replication and reconstruction, no one has characterized the difference between them such that replication and reconstruction could be distinguished empirically in psychological experiments and used as indicators of the fidelity of cultural inheritance in the real world.

    The word ‘replication’ comes from the lexicon of molecular genetics, where it refers to a process of ‘splitting and reassembly’ of DNA, which occurs at cell division [29]. As far as I can tell, no one is claiming—or claiming that others are claiming—that cultural inheritance involves a precise analogue of this kind of splitting and reassembly. Rather, replication is almost invariably defined with reference to ‘copying’, but, as Godfrey-Smith [30] and Lewens [31] have noted, without an accompanying explanation of what is meant by ‘copying’ [21,32]. Consistent with everyday usage, and the way the term is used in research on social learning, ‘copying’ could be understood as any process in which entities play a causal role in the production of new, similar entities. However, this approach would bind the process of copying/replication too closely to its products [33]. In effect, it would define replication in terms of its relatively high-fidelity products and thereby squander the opportunity offered by the replication/reconstruction distinction: the opportunity to find out about the fidelity of cultural inheritance by examining the features of the psychological processes through which it occurs. Only if we know about the processes of cultural replication, can we work out the range of inputs over which there is a match between input and output sufficient to support cultural selection.

    Thus, the distinction between replication and reconstruction has considerable promise when replication and reconstruction are viewed as two different types of psychological process, one of which, by hypothesis, yields higher-fidelity cultural inheritance than the other. In this case, the likely fidelity of cultural inheritance in a given domain across time could be assessed using data that are readily available: data from humans alive today which tell us about the psychological processes mediating social learning in various domains. I have argued that this promise is not being fulfilled because replication and reconstruction are being defined not as types of psychological process—in terms of the operations, or sequences of events, that each instantiates—but by the extent to which their products resemble their social inputs. This approach conflates processes with products (replication and reconstruction with high and low fidelity), makes the argument circular and prevents cognitive science from getting a handle on a fundamental question about cultural evolution: does it involve third-way cultural selection?

    I think the potential value of the replication/reconstruction distinction can be recovered by using dual-system theory to develop Sperber's [20,33] suggestion that replication is ‘stimulus-driven’, whereas reconstruction is inferential. Dual-system models (not to be confused with dual-inheritance theory) have provided a framework for research on cognition ever since psychology became an empirical science [34], and they continue to inspire some of the most rigorous, cumulative work in the field [35,36]. These models vary in detail, but they are united in suggesting that thought, and especially human thought, is controlled by two systems, or types of process, that interact with one another. The operation of System 1 is typically characterized as bottom-up (or stimulus-driven), fast, involuntary, parallel, unavailable to conscious awareness, and based on information derived from genetic inheritance and associative learning. The operation of System 2 is top-down, slow, effortful, serial, available to conscious awareness, and based on information both from System 1, and generated by its own activity. System 2 acts as a more-or-less successful ‘supervisor’ or ‘executive’ with respect to System 1 [37]; it schedules, harnesses and augments the activities of System 1. The activities of System 1 lend themselves to characterization at the sub-personal level, whereas the activities of System 2 are more naturally characterized at the personal level, as things that are done by the whole agent.

    Viewed from the perspective of dual-systems theory, social learning is replicative to the extent that information from another agent is picked up or encoded by System 1—in a fast, involuntary and possibly unconscious way—and reconstructive to the extent that encoding of information from another agent is done or supervised by System 2—in a slow, deliberate, conscious way. This is a reconstructed version of the replication/reconstruction—it is not the same as Sperber's replication/reconstruction distinction—but it is consistent with his suggestion that replication is stimulus-driven, and with the connotations of ‘replication’ that waft over from genetics. It makes cultural replication into a process that occurs ‘all by itself’. Like genetic replication, it is not ‘done by’ the recipient of the ideas/alleles; it just happens. To make clear when I am using the dual-systems, reconstructed version of the replication/reconstruction distinction, I will refer to ‘replication1’ and ‘reconstruction2’.

    There is plenty of evidence of replication1 in the cognitive science literature on social learning in humans and other animals. For example, there are many demonstrations that, in controlled laboratory conditions and when talking casually to others, humans engage in ‘automatic imitation’ or ‘mimicry’. We copy the gestures of others—the way in which parts of the body move relative to one another—when we do not intend to copy: when copying interferes with us discharging our intentions and when we are apparently unaware of the other person's gestures or our own imitation of them [38,39]. Similarly, there is compelling evidence that, like other animals, humans readily acquire preferences and aversions through ‘observational conditioning’—a form of unsupervised associative learning, and therefore solidly part of System 1. After seeing another person's face spontaneously wincing in the presence of an object, or showing disgust in reaction to a smell, the observer becomes fearful of the object, or apt to avoid eating anything with that, now nasty, smell [40,41]. Another kind of replication1, rote learning, is evident in everyday life. Living as I do in the unusual world of an Oxford college, I have heard a particular prayer, a Latin grace, said many times by others. I do not understand Latin, and I never intended to learn the sequence of sounds, but when the time came for me to say grace, I could utter the words ‘parrot-fashion’.

    The same kinds of content—sequences of body movements, aversions and sequences of sounds—can also be socially learned by reconstruction2. As a lousy tennis player, with a very limited repertoire of skilled tennis moves, I might try to copy the pro's serve by laboriously describing it to myself while watching—trying to capture in words the topography and timing of the action components—and then rehearsing this description in my mind as I grasp the racket and try to repeat the pro's performance. This would be an intentional, reconstructive2 (and probably doomed) form of body movement imitation. Similarly, in episodes of what cultural epidemiologists call ‘ostensive communication’, I could acquire an aversion by hearing you say ‘touching a hot iron is painful’ or ‘spinach is disgusting’. And, given the right education, people can certainly learn to say a Latin grace in the time-honoured, reconstructive2 way: with the firm intention to learn and full command of the tongue of Ancient Rome.

    The foregoing examples support two things that cultural epidemiologists have claimed about replication and reconstruction, but run counter to a third. They are broadly consistent with the idea that replication1 is typically of higher fidelity than reconstruction2; on average, System 1 social learning processes yield products that more closely resemble their inputs than System 2 social learning processes. Replicative copying of novel sequences of body movements can be very precise [42], but our action vocabularies are so limited that imitation-by-verbal-description is likely to be grossly inaccurate for all but the most topographically simple actions. Likewise, observational conditioning may be more likely than verbal instruction to result in the receiver developing an aversion to the same category of objects as the transmitter. An observationally conditioned aversion generalizes only to physically similar objects—for example, from a flat iron to a steam iron—but an instruction such as ‘touching an iron is painful’ could be taken to mean it is risky to contact any tool made of iron. And if a receiver understands the language in which a formula is expressed, they are more likely to ‘correct’ a component they regard as wrong, or to produce an utterance that means the same but sounds different, than if they learn by rote a sequence of phonemes that is, for them, meaningless. Thus, while it may be possible to make the fidelity of reconstruction2 comparable with that of replication1—for example, through extended periods of teaching, such as those involved in science education—it is likely that, on average, replication1 is of higher fidelity than reconstruction2.

    The foregoing examples also support the cultural epidemiologists' denial that replication, when it occurs, depends on psychological mechanisms that are genetic adaptations for culture—that evolved genetically for high-fidelity cultural inheritance [21]. Imitation of body movement topography used to be thought to depend on such a genetic adaptation, or an ‘innate module’ [43]. However, the foundation of the innate module view was recently undermined by a large-scale study showing that human newborns do not imitate [44], and there is now a substantial body of evidence from adults, infants and nonhuman animals indicating that, rather than being genetically inherited, the imitation mechanism is constructed in the course of development through learning [45,46]. As for observational conditioning and verbal instruction, the former is a species of associative learning—a cognitive capacity that is far too ancient, in phylogenetic terms, to be an adaptation for culture—and even those who regard language as a human-specific genetic adaptation do not claim that it evolved specifically for high-fidelity cultural inheritance by verbal instruction.

    However, the foregoing examples suggest that the Paris school is wrong in thinking that replication is rare. I suspect their preoccupation with cultural traits that are transmitted via language (e.g. religious beliefs, folk lore and fairy tales), combined with their rich Gricean view of how much System 2 inference is involved in linguistic communication, has led cultural epidemiologists to overlook a substantial body of research in cognitive science showing that sub-personal, System 1 processes can mediate the cultural inheritance of gestures, skills, preferences and, with the appropriate social support for rote learning, linguistic entities that are in an important sense meaningless for those who utter them [38–42].

    In summary, reconstructing the distinction between replication and reconstruction so that it is more firmly rooted in cognitive science, and does not merely define replication as high-fidelity transmission, suggests that the Paris school is right on two counts and wrong on a third: replication1 is more likely than reconstruction2 to support high-fidelity inheritance—to result in the receiver receiving something similar to what the sender sent (deliberately or inadvertently)—and this is not because replication1 mechanisms are genetic adaptations for cultural inheritance. However, there is no reason to think that replication1 is rare. Indeed, the ease with which automatic imitation, observational conditioning and rote learning can be observed in the laboratory and in everyday life suggests that cultural replication1 is a pervasive feature of human lives.

    Several commentators have recently argued that too much fuss is being made about the differences between dual-inheritance theory and cultural epidemiology, and that disagreements between the California and Paris schools are more apparent than real [24,27,47]. For example, surveying the results of transmission chain experiments, Acerbi & Mesoudi [27] conclude that there is enough evidence that cultural inheritance can be replicative (they use the term ‘preservative’), for us to be confident that, at least in some domains and at certain levels of granularity, there is selection on cultural variants. This may well be true if one takes ‘selection’ to be no more than a synonym for ‘choice’. In that case, to say that there has been ‘selection on cultural variants', means only that the frequency of types in a population at T + N has been influenced by learners' choices among variants to copy at previous time steps. However, if one is interested in the third-way question, cultural selection means more than this. In the third-way context, cultural selection occurs when (i) a change in the frequency of types in a population constitutes improvement or adaptation (i.e. the frequency of types that do a better job, with respect to human purposes, increases more than that of types that do the same job less well) and (ii) this improvement is not due solely to smart choices by agents; to learners choosing to copy the better variants because they, the learners, recognize the ‘betterness’ of the better variants. If the improvement is due to smart choices by learners—for example, if people use durable rather than disposable shopping bags because they understand the former to be better for the environment—thought and behaviour are becoming adapted in the second way, not the third way [13,18].

    Third-way cultural selection requires a good deal more than replication, or even replication1. As the previous paragraph indicates, one additional requirement is for ‘dumb’, blind or trusting choices by learners, which nonetheless make better variants more likely to be copied than inferior variants. These choices could be made with deliberation, and via sophisticated cognitive processes, but they must not depend on learners detecting, individually or collectively via foresight, the betterness of better variants [10]. Intelligence in the sense of insight into what will and will not ‘work’, whether uniform or highly variable within a population, is a threat to third-way cultural selection; it increases the chances that adaptation will occur in the second, rather than the third way.

    A second additional requirement is for another kind of fidelity. Replication1 delivers ‘one-shot fidelity’; processes such as imitation, observational conditioning and rote learning make it likely that, in the course of a particular episode of social learning, the receiver will acquire an idea or behaviour similar to that of the model agent. But for improvements to accumulate—for cultural selection in the strong sense—‘recurrent fidelity’ is also needed; the idea or behaviour must remain similar to that of the model, in memory and over episodes of activation or use, until it is passed on to one or more other learners. A little more formally: ‘one-shot fidelity’ is the fidelity with which a trait, t, is initially learned from an expert, A, by a novice, B. A fair degree of fidelity at this initial stage is undoubtedly necessary for cultural selection, but it is radically insufficient. For improvements to accumulate, ‘recurrent fidelity’ is also needed: B must retain t—keep doing what A did, or keep believing what A believed—until C, a novice of the next cultural generation, acquires t from B. The t needs to be insulated from loss or modification between acquisition and re-transmission [48–50].

    Many of the processes or ‘decision rules’ that dual-inheritance theory regards as integral to cultural evolution—such as ‘direct bias’, ‘guided variation’ and ‘conformist bias’—are consistent with the idea that cultural change is a function of choice, but are threats to the possibility of third-way cultural selection. They militate in favour of selection in the weak sense—choice—and against selection in the strong sense. For example, direct bias is a threat to the requirement for dumb choices. In direct bias [5], later called ‘content bias’ [7], learners are supposed to survey all traits in the population, to evaluate their efficiency relative to other traits and, based on this evaluation, preferentially to copy the better traits. Although direct bias is clearly a selection mechanism in the weak sense—it relates to choices among cultural traits [27]—it involves (incredibly) smart choices by learners, and therefore any improvement or adaptation resulting from this bias would be due, not to third-way cultural selection, but to individual intelligence or insight. Similarly, guided variation, which occurs when cultural variants are modified by learning between acquisition and re-transmission [5], is a threat to recurrent fidelity; it reduces the chances that small improvements will be preserved as platforms for further improvement.

    Faced with the many requirements for third-way cultural selection, and threats against their fulfilment, cultural epidemiologists are sceptical about a third way, arguing that cultural change is rarely, if ever, a process of adaptation. By contrast, dual-inheritance theorists appear to remain optimistic about the possibility of cultural selection, but have not explained how the requirements could be met in spite of the threats [10,23]. This may have been part of what Sterelny ([9, p 43]) had in mind when he said that dual-inheritance theorists ‘do not seem to think of selection and fitness in causally robust ways' (see §2 above). I share the optimism of the California school and believe that cognitive science can help us to think about cultural selection in more ‘causally robust ways’; it can help us to explain how, against the odds, the requirements for third-way cultural selection could be met.

    Let's take as an example the ‘decision rules’—sometimes called ‘social learning strategies'—that are, according to dual-inheritance theory, the basis on which learners choose which cultural variants to copy. These rules are a fundamental part of dual-inheritance theory, they explain directional change in the frequencies of variants in the population, but they have been consistently ‘blackboxed’ by the California school [51]. With some resolution, dual-inheritance theorists have refused to ask what social learning strategies are ‘made of’—how they are implemented at the cognitive level. Opening the black box, and combing through research on social learning strategies in animals, children and adults, recently I found evidence that, from a cognitive science perspective, two kinds of rule guide choices about when, what and whom to copy [52]. The first, ‘planetary’ kind of decision rule is implemented by relatively simple, taxon- and domain-general psychological processes; mechanisms of attention and associative learning that are present in a broad range of species, come online early in development and process information from the social and inanimate worlds via the same computations. For example, agents who grab more attention because they are large, noisy or standing close to desirable objects are more likely to be copied than agents who grab less attention. Empirical regularities of this kind can be characterized by rules—such as copy older individuals (who tend to be larger), or copy the successful (who tend to be located near desirable objects)—but these rules, like the rules of planetary motion, are in the minds of researchers, not in the minds of the entities or agents the researchers are studying. The second, ‘cook-like’ kind of decision rule is implemented by complex ‘metacognitive’ processes: System 2 psychological processes that represent ‘who knows’. More specifically, System 2 metacognitive processes represent the accuracy and reliability with which other cognitive processes, in the self and in others, represent the world [53]. The evidence suggests that these metacognitive social learning strategies are found only in humans, come online late in development and process social information in a domain-specific way. For example, they specify that, when building a boat, one should copy the boat-builder with the largest fleet, and when struggling with information technology, one should copy digital natives. Metacognitive social learning strategies are full-bloodied rules. They are consciously represented in the minds of choosing agents, guiding their behaviour in the way that a cook uses a recipe.

    Unlike planetary rules, metacognitive social learning strategies have the potential to meet the requirements for cultural selection identified in the previous section—the need for dumb choices and recurrent fidelity.

    (i) Dumb choices. Although mediated by sophisticated psychological processes, metacognitive social learning strategies are dumb in the sense that is important for third-way cultural selection: they bias an agent towards copying better variants without the agent being smart enough to know which variants are better and which are worse. They are alternatives to direct/content bias that leave room for cultural selection, rather than individual intelligence, to do the adaptive work. If I copy the boat-builder with the biggest fleet, there is a good chance I will copy a design that is especially successful. This is because fleets remain large when they are made up of boats that are unlikely to sink. But, crucially, I do not need to know this in order to make the right—the adaptive—choice of which boat design to copy. I do not need to be smart enough to know what makes a good boat good, or to have any theory about why the builder with the biggest fleet knows best. As long as I, along with other novices, slavishly follow the rule copy the boat-builder with the biggest fleet, adaptive innovations are likely to become more widespread and to form the basis for further improvements in boat design.

    (ii) Recurrent fidelity. Metacognitive social learning strategies can guide learners towards knowledgeable models with great precision, specifying the individual or type of person to copy in each of a range of task domains. As a result, they create conditions conducive to the development and evolution of processes that promote high-fidelity cultural inheritance. When there is a good chance that you are going to copy an adaptive variant, it is worthwhile investing time and energy in copying accurately and in detail. The processes that promote one-shot fidelity, replication1 processes, include automatic imitation and rote learning (§4). The processes that promote recurrent fidelity are those that discourage guided variation, i.e. changing a cultural variant in the light of further experience between acquisition and re-transmission. As far as I am aware, no one has studied these processes from a cognitive science perspective. My guess is that they involve a variety of low-level processes (System 1) supervised by culturally inherited beliefs (System 2) about the importance of conserving cultural traits for group identity, or more specifically, about who is and who is not allowed to innovate in particular domains. As an example relating to group identity, I inherited from my mother the belief that Maids of Kent (women born to the east of the River Medway in the English county of Kent) decorate their apple pies with pastry in the shapes of oak, ash and elm leaves. In superstitious fear of being mistaken for a Kentish Maid (born to the west of the Medway), an identity with no practical consequences in my lifetime, this belief has prevented me from deviation. Every apple pie I have ever made has been decorated with an oak, an ash and an elm leaf. Consequently, there has been no opportunity for me to discover through reinforcement learning (also known as ‘trial-and-error’) that alternatives are quicker to assemble, more pleasing to the eye, or garner more compliments. And had I failed in childhood to suppress my System 1 inclination to innovate, no doubt my mother or grandmother would have restored recurrent fidelity by punishing my tinkering with a pained expression and a pastry knife.

    Thus, thinking about social learning strategies from a cognitive science perspective reveals that there are two kinds of decision rules, and the metacognitive kind, found only in humans, has the potential to overcome many of the threats to third-way cultural selection identified by the California and Paris schools. Of course, this analysis begs the question of where metacognitive social learning strategies come from, and how they get to be so wise—questions I will take up in the latter part of the next section.

    So far, this article has considered only what cognitive science can do for cultural evolution. Now I want to consider, albeit briefly, the reciprocal relationship: what cultural evolution can do for cognitive science. This topic has been the focus of my work for the last few years [54]. I suggest that cognitive science needs cultural evolutionary theory to explain the origins and adaptiveness of distinctively human cognitive mechanisms—mechanisms such as causal understanding, imitation, language and mindreading (or ‘theory of mind’), which are present in mature adult humans, but absent, or found only in nascent form, in other animals.

    Evolutionary psychology—or, at least, the Santa Barbara school of evolutionary psychology [55]—suggests that genetic evolution is the architect of the human mind. According to this ‘cognitive instinct’ view, distinctively human ways of thinking are inborn. A human baby does not enter the world understanding causality, capable of imitating any action she sees, talking in complete sentences and understanding all about other minds, but she contains in her genes very specific programmes for the development of these capacities; programmes that are capable of building distinctively human, domain-specific cognitive mechanisms with minimal help from learning. The environment in which a child grows up is seen as merely ‘triggering’ or ‘evoking’ cognitive development.

    The cognitive instinct view had some plausibility when it was introduced more than 20 years ago. For example, at that time there seemed to be compelling evidence that human newborns can imitate [56], Chomsky's ‘universal grammar’ account of language was still dominant among linguists [57], and it was widely accepted that autistic individuals have difficulty ascribing thoughts and feelings because they lack an innate module for theory of mind [58]. But, in the ensuing years, and partly through the emergence of social cognitive neuroscience—a potent blend of social psychology, cognitive psychology and brain imaging—the cognitive instinct hypothesis has become less and less plausible. We now know that human newborns do not imitate [44]; ‘universal grammar’ has been pared down to the point where Chomsky's claim is either untestable or indistinguishable from the alternative, pragmatic or constructivist, view of language [59,60]; and there is evidence that autistic individuals have many cognitive impairments, some of them, like ‘weak central coherence’ and problems with executive function, which are much more domain-general than theory of mind [61].

    But if distinctively human cognitive mechanisms are not products of genetic evolution, where do they come from? No doubt ‘learning’, broadly construed, is a large part of the answer to this question, but it cannot possibly be the whole answer. People grow up in a broad range of environments. Therefore, if each developing human built his or her own specialized cognitive mechanisms through experience, it would be a staggering coincidence to find, as we do, that most people—at least, most people within any given culture—end up with the same set of mechanisms; for example, with mechanisms of causal understanding, language and theory of mind, each of which functions in much the same way as it does in other adults of the same social group. Furthermore, the ‘learning’ answer, by itself, does not explain why these shared cognitive mechanisms do their jobs reasonably well—why causal understanding gives us some insights into the workings of the inanimate world; language enables us to communicate fairly effectively and theory of mind allows us to predict what others are going to do. Learning alone cannot explain why, in this sense, distinctively human cognitive mechanisms are adaptive.

    To explain why distinctively human cognitive mechanisms are both shared and adaptive, cognitive science needs cultural evolutionary theory. Until now, cultural evolutionary analysis has been applied only to ‘grist’; it has been used to explain variation in, and the adaptiveness of, the products of thought—behaviour, skills and artefacts. I am proposing that it should also be applied to ‘mills’, to the mechanisms of thought—like causal understanding, language and mindreading—that control behaviour, mediate skills and, through those skills, produce artefacts. This kind of analysis, ‘cultural evolutionary psychology’, embraces the now plentiful evidence that the development of distinctively human cognitive mechanisms depends crucially, not merely on learning, but on social learning. Humans have a genetic starter kit consisting of enhanced social motivation, attentional biases (e.g. to faces and voices) and souped-up domain-general mechanisms of learning and memory. This starter kit allows complex, domain-specific ‘modules’ to be constructed in the course of development through social interaction. Distinctively human cognitive mechanisms are not merely learned, but culturally inherited, from members of the child's social group. They are shared within social groups because members ‘catch’ them from one another, and to the extent that they are adaptive—do their jobs well—it is because variant cognitive mechanisms have been winnowed by third-way cultural selection [54].

    This kind of cultural evolutionary analysis can explain why, by hypothesis, there are metacognitive social learning strategies that promote third-way cultural selection of grist. The picture is of a population of social groups—groups of people defined, not by the genes they carry, but by geography and/or cultural characteristics such as language. The members of each social group subscribe to a common set of metacognitive social learning strategies. The decision rules are shared within groups because their inheritance is ‘distributed’, i.e. the rules are learned not only from biological parents (vertical transmission) and unrelated members of the parental generation (oblique transmission), but also from peers (horizontal transmission). Different social groups subscribe to different sets of metacognitive decision rules [62–65]. For example, group A's set of rules might differentiate more finely among task domains, or among potential models within each domain, than group B's set of rules. To the extent that the more precise rules really identify ‘who knows’—the right people to copy in each domain—group A will be better able than group B to preserve adaptive innovations in the task domains for which they have more precise rules, and this will enable group A to develop, through third-way cultural selection, better boats, fish hooks or methods of baking bread. The resulting benefits to group A's living conditions make group A more likely than group B to persist, to expand through biological reproduction and immigration, and consequently to ‘bud’, producing offspring groups with similar metacognitive social learning strategies. Thus, group A is fitter than group B, where the fitness of a social group can be understood in relation to the number of descendant individuals (Type 1 fitness), or descendant groups (Type 2 fitness), that inherit the group's metacognitive social learning strategies [66].

    Many metacognitive social learning strategies are sources of what dual-inheritance theorists call ‘indirect bias’ [5]. They instruct learners to decide what to copy, not by evaluation of the traits themselves (direct/content bias)—e.g. how swiftly a boat moves through the water—but on the basis of model characteristics—e.g. which potential model agent has the largest number of boats, cows or publications. Compared with direct/content bias and guided variation, indirect bias is certainly a friend of third-way cultural selection. It involves choices that are dumb in the relevant sense, and it does not militate against recurrent fidelity. However, indirect bias has been found, not only in humans, but in a broad range of other species for which there is no evidence of cumulative or adaptive cultural change. For example, vervet monkeys are more inclined to copy females, the philopatric sex, than males [67]. Therefore, by itself, the occurrence of indirect bias in human populations is not sufficient grounds for optimism about cultural selection. It is only when we focus on cognitive mechanisms—recognize that, in humans, indirect bias can be implemented by System 2, cook-like rules, as well as by System 1, planetary rules—that we begin to see how indirect bias can support cultural selection. Planetary social learning strategies can change as a function of the user's own, recent experience; for example, if a monkey finds that information from females has yielded higher payoffs recently, it will turn its attention from males to females. By contrast, because they can be expressed in language and thereby culturally inherited, metacognitive social learning strategies can distil the experience of many agents over an extended period of time. In other words, metacognitive social learning strategies tend to be ‘wise’ (see §6), to promote third-way cultural selection of behaviour, skills and artefacts, not merely because they implement indirect bias, but because they are themselves products of third-way cultural selection [51,52].

    I have argued that conflict between populational models of cultural evolution—between dual-inheritance theory and cultural epidemiology—is important to the extent that it concerns the third-way question: Are human thought and behaviour made adaptive, not only by genetic selection and intelligence, but by cultural selection? Where this is the question at issue, easy attempts to reconcile California and Paris—by suggesting that California has quietly given up on third-way cultural selection, or by conflating weak and strong senses of ‘selection’—are in danger of drawing attention away from a fundamental question about cultural change. I have also suggested that cognitive science, and especially the kind of psychology that concerns itself with sub-personal mechanisms, can help cultural evolutionary theory to address the third-way question (i) by refining the distinction between replication and reconstruction, so that it can be used more effectively to assess the one-shot fidelity of cultural inheritance, and (ii) by casting a spotlight on metacognitive social learning strategies. These decision rules, unlike their planetary counterparts, have the potential to meet the dumb choice and recurrent fidelity requirements for third-way cultural selection. In a coda (§7), I suggested that cognitive science needs cultural evolution at least as much as cultural evolution needs cognitive science: to explain the origins and adaptiveness of distinctively human cognitive processes. If that is correct, third-way cultural selection is much less likely to have been crowded out by natural selection on genetic variants, by the ‘first way’, than the Paris school assumes in its discussions of cultural attraction. But that is another story, to be told at another interdisciplinary party. Thanks for the invitation.

    This article has no additional data.

    I have no competing interests.

    This research was supported by All Souls College, University of Oxford.

    I am grateful to Ellen Clarke, Tim Lewens, Kim Sterelny and three anonymous referees for their astute comments on an earlier version of the manuscript.

    Footnotes

    One contribution of 16 to a theme issue ‘Bridging cultural gaps: interdisciplinary studies in human cultural evolution’.

    References

    • 1

      Tomasello M. 2014A natural history of human thinking. Cambridge, MA: Harvard University Press. Crossref, Google Scholar

    • 2

      Dennett DC. 1981Three kinds of intentional psychology. In Reduction, time and reality (ed. R Healey), pp. 37–61. Cambridge, UK: Cambridge University Press. Google Scholar

    • 3

      Frankish K, Ramsey W. 2012The Cambridge handbook of cognitive science. Cambridge, UK: Cambridge University Press. Crossref, Google Scholar

    • 4

      Proctor RW, Vu KP. L. 2006Stimulus–response compatibility principles: data, theory, and application. Boca Raton, FL: CRC Press. Crossref, Google Scholar

    • 5

      Boyd R, Richerson PJ. 1985Culture and the evolutionary process. Chicago, IL: University of Chicago Press. Google Scholar

    • 6

      Cavalli-Sforza LL, Feldman MW. 1981Cultural transmission and evolution: a quantitative approach. Princeton, NJ: Princeton University Press. Google Scholar

    • 7

      Richerson PJ, Boyd R. 2005Not by genes alone. Chicago, IL: University of Chicago Press. Google Scholar

    • 8

      Henrich J. 2015The secret of our success: how culture is driving human evolution, domesticating our species, and making us smarter. Princeton, NJ: Princeton University Press. Crossref, Google Scholar

    • 9

      Sterelny K. 2017Cultural evolution in California and Paris. Stud. Hist. Philos. Biol. Biomed. Sci. 62, 42–50. Crossref, ISI, Google Scholar

    • 10

      Clarke E, Heyes C. 2017The swashbuckling anthropologist: Henrich on the secret of our success. Biol. Philos. 32, 289–305. (doi:10.1007/s10539-016-9554-y) Crossref, ISI, Google Scholar

    • 11

      Holden C, Mace R. 2009Phylogenetic analysis of the evolution of lactose digestion in adults. Hum. Biol. 81, 597–619. (doi:10.3378/027.081.0609) Crossref, PubMed, ISI, Google Scholar

    • 12

      Lewens T. 2015Cultural evolution: conceptual challenges. Oxford, UK: Oxford University Press. Crossref, Google Scholar

    • 13

      Godfrey-Smith P. 2009Darwinian populations and natural selection. Oxford, UK: Oxford University Press. Crossref, Google Scholar

    • 14

      Dennett DC. 2001The evolution of culture. Monist 84, 305–324. (doi:10.5840/monist200184316) Crossref, ISI, Google Scholar

    • 15

      Campbell DT. 1965Variation and selective retention in socio-cultural evolution. Soc. Change Dev. Areas 19, 26–27. Google Scholar

    • 16

      Campbell DT. 1974Evolutionary epistemology. In The philosophy of Karl Popper (ed. Schlipp PA), pp. 413–463. LaSalle, IL: Open Court. Google Scholar

    • 17

      Pinker S. 2010The cognitive niche: coevolution of intelligence, sociality, and language. Proc. Natl Acad. Sci. USA 107, 8993–8999. (doi:10.1073/pnas.0914630107) Crossref, PubMed, ISI, Google Scholar

    • 18

      Amundson R. 1989The trials and tribulations of selectionist explanations. In Issues in evolutionary epistemology (eds Hahlweg K, Hooker CA), pp. 413–432. Albany, NY: State University of New York Press. Google Scholar

    • 19

      Godfrey-Smith P. 2012Darwinism and cultural change. Phil. Trans. R. Soc. B 367, 2160–2170. (doi:10.1098/rstb.2012.0118) Link, ISI, Google Scholar

    • 20

      Sperber D. 1996Explaining culture. Oxford, UK: Blackwell Publishers. Google Scholar

    • 21

      Morin O. 2015How traditions live and die. Oxford, UK: Oxford University Press. Google Scholar

    • 22

      Scott-Phillips T. 2017A (simple) experimental demonstration that cultural evolution is not replicative, but reconstructive – and an explanation of why this difference matters. J. Cogn. Cult. 17, 1–11. (doi:10.1163/15685373-12342188) Crossref, ISI, Google Scholar

    • 23

      Henrich J, Boyd R. 2002On modeling cognition and culture: why cultural evolution does not require replication of representations. J. Cogn. Cult. 2, 87–112. (doi:10.1163/156853702320281836) Crossref, Google Scholar

    • 24

      Richerson PJ. (In press). Recent critiques of dual inheritance theory. Evol. Stud. Imaginat. Culture. (doi:10.26613/esic/1.1.27) Google Scholar

    • 25

      Dawkins R. 1996Climbing mount improbable. New York, NY: W. W. Norton. Google Scholar

    • 26

      Sterelny K. 2011Darwinian spaces: Peter Godfrey-Smith on selection and evolution. Biol. Philos. 26, 489–500. (doi:10.1007/s10539-010-9244-0) Crossref, ISI, Google Scholar

    • 27

      Acerbi A, Mesoudi A. 2015If we are all cultural Darwinians what's the fuss about? Clarifying recent disagreements in the field of cultural evolution. Biol. Philos. 30, 481–503. (doi:10.1007/s10539-015-9490-2) Crossref, PubMed, ISI, Google Scholar

    • 28

      Eriksson K, Coultas JC. 2014Corpses, maggots, poodles and rats: emotional selection operating in three phases of cultural transmission of urban legends. J. Cogn. Cult. 14, 1–26. (doi:10.1163/15685373-12342107) Crossref, Google Scholar

    • 29

      Hull DL, Langman RE, Glenn SS. 2001A general account of selection: biology, immunology, and behavior. Behav. Brain Sci. 24, 511–528. Crossref, PubMed, ISI, Google Scholar

    • 30

      Godfrey-Smith P. 2000The replicator in retrospect. Biol. Philos. 15, 403–423. (doi:10.1023/A:1006704301415) Crossref, ISI, Google Scholar

    • 32

      Dawkins R. 1982The extended phenotype. Oxford, UK: Oxford University Press. Google Scholar

    • 33

      Sperber D. 2000An objection to the memetic approach to culture. In Darwinizing culture: the status of memetics as a science (ed. Aunger R), pp. 163–173. Oxford, UK: Oxford University Press. Google Scholar

    • 34

      James W. 1890The principles of psychology. New York, NY: Holt and Company. Google Scholar

    • 35

      Evans JSB, Stanovich KE. 2013Dual-process theories of higher cognition: advancing the debate. Perspect. Psychol. Sci. 8, 223–241. (doi:10.1177/1745691612460685) Crossref, PubMed, ISI, Google Scholar

    • 36

      Kahneman D. 2003A perspective on judgment and choice: mapping bounded rationality. Am. Psychol. 58, 697. (doi:10.1037/0003-066X.58.9.697) Crossref, PubMed, ISI, Google Scholar

    • 37

      Norman DA & Shallice T. 1986Attention to action: willed and automatic control of behaviour. In Consciousness and self-regulation (Advances in research and theory, vol. 4) (eds Davidson RJet al.), pp. 1–18. New York, NY: Plenum. Google Scholar

    • 38

      Chartrand TL, Bargh JA. 1999The chameleon effect: the perception–behavior link and social interaction. J. Pers. Soc. Psychol. 76, 893–910. (doi:10.1037/0022-3514.76.6.893) Crossref, PubMed, ISI, Google Scholar

    • 39

      Heyes C. 2011Automatic imitation. Psychol. Bull. 137, 463–483. (doi:10.1037/a0022288) Crossref, PubMed, ISI, Google Scholar

    • 40

      De Houwer J, Thomas S, Baeyens F. 2001Association learning of likes and dislikes: a review of 25 years of research on human evaluative conditioning. Psychol. Bull. 127, 853. (doi:10.1037/0033-2909.127.6.853) Crossref, PubMed, ISI, Google Scholar

    • 41

      Olsson A, Phelps EA. 2007Social learning of fear. Nat. Neurosci. 10, 1095–1102. (doi:10.1038/nn1968) Crossref, PubMed, ISI, Google Scholar

    • 42

      Bird G, Heyes C. 2005Effector-dependent learning by observation of a finger movement sequence. J. Exp. Psychol. 31, 262. (doi:10.1037/0096-1523.31.2.262) Google Scholar

    • 43

      Meltzoff AN, Moore MK. 1997Explaining facial imitation: a theoretical model. Early Dev. Parent. 6, 179. (doi:10.1002/(SICI)1099-0917(199709/12)6:3/4<179::AID-EDP157>3.0.CO;2-R) Crossref, PubMed, Google Scholar

    • 44

      Oostenbroek Jet al.2016Comprehensive longitudinal study challenges the existence of neonatal imitation in humans. Curr. Biol. 26, 1334–1338. (doi:10.1016/j.cub.2016.03.047) Crossref, PubMed, ISI, Google Scholar

    • 45

      Catmur C, Walsh V, Heyes CM. 2009Associative sequence learning: the role of experience in the development of imitation and the mirror system. Phil. Trans. R. Soc. B 364, 2369–2380. (doi:10.1098/rstb.2009.0048) Link, ISI, Google Scholar

    • 46

      Heyes C. 2016Homo imitans? Seven reasons why imitation couldn't possibly be associative. Phil. Trans. R. Soc. B 371, 20150069. (doi:10.1098/rstb.2015.0069) Link, ISI, Google Scholar

    • 47

      Buskell A. 2017What are cultural attractors?Biol. Philos. 32, 377–394. Google Scholar

    • 48

      Shea N. 2009Imitation as an inheritance system. Phil. Trans. R. Soc. B 364, 2429–2443. (doi:10.1098/rstb.2009.0061) Link, ISI, Google Scholar

    • 49

      Heyes CM. 1993Imitation, culture and cognition. Anim. Behav. 46, 999–1010. (doi:10.1006/anbe.1993.1281) Crossref, ISI, Google Scholar

    • 50

      Heyes C. In press.Human nature, natural pedagogy, and evolutionary causal essentialism. In Why we disagree about human nature (eds Lewens T, Hannon E). Oxford, UK: Oxford University Press. Google Scholar

    • 51

      Heyes C. 2016Blackboxing: social learning strategies and cultural evolution. Phil. Trans. R. Soc. B 371, 20150369. (doi:10.1098/rstb.2015.0369) Link, ISI, Google Scholar

    • 52

      Heyes CM. 2016Who knows? Metacognitive social learning strategies. Trends Cogn. Sci. 20, 204–213. (doi:10.1016/j.tics.2015.12.007) Crossref, PubMed, ISI, Google Scholar

    • 53

      Shea N, Boldt A, Bang D, Yeung N, Heyes C, Frith CD. 2014Supra-personal cognitive control and metacognition. Trends Cogn. Sci. 18, 186–193. (doi:10.1016/j.tics.2014.01.006) Crossref, PubMed, ISI, Google Scholar

    • 54

      Heyes C. 2017Cognitive gadgets: the cultural evolution of thinking. Cambridge, MA: Harvard University Press. Google Scholar

    • 55

      Barkow JH, Cosmides L, Tooby J. (eds). 1995The adapted mind: evolutionary psychology and the generation of culture. Oxford, UK: Oxford University Press. Google Scholar

    • 56

      Meltzoff AN, Moore MK. 1977Imitation of facial and manual gestures by human neonates. Science 198, 75–78. (doi:10.1126/science.198.4312.75) Crossref, PubMed, ISI, Google Scholar

    • 57

      Chomsky N. 1988Language and problems of knowledge: the Managua lectures, vol. 16. Cambridge, MA: MIT Press. Google Scholar

    • 58

      Baron-Cohen S, Leslie AM, Frith U. 1985Does the autistic child have a ‘theory of mind’?Cognition 21, 37–46. (doi:10.1016/0010-0277(85)90022-8) Crossref, PubMed, ISI, Google Scholar

    • 59

      Christiansen MH, Chater N. 2016Creating language: integrating evolution, acquisition, and processing. Cambridge, MA: MIT Press. Crossref, Google Scholar

    • 60

      Moore R. 2017The evolution of syntactic structure. Biol. Philos. 32, 599–613. Google Scholar

    • 61

      Brunsdon VEet al.2015Exploring the cognitive features in children with autism spectrum disorder, their co-twins, and typically developing children within a population-based sample. J. Child Psychol. Psychiat. 56, 893–902. (doi:10.1111/jcpp.12362) Crossref, PubMed, ISI, Google Scholar

    • 62

      Berl RE, Hewlett BS. 2015Cultural variation in the use of overimitation by the Aka and Ngandu of the Congo Basin. PLoS ONE 10, e0120180. Crossref, PubMed, ISI, Google Scholar

    • 63

      Corriveau KH, Kim E, Song G, Harris PL. 2013Young children's deference to a consensus varies by culture and judgment setting. J. Cogn. Cult. 13, 367–381. (doi:10.1163/15685373-12342099) Crossref, Google Scholar

    • 64

      Glowacki L, Molleman L. 2017Subsistence styles shape human social learning strategies. Nat. Human Behav. 1, 0098. (doi:10.1038/s41562-017-0098) Crossref, PubMed, ISI, Google Scholar

    • 65

      Mesoudi A, Chang L, Murray K, Lu HJ. 2015Higher frequency of social learning in China than in the West shows cultural variation in the dynamics of cultural evolution. Proc. R. Soc. B 282, 20142209. (doi:10.1098/rspb.2014.2209) Link, ISI, Google Scholar

    • 66

      Okasha S. 2005Multilevel selection and the major transitions in evolution. Phil. Sci. 72, 1013–1025. (doi:10.1086/508102) Crossref, ISI, Google Scholar

    • 67

      van de Waal E, Renevey N, Favre CM, Bshary R. 2010Selective attention to philopatric models causes directed social learning in wild vervet monkeys. Proc. R. Soc. B 277, 2105–2111. (doi:10.1098/rspb.2009.2260) Link, ISI, Google Scholar


    Page 3

    Understanding how the capacity for language evolved is a difficult problem, because it is unique to humans, because its computational nature and brain basis are poorly understood, and because the successive stages of its evolution left no fossil record. The problems of the evolution of language and of the cognitive capacities that support it extend over a wide range of disciplines, from comparative ethology to neuroscience, psychology and linguistics, but loom particularly large in evolutionary biology, where the emergence of language has been described as one of a few ‘major evolutionary transitions’ [1]. What made this transition possible? Answering this question is like finding a missing piece of a puzzle from which many other pieces are also missing, and over which there is no consensus as to how the ‘big picture’ should look. The shape of the sought-after piece can still be pondered, but to do so meaningfully requires that assumptions be made about the other missing pieces and how they might fit together.

    Some of these pieces are highlighted by different questions that may be asked about the origin of language, each portraying a different perspective on the topic. What is language? What are the cognitive mechanisms that are involved in language learning and use? What are the roles of cultural exposure and of neural plasticity in the development of an individual's linguistic capacity? What was language's original function? What features of language are common to humans and to other apes' behaviour, and what are different—qualitatively or quantitatively? How much specific evolutionary adaptation was required for language, and how much of it relied on domain-general mechanisms?

    In this paper, we address a very specific, yet critically important aspect of the question of language emergence: what was the ecological–behavioural context that made proto-language adaptive and triggered its emergence? We begin by stating and motivating, in §2, our assumptions regarding the nature of language, which differ in important respects from those that underlie the two major classes of linguistic theories. In §3, we discuss some open questions in language evolution in light of our view of language and of the standard methodology in computational cognitive science, which calls for multiple levels of explanation. In §4, we revisit our central question and mention some of the answers that have been proposed for it. Section 5 briefly discusses some of the relevant findings on the brain basis of language. In §6, we lay out our hypothesis, the Cognitive Coupling (COCO) hypothesis: in the context of the teaching of tool use, brain mechanisms involved in hierarchical planning and in sequential control of actions were coupled with brain mechanisms in charge of social communication, giving rise to the capacity for language (figure 1). Section 7 lists evidence that supports this hypothesis. Finally, in §8 we summarize our argument and outline directions for future work.

    What is cultural transmission example?

    Figure 1. A schematic illustration of the premises and the reasoning underlying the proposed Cognitive Coupling account of language evolution.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    The problem of language evolution looks particularly daunting when seen through the lens of both reigning paradigms in linguistics: the formalist and the functionalist (see [2] for the motivation of these terms). In this section, we briefly state these two approaches, then outline a third, which we favour and which we claim offers a better foundation for an evolutionary understanding of the emergence of language.

    On the formalist account, language is a formal system (in the technical sense of the phrase; [3])—a set of generative rules or grammar that licenses a discrete infinity of well-formed strings of symbols or sentences, to each of which it assigns a hierarchical tree-like structure. The primary use of language is taken to be the structuring of thoughts, with its use for communication being considered ‘ancillary’ [4]. Although both thinking and communication are presumed to involve the generation and analysis of ‘meanings’, formalist linguistics does not favour a specific approach to semantics, which is seen conceptually as secondary to syntax and which, as a discipline, appears to lag behind theories of syntax (e.g. [5]). For a detailed critical discussion of the formalist approach to language, see Edelman [6,7].

    Because the possession of a formal generative grammar implies an ability to generate an infinity of structured sentences that no finite amount of training data can possibly motivate without a ‘quantum leap’ of generalization,1 formalists have traditionally shunned questions of evolution altogether, or else resorted to ‘singular mutation’ accounts (such as the one behind the Merge operation; e.g. [9,10]). Chomsky, in particular, favours viewing the emergence of language as having been due to a unique ‘discontinuity’ (for a recent recap of that view, see [4]), perhaps acting on his own earlier advice (‘It is perfectly safe to attribute this development [of innate mental structure] to ‘natural selection’, so long as we realize that there is no substance to this assertion’; [11, p. 97]). We note that the continued tendency to regard language evolution as a ‘mystery’ (as per the title of [12]) is actually a reasonable stance for the formalists, not only because of the problematicity of the all-important point mutation as the key explanatory move, but also because of the dearth of unequivocal empirical (behavioural and neurobiological) support for the generative hypothesis (see [6], §5a, for a detailed discussion and references).

    On the functionalist account, language is ‘for’ communication, which makes it inherently social and also elevates semantics and, more generally, concepts and cognition, to a position of primary importance in linguistic theory. Grammar still plays a central role as the tool for encoding meanings into a form that can be easily transmitted via the available gestural (acoustic or other) channel and for decoding the messages on the receiving end. Because they assume that communication between cognitive agents is the overarching goal, functionalist approaches to grammar do, however, tend to be better integrated with theories of cognition, by positing computational mechanisms that are shared between conceptual and linguistic processes (e.g. [13]).

    Because functionalist theories still involve the concepts of grammar and well-formedness, they pose the same conceptual problems, and give rise to the same tensions with regard to empirical findings, as do formalist theories (examples include positing intricate rules that involve hidden structures and having to garner behavioural and neural evidence for these rules and structures; see [7] for details and discussion). This problematicity extends to, and is amplified in, the context of the origins of language, where functionalists need to explain not only the emergence of well-formedness (and of course infinite generativity), but also of the kind of code coordination that would make the sharing of meanings possible.

    While the evolution of both these traits has been demonstrated in agent-based computational simulations (e.g. [14,15]), these had been geared from the outset towards rewarding effective and properly (i.e. compositionally) structured messaging, leaving open the question of our main concern: the initial ecological context and adaptive value of these traits for agents who prior to the emergence of full-blown language had had no use for either structure or compositional meaning.

    In contrast to the formalist and functionalist approaches, we suggest an alternative view of language and adopt it as a working hypothesis ([6,7]; see [16,17] for similar earlier approaches). It does not postulate the existence of infinitely generative rules (a grammar), nor does it assume that people speak in complete, well-formed sentences that possess a uniquely linguistic kind of recursive structure. Furthermore, we do not accept that language is primarily ‘for’ either thinking (as per the current Minimalist Grammar version of formalist linguistic theory) or communication, construed as exchanging coded meanings (as per the functionalist approach). Rather, we assume that language constitutes a toolkit that effectively supports an individual in influencing the state and the behaviour of others (and perhaps also self; this notion is discussed, e.g. in [14,18–21]). This system's sophistication transcends anything found in the rest of the animal kingdom, yet its basic computational principles and neural mechanisms, including those that give rise to sequential and hierarchical structure, are closely related to those underlying animal signalling and social interaction.

    Importantly, this view highlights the role of language as a means of communication, but—as opposed to some previous accounts—communication is construed as the interlocutors being able to influence each others' (and their own) cognitive and behavioural processes by linguistic means [22]. This take on communication works not just for the many types of situations that require ‘mere’ steering of thinking or behaviour in a desirable direction, but also for conveying complex information, such as the design for a complex mechanical contraption or an entire strategic hunting plan. In these latter scenarios, people rely heavily on conceptual—that is, perceptual, abstract and motor—knowledge that their interlocutors possess ahead of time and that need not be (and cannot be) included in the ‘messages’ that are being exchanged (cf. [16]).

    The construal of language as an instance of a system of social interaction suggests that in order to understand the evolution of the capacity for language from, and based on, the cognitive substrates that preceded it, it would be instructive to consider the elements that it shares with communication systems in other animals, particularly apes, which were likely to have been in existence in our most recent common ancestor, as well as those aspects in which language differs from such communication systems (cf. [23]). As noted by Ackermann et al. [24], ape vocalizations lack the kind of sequential (let alone hierarchical) structure that characterizes language; in comparison, the emotional modulation and social uses of vocal gestures is similar in apes and humans, and apes, like humans, are capable of planning and carrying out structured behaviours in domains other than language.

    This makes it feasible for the coupling of the capacity for structured behaviour with communication-related traits (such as the intent for affecting the state of others) to have emerged in the hominin line by a sequence of minimal steps (changes in brain circuitry), each of which was adaptive. A relevant observation here is that the primary features of language that distinguish it from ape communication systems are supported by neural mechanisms that play prominent roles in other areas of ape behaviour. In particular, the processing of serial order and hierarchical structure, including long-distance dependencies in space and time, are necessary for behaviours such as primate tool use, navigation, foraging and social action.

    Following the hypothesized sequence of adaptations, language became characterized by serial order and hierarchical structure and dependencies, imposed over vocal (or other physical) gestures that may be individually and/or jointly referential [25]. Language also acquired a number of distinctive social characteristics (in addition to those that it shares with other behaviours [26]). Some of these specifically human traits are the intentional use of language to affect others' states and behaviours [27], while tailoring the verbal means to the recipient; propensity for extended dialogic interaction [28]; a vast, learning-intensive, socially acquired component that dwarfs its innate foundation [29]; and a critical role in humans’ capacity for innovation [30]. Importantly, elements of each of these, from the so-called theory of mind to a combination of innate, learned, and invented behaviours, can be found in non-human ape communication, but are severely constrained and limited compared to their ubiquitous role in humans [31–34].

    Giving up the formalist conceptions of grammar and well-formedness makes it easier for us (as it would for certain other functionalist approaches) to resolve other problematic ideas associated with these conceptions. One such idea that we can safely set aside is that babies learn infinite productivity and structural perfection—traits found in no other species—from finite data; another one is that our species evolved such unique traits from scratch. Furthermore, because we hold that the basic use of language is social in a manner that allows for, but does not necessitate, the exchange of perfectly structured information, our approach is amenable for serving as a bridge between, on the one hand, theories of animal communication and its evolution, and, on the other hand, linguistics and the evolution of language.

    With the literature on language evolution growing apace (see the recent special issue of Psychonomic Bulletin & Review for a sample, [23], as well as [35]), the range of research questions that are being entertained is very broad. To impose some structure on these, we invoke a methodological consideration that has long been standard in evolutionary, behavioural and cognitive sciences: the notion of levels of explanation. In evolutionary biology, these include Mayr's distinction between proximate and ultimate causes [36,37]. In ethology, there is Tinbergen's [38] fourfold distinction among questions of survival value, ontogeny, evolution and causal powers of a trait. Finally, in computational cognitive science, there are the three levels of explanation identified by Marr & Poggio [39–41]: the levels of problem; representation and algorithm; and implementation (see [42] for a discussion of evolution in this context).

    The various levels of explanation are not independent: as one of us pointed out in connection to vision [39,43] and language [6,7], a specific assumption made on one level has implications, sometimes negative, for the directions in which inquiry on other levels proceeds. With regard to language, in particular, getting the problem-level answer or assumption wrong can prevent us from understanding how it works in the computational sense, or how it has evolved [6].

    The interdependence of the levels of explanation suggests that the answers to the many questions arising from language evolution are also connected to one another. For instance, a saltatory as opposed to a gradualist account is made more plausible by assuming that there is one key feature (such as the Merge operation that purportedly allows recursion; [4]) that makes language what it is (and different from other animal signalling and communication systems). Given that Merge and recursion, as well as the broader notion of grammar, have no bearing on the question of how gestures can be, or have become, referential, it is not surprising that the formalist linguistic theory built around these concepts posits that language is primarily ‘for’ internal thought, largely skirting the problem of reference. Rejecting, as we do, the formal notion of grammar (and the conceptual centrality of Merge) thus leads to a completely different set of takes on the questions of saltation/gradualism and reference.

    Adopting the popular alternative assumption regarding what language is ‘for’—communication, as per functionalist linguistics—suggests other answers to these questions and brings to the fore yet another set of issues. In particular, it becomes critically important to determine how language relates to honest signalling [44–46]. Furthermore, given that communication is by definition a social/cultural activity, understanding the interaction between the genetic evolution of language users and the cultural evolution of language becomes key, giving rise, in turn, to questions about the relation between the evolution of the cognitive capacity for language and the ways in which language evolves (cf. Hurford's [47] discussion of the notion of ‘glossogeny’, which dates back to Jacobson and earlier work in linguistics). It also brings to mind queries as to which aspects of language acquisition are developmental, dependent on cultural exposure, and which are innate, a product of evolutionary adaptation.

    As already noted, our take on language diverges from the standard functionalist one (albeit not so drastically as it differs from that of formalist linguistics). On the abstract computational level, our view holds that language users aim to influence the cognitive state, and accordingly the behaviour of others (and of themselves, when resorting to internal speech). This implies that the key computational problem that the cognitive system needs to address is choosing what to say next (note that this is a subset of the general problem in the control of behaviour, which is deciding what to do next; [7,48]). On the level of representation and algorithms, language production is supported by a system of options structured like a graph, whose nodes correspond to the discrete combinatorial elements of language—phonemes on one level; morphemes or words on another [49–51]. Paths through this graph, which correspond to the (multimodal; [48]) utterances, are constructed by the speaker ‘on the fly’ during production, by an algorithm that is akin to competitive queuing [52], subject to a number of dynamically applied constraints, which include real-time social feedback; the listener's own version of the graph mediates the processes (and the resulting effects) on the receiving side. Finally, on the level of neural implementation, the use of language, similar to any other behaviour, involves activity that is distributed over most of the brain; some particularly relevant circuits are discussed in §5 below (see also [7]).

    We note that our problem-level view of the nature of language places it much closer than other accounts to other behaviours, including those of other species; the same holds for the algorithms that shape language behaviour and the brain circuits that implement them. Thus, we expect our approach to be more amenable to integration with evolutionary theory than formalist and functionalist ones. With the just-stated conception of language in mind, we now turn to the specific question of the adaptive value of language and the context of its emergence.

    A popular way of framing this question is by focusing on ‘the adaptive value of the first word’ ([53, p. 165]). Assuming, as we do, that language is ‘for’ influencing others [14,16,18,20] implies that the context we are interested in must be a social one, but this leaves open a rather broad range of possible ecological–behavioural contexts. A principled exploration of this range may be facilitated by the explicit layout of evolutionary considerations that need to be taken into account (e.g. [54]).

    First and foremost, the ecological–behavioural context within which the capacity for language evolved (henceforth the context) was necessarily one whose occurrence did not rely on the pre-existence of language. In other words: although once language existed it may have turned out to be useful, and perhaps necessary, in many contexts, the fundamental setting in which language evolved must have been one that initially was not facilitated by language at all.

    Second, this context must have been one in which alternative outcomes of the involved individuals' behaviour had consequences for their fitness, and this context had to recur on a frequent basis, in the lives of many, and over multiple generations. In order to drive evolutionary change that would give rise to language, these outcomes must have been influenced by the extent to which communication between the individuals was successful.

    Third, the context we seek is one in which the interests of the involved individuals are largely aligned. This stems from the fact that language does not have an inherent mechanism that assures honest signalling [44], and it seems highly unlikely that such an elaborate communication system, which is demanding of all individuals involved, and which has the potential to be easily used for deception, would evolve outside of a context in which all involved stand to benefit greatly from successful communication and transfer of knowledge. Although not exclusively, a situation in which most communicators are kin seems the most likely in this context [54–56].

    Finally, although not absolutely necessary, it seems reasonable to assume that the context that gave rise to language, given language's flexibility, open-endedness and being itself a cultural construct, would have been one that would change at a rate that requires cultural transmission of knowledge, perhaps because genetic adaptation to the environmental challenge would be too slow. This further suggests that the setting may be one in which niche construction and co-evolution of the context and the language that is involved in its facilitation may take place. It also seems likely that steps along the process occurred in the form of a Baldwin effect: that plasticity in the use of the communication system supported successful coping with the challenge provided by the ecological context, and that selection favoured individuals who happened to be better cognitively pre-adapted to this means of communication, gradually giving rise to the innate capacity for language that characterizes modern humans.

    As is the case with other behaviours, language engages a broad coalition of brain areas and circuits ([57–60]; for a summary, see [7], §5). Although this view is supported by a large and growing set of neuropsychological findings (in addition to the general evidence for massive sharing of brain areas across tasks; [61]), these are commonly still given a conservative interpretation. This interpretation distinguishes between ‘core’ language areas (such as Broca's), where a lesion results in a major loss of function, and ‘optional’ ones (e.g. the cerebellum), where lesions may seem to have little, if any effect on normal functioning.

    A closer consideration reveals, however, that both these observations are open to challenge. First, lesions in a core language area typically result in a loss of other functions in addition to language (e.g. [62] on Broca's area lesions). Second, lesions in other areas may impair language in ways that standard tests can easily miss (e.g. [63] on cerebellar lesions).2 These observations suggest that the brain basis of language is much broader than commonly thought—a notion that is entirely in line with Anderson's [61] Neural Reuse hypothesis, according to which ‘neural elements originally developed for one purpose are put to multiple uses’. (See also the broader notion of evolutionary exaptation, [64–66].) Language evolution, in particular, is thus seen as having been facilitated by the hijacking of existing brain mechanisms, which came to be reused for new tasks, while retaining their old use [57,67,68].3

    The category of tasks to which Broca's area, in particular, makes a critical contribution offers a hint as to how such hijacking may have occurred. In addition to its classically documented involvement in the structural (syntactic) aspects of language, Broca's area supports hierarchical structural processing that may or may not be temporal, as, for instance, in task decomposition [62,69]. As such, it serves as a functional hub whose capacity for hierarchical computation can be used both for planning sophisticated action selection for manual tasks as well as sophisticated communication.4 Furthermore, there is abundant evidence suggesting that action selection and the temporal sequencing of elementary actions is mediated by circuits linking cortical areas (including those in the frontal lobe, such as Broca's) and the thalamus with the basal ganglia (e.g. [52,70]). It has been proposed that these circuits are central to language (e.g. [71–73]). Indeed, the striatum, which is the first destination of the cortical projections to the basal ganglia, appears to be absolutely necessary for language acquisition [74,75].

    A detailed two-systems view of language that focuses on the above circuits has been advanced by Ackermann et al. [24], who posit one set of mechanisms for the ‘digital’ content of utterances and the other for the ‘analogue’ prosodic/affective information that they include. They argue that, although most of the components of each of these two systems are present both in humans and in other primates, non-human primates lack both the kind of coordination that allows independent control over the two, as well as some circuits that may be crucial.

    The emergence and subsequent epigenetic and genetic entrenchment of any new circuitry5 would have been particularly favoured by selection if the early hominin brain was being ‘prepared’ for it by having the relevant functional connectivity ramped up by the behavioural contexts and tasks involving proto-language (cf. the Baldwin effect; [76–78]). Interestingly, in modern humans, learning to read—a task that came into existence recently in human evolution and is unlikely to have had time to significantly influence human cognition via selection on genetic variants—causes, over a period of just several months, a significant shift in the pattern of functional connectivity in the brain [79], demonstrating the potential impact of a novel task that is accommodated by pre-existing circuitry.6 Similarly, with regard to manual behaviour, changes in functional connectivity were also found in the brains of human subjects who underwent several months of training in stone tool knapping [82].

    The preceding observations may help us understand how the human capacity for language could emerge from the primate-general mechanisms. The hypothesis that the brain basis of language was shaped by the hijacking/reuse of existing circuits, which already had well-established uses outside of the context of communication, may be considered in conjunction with two specific such mechanisms (over and above the general pattern of brain connectivity, such as the centrality of the pulvinar, noted in endnote 6). The first is the utilization of the cortical (prefrontal and motor)/basal ganglia circuits for learning visuo-motor sequences [83]—a function that is critical for, among other tasks, the learning of tool use. The second is the use—indeed, the indispensability—of the same circuits for reinforcement learning, which aims to maximize cumulative expected reward [84–86].

    In light of these considerations, we may supplement our list of evolutionary considerations with the requirement that the proposed ecological–behavioural context for the evolution of the capacity for language was a setting that would have supported co-option and temporal coupling of the circuits involved in temporal sequencing and hierarchical processing with those involved in communication. Such a context must be one that does not initially rely on the coupled action of these mechanisms in order to exist, but which is likely to reliably prompt their simultaneous use again and again throughout an individual's lifetime and across generations. This simultaneous use would have had to be advantageous, such that individuals in whom the mechanisms tended to couple with one another more readily had an advantage.

    One scenario that meets these requirements is the teaching of tool use—a task whose brain basis, perceptual/motor and social/communicative aspects are close enough to those of language to suggest that it may have made protolanguage adaptive [54,87], thereby facilitating the reuse and temporal coupling of brain mechanisms and driving the evolution of language as we know it. Interestingly, this coupling may have been facilitated by the dual role of the cortical/basal ganglia circuits both in perceptual-motor behaviour and in reinforcement learning, since feedback for effective dyadic communication and successful learning would provide immediate social reward alongside the attendant advantages of social transmission of knowledge. In the next section, we describe this scenario in some detail and discuss its implications.

    The problem of explaining the origin of language has been eloquently posed by Premack [88, p. 282]:

    Human language is an embarrassment for evolutionary theory because it is vastly more powerful than one can account for in terms of selective fitness. A semantic language with simple mapping rules, of a kind one might suppose that the chimpanzee would have, appears to confer all the advantages one normally associates with discussions of mastodon hunting or the like. For discussions of that kind, syntactic classes, structure-dependent rules, recursion and the rest, are overly powerful devices, absurdly so.

    A number of suggestions have been put forth regarding the origin of language, framing the question in various ways and accordingly focusing on different aspects that may be related to it [23,45]. One widely cited model, the gossip theory, suggests that language evolved in the context of increasing hominid group size, allowing this growth by providing an efficient means to exchange information about non-present individuals [89,90]. According to this approach, this is a necessity that arises when the group reaches a size at which most individuals cannot spend the majority of their time with one another.

    Other approaches, along similar or related lines of reasoning, suggest that language emerged as a means of communication that would replace one-on-one grooming or would increase group cohesion through ritual [46,89,91]. Another model, the hunting coordination theory [92,93], suggests that language emerged as a means of coordinating complex activities that demanded planning ahead of time and during which the participants could not communicate due to limited physical proximity. Yet another model suggests that language, similar to bird song and courtship displays, evolved as a response to pressures of sexual selection or for the establishment of social status ([94], and see [95,96]).

    A detailed discussion of these ideas is beyond the scope of the present paper. We believe that each of these theories has some merit and the mechanisms they posit may have had a role in some phase of language evolution, particularly once language had already developed. However, each of them fails to account for the origin of language, for at least one of a number of reasons. The majority of these theories propose ultimate explanations for language, i.e. they explain what its use may have been once it emerged, but ignore the proximate mechanism by which language could have evolved, in such a manner that every step in the process of its evolution was independently advantageous. Thus, it may be useful to talk about a person, object or ambush plan that are out of sight, but it is unclear how such an ability could have gradually evolved [54].

    Another point of weakness in some theories such as sexual selection is that it is unclear how and why language would become referential, or why these selective forces, similar to those that many species experience, would lead to language in humans but not in other organisms (discussed in, e.g. [23]). We suggest that the scenarios regarding the origins of language that have been proposed thus far either do not take into account the evolutionary considerations outlined in §4 or do not fulfil the requirement that the context include simultaneous use of brain circuits involved in temporal sequencing of behaviour and in communication.

    An ongoing debate is whether language evolution included an early phase of gestural communication that was later replaced by vocal utterances [20,31,92] or whether it arose directly from vocal communication skills that preceded it (discussed in [32]). It has been pointed out that the dichotomy between gestural and vocal origins may be false, because both may have taken place contemporaneously [67]. We return to discuss this topic below, in light of the scenario for the origin of language that we propose.

    Here is how the sequence of events underlying and driving language evolution could have proceeded, according to the scenario we just outlined. At the beginning of the evolutionary transition in question, our hominin ancestors possessed brain mechanisms that enabled them (a) to engage in the hierarchical planning and sequential control of actions and (b) to perform elementary communication acts consisting of isolated manual and vocal gestures. The main components of the mechanisms supporting (a) and (b) were homologous, respectively, to the cortical/basal ganglia circuits mentioned above and the primate-general ‘limbic communication system’ [24].

    One kind of behaviour that is enabled by hierarchical planning and sequential control and may benefit from concurrent communication is instruction in tool-making and tool use [54,97]. We propose a process that plays out on two timescales. The first occurs along an individual's lifetime: individuals who learned and taught tool-making while at the same time engaging in elementary communication learned and taught more efficiently, produced better tools and enjoyed a selective advantage. Developmentally, this would have led to the coupling of the circuits involved (as per Hebb's idea that neurons that ‘fire together, wire together’). Such yoking of the two sets of mechanisms is likely to have occurred unconsciously, reinforced directly by the increase in instructional success. Over an evolutionary timescale, natural selection favoured the lineages of individuals in whose brains the coordination between these two activities was more readily facilitated by changing functional connectivity. Since variation also exists and occasionally arises in the innate aspects of brain architecture, selection also would have favoured lineages in which greater coupling of communication and serial order was genetically determined via newly emerging circuitry.

    Related ideas regarding the context in which language arose have been put forward previously, albeit not in a form that links them with the requirement of co-occurrence of communication and temporal sequencing of actions, which could have led to the language-unique coupling of brain circuits. Laland [54] and others [87,97] proposed, on the basis of evolutionary considerations that highlight honest signalling and cultural niche construction, that the context of language evolution was in teaching among kin. Stout and others [67,98,99] hypothesized that tool-making and its instruction gave rise to language; they provide an elaborate analysis of the neural correlates of the two behaviours and use experiments to demonstrate the effects of learning stone knapping on functional connectivity in the brain and of tool production on neural activation.7 We endorse these accounts, and suggest that combining them leads to a powerful working hypothesis regarding the origins of the capacity for language.

    The account we proposed is in line with the evolutionary considerations we have laid out; importantly, it also provides an explanation for the fact that language evolved only in humans, as an outcome (and perhaps also a driver) of humans' increasing reliance on a culturally constructed niche for their survival. This is a context that changes rapidly, relying on learned knowledge, in which teaching and effective information transfer are highly advantageous. Such learning and teaching requirements, coupled with the anchoring of some of the most critical human skills in the physical realm of sequential behaviour in tool production, offer a natural context that bridges communication and sequential and hierarchical processing, which are largely separate from one another in other apes. It also accounts for the referential nature of language and suggests a way in which it could have arisen gradually, as each incremental improvement in the ability to refer to seen and—later—unseen aspects of the demonstrated process would have been independently advantageous. This is particularly true given that this cultural context is distinguished by the need for precision in some of its components.8

    The debate regarding the hypothesis of the gestural origin of language can now be reconsidered. The ecological context of language origin that we propose assigns a major role to physical manipulation and motor behaviour, the fundamental tenets of the gestural hypothesis [20,21]. Importantly, these are called upon in our scenario, at least in the primal phase of language evolution, in a functional context without a communicative intent, namely in the service of producing or manipulating tools. However, the recruitment of gestural behaviour, alongside vocal communication, would have been natural and probably occurred early on, for instance in the form of repeating a functional manual gesture multiple times for emphasis, or miming a functional manipulation before it takes place to aid the learner in parsing and interpreting the sequence of actions. In this context, not only are the gestural and vocal approaches not mutually exclusive, they complement each other. The joint involvement of both modalities in the emergence of language is, according to our proposal, to be expected. This notion is also consistent with the multimodal nature of language [48,102].

    Although alternative hypotheses of the origin of language invoke different behavioural settings, it would be naive to suggest that any single ecological context such as the teaching of tool production to kin was the exclusive driver of language evolution. First, the teaching of tool production is but one of many social behaviours that share features that we hold to be crucial in supporting the evolution of language. Coordinated manipulation of objects, as in constructing a shelter or using a large hunting net, for example, is one of these (see discussion in [103]). Second, multiple contexts that an individual experiences throughout life are likely to have shaped each of the systems involved in the emergence of language. Thus, for example, the cultural practice of tool use and production, which requires the learning and honing of very precise, highly intentional and sometimes non-intuitive, motor manipulation, and which had probably evolved in the hominin lineage for millions of years before the emergence of language [21], is likely to have primed and selected for brain mechanisms that later—through coupling with the communication circuitry—gave rise to language. Finally, once language or proto-language began to develop and as it developed, it would have been rapidly applied to additional contexts, which—in turn—would have added to the sum of selective pressures that act on this composite trait [104]. We suggest that the context of teaching tool production and use served as a primal and primary context for the evolution of language and the capacity for it, but additional contexts necessarily played a role even in the early stages of language evolution and even more so as this capacity evolved.

    We now proceed to list, and briefly discuss the significance of, sources of evidence in favour of the COCO hypothesis, drawing on studies in archaeology, cognitive and developmental psychology and neuroscience. Many of the points that we make here recap material already mentioned in §§2–5.

    Language does not fossilize in a manner that leaves tangible archaeological footprints. Accordingly, any link between language and archaeological findings is inferential, and is often deep in the realm of speculation. Various artefacts have been viewed as evidence of complex cognitive abilities and effective means of cultural transmission; some researchers have postulated that certain findings, such as composite tools, cave paintings and artefacts suggestive of symbolic thought such as beads, were unlikely to have been produced in the absence of language [93,105–108]. Even if such inference is taken at face value, it provides, at best, a limit regarding the time frame in which language may have emerged [107,108], but does not tell us much about the context of its origin.

    It has been argued that morphological changes in the vocal apparatus or brain structures, whose existence may be inferred from the skull's shape, can serve as evidence regarding language use [107]. We find this unlikely, in light of the fact that language can be carried out fully over a gestural modality as in sign language, and that multiple brain regions are involved in language use while no brain region serves language alone.9

    However, a new discipline—experimental archaeology—is rapidly bridging the gap between the archaeological record and the neural and behavioural dynamics that it may reflect. This approach involves the study of apes (human and non-human) as they manipulate and produce tools and while they teach and learn these skills [97,110]. Aided by neuroimaging technology, these studies explore the brain mechanisms involved, and, coupled with studies of comparative neuroanatomy, highlight the changes in brain activation and connectivity patterns in response to and in support of tool use, on timescales ranging from seconds to years [82]. They also allow comparison between the brain structures involved in tool use by humans and other apes, fleshing out the possible results of natural selection on this behaviour [111,112].

    These studies have provided some of the evidence of the overlap discussed earlier between brain mechanisms that support tool use and those that support language. They are also in line with the idea that producing certain tools, but not others, might be associated with qualitatively different cognitive mechanisms, and that these mechanisms, in turn, may be in accord with a related linguistic capacity: Stout et al. [110,113,114] find that Oldowan tool production requires the activation of praxis-oriented neural circuitry in both humans and bonobos, while the production of late Acheulean tools incorporates, in addition to these, circuitry associated with hierarchical organization, abstract action representation, semantic/syntactic integration and long-distance syntactic dependencies. This makes much sense, as the production of Acheulean tools requires the achievement of technological sub-goals, which are interdependent in a manner similar to syntactic dependencies. They also discovered that the same circuits are activated upon observing another individual's production of Acheulean tools, and that in experienced stone knappers who observe another individual's work, neural circuitry associated with intention attribution is activated [67]. We suggest that although these findings cannot prove that language evolved in the context of technological pedagogy, the coordinated neural activation of the primary elements of linguistic communication, sequential and hierarchical processing and inference of another individual's internal state, during tool production or observation, constitutes compelling support for this hypothesis.

    The behavioural contexts of communal tool use and tool-making jointly activate cognitive mechanisms that are also employed in a range of other situations and tasks, both in humans and, to various extents, in other primates. These include, on the one hand, sequentially and hierarchically structured action control and, on the other hand, communication and social manipulation (see [7], for behavioural and computational analyses, concrete examples of relevant tasks, and extensive references). Socially shared tool-related activities, and especially tool-making instruction, are therefore prime candidates for the kind of context that first brought together previously unrelated abilities that had existed in the hominin line, making their integration and the subsequent emergence of modern capacity for language initially and continually adaptive.

    The tool-making pedagogy hypothesis of language origin fits naturally with the view of language as a skill (e.g. [115,116]). None of the skills that distinguish humans from other animals—with the exception of the skill to learn language and other complex behaviours—are available at birth: all must be learned. Just like the learning of other complex skills, language acquisition involves social interaction [117–119] and possibly even instruction [120]. The presumably innate foundation over which the specifically linguistic skills are built during ontogeny, the ease with which humans combine communicative influence and reference with complex sequential hierarchical structuring of actions, may be considered an ‘evolutionary fossil’ of the social skill-transmission context in which the two components of this combination first came together. A less indirect type of evidence for the existence of a common neural substrate for gesturing, object manipulation, tool use and communication can be obtained by studying the trajectories of development of these behaviours throughout childhood, with particular stress on early phases of their acquisition (e.g. studies discussed in [98]).

    The evolutionary convergence, as well as the behavioural, computational and developmental commonalities between language and other advanced human skills, would be difficult to explain were the brain basis of language confined to the cortical Broca's and Wernicke's areas, as most introductory textbooks still have it. The understanding that emerges from dozens of studies is, however, that most brain areas, both cortical and subcortical, participate in supporting most tasks that have been considered [61], with language processing being no less distributed ([58,121]; see [7], §5, for a brief synthesis). The critical question is, of course, what made the brain mechanisms underlying various relevant behaviours that had been in place prior to the emergence of (proto-) language particularly (and, among all species, uniquely) suitable for supporting such emergence. The answer, based on the discussion of the brain basis of language in §5 above, is twofold. First, the relevant behaviours, as per the tool-making pedagogy hypothesis, included sequential hierarchical planning and control, together with social engagement and attempted influence or ‘communication’. Second, the circuits that controlled these behaviours (as mentioned in §5) became functionally interdependent, a development that facilitated their subsequent greater anatomical affinity, as documented in modern humans (see [67] for a detailed survey of the relevant circuits and a discussion of the significance of the anatomical and functional findings for the ‘technological’ hypothesis of language origin).

    A number of predictions can be derived from the COCO hypothesis, which may direct analysis and interpretation of empirical findings and can suggest directions for further exploration.

    (1)

    We predict structural similarities (of the kind that exists between tutor and juvenile songs in the zebra finch; [122]) between (i) transcripts of manual skill pedagogy and (ii) child-directed speech (CDS). In particular, various structural characteristics of CDS, such as the prevalence of variation sets (e.g. [123]) are known to change over time as infants grow older and become more proficient in language (e.g. [124]). Given the parallels between linguistic development and skill instruction mentioned earlier, we expect a similar developmental trajectory to be found in both cases. Ideally, both language and skill pedagogy episodes should be analysed in their full multimodal complexity, in full detail [125], and using state of the art computational tools [126].

    (2)

    If some component of the cognitive coupling has not become innate and remains dependent on experience, we should expect that exposure to linguistic sequences that are coupled with dynamics in the visuo-motor modality, such that these circuits' activation is coordinated, would be a particularly effective way of supporting language learning. This may be testable experimentally.

    (3)

    The COCO hypothesis is founded on the assertion that communication that is coupled with tool use would facilitate its social transmission. This can be tested in experiments, particularly with children. Dimensions of interest would include whether (and how) children choose to incorporate coupled communication in their demonstration of a physical skill, when instructed to teach the skill or to help another child acquire it; whether doing so increases transmission efficacy; how communication that accompanies the physical demonstration changes along transmission chains that include multiple children; what kind of communication is added, e.g. gestural or vocal, linguistic or non-linguistic; and how this changes as a function of age, earlier experience or the details of the instructions provided.

    (4)

    It may be useful to study the manner in which human cognition accommodates evolutionarily novel tasks other than language and how new skills emerge when provided with a cultural context that makes them beneficial or that reinforces them socially. Such are the skills related to reading and writing, complex mathematics and the perception of virtual reality settings and the behaviour in them (e.g. [127,128]). In particular, it may be interesting to study whether different timing of the exposure to these settings along the individual's cognitive developmental trajectory affects the manner in which they are accommodated by neural circuitry.

    (5)

    The COCO hypothesis ascribes a prominent role in language to coupling between neural circuits, which may occur with greater ease if linguistic exposure begins early in cognitive development. In this light, perhaps some differences between first and second languages, which are learned at a later age, stem from different efficacy of coupling between circuits that were evolutionarily involved in non-linguistic communication and circuits related to serial order and hierarchical processing. This possibility suggests that there should be prominent differences between first and second languages in the ease and fluency with which individuals process and produce utterances that include complex or high-order syntactic dependencies. An indication that this is the case may be found in existing studies (e.g. [129,130], but see also a different result, perhaps due to the influence of explicit language instruction, in [131]), but targeted exploration of this possibility may yield interesting insight.

    To summarize, our hypothesis on the origin of language, the COCO hypothesis, rests on the notion that language is a means by which individuals influence one another's state, through communicating information in a manner that is characterized by serial order and hierarchical structure, imposed over referential vocal (or other physical) gestures. This view of language highlights its reliance on two sets of pre-existing neural mechanisms, those involved in communication and those involved in serial behaviour, giving rise to the central tenet of our theory: that the coupling of these mechanisms triggered the emergence of language. Adopting an evolutionary perspective in searching for a plausible proximate mechanism and an ecological context in which language could have emerged, we suggest that the most likely setting that induced this neural coupling was instruction in tool use and tool production.

    Our COCO hypothesis provides an important missing link connecting three previously suggested takes on language and its evolution: theories that propose that language evolved in the context of teaching among kin [54,67,87,97]; theories that highlight the reliance of language on pre-existing neural substrates of communication and of hierarchical and serial order processing (e.g. [57,61,98]); and observations showing that these neural circuits are common to language and to tool production and use (e.g. [67,98]). These three takes, respectively, suggest a selective force that could favour efficient communication, provide an invaluable mechanistic description of the solution that evolved in response to this challenge and point out the ‘suspicious coincidence’ that is inherent in the sharing of neural circuits between language and tool production. However, even when taken together, these ideas leave open the question of the concrete step by step evolutionary dynamics that (i) plausibly led to language as we know it and not to some other solution to the evolutionary challenge,10 and (ii) would offer a causal explanation of the ‘suspicious coincidence’ on the level of neural implementation.

    The COCO hypothesis fills this lacuna: it proposes an account of specific dynamics that would have taken place while teaching a structured skill such as tool production and that would include neural coupling of existing initially independent neural mechanisms—a coupling that occurred in individuals during development, via neural plasticity. Heritable variation among individuals in the efficiency of this coupling would have allowed natural selection to gradually increase the innate coupling between the mechanisms or select for factors that increase the rapidity of the developmental coupling. This would eventually give rise to a communication system that, completely non-coincidentally, shares neural correlates of structure and serial order with tool production: human language.

    Finally, the human-unique reliance on a cultural niche in which interaction with the environment is heavily mediated by tools, and which entirely depends on culturally acquired knowledge, may explain why language evolved only in the hominin lineage. Language not only stems from this unique cultural niche; it also supports it, suggesting that the two have co-evolved in a tight loop of positive feedback, playing a prominent role in the shaping of humanity as we know it. The emergence of language from the humble origins of simple tool production has thus set off the process that boosted the cognitive ability of one species over all others and has allowed it, for good and for bad, to transform the face of the planet.

    This article has no additional data.

    Aspects of both authors' language and meta-language skills underwent selective pressure and concomitant change in the course of this collaboration, giving rise to complementary contributions to all aspects of the study.

    The authors have no competing interests.

    O.K. is supported by the John Templeton Fund and by the Stanford Center for Computational, Evolutionary, and Human Genomics.

    We thank Arnon Lotem, Marc Feldman and three anonymous reviewers for helpful comments.

    Footnotes

    1 Cf. Chomsky [8, p. 380]: ‘Human language is based on an elementary property that also seems to be biologically isolated: the property of discrete infinity, which is exhibited in its purest form by the natural numbers 1, 2, 3, … Children do not learn this property of the number system. Unless the mind already possesses the basic principles, no amount of evidence could provide them; and they are completely beyond the intellectual range of other organisms.’

    2 Cf. Koziol et al. [63, p. 156]: ‘Expressive language impairments include word finding difficulties and abnormal syntax with agrammatism, long latency and brief responses, reluctance to engage in conversation. Verbal fluency is decreased, affecting phonemic (letter) more than semantic (category) naming. Mutism occurs following acute injury such as surgery involving the vermis, mostly in children but also to varying degrees in adults. Poor control of volume, pitch and tone can produce high-pitched, hypophonic speech.’

    3 Neural reuse also occurs, necessarily, in dealing with tasks that did not accompany our species throughout most of its evolutionary history and that are unlikely to have had time to affect our innate brain structures. Examples of such tasks are reading and writing, carrying out complex mathematical calculations and accommodating settings such as virtual reality, from watching television to actively engaging in it, as in gaming.

    4 As in ‘animal communication’; our use of the term does not constitute an endorsement of the construal of language as message-passing.

    5 The specific hypothesis of Ackermann et al. [24], which we do not necessarily endorse, is that the new circuit consisted of a monosynaptic connection between cortical ‘language’ areas and the motor area that controls the laryngeal muscles.

    6 Notably, in addition to the cortex and the midbrain visual attention circuits, changes following learning to read were found in the pulvinar—a higher-order thalamic nucleus, which is reciprocally connected to the entire cortex [80], as well as to many subcortical areas, including the basal ganglia [81], and which can therefore mediate large-scale functional plasticity that eventually becomes translated into new connections.

    7 Cf. Engels [100], writing in 1876: ‘First labour, after it and then with it speech—these were the two most essential stimuli under the influence of which the brain of the ape gradually changed into that of man.’

    8 Notably, hypotheses which highlight the advantages conferred by propensity to learn and by social status associated with linguistic proficiency, are compatible with this scenario [101].

    9 Additionally, recent findings show that previously, the productive abilities of monkeys' vocal tracts significantly underestimated their vocal capacity. In fact, monkey vocal tracts appear to be speech-ready [109].

    10 That some of the subject matter of communication during teaching was structured (tool production) does not imply that the communication itself had to be structured identically. Conceivably, for example, a communication system could be improved through an increase in referentiality, without incorporating structure. Even communicating about hierarchically structured tools is possible using non-hierarchical referential sequences. Such communication may indeed have been an intermediate phase of language evolution [132].

    One contribution of 16 to a theme issue ‘Bridging cultural gaps: interdisciplinary studies in human cultural evolution’.

    References

    • 1

      Szathmary E, Smith JM. 1995The major evolutionary transitions. Nature 374, 227. (doi:10.1038/374227a0) Crossref, PubMed, ISI, Google Scholar

    • 3

      Smullyan R. 1961The theory of formal systems, Annals of Mathematics Studies, vol. 47. Princeton, NJ: Princeton University Press. Google Scholar

    • 4

      Everaert MBH, Huybregts MAC, Chomsky N, Berwick RC, Bolhuis JJ. 2015Structures, not strings: linguistics as part of the cognitive sciences. Trends Cogn. Sci. 19, 729–743. (doi:10.1016/j.tics.2015.09.008) Crossref, PubMed, ISI, Google Scholar

    • 5

      Pietroski PM. 2003The character of natural language semantics. InEpistemology of language (ed. A. Barber), pp. 217–256. Oxford, UK: Oxford University Press. Google Scholar

    • 6

      Edelman S. In preparation. Verbal behavior without syntactic structures: beyond Skinner and Chomsky. In Chomsky's legacy (ed. C. Behme) Google Scholar

    • 7

      Edelman S. 2017Language and other complex behaviors: unifying characteristics, computational models, neural mechanisms. Lang. Sci. 62, 91–123. (doi:10.1016/j.langsci.2017.04.003) Crossref, ISI, Google Scholar

    • 8

      Chomsky N. 2004Language and mind: current thoughts on ancient problems. In Variation and universals in biolinguistics (ed. Jenkins L), pp. 379–405. Amsterdam, The Netherlands: Elsevier. Google Scholar

    • 9

      Bolhuis JJ, Tattersall I, Chomsky N, Berwick RC. 2014How could language have evolved?PLoS. Biol. 12, e1001934. (doi:10.1371/journal.pbio.1001934) Crossref, PubMed, ISI, Google Scholar

    • 10

      Chomsky N. 2005Three factors in language design. Linguist. Inq. 36, 1–22. (doi:10.1162/0024389052993655) Crossref, ISI, Google Scholar

    • 11

      Chomsky N. 1972Language and mind. Cambridge, UK: Cambridge University Press. Google Scholar

    • 12

      Hauser MD, Yang C, Berwick RC, Tattersall I, Ryan MJ, Watumull J, Chomsky N, Lewontin RC. 2014The mystery of language evolution. Front. Psychol. Front. 5, 401. (doi:10.3389/fpsyg.2014.00401) PubMed, ISI, Google Scholar

    • 13

      Langacker RW. 2016Working toward a synthesis. Cogn. Linguist. 27, 465–477. (doi:10.1515/cog-2016-0004) Crossref, ISI, Google Scholar

    • 14

      Scott-Phillips TC, Kirby S. 2010Language evolution in the laboratory. Trends Cogn. Sci. 14, 411–417. (doi:10.1016/j.tics.2010.06.006) Crossref, PubMed, ISI, Google Scholar

    • 15

      Kirby S. 2017Culture and biology in the origins of linguistic structure. Psychon. Bull. Rev. 24, 118–137. (doi:10.3758/s13423-016-1166-7) Crossref, PubMed, ISI, Google Scholar

    • 16

      Ramscar M, Baayen H. 2013Production, comprehension, and synthesis: a communicative perspective on language. Front. Psychol. 4, 233. (doi:10.3389/fpsyg.2013.00233) Crossref, PubMed, ISI, Google Scholar

    • 17

      LaPolla RJ. 2015On the logical necessity of a cultural and cognitive connection for the origin of all aspects of linguistic structure. In Language structure and environment: social, cultural, and natural factors (eds R De Busser, RJ LaPolla), pp. 31–44. Amsterdam, The Netherlands: John Benjamins. Google Scholar

    • 18

      Sperber D, Origgi G. 2012A pragmatic perspective on the evolution of language. In Meaning and relevance (eds Wilson D, Sperber D), p. 331. Cambridge, UK: Cambridge University Press. Crossref, Google Scholar

    • 19

      Sperber D, Wilson D. 1986Relevance: communication and cognition. Oxford, UK: Blackwell. Google Scholar

    • 20

      Arbib MA. 2005From monkey-like action recognition to human language: an evolutionary framework for neurolinguistics. Behav. Brain Sci. 28, 105–124. (doi:10.1017/S0140525X05000038) Crossref, PubMed, ISI, Google Scholar

    • 21

      Arbib MA. 2017Toward the language-ready brain: biological evolution and primate comparisons. Psychon. Bull. Rev. 24, 142–150. (doi:10.3758/s13423-016-1098-2) Crossref, PubMed, ISI, Google Scholar

    • 22

      Scott-Phillips TC. 2017Pragmatics and the aims of language evolution. Psychon. Bull. Rev. 24, 186–189. (doi:10.3758/s13423-016-1061-2) Crossref, PubMed, ISI, Google Scholar

    • 23

      Fitch WT. 2017Empirical approaches to the study of language evolution. Psychon. Bull. Rev. 24, 3–33. (doi:10.3758/s13423-017-1236-5) Crossref, PubMed, ISI, Google Scholar

    • 24

      Ackermann H, Hage SR, Ziegler W. 2014Brain mechanisms of acoustic communication in humans and nonhuman primates: an evolutionary perspective. Behav. Brain Sci. 37, 529–546. (doi:10.1017/S0140525X13003099) Crossref, PubMed, ISI, Google Scholar

    • 25

      Everett DL. 2016Grammar came later: triality of patterning and the gradual evolution of language. J. Neurolinguist. 43, 133. (doi:10.1016/j.jneuroling.2016.11.001) Crossref, ISI, Google Scholar

    • 26

      Seyfarth RM, Cheney DL. 2017Precursors to language: social cognition and pragmatic inference in primates. Psychon. Bull. Rev. 24, 79–84. (doi:10.3758/s13423-016-1059-9) Crossref, PubMed, ISI, Google Scholar

    • 27

      Mercier H, Sperber D. 2011Why do humans reason? Arguments for an argumentative theory. Behav. Brain Sci. 34, 57–74. (doi:10.1017/S0140525X10000968) Crossref, PubMed, ISI, Google Scholar

    • 28

      Du Bois JW. 2014Towards a dialogic syntax. Cogn. Linguist. 25, 359–410. (doi:10.1515/cog-2014-0024) Crossref, ISI, Google Scholar

    • 29

      Tomasello M. 2006Acquiring linguistic constructions. In Handbook of child psychology(eds R Siegler, D Kuhn), pp. 1–48. Oxford, UK: Wiley. Google Scholar

    • 30

      Clark A. 1998Magic words: how language augments human computation. In Language and thought: interdisciplinary themes (eds Carruthers P, Boucher J), pp. 162–183. Cambridge, UK: Cambridge University Press. Crossref, Google Scholar

    • 31

      Cartmill EA, Beilock S, Goldin-Meadow S. 2012A word in the hand: action, gesture and mental representation in humans and non-human primates. Phil. Trans. R. Soc. B 367, 129–143. (doi:10.1098/rstb.2011.0162) Link, ISI, Google Scholar

    • 32

      Byrne RW, Cochet H. 2017Where have all the (ape) gestures gone?Psychon. Bull. Rev. 24, 68–71. (doi:10.3758/s13423-016-1071-0) Crossref, PubMed, ISI, Google Scholar

    • 33

      Byrne R, Whiten A. 1989Machiavellian intelligence: social expertise and the evolution of intellect in monkeys, apes, and humans. Oxford, UK: Oxford University Press. Google Scholar

    • 34

      Leavens DA, Hostetter AB, Wesley MJ, Hopkins WD. 2004Tactical use of unimodal and bimodal communication by chimpanzees, Pan troglodytes. Anim. Behav. 67, 467–476. (doi:10.1016/j.anbehav.2003.04.007) Crossref, ISI, Google Scholar

    • 35

      Steele J, Ferrari PF, Fogassi L. 2012From action to language: comparative perspectives on primate tool use, gesture and the evolution of human language. Phil. Trans. R. Soc. B 367, 4–9. (doi:10.1098/rstb.2011.0295). Link, ISI, Google Scholar

    • 36

      Mayr E. 1961Cause and effect in biology. Science 134, 1501–1506. (doi:10.1126/science.134.3489.1501) Crossref, PubMed, ISI, Google Scholar

    • 37

      Laland KN, Sterelny K, Odling-Smee J, Hoppitt W, Uller T. 2011Cause and effect in biology revisited: is Mayr's proximate-ultimate dichotomy still useful?Science 334, 1512–1516. (doi:10.1126/science.1210879) Crossref, PubMed, ISI, Google Scholar

    • 38

      Tinbergen N. 1963On aims and methods of ethology. Ethology 20, 410–433. Google Scholar

    • 39

      Edelman S. 2012Vision, reanimated and reimagined. Perception 41, 1116–1127. (doi:10.1068/p7274) Crossref, PubMed, ISI, Google Scholar

    • 40

      Marr D. 1982Vision: a computational investigation into the human representation and processing of visual information. San Francisco, CA: WH Freeman. Google Scholar

    • 41

      Marr D, Poggio T. 1977From understanding computation to understanding neural circuitry. Neurosci. Res. Progr. Bull. 15, 470–488. Google Scholar

    • 42

      Poggio T. 2012The levels of understanding framework, revised. Perception 41, 1017–1023. (doi:10.1068/p7299) Crossref, PubMed, ISI, Google Scholar

    • 43

      Edelman S. In press.Perception of object shapes. In The Oxford handbook of computational perceptual organization (eds Gepshtein S, Maloney L). New York, NY: Oxford University Press. Google Scholar

    • 44

      Lachmann M, Szamado S, Bergstrom CT. 2001Cost and conflict in animal signals and human language. Proc. Natl Acad. Sci. USA 98, 13 189–13 194. (doi:10.1073/pnas.231216498) Crossref, ISI, Google Scholar

    • 45

      Számadó S, Szathmáry E. 2006Selective scenarios for the emergence of natural language. Trends Ecol. Evol. 21, 555–561. (doi:10.1016/j.tree.2006.06.021) Crossref, PubMed, ISI, Google Scholar

    • 46

      Knight C. 1998Ritual/speech coevolution: a solution to the problem of deception. In Approaches to the evolution of language (eds Hurford JR, Studdert-Kennedy M, Knight C), pp. 68–91. Cambridge, UK: Cambridge University Press. Google Scholar

    • 47

      Hurford JR. 1990Nativist and functional explanations in language acquisition. In Logical issues in language acquisition (ed. Roca IM), pp. 85–136. Dordrecht, The Netherlands: Foris Publications. Crossref, Google Scholar

    • 48

      Kolodny O, Edelman S. 2015The problem of multimodal concurrent serial order in behavior. Neurosci. Biobehav. Rev. 56, 252–265. (doi:10.1016/j.neubiorev.2015.07.009) Crossref, PubMed, ISI, Google Scholar

    • 49

      Solan Z, Horn D, Ruppin E, Edelman S, McClelland JL. 2005Unsupervised learning of natural languages. Proc. Natl Acad. Sci. USA 102, 11 629–11 634. (doi:10.1073/pnas.0409746102) Crossref, ISI, Google Scholar

    • 50

      Edelman S. 2008Computing the mind: how the mind really works. Oxford, UK: Oxford University Press. Google Scholar

    • 51

      Kolodny O, Lotem A, Edelman S. 2015Learning a generative probabilistic grammar of experience: a process-level model of language acquisition. Cogn. Sci. 39, 227–267. (doi:10.1111/cogs.12140) Crossref, PubMed, ISI, Google Scholar

    • 52

      Bullock D. 2004Adaptive neural models of queuing and timing in fluent action. Trends Cogn. Sci. 8, 426–433. (doi:10.1016/j.tics.2004.07.003) Crossref, PubMed, ISI, Google Scholar

    • 53

      Bickerton D. 2009Adam's tongue: how humans made language, how language made humans. Basingstoke, UK: Macmillan. Google Scholar

    • 54

      Laland KN. 2017The origins of language in teaching. Psychon. Bull. Rev. 24, 225–231. (doi:10.3758/s13423-016-1077-7) Crossref, PubMed, ISI, Google Scholar

    • 55

      Fitch WT. 2004Kin selection and ‘mother tongues’: a neglected component in language evolution. In Evolution of communication systems: a comparative approach (eds Oller D, Griebel U), pp. 275–296. Cambridge, MA: MIT Press. Google Scholar

    • 57

      Anderson ML. 2016Précis of after phrenology: neural reuse and the interactive brain. Behav. Brain Sci. 39, e120. (doi:10.1017/S0140525X15000631) Crossref, PubMed, ISI, Google Scholar

    • 58

      Hagoort P. 2014Nodes and networks in the neural architecture for language: Broca's region and beyond. Curr. Opin. Neurobiol. 28, 136–141. (doi:10.1016/j.conb.2014.07.013) Crossref, PubMed, ISI, Google Scholar

    • 59

      Friederici AD. 2012The cortical language circuit: from auditory perception to sentence comprehension. Trends Cogn. Sci. 16, 262–268. (doi:10.1016/j.tics.2012.04.001) Crossref, PubMed, ISI, Google Scholar

    • 60

      Thompson-Schill SL. 2005Dissecting the language organ: a new look at the role of Broca's area in language processing. In Twenty-first century psycholinguists: four cornerstones (ed. Cutler A), pp. 173–189. Hillsdale, NJ: Lawrence Erlbaum Associates. Google Scholar

    • 61

      Anderson ML. 2010Neural reuse: a fundamental organizational principle of the brain. Behav. Brain Sci. 33, 245–266. (doi:10.1017/S0140525X10000853) Crossref, PubMed, ISI, Google Scholar

    • 62

      Koechlin E, Jubault T. 2006Broca's area and the hierarchical organization of human behavior. Neuron 50, 963–974. (doi:10.1016/j.neuron.2006.05.017) Crossref, PubMed, ISI, Google Scholar

    • 63

      Koziol LFet al.2014Consensus paper: the cerebellum's role in movement and cognition. The Cerebellum 13, 151–177. (doi:10.1007/s12311-013-0511-x) Crossref, PubMed, ISI, Google Scholar

    • 64

      Gould SJ, Vrba ES. 1982Exaptation—a missing term in the science of form. Paleobiology 8, 4–15. (doi:10.1017/S0094837300004310) Crossref, ISI, Google Scholar

    • 65

      Dehaene S, Cohen L. 2007Cultural recycling of cortical maps. Neuron 56, 384–398. (doi:10.1016/j.neuron.2007.10.004) Crossref, PubMed, ISI, Google Scholar

    • 66

      Dehaene Set al.2010How learning to read changes the cortical networks for vision and language. Science 330, 1359–1364. (doi:10.1126/science.1194140) Crossref, PubMed, ISI, Google Scholar

    • 67

      Stout D, Chaminade T. 2012Stone tools, language and the brain in human evolution. Phil. Trans. R. Soc. B 367, 75–87. (doi:10.1098/rstb.2011.0099) Link, ISI, Google Scholar

    • 68

      Fitch WT. 2011The evolution of syntax: an exaptationist perspective. Front. Evol. Neurosci. 3, 9. (doi:10.3389/fnevo.2011.00009) Crossref, PubMed, Google Scholar

    • 69

      Botvinick MM. 2008Hierarchical models of behavior and prefrontal function. Trends Cogn. Sci. 12, 201–208. (doi:10.1016/j.tics.2008.02.009) Crossref, PubMed, ISI, Google Scholar

    • 70

      Jin X, Tecuapetla F, Costa RM. 2014Basal ganglia subcircuits distinctively encode the parsing and concatenation of action sequences. Nat. Neurosci. 17, 423–430. (doi:10.1038/nn.3632) Crossref, PubMed, ISI, Google Scholar

    • 71

      Lieberman P. 2002Human language and our reptilian brain: the subcortical bases of speech, syntax, and thought. Boston, MA: Harvard University Press. Google Scholar

    • 72

      Longworth CE, Keenan SE, Barker RA, Marslen-Wilson WD, Tyler LK. 2005The basal ganglia and rule-governed language use: evidence from vascular and degenerative conditions. Brain 128, 584–596. (doi:10.1093/brain/awh387) Crossref, PubMed, ISI, Google Scholar

    • 73

      Ullman MT. 2006Is Broca's area part of a basal ganglia thalamocortical circuit?Cortex 42, 480–485. (doi:10.1016/S0010-9452(08)70382-4) Crossref, PubMed, ISI, Google Scholar

    • 74

      Sidtis DVL, Pachana N, Cummings JL, Sidtis JJ. 2006Dysprosodic speech following basal ganglia insult: toward a conceptual framework for the study of the cerebral representation of prosody. Brain Lang. 97, 135–153. (doi:10.1016/j.bandl.2005.09.001) Crossref, PubMed, ISI, Google Scholar

    • 75

      Darkins AW, Fromkin VA, Benson DF. 1988A characterization of the prosodic loss in Parkinson's disease. Brain Lang. 34, 315–327. (doi:10.1016/0093-934X(88)90142-3) Crossref, PubMed, ISI, Google Scholar

    • 76

      Baldwin JM. 1896A new factor in evolution. Am. Nat. 30, 441–451. (doi:10.1086/276408) Crossref, Google Scholar

    • 77

      Weber BH, Depew DJ. 2003Evolution and learning: the Baldwin effect reconsidered. Cambridge, MA: MIT Press. Google Scholar

    • 78

      Iriki A, Taoka M. 2012Triadic (ecological, neural, cognitive) niche construction: a scenario of human brain evolution extrapolating tool use and language from the control of reaching actions. Phil. Trans. R. Soc. B 367, 10–23. (doi:10.1098/rstb.2011.0190) Link, ISI, Google Scholar

    • 79

      Skeide MA, Kumar U, Mishra RK, Tripathi VN, Guleria A, Singh JP, Eisner F, Huettig F. 2017Learning to read alters cortico-subcortical cross-talk in the visual system of illiterates. Sci. Adv. 3, e1602612. (doi:10.1126/sciadv.1602612) Crossref, PubMed, ISI, Google Scholar

    • 80

      Sherman SM. 2016Thalamus plays a central role in ongoing cortical functioning. Nat. Neurosci. 16, 533–541. (doi:10.1038/nn.4269) Crossref, ISI, Google Scholar

    • 81

      Barron DS, Eickhoff SB, Clos M, Fox PT. 2015Human pulvinar functional organization and connectivity. Hum. Brain Mapp. 36, 2417–2431. (doi:10.1002/hbm.22781) Crossref, PubMed, ISI, Google Scholar

    • 82

      Hecht EE, Gutman DA, Khreisheh N, Taylor SV, Kilner J, Faisal AA, Bradley BA, Chaminade T, Stout D. 2015Acquisition of Paleolithic toolmaking abilities involves structural remodeling to inferior frontoparietal regions. Brain Struct. Funct. 220, 2315–2331. (doi:10.1007/s00429-014-0789-6) Crossref, PubMed, ISI, Google Scholar

    • 83

      Nakahara H, Doya K, Hikosaka O. 2001Parallel cortico-basal ganglia mechanisms for acquisition and execution of visuomotor sequences—a computational approach. J. Cogn. Neurosci. 13, 626–647. (doi:10.1162/089892901750363208) Crossref, PubMed, ISI, Google Scholar

    • 84

      Báez-Mendoza R, Schultz W. 2013The role of the striatum in social behavior. Front. Neurosci. 7, 233. (doi:10.3389/fnins.2013.00233) Crossref, PubMed, ISI, Google Scholar

    • 85

      Sutton RS, Barto AG. 1998Introduction to reinforcement learning, vol. 135. Cambridge, MA: MIT Press. Crossref, Google Scholar

    • 86

      Fareri DS, Delgado MR. 2014The importance of social rewards and social networks in the human brain. Neuroscientist 20, 387–402. (doi:10.1177/1073858414521869) Crossref, PubMed, ISI, Google Scholar

    • 87

      Csibra G, Gergely G. 2011Natural pedagogy as evolutionary adaptation. Phil. Trans. R. Soc. B 366, 1149–1157. (doi:10.1098/rstb.2010.0319) Link, ISI, Google Scholar

    • 88

      Premack D. 1985‘Gavagai!’ or the future history of the animal language controversy. Cognition 19, 207–296. (doi:10.1016/0010-0277(85)90036-8) Crossref, PubMed, ISI, Google Scholar

    • 89

      Aiello LC, Dunbar RIM. 1993Neocortex size, group size, and the evolution of language. Curr. Anthropol. 34, 184–193. (doi:10.1086/204160) Crossref, ISI, Google Scholar

    • 90

      Dunbar R. 1998Grooming, gossip, and the evolution of language. Cambridge, MA: Harvard University Press. Google Scholar

    • 91

      Dunbar RIM. 2009Why only humans have language. Oxford Scholarship Online (doi:10.1093/acprof:oso/9780199545872.001.0001) Google Scholar

    • 92

      Hewes GWet al.1973Primate communication and the gestural origin of language [and comments and reply]. Curr. Anthropol. 14, 5–24. (doi:10.1086/201401) Crossref, ISI, Google Scholar

    • 93

      Jaynes J. 1976The evolution of language in the late Pleistocene. Ann. NY Acad. Sci. 280, 312–325. (doi:10.1111/j.1749-6632.1976.tb25496.x) Crossref, ISI, Google Scholar

    • 94

      Miller G. 2000The mating mind: how sexual choice shaped the evolution of human nature. New York, NY: Doubleday. Google Scholar

    • 95

      Deacon TW. 1998The symbolic species: the co-evolution of language and the brain. New York, NY: WW Norton & Company. Google Scholar

    • 96

      Darwin C. 1871The descent of Man and selection in relation to sex. London, UK: John Murray. Google Scholar

    • 97

      Morgan TJHet al.2015Experimental evidence for the co-evolution of hominin tool-making teaching and language. Nat. Commun. 6, 6029. (doi:10.1038/ncomms7029) Crossref, PubMed, ISI, Google Scholar

    • 98

      Greenfield PM. 1991Language, tools and brain: the ontogeny and phylogeny of hierarchically organized sequential behavior. Behav. Brain Sci. 14, 531–551. (doi:10.1017/S0140525X00071235) Crossref, ISI, Google Scholar

    • 99

      Stout D, Chaminade T. 2009Making tools and making sense: complex, intentional behaviour in human evolution. Cambridge Archaeol. J. 19, 85–96. (doi:10.1017/S0959774309000055) Crossref, ISI, Google Scholar

    • 100

      Engels F. 1895The part played by labour in the transition from ape to man. Stuttgart, Germany: Die Neue Zeit. English translation available at https://www.marxists.org/archive/marx/works/1876/part-played-labour/index.htm. Google Scholar

    • 101

      Tallerman M. 2013Kin selection, pedagogy, and linguistic complexity: whence protolanguage. In The evolutionary emergence of human language (eds Botha R, Everaert M), pp. 77–96. Oxford, UK: Oxford University Press. Google Scholar

    • 102

      Vigliocco G, Perniss P, Vinson D. 2014Language as a multimodal phenomenon: implications for language learning, processing and evolution. Phil. Trans. R. Soc. B 369, 20130292. (doi:10.1098/rstb.2013.0292). Link, ISI, Google Scholar

    • 103

      Reynolds PC. 1993The complementation theory of language and tool use. In Tools, language and cognition in human evolution (eds Gibson KR, Ingold T), pp. 407–428. Cambridge, UK: Cambridge University Press. Google Scholar

    • 104

      Lotem A, Halpern J, Edelman S, Kolodny O. 2017The evolution of cognitive mechanisms in response to cultural innovations. Proc. Natl Acad. Sci. USA 114, 7915–7922. (doi:10.1073/pnas.1620742114) Crossref, PubMed, ISI, Google Scholar

    • 105

      Montagu A. 1976Toolmaking, hunting, and the origin of language. Ann. NY Acad. Sci. 280, 266–274. (doi:10.1111/j.1749-6632.1976.tb25493.x) Crossref, ISI, Google Scholar

    • 106

      Isaac GL. 1976Stages of cultural elaboration in the Pleistocene: possible archaeological indicators of the development of language capabilities. Ann. NY Acad. Sci. 280, 275–288. (doi:10.1111/j.1749-6632.1976.tb25494.x) Crossref, ISI, Google Scholar

    • 107

      Klein RG. 2017Language and human evolution. J. Neurolinguist. 43, 204–221. (doi:10.1016/j.jneuroling.2016.11.004) Crossref, ISI, Google Scholar

    • 108

      Tattersall I. 2017How can we detect when language emerged?Psychon. Bull. Rev. 24, 64–67. (doi:10.3758/s13423-016-1075-9) Crossref, PubMed, ISI, Google Scholar

    • 109

      Fitch WT, de Boer B, Mathur N, Ghazanfar AA. 2016Monkey vocal tracts are speech-ready. Sci. Adv. 2, e1600723. (doi:10.1126/sciadv.1600723) Crossref, PubMed, ISI, Google Scholar

    • 110

      Stout D, Chaminade T. 2007The evolutionary neuroscience of tool making. Neuropsychologia 45, 1091–1100. (doi:10.1016/j.neuropsychologia.2006.09.014) Crossref, PubMed, ISI, Google Scholar

    • 111

      Hecht EE, Gutman DA, Bradley BA, Preuss TM, Stout D. 2015Virtual dissection and comparative connectivity of the superior longitudinal fasciculus in chimpanzees and humans. Neuroimage 108, 124–137. (doi:10.1016/j.neuroimage.2014.12.039) Crossref, PubMed, ISI, Google Scholar

    • 112

      Hecht EE, Gutman DA, Preuss TM, Sanchez MM, Parr LA, Rilling JK. 2013Process versus product in social learning: comparative diffusion tensor imaging of neural systems for action execution–observation matching in macaques, chimpanzees, and humans. Cereb. Cortex 23, 1014–1024. (doi:10.1093/cercor/bhs097) Crossref, PubMed, ISI, Google Scholar

    • 113

      Stout D. 2011Stone toolmaking and the evolution of human culture and cognition. Phil. Trans. R. Soc. B 366, 1050–1059. (doi:10.1098/rstb.2010.0369) Link, ISI, Google Scholar

    • 114

      Stout D, Toth N, Schick K, Chaminade T. 2008Neural correlates of Early Stone Age toolmaking: technology, language and cognition in human evolution. Phil. Trans. R. Soc. B 363, 1939–1949. (doi:10.1098/rstb.2008.0001) Link, ISI, Google Scholar

    • 115

      Moerk EL. 1990Three-term contingency patterns in mother-child verbal interactions during first-language acquisition. J. Exp. Anal. Behav. 54, 293–305. (doi:10.1901/jeab.1990.54-293) Crossref, PubMed, ISI, Google Scholar

    • 116

      Chater N, McCauley SM, Christiansen MH. 2016Language as skill: intertwining comprehension and production. J. Mem. Lang. 89, 244–254. (doi:10.1016/j.jml.2015.11.004) Crossref, ISI, Google Scholar

    • 117

      Bruner J. 1981The social context of language acquisition. Lang. Commun. 1, 155–178. (doi:10.1016/0271-5309(81)90010-0) Crossref, Google Scholar

    • 118

      Goldstein MHet al.2010General cognitive principles for learning structure in time and space. Trends Cogn. Sci. 14, 249–258. (doi:10.1016/j.tics.2010.02.004) Crossref, PubMed, ISI, Google Scholar

    • 119

      Nelson K. 2015A bio-social-cultural approach to early cognitive development: entering the community of minds. In Emerging trends in the social and behavioral sciences (eds Scott R, Kosslyn S), pp. 1–14. New York, NY: John Wiley and Sons. Google Scholar

    • 120

      Moerk EL. 1996Input and learning processes in first language acquisition. Adv. Child Dev. Behav. 26, 181–228. (doi:10.1016/S0065-2407(08)60509-1) Crossref, PubMed, ISI, Google Scholar

    • 121

      Silbert LJ, Honey CJ, Simony E, Poeppel D, Hasson U. 2014Coupled neural systems underlie the production and comprehension of naturalistic narrative speech. Proc. Natl Acad. Sci. USA 111, E4687–E4696. (doi:10.1073/pnas.1323812111) Crossref, PubMed, ISI, Google Scholar

    • 122

      Menyhart O, Kolodny O, Goldstein MH, DeVoogd TJ, Edelman S. 2015Juvenile zebra finches learn the underlying structural regularities of their fathers’ song. Front. Psychol. 6, 571. (doi:10.3389/fpsyg.2015.00571) Crossref, PubMed, ISI, Google Scholar

    • 123

      Onnis L, Waterfall HR, Edelman S. 2008Learn locally, act globally: learning language from variation set cues. Cognition 109, 423–430. (doi:10.1016/j.cognition.2008.10.004) Crossref, PubMed, ISI, Google Scholar

    • 124

      Huttenlocher J, Vasilyeva M, Cymerman E, Levine S. 2002Language input and child syntax. Cogn. Psychol. 45, 337–374. (doi:10.1016/S0010-0285(02)00500-5) Crossref, PubMed, ISI, Google Scholar

    • 125

      Steffensen SV. 2016Cognitive probatonics: towards an ecological psychology of cognitive particulars. New Ideas Psychol. 42, 29–38. (doi:10.1016/j.newideapsych.2015.07.003) Crossref, ISI, Google Scholar

    • 126

      Anderson DJ, Perona P. 2014Toward a science of computational ethology. Neuron 84, 18–31. (doi:10.1016/j.neuron.2014.09.005) Crossref, PubMed, ISI, Google Scholar

    • 127

      Lee KM, Jung Y. 2005Evolutionary nature of virtual experience. J. Cult. Evol. Psychol. 3, 159–176. (doi:10.1556/JCEP.3.2005.2.4) Crossref, Google Scholar

    • 128

      Reiner M. 2004The role of haptics in immersive telecommunication environments. IEEE Trans. Circuits Syst. Video Technol. 14, 392–401. (doi:10.1109/TCSVT.2004.823399) Crossref, ISI, Google Scholar

    • 129

      Johnson JS, Newport EL. 1989Critical period effects in second language learning: the influence of maturational state on the acquisition of English as a second language. Cogn. Psychol. 21, 60–99. (doi:10.1016/0010-0285(89)90003-0) Crossref, PubMed, ISI, Google Scholar

    • 130

      Newport EL. 1990Maturational constraints on language learning. Cogn. Sci. 14, 11–28. (doi:10.1207/s15516709cog1401_2) Crossref, ISI, Google Scholar

    • 131

      Dąbrowska E, Street J. 2006Individual differences in language attainment: comprehension of passive sentences by native and non-native English speakers. Lang. Sci. 28, 604–615. (doi:10.1016/j.langsci.2005.11.014) Crossref, ISI, Google Scholar

    • 132

      Jackendoff R, Wittenberg E. 2017Linear grammar as a possible stepping-stone in the evolution of language. Psychon. Bull. Rev. 24, 219–224. (doi:10.3758/s13423-016-1073-y) Crossref, PubMed, ISI, Google Scholar


    Page 4

    The concept of keystone species, originally suggested by Robert Paine to describe species whose impact on their ecosystem is much greater than their part in it [1–3], has been recently adopted by animal behaviour researchers to describe individuals whose impact on the population they live in is much greater than their proportion in it, and whose removal from the population would result in a profound and lasting effect on group dynamics [4]. While the general concept is relatively new, effects of such individuals have been noted and documented over decades and across social species by more situation-specific titles, such as dominants, tutors or leaders (see detailed review in [4]). Recent studies utilizing the keystone individuals concept have shown, for example, that the presence of a few bold individuals in colonies of social spiders, and the quality of the knowledge these individuals possess, affects the colony's foraging behaviour and success [5,6], and that an ant colony's nest site selection is faster and more accurate when it includes highly exploratory individuals [7]. The keystone framework is also gaining some traction in conservation biology: it has recently been proposed that identification of keystone individuals and analysis of their effect on the population is valuable in conservation and management of social species [8].

    In the context of cultural evolution, we may consider innovators of behaviours that spread in a population, and individuals who serve as a popular copying model, to be keystones [4]. Theoretical and experimental work assessing the role of innovation in cultural evolution has focused on the conditions favouring social learning over innovation (or individual learning) and vice versa (e.g. [9–17]), as well as on the diffusion of innovations [18–20]. Lately, a series of models turned the spotlight onto the way different types of innovations may shape the evolution of culture [21–23]. The different nature that innovations may have pertains to a longstanding dispute in the animal behaviour literature. While it is intuitively clear that not all innovations are similar in their inception and impact, how can we define the differences between them in general terms?

    In a recent paper, we approached this issue by describing behavioural innovations as measured by their magnitude [24]. Relying on a previous definition [25], we suggested that any new behaviour, no matter how similar to behaviours already in the populations' behavioural repertoire, should be considered an innovation; however, we argued that these innovations may differ in how close to or far from the population's mean behaviour they are. Innovations that are far from the mean are considered high-magnitude innovations, while innovations that are close to the mean are considered of low magnitude (see detailed discussion in [24]). Offering a great increase in the population's fitness, high-magnitude innovations that spread in the population may therefore be viewed as a cultural ‘leap' [21].

    High-magnitude innovations, by definition, differ significantly from familiar behaviours. They may include the introduction of a new object to interact with [26], a new territory to forage in [27], a new feeding method to use [28], or a new song to replicate [29]. Viewing others interact with an unfamiliar object may allow neophobic individuals to overcome their fear, or simply draw attention to an object that copiers have not noted before [30,31]. Thus, high-magnitude innovations allow copiers of the innovation to explore a new domain and perhaps modify it by innovating themselves. Models focusing on the effect of different innovation types on human cultural evolution have used the latter idea, suggesting to account for the punctuated evolutionary pattern found in the human artefact archeological record [21–23]. High-magnitude innovators may therefore not only serve as keystone individuals by generating cultural leaps, but also by facilitating socially induced innovations that further modify their own.

    In this study, I expand upon our previous work on the magnitude of innovation in social animals [24], to include cultural evolution. I investigate whether a trait allowing socially induced innovation can evolve, examine the effect of such a trait on the evolution of independent innovation and on the magnitude of innovation, and finally, analyse how all these traits interact to shape the progression of culture.

    I simulated a population of individuals genetically varying in their (i) tendencies to innovate and to copy others; (ii) innovation magnitude; and (iii) tendency to modify high-magnitude innovations they have copied. A generation's life began with a series of T = 10 discrete learning steps; in each of them individuals acquired one new behaviour either by innovating or by copying the innovations that others produced during that specific learning step t (0 < t ≤ T). Individuals who copied high-magnitude innovations in step t could, based on their genetic tendency, be ‘inspired’ to innovate in the next learning step (t + 1), to produce a modification of the copied innovation. This modified innovation could be copied by others during that step (t + 1 ≤ T) as any innovation, and could serve as a basis for further socially induced innovations in the next learning step (t + 2 ≤ T) in the same manner. After the T steps of the learning phase, individuals applied the behaviours they had acquired, with greater weight given to higher-paying behaviours. Individuals then produced offspring in proportion to the relative payoff they had accumulated during their lifetime, and died. The mean of the highest paying behaviours learned by parents was defined as their generation's cultural contribution, and considered the new generation's behavioural baseline for cultural evolution calculations.

    A population of n = 100 individuals were modelled, with each individual characterized by three focal genes: L (Learning gene), I (Innovation magnitude gene) and C (socially induced innovation gene). The learning gene, L, determined the probability the individual would, at each learning step, produce an independent innovation or copy a conspecific's innovation. There were 11 possible alleles in this gene: 0, 0.1, 0.2 … 1, where 0 coded for full-time copying, or social learning, 1 for full-time independent innovation, and all other alleles for a combination of the two (e.g. a carrier of the 0.3 allele spent 30% of the time, on average, copying, complemented by an average of 70% independent innovation). The innovation magnitude gene, I, affected how far from the population's norm an individual's innovations would be when innovating. There were again 11 possible alleles in this gene: 0, 0.1, 0.2 … ,1, which represented standard deviations from the population's mean behaviour; this value was used to draw a value from a normal distribution whose mean was the population's mean behaviour, and standard deviation was the individual's I allele (see below). The socially induced innovation gene, C, determined the probability that, after copying a high-magnitude innovation, the copier would proceed to modify this innovation in its next learning step. This gene included three alleles: C0—for zero probability, i.e. no effect; Csqrt—the square root of the individual's probability to innovate as set by its L allele, i.e. an increase in innovation probability that is proportional to genetic tendency for independent innovation; and C1—for a probability of 1, i.e. the individual was certain to innovate. Just like independent innovation, the magnitude of an individual's socially induced innovation was determined by its genotype in the I gene.

    All individuals in the population had a limited number of learning steps T = 10. In each of these steps they acquired one new behaviour, either by innovation or by copying an innovation a conspecific has produced at that specific step (our previous model analysing the cases of T = 100 found no significant differences between the two cases, see [24]). At the beginning of each step, it was determined for each individual whether it would innovate or copy, based on the probability dictated by its L genotype. Individuals who were to innovate generated a new behaviour. The value of this innovative behaviour (i.e. its payoff) was drawn from a normal distribution whose mean was the population's mean behaviour, and whose standard deviation was the innovator's allele in the I gene; for convenience, the population's mean behaviour value was set to 0. Then, individuals who were to copy in this learning step copied the behaviours generated by innovators. All innovations were ranked according to their value, and which innovations would be copied depended on the selectivity of social learning in the population (which was kept constant per population). The selectivity of social learning was controlled using the variable D, defined as 1 – [the fraction of demonstrators copied]. When selectivity was high (high D) only innovations with the highest value were copied (e.g. when D = 0.9 only the top 10% of innovation were copied); as the selectivity of social learning became lower, copying became more random (and was completely random at D = 0).

    In cases where individuals copied a high-magnitude innovation, defined as an innovation whose value was greater than 1 (putting it at a distance greater than 1 s.d. from the population's mean behaviour), it was determined whether they would modify this innovation in the following learning step, t + 1, based on their socially induced innovation (C) allele. If they were to innovate, the magnitude of their innovation was set by their I allele. These individuals produced an innovation at the beginning of step t + 1 along with independent innovators (described above). However, for these socially induced innovators, the value of their innovation was added to the value of the high-magnitude innovation they copied in their previous learning step (t), to yield a new innovation for step t + 1. This innovation was then ranked along with all innovations and copied by individuals who, in step t + 1, are copying others, as described above. The choice to have socially induced innovation triggered only by the copying of high-magnitude innovations, rather than the copying of any innovation, was made in order to set these innovations apart from independent innovations (see Discussion).

    After acquiring the behaviours, individuals apply these behaviours and will tend to use them with a frequency directly proportional to the payoff they offer. To calculate the proportion of time allotted to each behaviour, and since payoffs can be negative as well as positive, an exponential transformation of the form

    What is cultural transmission example?

    2.1

    was used, where px is the proportion of time spent using behaviour x, βx is the payoff of behaviour x, i = 1 … j are the behaviours the individual has acquired during its learning phase (j = T) and σ is the application sensitivity: the degree to which agents can distinguish between payoffs in choosing which behaviours to apply. This value is the same for all agents. Following previous analysis [24], σ was set to its high value (σ = 3.3), such that agents spend a higher proportion of their time applying the highest paying behaviour and little to no time applying low value behaviours. Note that due to the stochastic process used in the simulation to generate new behaviours, unless there is no innovation in the population, behaviours 1 … j will each be unique.

    The payoff accumulated from applying the learned behaviours, WA, was then calculated by summing up the multiplications of each behaviour's payoff and the proportion of time spent applying it,

    What is cultural transmission example?

    2.2

    To calculate the total payoff to individuals in the population, WT, the payoff obtained both during the learning phase, WL (which is the sum of all payoffs of behaviours learned), and during the application phase, WA, was summed using a weight factor α = 0.1 to account for the relative time allocated to the learning phase compared to the application phase,

    What is cultural transmission example?

    2.3

    Payoff received for behaviours was included in the learning phase payoff calculation (in the form of WL) regardless of whether they were applied, as it is assumed that agents perform behaviours when they are learning them, in order to experience their exact payoff.

    Individuals then reproduced, producing a number of offspring proportional to their total payoff relative to the payoff of all other individuals in the population. Since the total payoff could be negative, we again used an exponential transformation of the form

    What is cultural transmission example?

    2.4

    where ry is the probability of reproduction for individual y, and λ is the strength of selection. Following previous analysis [24], λ was set to its high value (λ = 3.3), to generate strong selection: individuals who obtained higher total payoff had much higher chances to reproduce than individuals who obtained a lower payoff. Among the offspring, mutation occurred at a rate of μ = 1/n in all genes. Mutation was random and the new variant was drawn from each gene's pre-defined allele pool.

    After parents were selected, each parent's highest paying behaviour was recorded. The mean of all these behaviours from the parental generation was then counted as that generation's cultural contribution. This assumption accounted for a situation where a full repertoire of behaviours was transferred to the new generation, and not just one. This mean was then viewed as the new generation's mean behaviour. Since values of behaviour were arbitrary, the actual value of this mean did not matter for purposes of innovation in the next generation, and furthermore, using it as the mean for the distribution from which the next generation draws innovations inflates cultural evolution rates, this cultural contribution was set aside and the actual mean used to draw innovations was zero for all generations. These cultural contributions were then used cumulatively to calculate the progress of cultural evolution. For example, if generation 1's contribution was 1.5, and generation 2's contribution was 0.5, the final value of culture for generation 2 was 1.5 + 0.5 = 2, and so on for following generations. The choice to use the mean of parents' highest paying behaviours was conservative: using only the single highest paying behaviour for each generation would have resulted in higher cultural rates.

    The allele frequency of socially induced innovation gene, C, changed with social learning selectivity, D (figure 1). The C1 allele, setting the probability of socially induced innovation to 1, had a clear advantage when the selectivity of social learning was low (D ≤ 0.1). The allele enhancing the probability of innovation, Csqrt, was also selected at D = 0, although at a much lower frequency, with some advantage over C0 (allele coding for no effect); this advantage of Csqrt over C0 disappeared when D = 0.1. When selectivity was higher, C1 was found at low frequencies, while Csqrt and C0 appeared at similar frequencies (between 40% and 50% each). It should be noted that in that range of social learning selectivity, D, the rate of independent innovation, set by the L gene, was close to zero (figure 2a): in most generations individuals had an independent innovation rate of zero, therefore, the Csqrt allele would have no effect on them, similar to C0.

    What is cultural transmission example?

    Figure 1. Mean frequency of alleles in the socially induced innovation gene, C, as a function of social learning selectivity, D. C0 allele codes for no socially induced innovation; C1 allele codes for certain socially induced innovation; Csqrt sets the probability of socially induced innovation to the square root of the probability of independent innovation (L genotype). Means and standard errors are calculated for generations 4001–5000, across 100 repeats of each simulation.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What is cultural transmission example?

    Figure 2. Effect of the socially induced innovation gene, C, on the frequency and magnitude of innovation. (a) Mean frequency of independent innovation based on mean genotype in the L gene; (b) mean magnitude of innovation among individuals with the genetic potential of independent and/or socially induced innovation. Means and standard errors are calculated for generations 4001–5000, across 100 repeats of each simulation.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    A comparison of the genetic probability of independent innovation in the presence and in the absence of the C gene shows an effect changing with the selectivity of social learning, D (figure 2a). While in the absence of C the genetic probability represents the expected probability of innovation in the population, in the presence of C, the actual rate of independent innovation may be lower than the genetic probability, as individuals may use some of their learning steps for socially induced innovation, instead of drawing between innovation and social learning based on their L allele. When the selectivity of social learning was at its lowest—where copying is completely random—socially induced innovation significantly decreased the rate of independent innovation. When social learning selectivity was poor while still eliminating the worst innovations (D = 0.1), the rate of independent innovation was the same with and without C. When selectivity was higher but still in the low range (0.2 ≤ D ≤ 0.5), the rate of independent innovation was slightly higher in the presence of the C gene. As the effect is very small, and due to the complicated frequency-dependent interaction between the three genes, it is difficult to determine whether this is due to noise created by drift in the C gene, because socially induced innovations increase the benefit of independent innovation by increasing the competition, because carriers of the Csqrt allele benefit when also carrying an L allele with a value that is higher than zero, or some combination of these. However, more selective social learning resulted in similar, close to zero rates of independent innovation, with and without socially induced innovation.

    When the selectivity of social learning was at its lowest—where copying was completely random—the magnitude of innovation was lower in the presence of socially induced innovation, although still very high (0.91 compared to 0.99; figure 2b). In the medium range of social learning selectivity, however, the magnitude of innovation was consistently higher in the presence of socially induced innovation, and as in the absence of socially induced innovation, decreased as selectivity in social learning increased.

    Culture as measured by the accumulation of innovations was higher when the selectivity in social learning (D) was lower (figure 3). Socially induced innovation (the C gene) increased the rate of cultural evolution; this effect was found even in the case of random copying (D = 0), where the rate of independent innovation was much lower in the presence of socially induced innovation than in its absence (figure 2a).

    What is cultural transmission example?

    Figure 3. Effect of socially induced innovation gene, C, on cumulative culture. Means and standard errors are calculated over 100 repeats of each simulation.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Socially induced innovations would seem to have a clear advantage: building on a known high-magnitude innovation, they offer the possibility of generating an even better innovation, with a lower risk compared to independent innovation. That is, even if the socially induced innovation results in a lower value behaviour compared to the independent innovation it builds upon, it is still less likely to be below the population's mean value of behaviour, unlike independent innovations. Still, socially induced innovations do not evolve when the selectivity of social learning is high: in that situation, others are likely to copy a high-magnitude socially induced innovation, without incurring the possible cost of producing a low-magnitude innovation. The cost for the socially induced innovator here is not only in having a lower value behaviour in its repertoire, but also in missing the chance of copying a better behaviour produced by another individual at that time step. This opportunity cost stems from the assumptions of the model, whereby individuals must perform the behaviour in order to learn it, know its exact payoff, and be ‘inspired' to modify it further with their own innovation.

    Most significant is the effect of socially induced innovation on the rate of independent innovation when copying is random (D = 0). In that condition, in the absence of the C gene, the rate of independent innovation is up to 0.64 ± 0.02, but when incorporating the C gene, the rate goes down to 0.13 ± 0.01. The magnitude of innovation is also somewhat lower. The dominating allele in the C gene at this time is C1, guaranteeing a socially induced innovation whenever a high-magnitude innovation is copied. This combination of traits is, perhaps unsurprisingly, ‘safer' than a high rate of independent innovation alone, for the reason discussed above. It should be noted that, while this result is found when social learning selectivity is low, the selectivity in application of behaviour is high, thus individuals do not blindly use behaviours; they are simply unable to judge the value of a behaviour without performing it first themselves. Regardless of the specific condition, it demonstrates how socially induced innovation may affect independent innovation, in a situation where independent innovation would otherwise be highly favoured. While lowering the rate of independent innovation, and the magnitude of all innovations, the C gene also leads to a much higher rate of cultural evolution: socially induced innovations may be copied by others, who may subsequently use them as a basis for further socially induced innovations, resulting in a cascade of innovations. Altogether, socially induced innovation, which can only act in the presence of high-magnitude independent innovation, selects here against high-magnitude independent innovators, and by lowering their frequency makes their role, as initiators of the innovation cascade, more crucial. In other words, it makes them keystone individuals.

    The definition of keystone individuals, as discussed by Modlmeier et al. [4], asserts that keystones cannot be ‘generic’: if removed, their niche cannot simply be filled by others. In the model presented here, individuals may be genetically identical, but few may, by chance, produce a high-magnitude independent innovation, while others may copy it and modify it. Their role as keystones is determined based on the result of their actions. Their independent innovations are chance events, and within a generation lifetime do not depend on whether others have or have not produced high-magnitude independent innovations of their own. Thus, the removal of a specific keystone individual would indeed not result in another individual in the population producing a high-magnitude innovation in its place.

    The results of the model provide, through proof of concept, insight into the coevolution of independent and socially induced innovation. As human technology is undoubtedly made of cascades of innovations [32], the finding that socially induced innovations may select against independent innovation is highly relevant, and fits nicely with results of models that combine these two types of innovations, to demonstrate how human culture may have evolved in ‘bursts', composed of initial ‘lucky leap' innovations that are followed by further innovations inspired by the leap [21–23]. Furthermore, the results presented here demonstrate how socially induced innovation may help maintain independent innovations, or lucky leaps, at a low frequency, when it is difficult to gauge the payoff of a behaviour without first-hand experience (see discussion of selectivity above).

    While the model aims to be general, cumulative culture in nonhuman animals, to the extent that it exists, is difficult to track. Some exceptions to this rule, however, are bird song [29] and whale song [33], where populations have been documented evolving unique vocal repertoires. Studies in bird song suggest possible costs to song innovation (e.g. the signal not conveying the signaler's intended information [29]), as well as benefits (adjusting song to new ecological circumstances, e.g. songs that travel better in an urban environment [34–36]). They also suggest that innovations often arise through copying errors [29]. This is especially interesting in the context of socially induced innovation, as a novel song (i.e. an innovation), only performed by a single individual, would seem more likely to be replicated with errors by listeners (i.e. lead to socially induced innovation), compared to a song performed by many in the population (i.e. the mean behaviour).

    Is the concept of keystone individuals conducive to our understanding of the evolution of culture? What if, for example, individuals were induced to innovate by copying any innovation, regardless of its magnitude? In such a case, socially induced innovations would have no benefit over independent innovations: if the original innovation they innovate upon were not of high value, socially induced innovations would be just as likely to result in below-average behaviour as an independent innovation. Thus, cultural evolution rates with such indiscriminate socially induced innovations are likely to be the same as in their absence. Having the keystone concept in mind contributed much to the design of the model presented in this paper, and in turn, to its insight into the possible evolutionary interaction between independent innovation, socially induced innovation and innovation magnitude, and how this interaction can shape the evolution of culture.

    The Matlab code for the simulations used in this paper has been uploaded as part of the electronic supplementary material.

    This article has no additional data.

    I declare I have no competing interests.

    I received no funding for this study.

    I wish to thank two anonymous reviewers for their thoughtful, constructive comments on a previous version of this manuscript.

    Footnotes

    One contribution of 16 to a theme issue ‘Bridging cultural gaps: interdisciplinary studies in human cultural evolution’.

    Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.3965457.

    References

    • 1

      Paine RT. 1969A note on trophic complexity and community stability. Am. Nat. 103, 91–93. (doi:10.1086/282586) Crossref, ISI, Google Scholar

    • 2

      Paine RT. 1995A conversation on refining the concept of keystone species. Conserv. Biol. 9, 962–964. (doi:10.1046/j.1523-1739.1995.09040962.x) Crossref, ISI, Google Scholar

    • 3

      Mills LS, Soulé ME, Doak DF. 1993The keystone-species concept in ecology and conservation: management and policy must explicitely consider the complexity of interactions in natural systems. Bioscience 43, 219–224. (doi:10.2307/1312122) Crossref, ISI, Google Scholar

    • 4

      Modlmeier AP, Keiser CN, Watters JV, Sih A, Pruitt JN. 2014The keystone individual concept: an ecological and evolutionary overview. Anim. Behav. 89, 53–62. (doi:10.1016/j.anbehav.2013.12.020) Crossref, ISI, Google Scholar

    • 5

      Pruitt JN, Wright CM, Keiser CN, DeMarco AE, Grobis MM, Pinter-Wollman N. 2016The Achilles' heel hypothesis: misinformed keystone individuals impair collective learning and reduce group success. Proc. R. Soc. B 283, 20152888. (doi:10.1098/rspb.2015.2888) Link, ISI, Google Scholar

    • 6

      Pruitt JN, Keiser CN. 2014The personality types of key catalytic individuals shape colonies’ collective behaviour and success. Anim. Behav. 93, 87–95. (doi:10.1016/j.anbehav.2014.04.017) Crossref, PubMed, ISI, Google Scholar

    • 7

      Hui A, Pinter-Wollman N. 2014Individual variation in exploratory behaviour improves speed and accuracy of collective nest selection by Argentine ants. Anim. Behav. 93, 261–266. (doi:10.1016/j.anbehav.2014.05.006) Crossref, PubMed, ISI, Google Scholar

    • 8

      Swan GJF, Redpath SM, Bearhop S, McDonald RA. 2017Ecology of problem individuals and the efficacy of selective wildlife management. Trends Ecol. Evol. 32, 518–530. (doi:10.1016/j.tree.2017.03.011) Crossref, PubMed, ISI, Google Scholar

    • 9

      Arbilly M, Motro U, Feldman MW, Lotem A. 2011Evolution of social learning when high expected payoffs are associated with high risk of failure. J. R. Soc. Interface 8, 1604–1615. (doi:10.1098/rsif.2011.0138) Link, ISI, Google Scholar

    • 10

      Rendell Let al.2010Why copy others? Insights from the social learning strategies tournament. Science 328, 208–213. (doi:10.1126/science.1184719) Crossref, PubMed, ISI, Google Scholar

    • 11

      Rendell L, Fogarty L, Laland KN. 2010Rogers' paradox recast and resolved: population structure and the evolution of social learning strategies. Evolution 64, 534–548. (doi:10.1111/j.1558-5646.2009.00817.x) Crossref, PubMed, ISI, Google Scholar

    • 12

      Enquist M, Eriksson K. 2007Critical social learning: a solution to Rogers's paradox of nonadaptive culture. Am. Anthropol. 109, 727–734. (doi:10.1525/AA.2007.109.4.727) Crossref, ISI, Google Scholar

    • 13

      Rogers AR. 1988Does biology constrain culture?Am. Anthropol. 90, 819–831. (doi:10.1525/aa.1988.90.4.02a00030) Crossref, ISI, Google Scholar

    • 14

      Franz M, Nunn CL. 2009Rapid evolution of social learning. J. Evol. Biol. 22, 1914–1922. (doi:10.1111/j.1420-9101.2009.01804.x) Crossref, PubMed, ISI, Google Scholar

    • 15

      Aoki K, Wakano JY, Feldman MW. 2005The emergence of social learning in a temporally changing environment: a theoretical model. Curr. Anthropol. 46, 334–340. (doi:10.1086/428791) Crossref, ISI, Google Scholar

    • 16

      Borenstein E, Feldman MW, Aoki K. 2008Evolution of learning in fluctuating environments: when selection favors both social and exploratory individual learning. Evolution 62, 586–602. (doi:10.1111/j.1558-5646.2007.00313.x) Crossref, PubMed, ISI, Google Scholar

    • 17

      Lehmann L, Feldman MW. 2009Coevolution of adaptive technology, maladaptive culture and population size in a producer-scrounger game. Proc. R. Soc. B 276, 3853–3862. (doi:10.1098/rspb.2009.0724) Link, ISI, Google Scholar

    • 18

      Cavalli-Sforza LL, Feldman MW. 1981Cultural transmission and evolution: a quantitative approach. Princeton, NJ: Princeton Univesity Press. Google Scholar

    • 19

      Boyd R, Richerson PJ. 1985Culture and the evolutionary process. Chicago, IL: University of Chicago Press. Google Scholar

    • 20

      Rogers EM. 2003Diffusion of innovations, 5th edn. New York, NY: Free Press. Google Scholar

    • 21

      Kolodny O, Creanza N, Feldman MW. 2015Evolution in leaps: the punctuated accumulation and loss of cultural innovations. Proc. Natl Acad. Sci. USA 112, E6762–E6769. (doi:10.1073/pnas.1520492112) Crossref, PubMed, ISI, Google Scholar

    • 22

      Kolodny O, Creanza N, Feldman MW. 2016Game-changing innovations: how culture can change the parameters of its own evolution and induce abrupt cultural shifts. PLoS Comput. Biol. 12, 1–15. (doi:10.1371/journal.pcbi.1005302) Crossref, ISI, Google Scholar

    • 23

      Creanza N, Kolodny O, Feldman MW. 2017Greater than the sum of its parts? Modelling population contact and interaction of cultural repertoires. J. R. Soc. Interface 14, 20170171. (doi:10.1098/rsif.2017.0171) Link, ISI, Google Scholar

    • 24

      Arbilly M, Laland KN. 2017The magnitude of innovation and its evolution in social animals. Proc. R. Soc. B 284, 20162385. (doi:10.1098/rspb.2016.2385) Link, ISI, Google Scholar

    • 26

      Visalberghi E, Fragaszy DM. 1995The behaviour of capuchin monkeys, Cebus apella, with novel food: the role of social context. Anim. Behav. 49, 1089–1095. (doi:10.1006/anbe.1995.0137) Crossref, ISI, Google Scholar

    • 27

      Laland KN, Williams K. 1997Shoaling generates social learning of foraging information in guppies. Anim. Behav. 53, 1161–1169. (doi:10.1006/anbe.1996.0318) Crossref, PubMed, ISI, Google Scholar

    • 28

      Allen J, Weinrich M, Hoppitt W, Rendell L. 2013Network-based diffusion analysis reveals cultural transmission of lobtail feeding in humpback whales. Science 340, 485–488. (doi:10.1126/science.1231976) Crossref, PubMed, ISI, Google Scholar

    • 29

      Slater PJB, Lachlan RF. 2003Is innovation in bird song adaptive? In Animal innovation (eds Reader SM, Laland KN), pp. 117–135. Oxford, UK: Oxford University Press. Crossref, Google Scholar

    • 30

      Greenberg R. 2003The role of neophobia and neophilia in the development of innovative behaviour of birds. In Animal innovation (eds Reader SM, Laland KN), pp. 175–196. Oxford, UK: Oxford University Press. Crossref, Google Scholar

    • 31

      Brosnan SF, Hopper LM. 2014Psychological limits on animal innovation. Anim. Behav. 92, 325–332. (doi:10.1016/j.anbehav.2014.02.026) Crossref, ISI, Google Scholar

    • 32

      Basalla G. 1988The evolution of technology. Cambridge, UK: Cambridge University Press. Google Scholar

    • 33

      Whitehead H, Rendell L. 2015The cultural lives of whales and dolphins. Chicago, IL: University of Chicago Press. Google Scholar

    • 34

      Brumm H. 2004The impact of environmental noise on song amplitude in a territorial bird. J. Anim. Ecol. 73, 434–440. (doi:10.1111/j.0021-8790.2004.00814.x) Crossref, ISI, Google Scholar

    • 35

      Templeton CN, Zollinger SA, Brumm H. 2016Traffic noise drowns out great tit alarm calls. Curr. Biol. 26, R1173–R1174. (doi:10.1016/j.cub.2016.09.058) Crossref, PubMed, ISI, Google Scholar

    • 36

      Slabbekoorn H, den Boer-Visser A. 2006Cities change the songs of birds. Curr. Biol. 16, 2326–2331. (doi:10.1016/j.cub.2006.10.008) Crossref, PubMed, ISI, Google Scholar


    Page 5

    In the study of cultural evolution, patterns of among-group cultural differences and similarities are often analysed in the context of time (i.e. history) and space (i.e. geography) in an attempt to infer the processes that have generated such patterns (e.g. [1–8]). Under the umbrella of a Darwinian evolutionary model of ‘descent with modification’ [2,6,9–14], processes such as innovation, copying errors, cultural drift, selection, dispersal, population splitting and population contact all play potentially important roles in generating observed patterns of cultural diversity. Hence, despite differences in the specifics of the mechanisms, the processes of cultural differentiation, change and assimilation are, for all intents and purposes, analogous to the processes of micro- and macroevolutionary change observed in biology. As such, analytical and theoretical approaches developed in the study of evolutionary biology have been widely adopted and assimilated into the study of cultural evolutionary change (e.g. [6,7,11,14–18]).

    Owing to the similarity of mechanisms and process in the operation of both cases, researchers examining cultural variation in modern human populations face many of the same issues as those studying patterns of human biological diversity. Unlike other primate species (and mammals in general), Homo sapiens is characterized by being an extremely geographically widespread (yet comparatively young) species, dispersing across all major continental landmasses and island systems in a relatively short timeframe (e.g. [19]). As a consequence, the specific histories of biological and linguistic diversification have played out over relatively vast geographical areas, resulting in an intimate correlation between global patterns of genetic, morphological and linguistic diversity [20–24]. Therefore, linguistic affinities are often used as a means of modelling the historical relationships between populations (e.g. [1,2,5,8]), when detailed genetic or other genealogical data are not available.

    Several studies of human material culture variation have sought to understand the connection between spatial, historical and cultural patterns by statistically assessing the correlations between linguistic, geographical and cultural distance matrices (e.g. [2,3,25]). Such studies have sometimes been situated within the context of debates regarding the relative importance of population splitting (branching) versus assimilation or borrowing (blending) in the generation of observed cultural affinity patterns [2,25–29]. Bifurcating phylogenetic (tree-like) models are widely applied in the study of biological and cultural evolution under the assumption that groups evolve via divergence from existing groups and via the accumulation of novel-derived traits or attributes [28,30,31]. While phylogenetic tree approaches are often deemed appropriate in the study of biological speciation at macroevolutionary levels, their application to the study of biological diversity at lower taxonomic levels may be problematic on the grounds of gene flow, introgressive hybridization and horizontal gene transfer between geographically proximate lineages [32]. This is likely to be particularly true at the intraspecific level, such as the relationships among human populations, where the potential for gene flow between contiguous populations is constant [33–36]. Similar criticisms have been levelled at the use of tree models in the study of cultural diversity [26,37,38], given the potential for extensive borrowing, convergence and exchange (i.e. ‘culture flow’ [39]) among geographically contiguous groups [40]. Such criticisms may not be warranted in cases where cultural lineages are more akin to interspecific (or higher taxonomic) comparisons, although it is not always possible to judge a priori what the potential for borrowing and exchange between different groups actually is. Accordingly, some researchers argue for a ‘tangled web’ model to represent the evolutionary relationships between populations, whereby channels (lineages) split and flow into each other as a function of their historical relatedness and geographical propinquity.

    Broadly speaking, recent evolutionary analyses of human material culture patterns have taken two major (but not mutually exclusive) methodological approaches to the question of historical divergence and geographical patterning (see also [29]). One approach has been to fit bifurcating tree models representing the phylogenetic (branching) history of individual groups to material culture datasets (e.g. [28,30,31,41,42]). For example, Collard et al. [28] showed that the goodness of fit of various material culture datasets to a phylogenetic tree model was approximately the same as for biological datasets representing a range of different animal taxa. They concluded that while both ‘branching’ and ‘blending’ are important processes in generating cultural diversity patterns, there was little evidence that the effects of ‘blending’ processes were so strong as to obscure the underlying signal of historical divergence among groups.

    The other major analytical approach applied to such questions involves the statistical comparison of affinity matrices representing cultural, geographical or linguistic affiliations using Mantel tests [43] (e.g. [2,5,18,25,44,45]). This approach mirrors that often taken by biological anthropologists interested in understanding the relationship between intraspecific biological distances and various explanatory factors such as geographical distance, time and climate (e.g. [46–49]). As noted above, tree-building (phylogenetic) methods are often deemed appropriate when analysing biological datasets at the supra-specific level, under the assumption that evolving lineages do not exchange genetic material once they have diverged from a common ancestor. Therefore, in the case of modern human population history, where the potential for gene flow and reticulation between population lineages is high, anthropologists have often called upon the analysis of among-group affinity matrices instead of tree-building methods as an alternative method. However, affinity matrices violate the basic statistical assumptions of traditional correlation and regression techniques due to the non-independence of their elements. As a result, Mantel tests [43] and their derivatives are generally employed to compare matrices, whereby significance values are assigned via random permutation of the matrix rows and columns to produce a distribution of values against which the observed correlation can be compared. Extensions of the Mantel test include partial tests, whereby the effects of a third matrix (or more) are held constant, as well as the multiple Mantel test, whereby multiple independent matrices can be regressed against a single dependent matrix [50–52].

    While intuitively simple to implement and interpret, Mantel tests have been criticized for having low statistical power and elevated type I error rates, particularly when one or more matrices are spatially (or phylogenetically) autocorrelated [53,54]. However, others have defended the use of Mantel tests [52] by pointing out that elevated type I errors can, to some extent, be corrected by adjusting the α-levels and by implementing suitable research design, such as using partial Mantel tests when testing specific hypotheses. An alternative, more statistically powerful, approach to Mantel tests involves the use of Procrustes rotation of two matrices to compare their structure and assess congruence (e.g. [48,55,56]). Here, statistical significance is assigned using a permutation test and sum-of-squares differences between the configurations of two matrices. The advantage of this method is that the effect of each individual population can be assessed independently of the whole dataset, while the main disadvantage is the lack of control over additional matrices, the effects of which may actually be driving a spurious correlation between two matrices being rotated.

    Here, we showcase an alternative method–the hierarchical Mantel test (HMT)—that quantifies the independent effects of multiple matrices explicitly, thereby taking spatial autocorrelation into consideration. To illustrate the problem of spatial autocorrelation, assume that we have three matched affinity matrices for a sample of human populations: (i) genetic distance, (ii) geographical distance and (iii) climatic differences. Using simple Mantel tests, we learn that all three matrices are significantly and strongly correlated with each other; i.e. genetics correlates with both geography and climate, and geography correlates with climate. So what can we conclude from these results? Is climate driving genetic differentiation? Or does climate correlate with genetic distance because climate and genetics are both spatially autocorrelated? To disentangle what is going on, we could perform a partial Mantel test and learn that the correlation between genetics and climate disappears once we account for geography. So, spatial autocorrelation has been accounted for. However, what partial Mantel tests do not allow us to do is quantify the extent to which geography (or indeed climate) potentially explain genetic distance. It could well be the case that geography is the major determinant of among-group genetic distances, but that climate also plays a minor but additive role in generating diversity. This subtlety would not be picked up by the partial Mantel test alone. A similar problem is faced in cases where biological variation correlates with geographical distance, but there is also a strong correlation between geography and population history. In such cases, it is impossible to tell whether groups have differentiated under a classic model of ‘isolation-by-distance’ [57], or whether their differentiation is primarily due to a hierarchical history of shared common ancestry, which also happens to be spatially autocorrelated [21,34,58].

    Disentangling the effects of these two geographically mediated processes (i.e. phylogenetic divergence versus gene flow) in biology mirrors the stumbling blocks faced in resolving the ‘branching–blending’ debate in cultural evolution. Hence, the HMT method has wide applicability across biological and cultural datasets, where there is a strong co-correlation between geographical distance and a bifurcating tree model representing population history. The HMT was adapted from evolutionary ecology [51] by de Campos Telles & Diniz-Filho [59] to partition the effects of contemporary gene flow and historical effects of population divergence among 10 populations of the Brazilian tree species Eugenia dysenterica. They found that while all three matrices (biology, geography and history) were significantly correlated with each other, when they applied the hierarchical version of the Mantel test, almost 22% of biological variation was explained by ‘history’ alone with only an additional 1.5% explained by geographically mediated gene flow. Thus, it was possible to conclude that historical divergence (i.e. branching) was the more potent force in generating among-population genetic differences, with recent gene flow (i.e. blending) playing a relatively minor role.

    The HMT works as follows: among-group biological/cultural variation is assumed to be partitioned into the effects of (a) pure ‘history’, (b) pure ‘geography’, (c) interaction between history and geography and (d) residual variation. This latter residual variation reflects imperfections in the models of ‘history’ and ‘geography’ used, as well as additional explanatory factors not considered in the HMT model. First, in order to qualify for the HMT, all three matrices (biology/culture distance (D), history (H) and geography (G)) must correlate with each other. Thereafter, two simple (I, II) and one multiple Mantel tests (III) are performed: (I) D× H, (II) D× G and (III) D× (H + G). The resultant correlation coefficients are converted into coefficients of determination (R2) and the partitions are calculated as follows:

    What is cultural transmission example?

    Here, we illustrate the potential utility of the HMT method using two human datasets, one biological and one cultural. The biological case study employs a recently published global craniometric dataset [60] representing populations from all major continents. The cultural case study employs material culture data collated by Welsch et al. [1] for coastal communities from northern New Guinea. Notably, in the biological case, the geographical scale is global and large-scale historical factors due to the out-of-Africa dispersal are likely to be strong in structuring among-group patterns (e.g. [21,34]). Conversely, in the case of the cultural dataset, which operates at a far more local level geographically, patterns of inter-group cultural exchange have been proposed by some [1] to be the mostpertinent factor in constructing among-group affinity patterns.

    In the case of both the biological and the cultural case studies, three different types of among-group distance or dissimilarity matrices were required, representing (i) ‘geography’, (ii) ‘history’ and (iii) the among-group affinity patterns of interest (i.e. biological distance or cultural similarity).

    The biological case study was based on modern human craniometric affinity patterns (figure 1) for 17 globally distributed populations [60].

    What is cultural transmission example?

    Figure 1. (a) Global map showing geographical location of each cranial population. Stars represent the following waypoints used to calculate geographical distances among groups: Cairo, Egypt (30.0, 31.0); Istanbul, Turkey (41.0, 28.0); Phnom Penh, Cambodia (11.0, 104.0); Anadyr, Russia (64.0, 177.0); Prince Rupert, Canada (54.0, −130.0) and Panama City, Panama (9.1, −79.4). (b) Tree topology representing a hierarchical model of historical divergence among populations based on genetic information. ‘History’ distances simply reflect the number of nodes connecting pairs of populations and branch lengths are not shown to scale.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    (i) The geographical distance matrix comprised between-population great-circle distances in kilometres based on the geographical coordinates provided in table 1. Distances were calculated via the waypoints shown in figure 1 using the Geographic Distance Matrix Generator v. 1.2.3 [61].

    Table 1.Human population craniometric samples employed.

    populationmuseumaNlatitude, longitude
    1. SanNHM, MH, AMNH, NHMW, DC31−21.0, 20.0
    2. BiakaNHM, MH214.0, 17.0
    3. IboNHM307.5, 5.0
    4. ZuluNHM30−28.0, 31.0
    5. BerberMH3032.0, 3.0
    6. ItalianNHMW3046.0,10.0
    7. BasqueMH3043.0, 0.0
    8. RussianNHMW3061.0, 40.0
    9. AustralianDC30−22.0, 126.0
    10. AndamanNHM2812.4, 92.8
    11. MongolianMH3045.0,111.0
    12. ChineseNHMW3032.5,114.0
    13. JapaneseMH3038.0,138.0
    14. AlaskanAMNH3069.0, −158.0
    15. GreenlandSNMNH3070.5, −53.0
    16. HawikuhSNMNH3033.5, −109.0
    17. ChubutMLP30−43.7, −68.7

    (ii) Among-group distances due to historical divergence were constructed based on the branching topology of a consensus genetic tree of population relatedness (figure 1b). This tree was informed primarily by the neighbour-joining analysis of 246 neutral microsatellites published by Pemberton et al. [62], with additional information regarding the branching relationships of the four New World populations derived from Reich et al. [63], and the relationship between the Australians and Andamanese inferred from Rasmussen and co-workers [47,64]. Once the fully resolved genetic tree was constructed, a distance matrix reflecting these hierarchical branching relationships was compiled, whereby each node separating two populations was counted as one unit. Hence, if two populations were separated by four nodes, their ‘history’ distance was coded as 4, and so forth. The resultant history distance matrix was analysed using a neighbour-joining [65] algorithm to confirm the hierarchical branching relationship shown in figure 1b.

    (iii) Craniometric distance was calculated as the pairwise Procrustes distances among populations, following a geometric morphometric analysis of cranial landmark configurations. Table 1 provides the sample sizes and museum repositories for each cranial population sampled. Only adults were measured and sex was ascertained using standard osteological protocols [66]. One hundred and thirty-five cranial landmarks were digitized from each cranium by N.v.C.-T. (see electronic supplementary material, table S1 for anatomical descriptions) using a Microscribe 3DX™ digitizer. The full cranial landmark configuration was also subdivided into three standard functional–developmental modules (face, cranial vault and basicranium), each of which were also analysed separately. Each cranial landmark configuration was aligned using Generalized Procrustes Analysis and tangent space projection in MorphoJ 1.06 [67], and the resultant scaled Procrustes shape variables were employed to calculate the Procrustes distance matrices representing biological distances between groups. This resulted in four biological distance matrices representing the shape of the entire cranium, the face, cranial vault and the basicranium.

    The cultural case study was based on material culture affinity patterns (figure 2) for 10 Austronesian-speaking groups distributed along the northern coast of New Guinea [1]. These data were collated from a larger dataset compiled by Welsch et al. [1] of material culture variability across 31 Austronesian and Papuan communities, speaking languages from as many as seven different language families. We chose to focus solely on the communities speaking Austronesian languages primarily because this allowed us, in the absence of genetic data, to use linguistic information within a single language family to construct a hypothesis of historical divergence. The other Papuan language families each contained fewer community samples within families and, given the uncertainty of how language families are related to one another [68], constructing a higher-level phylogeny for all 31 communities was not feasible. Moreover, the Austronesian communities differed from the Papuan ones in being geographically widespread across the study area (figure 2) with communities in the far west and east end of the sampled region. There were three additional communities in the original dataset that were reported to speak both an Austronesian and a Papuan language, which we excluded from the present dataset, given uncertainties as to their primary linguistic affiliation. This resulted in a final dataset compiled for 10 material culture samples from Austronesian-speaking villages (table 2).

    What is cultural transmission example?

    Figure 2. (a) Map showing geographical location of each of the Austronesian communities sampled along the northern coast of New Guinea. (b) Tree topology representing the hierarchical model of historical divergence among populations based on linguistic information. Community names are those given by Welsch et al. [1] as taken from museum records at the time of sampling and do not necessarily reflect local village names used today (table 2).

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Table 2.Austronesian villages sampled for cultural traits.

    communityalanguage spokenclatitude, longituded
    1. Humboldt (Yos Sudarso) BayYotafa (Tobati)−2.56, 140.71
    2. SissanoSissano−3.00, 142.06
    3. MalolSissanoe−3.10, 142.24
    4. TumleobTumleo−3.12, 142.40
    5. AlibAli (Kap)f−3.13, 142.47
    6. SeleobAli (Kap)−3.14, 142.49
    7. AngelbAli (Kap)−3.16, 142.49
    8. Wogeo (Vokeo)bWogeo−3.22, 144.10
    9. KoilbWogeo−3.36, 144.21
    10. Kadowar (Kadovar)bBam (Biem)−3.61, 144.59

    (i) The geographical distance matrix was constructed from the distances (in kilometres) between communities published by Welsch et al. [1]. We provide estimates of the geographical coordinates for each community based on village names given by Welsch et al. [1] in table 2. Owing to their coastal position (figure 2), the geographical distances provided by Welsch et al. [1] assume that interaction between communities was by water (via canoe) rather than overland. Therefore, the fact that some communities were island-based and some from the New Guinea mainland (table 2) should not affect the results of our analyses as it assumed that all communication between groups was via ‘water distances’.

    (ii) Distances representing historical divergence between groups were based on the linguistic similarity measures published by Welsch et al. [1], as shown in figure 2b. The coding system used for language similarities also follows Welsch et al. [1], and the resultant linguistic similarity matrix used is shown in the electronic supplementary material, table S2. Use of this particular linguistic taxonomy allowed for a more direct comparison to be made between the results obtained here and those of the original study by Welsch et al. [1]. To convert the language similarity matrix to a distance matrix, we inverted the coding (i.e. 95% becomes 5%, 30% becomes 70% etc.). The resultant history distance matrix based on language was analysed using a neighbour-joining algorithm [65] to confirm the hierarchical branching relationship shown in figure 2b.

    (iii) Cultural similarity between communities was quantified based on the presence or absence of 44 classes of material culture in the 10 communities (see electronic supplementary material, table S3). Of the original 47 traits coded by Welsch et al. [1], three were found to be absent in all 10 Austronesian communities and were therefore removed from the dataset as uninformative. A matrix representing the pairwise similarities among groups based on the remaining 44 cultural traits was calculated in PAST v. 3.07 [69] as Jaccard distances, which are particularly appropriate for binary (presence/absence) data [2,5,44].

    In both the biological and cultural datasets, the matrices representing ‘geography’ and ‘history’ were found to be significantly correlated using a Mantel test [43]: biology: r = 0.592, p < 0.001; culture: r = 0.811, p = 0.001. This confirms the strong association between the geographical positions of individual groups and their patterns of relatedness as shown in figures 1 and 2. Therefore, in both datasets, the patterns of historical divergence among groups are also strongly spatially correlated. As noted earlier, the HMT partitions out the unique contribution of two geographically mediated processes (historical divergence (VarHist) and gene/culture flow (VarGeo)), as well as quantifying the contribution of the interaction between these two processes (VarHist × Geo). Assuming three distance matrices (biology, history and geography), the test uses standard and multiple Mantel regressions to partition the effects of history, geography and the interaction between history/geography as follows:

    What is cultural transmission example?

    Standard and multiple mantel tests were conducted in R v. 3.1 using the multi.mantel and the mantel functions from the phytools [70] and the vegan [71] packages, respectively. An R function to implement the HMT is provided in the electronic supplementary material.

    Table 3 presents the results of the HMT for the biological datasets. For all four cranial matrices tested, historical divergence explains a much higher proportion of the among-group biological variation than pure geographically mediated gene flow. Depending on the cranial region under investigation, history explains between 15 and 22% of the overall variation, compared with less than 2% explained by pure geography. It should be noted that the interaction between history and geography also explained a proportion of overall variation in most cases, although this varied substantially between different regions of the skull (approx. 1% for the basicranium to approximately 15% for the vault). Contrasting the pattern of results for the basicranium and the face reveals some interesting differences in the extent to which different cranial features track phylogenetic signals compared to reflecting the past action of gene flow. The strongest historical signal was found in the basicranium (approx. 22%), with virtually no additional variation explained by geography or the interaction between history and geography. Conversely, for the face, the pure historical signal was a little weaker (19%), but with the same amount of biological variance (19%) explained by the interaction between history and geography. This may reflect the additional effect of environmental factors such as climate or diet in creating among-group facial differentiation [72] that are currently not being accounted for by the HMT.

    Table 3.Results of HMT for global cranial data. Data are proportions of among-group biological distance explained by the models of historical divergence, geographical distance and the interaction between these two factors (p-values in parentheses).

    historygeographyhistory × geographytotal (%)
    cranium0.1979 (<0.001)0.0002 (0.007)0.1194 (<0.001)31.75
    vault0.1515 (<0.001)0.0089 (0.013)0.1533 (0.002)31.37
    face0.1930 (<0.001)0.0101 (<0.001)0.1900 (0.001)39.31
    basicranium0.2268 (<0.001)0.0198 (0.080)0.0109 (<0.001)25.75

    Table 4 presents the analogous HMT results for the cultural dataset. By contrast with the biological case study, by far the strongest predictor of among-group cultural similarity was geography (35%), with history explaining only an additional 8% of cultural variation. Notably, the effect of the interaction between history and geography was not particularly strong for the cultural case study (approx. 1.7%), indicating that the patterns of cultural similarities represented in the Austronesian dataset are primarily driven by geographical distance rather than by the historical relatedness of these groups, as measured by linguistic similarity.

    Table 4.Results of HMT for Austronesian cultural data. Data are proportion of among-group cultural similarity explained by the models of historical divergence, geographical distance and the interaction between these two factors (p-values in parentheses).

    historygeographyhistory × geographytotal (%)
    culture0.0866 (0.049)0.3547 (0.010)0.0173 (0.006)45.86

    The results for the craniometric and cultural case studies differed primarily in terms of the relative contributions of historical divergence and recent gene/culture flow to explaining observed patterns of among-group affinity, despite a strong correlation between patterns of geographical distance and historical divergence in both cases. In the case of the craniometric case study, the tree-like model of historical divergence was found to explain a much larger proportion of population diversity, compared with geographically mediated gene flow. It is worth noting, however, that the relative proportions of variation explained by history and geography differed depending on the cranial region under investigation. This highlights how different areas of morphology can track phylogenetic signals of population divergence to a more accurate extent, while some regions may be influenced by additional (e.g. environmental) factors not considered as part of the hierarchical Mantel model [72]. The fact that the cranial data were better explained by a tree model of successive common ancestors is not surprising given that the dataset considered here is global rather than regional in focus. It is well established that global patterns of human genetic and morphological variation were largely established as a result of the relatively rapid migration out-of-Africa at least 70 kya [19], explaining the close correspondence between biological structure and global geography [21,34,36,47]. While population history is still important at a more local/regional level, population structure is relatively more likely to be driven by recent gene flow between geographically proximate populations as well as harbour the signals of more ancient population structure resulting from large-scale dispersals or recent migrations [34,36]. It is worth noting that there are other ways in which this ‘history’ matrix could have been constructed, including using actual genetic distance data which would reflect both the hierarchical relationships (numbers of nodes) and the relative genetic distances (branch lengths) between populations. In this case, the history model was informed by genetic data, but the lack of a single integrated genetic database for all populations considered precluded the use of primary genetic distance data.

    In the case of the cultural case study, the opposite pattern was observed, whereby geographically mediated culture flow was the more prominent factor explaining among-group affinity patterns, while ‘history’, as measured by linguistic affiliation, played a relatively minor role. Our results support Welsch et al.'s [1] original conclusions that geography played a strong role in generating the observed patterns of cultural variation, at least among the Austronesian communities in their dataset. Welsch et al. [1] downplayed the role of linguistic affiliation, stating that material culture diversity appeared to be ‘unrelated to the linguistic relationships of these communities'. This assessment was later challenged by analyses carried out by Moore and co-workers [73,74], all of which demonstrated that both geography and linguistic affiliation were important for explaining cultural patterning. It is difficult to directly compare our results to those generated by the aforementioned studies, as we focused only on the Austronesian rather than the complete dataset which included communities from several Papuan language families. However, a subsequent study by Shennan & Collard [25] sought to better understand the impact of population history (branching) and geographical propinquity (blending) on this dataset by comparing pairs of Austronesian-speaking communities against neighbouring Papuan groups. The rationale was that Austronesian-speaking peoples are relatively recent migrants to New Guinea, arriving approximately 3000 years ago as part of a rapid demographic expansion across the Pacific region, originating in Taiwan approximately 5500 years ago [75,76]. Therefore, Austronesian-speaking communities form a coherent group with a particular linguistic and genetic history [77] that might also be reflected in their cultural affinity patterns. While our results concur with those of Shennan & Collard [25] in highlighting that the Austronesian communities form a coherent phylogenetic grouping, our findings also emphasize the strong role played by geographically mediated processes such as borrowing, exchange and cultural assimilation with neighbouring Papuan-speaking communities. This history of interaction is also reflected in genetic studies, which show that rates of language borrowing and exchange between Papuan and Austronesian speakers were fast and pervasive, compared with relatively low rates of genetic admixture [75]. However, it should also be noted that shore-dwelling populations in Oceania are generally more admixed than inland populations, as dispersal along shorelines via watercraft makes movement easier [75]. Therefore, it is likely that the effects of cultural ‘blending’ were relatively important in generating the among-group cultural patterns observed in the coastal Austronesian communities analysed here. One future avenue of inquiry in this regard may be to use linguistic data such as phoneme variation to reconstruct a ‘history’ matrix that compares across language families (Austronesian and Papuan), given the close association found between phonemic and genetic diversity patterns at a global level [22].

    Our results suggest that, at least in the case of the material cultural datasets analysed here, ‘blending’ forces such as cultural contact, borrowing and exchange were more potent in generating among-group affinity patterns when compared with phylogenetic ‘branching’. However, there are a couple of issues that need to be considered when interpreting the results in the context of debates regarding the relative importance of ‘branching’ and ‘blending’ in driving patterns of material culture evolution. It should be made clear that the hypothesis of phylogenetic history inputted into our model uses linguistic affinities to model the population history of the people not the phylogenetic histories of the cultural attributes themselves, which may or may not amount to the same histories. Indeed, it has been shown that even different classes of artefacts from the same populations can be subject to differing cultural evolutionary forces, leading to contrasting affinity patterns in their attributes among those same communities [18]. In this case, however, we are asking whether the presence/absence patterns of multiple artefacts exhibited across different groups of people potentially tell us something about the cultural history of those communities played out across time and space. When phylogenetic methods are applied to material culture attributes directly [41], a strong tree-like signal may well be recovered, but this may or may not reflect the phylogenetic history of the particular populations of people that created the material culture patterns, as would be measured using genetic or linguistic data. Accordingly, it is possible that material culture patterns follow a strong tree-like model of successive bifurcations, yet this model does not match patterns of genetic or linguistic affiliation. Hence, it is worth noting that our results do not negate the presence of a phylogenetic history in the cultural dataset, or more specifically sub-elements of it, but rather suggest that due to geographically mediated cultural exchange among groups, the historical signal of population splitting has been overridden by a more recent signal of culture flow [39]. In some instances, it can be shown at the outset that language and geography are uncorrelated (e.g. [18]), which means that the HMT method would not be required to disentangle the effects of isolation-by-distance from historical branching under such conditions. However, as shown here, such fortuitous circumstances are not always the case.

    Regardless of the precise details of these two case studies, the results serve to illustrate the usefulness of the HMT for disentangling and explicitly quantifying the relative impact of two distinct geographically mediated processes: historical divergence and recent gene/culture flow. Recent studies have acknowledged that conceptualizing these two processes as dichotomous is problematic [7], due to the complexities of inferring instances of ‘branching’ and ‘blending’ in morphological [60] or cultural (e.g. [29]) datasets. Hence, recent studies have employed other methods such as network analyses (e.g. [78–81]) to explore among-group affinities or agent-based simulations [29] to better understand the effects of fission (splitting), innovation (mutation) and horizontal transfer on observed affinity patterns. We advocate that the HMT provides a simple yet effective adjunct to existing methodological approaches for examining the spatial and phylogenetic causes of cultural diversity patterns, especially when several data matrices are correlated with each other, meaning that an independent measure of history is elusive. It is also important to note that ‘history’ can be modelled in several different ways, depending on the research questions at hand. Here, we chose to employ the same linguistic taxonomy used by Welsch et al. [1] as a means of maintaining continuity with the original study, but historical relationships could also be modelled using actual genetic distances or linguistic data such as phonemic variation. As mentioned earlier, the HMT method is likely to be most effective where both the cultural and phylogenetic patterns are spatially autocorrelated. Its primary value lies in the explicit quantification of the effects of phylogeny (history) and geographical propinquity in causing observed patterns, and assumes a priori that historical and geographical factors are related. In that sense, it avoids the pitfall of false dichotomy in assuming that affinity patterns must be caused by either branching or blending, rather than acknowledging that both processes are likely conspiring to create observed among-group differences under such conditions.

    Extensions of the HMT method to include additional causal factors, such as ecological data, are also possible, although each additional factor would increase the number of interaction terms quantified. However, such an extension may prove useful in the cases of particular material culture datasets where there are clear environmental or climatic correlates (i.e. food preparation, clothing and shelters). Another possible extension of the HMT to human cultural datasets could involve testing the goodness of fit of alternative phylogenetic models for explaining particular cultural patterns. For example, von Cramon-Taubadel et al. [60] recently used the HMT to test all possible phylogenetic positions for the prehistoric Paleoamerican population from Lagoa Santa (Brazil) within the context of modern human craniometric diversity. By using the HMT to hold the effects of gene flow constant, it was possible to test alternative migration scenarios into the New World. Hence, extrapolating this method to cultural datasets may be useful in cases where the relative phylogenetic position of an individual population is uncertain within the context of a ‘known’ phylogeny, yet allows one to control for the effects of geographically mediated culture flow.

    Here, we showcased a simple extension of standard matrix correlation methods, termed the HMT, to explicitly quantify the relative impact of population history (phylogeny) and among-group contact (gene/culture flow) in generating observed patterns of population affinities, using both a biological and material cultural case study. In both case studies, the among-group biological/cultural distances and the phylogenetic model employed were spatially autocorrelated, illustrating the often intimate interconnections between space (geography) and time (history) in generating human diversity patterns. In the biological case study, the results showed that historical factors (phylogeny) played the more potent role in generating observed patterns of global among-population craniometric diversity. Conversely, in the cultural case study, geographically mediated culture flow between contiguous populations explained a greater proportion of the differences in the presence or absence of material culture attributes among 10 Austronesian communities from the northern coast of New Guinea. These case studies serve to illustrate the utility of this simple and intuitive method for disentangling the relative effects of history (i.e. branching) and geography (i.e. blending) in explaining observed cultural affinity patterns.

    This article has no additional data.

    N.v.C.-T. and S.J.L. performed research conception and design. N.v.C.-T. carried out data acquisition and analysis. N.v.C.-T. and S.J.L. wrote the paper. All authors approved the final manuscript for submission.

    We have no competing interests.

    We are grateful to the Research Foundation of the State University of New York for funding support.

    Footnotes

    One contribution of 16 to a theme issue ‘Bridging cultural gaps: interdisciplinary studies in human cultural evolution’.

    Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.3965460.

    References

    • 1

      Welsch RL, Terrell J, Nadolski JA. 1992Language and culture on the north coast of New Guinea. Am. Anthropol. 94, 568–600. (doi:10.1525/aa.1992.94.3.02a00030) Crossref, ISI, Google Scholar

    • 2

      Jordan P, Shennan S. 2003Cultural transmission, language, and basketry traditions amongst the California Indians. J. Anthropol. Archaeol. 22, 42–74. (doi:10.1016/S0278-4165(03)00004-7) Crossref, ISI, Google Scholar

    • 3

      Mace R, Holden CJ, Shennan SJ. 2005The evolution of cultural diversity: a phylogenetic approach. London, UK: UCL Press. Google Scholar

    • 4

      Freckleton RP, Jetz W. 2009Space versus phylogeny: disentangling phylogenetic and spatial signals in comparative data. Proc. R. Soc. B 276, 21–30. (doi:10.1098/rspb.2008.0905) Link, ISI, Google Scholar

    • 5

      Lycett SJ. 2014Dynamics of cultural transmission in Native Americans of the High Great Plains. PLoS ONE 9, e112244. (doi:10.1371/journal.pone.0112244) Crossref, PubMed, ISI, Google Scholar

    • 6

      Lycett SJ. 2015Cultural evolutionary approaches to artifact variation over time and space: basis, progress, and prospects. J. Archaeol. Sci. 56, 21–31. (doi:10.1016/j.jas.2015.01.004) Crossref, ISI, Google Scholar

    • 7

      Jordan P. 2015Technology as human social tradition: cultural transmission among hunter-gatherers. Oakland, CA: University of California Press. Google Scholar

    • 8

      Lycett SJ. 2017Cultural patterns within and outside of the post-contact Great Plains as revealed by parfleche characteristics: implications for areal arrangements in artifactual data. J. Anthropol. Archaeol. 48, 87–101. (doi:10.1016/j.jaa.2017.07.003) Crossref, ISI, Google Scholar

    • 9

      Mesoudi A, Whiten A, Laland KN. 2004Perspective: is human cultural evolution Darwinian? Evidence reviewed from the perspective of The Origin of Species. Evol. Anthropol. 58, 1–11. (doi:10.1111/j.0014-3820.2004.tb01568.x) Google Scholar

    • 10

      Eerkens JW, Lipo CP. 2007Cultural transmission theory and the archaeological record: providing context to understanding variation and temporal changes in material culture. J. Archaeol. Res. 15, 239–274. (doi:10.1007/s10814-007-9013-z) Crossref, ISI, Google Scholar

    • 11

      O'Brien MJ, Lyman RL. 2000Applying evolutionary archaeology: a systematic approach. New York, NY: Kluwer Academic/Plenum. Crossref, Google Scholar

    • 12

      Shennan S. 2011Descent with modification and the archaeological record. Phil. Trans. R. Soc. B 366, 1070–1079. (doi:10.1098/rstb.2010.0380) Link, ISI, Google Scholar

    • 13

      Lycett SJ. 2011‘Most beautiful and most wonderful’: those endless stone tool forms. J. Evol. Psychol. 9, 143–171. (doi:10.1556/JEP.9.2011.23.1) Crossref, Google Scholar

    • 14

      Mesoudi A. 2011Cultural evolution: how Darwinian theory can explain culture and synthesize the social sciences. Chicago, IL: Chicago University Press. Crossref, Google Scholar

    • 15

      Cavalli-Sforza LL, Feldman MW. 1981Cultural transmission and evolution: a quantitative approach. Princeton, NJ: Princeton University Press. Google Scholar

    • 16

      Boyd R, Richerson PJ. 1985Culture and the evolutionary process. Chicago, IL: University of Chicago Press. Google Scholar

    • 17

      Hamilton MJ, Buchanan B. 2009The accumulation of stochastic copying errors causes drift in culturally transmitted technologies: quantifying Clovis evolutionary dynamics. J. Anthropol. Archaeol. 28, 55–69. (doi:10.1016/j.jaa.2008.10.005) Crossref, ISI, Google Scholar

    • 18

      Lycett SJ. 2015Differing patterns of material culture intergroup variation on the high plains: quantitative analyses of parfleche characteristics vs. moccasin decoration. Am. Antiq. 80, 714–731. (doi:10.7183/0002-7316.80.4.714) Crossref, ISI, Google Scholar

    • 19

      Eriksson A, Betti L, Friend AD, Lycett SJ, Singarayer JS, von Cramon-Taubadel N, Valdes PJ, Balloux F, Manica A. 2012Late Pleistocene climate change and the global expansion of anatomically modern humans. Proc. Natl Acad. Sci. USA 109, 16 089–16 094. (doi:10.1073/pnas.1209494109) Crossref, ISI, Google Scholar

    • 20

      Cavalli-Sforza LL, Menozzi P, Piazza A. 1994The history and geography of human genes. Princeton, NJ: Princeton University Press. Google Scholar

    • 21

      Ramachandran S, Deshpande O, Roseman CC, Rosenberg NA, Feldman MW, Cavalli-Sforza LL. 2005Support from the relationship of genetic and geographic distance in human populations for the serial founder effect originating in Africa. Proc. Natl Acad. Sci. USA 102, 15 942–15 947. (doi:10.1073/pnas.0507611102) Crossref, ISI, Google Scholar

    • 22

      Creanza N, Ruhlen M, Pemberton TJ, Rosenberg NA, Feldman MW, Ramachandran S. 2015A comparison of worldwide phonemic and genetic variation in human populations. Proc. Natl Acad. Sci. USA 112, 1265–1272. (doi:10.1073/pnas.1424033112) Crossref, PubMed, ISI, Google Scholar

    • 23

      Reyes-Centeno H, Harvati K, Jäger G. 2016Tracking modern human population history from linguistic and cranial phenotype. Sci. Rep. 6, 36645. (doi:10.1038/srep36645) Crossref, PubMed, ISI, Google Scholar

    • 24

      Henn BM, Cavalli-Sforza LL, Feldman MW. 2012The great human expansion. Proc. Natl Acad. Sci. USA 109, 17 758–17 764. (doi:10.1073/pnas.1212380109) Crossref, ISI, Google Scholar

    • 25

      Shennan SJ, Collard M. 2005Investigating processes of cultural evolution on the north coast of New Guinea with multivariate and cladistic analyses. In The evolution of cultural diversity: a phylogenetic approach (eds Mace R, Holden CJ, Shennan SJ), pp. 133–164. London, UK: UCL Press. Google Scholar

    • 26

      Moore JH. 1994Putting anthropology back together again: the ethnogenetic critique of cladistic theory. Am. Anthropol. 96, 925–948. (doi:10.1525/aa.1994.96.4.02a00110) Crossref, ISI, Google Scholar

    • 27

      Bellwood P. 1996Phylogeny vs reticulation in prehistory. Antiquity 70, 881–890. (doi:10.1017/S0003598X00084131) Crossref, ISI, Google Scholar

    • 28

      Collard M, Shennan SJ, Tehrani JJ. 2006Branching, blending, and the evolution of cultural similarities and differences among human populations. Evol. Hum. Behav. 27, 169–184. (doi:10.1016/j.evolhumbehav.2005.07.003) Crossref, ISI, Google Scholar

    • 29

      Crema ER, Kerig T, Shennan SJ. 2014Culture, space, and metapopulation: a simulation-based study for evaluating signals of blending and branching. J. Archaeol. Sci. 43, 289–298. (doi:10.1016/j.jas.2014.01.002) Crossref, ISI, Google Scholar

    • 30

      Tehrani J, Collard M. 2002Investigating cultural evolution through biological phylogenetic analyses of Turkmen textiles. J. Anthropol. Archaeol. 21, 443–463. (doi:10.1016/S0278-4165(02)00002-8) Crossref, ISI, Google Scholar

    • 31

      Mace R, Holden CJ. 2005A phylogenetic approach to cultural evolution. Trends Ecol. Evol. 20, 116–121. (doi:10.1016/j.tree.2004.12.002) Crossref, PubMed, ISI, Google Scholar

    • 32

      Arnold ML. 2016Divergence with genetic exchange. Oxford, UK: Oxford University Press. Google Scholar

    • 33

      Sherry ST, Batzer MA. 1997Modelling human evolution—to tree or not to tree?Gen. Res. 7, 947–949. (doi:10.1101/gr.7.10.947) Crossref, PubMed, ISI, Google Scholar

    • 34

      Hunley KL, Healy ME, Long JC. 2009The global pattern of gene identity variation reveals a history of long-range migrations, bottlenecks, and local mate exchange: implications for biological race. Am. J. Phys. Anthropol. 139, 35–46. (doi:10.1002/ajpa.20932) Crossref, PubMed, ISI, Google Scholar

    • 35

      Pickrell JK, Pritchard JK. 2012Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet. 8, e1002967. (doi:10.1371/journal.pgen.1002967) Crossref, PubMed, ISI, Google Scholar

    • 36

      Hunley KL, Cabana GS. 2016Beyond serial founder effects: the impact of admixture and localized gene flow on patterns of regional genetic diversity. Hum. Biol. 88, 219–231. (doi:10.13110/humanbiology.88.3.0219) Crossref, PubMed, ISI, Google Scholar

    • 37

      Kroeber AL. 1948Anthropology: race, language, culture, psychology, pre-history. New York, NY: Harcourt Brace. Google Scholar

    • 38

      Terrell J. 1988History as a family tree, history as an entangled bank: constructing images and interpretations of prehistory in the South Pacific. Antiquity 62, 642–657. (doi:10.1017/S0003598X00075049) Crossref, ISI, Google Scholar

    • 39

      Lycett SJ, von Cramon-Taubadel N. 2016Transmission of biology and culture among post-contact Native Americans on the western Great Plains. Sci. Rep. 6, 25695. (doi:10.1038/srep25695) Crossref, PubMed, ISI, Google Scholar

    • 40

      Borgerhoff MM, Nunn CL, Towner MC. 2006Cultural macroevolution and the transmission of traits. Evol. Anthropol. 15, 52–64. (doi:10.1002/evan.20088) Crossref, ISI, Google Scholar

    • 41

      O'Brien MJ, Lyman RL. 2003Cladistics and archaeology. Salt Lake City, UT: University of Utah Press. Google Scholar

    • 42

      Buchanan B, Collard M. 2007Investigating the peopling of North America through cladistic analyses of Early Paleoindian projectile points. J. Anthropol. Archaeol. 26, 366–393. (doi:10.1016/j.jaa.2007.02.005) Crossref, ISI, Google Scholar

    • 43

      Mantel NA. 1967The detection of disease clustering and a generalized regression approach. Cancer Res. 27, 209–220. PubMed, ISI, Google Scholar

    • 44

      Rogers DS, Ehrlich PR. 2008Natural selection and cultural rates of change. Proc. Natl Acad. Sci. USA 105, 3416–3420. (doi:10.1073/pnas.0711802105) Crossref, PubMed, ISI, Google Scholar

    • 45

      Hart JP. 2012The effects of geographic distances on pottery assemblage similarities: a case study from northern Iroquoia. J. Archaeol. Sci. 39, 128–134. (doi:10.1016/j.jas.2011.09.010) Crossref, ISI, Google Scholar

    • 46

      Hubbe M, Neves WA, Harvati K. 2010Testing evolutionary and dispersion scenarios for the settlement of the New World. PLoS ONE 5, e11105. (doi:10.1371/journal.pone.0011105) Crossref, PubMed, ISI, Google Scholar

    • 47

      Reyes-Centeno H, Ghirotto S, Détroit F, Grimaud-Hervé D, Barbujani G, Harvati K. 2014Genomic and cranial phenotype data support multiple modern human dispersals from Africa and a southern route into Asia. Proc. Natl Acad. Sci. USA 111, 7248–7253. (doi:10.1073/pnas.1323666111) Crossref, PubMed, ISI, Google Scholar

    • 48

      von Cramon-Taubadel N. 2016Population biodistance in global perspective: assessing the influence of population history and environmental effects on patterns of craniomandibular variation. In Biological distance analysis: forensic and bioarchaeological perspectives (eds Pilloud MA, Hefner JT), pp. 425–445. London, UK: Elsevier. Crossref, Google Scholar

    • 49

      Pilloud MA, Hefner JT. 2016Biological distance analysis: forensic and bioarchaeological perspectives. London, UK: Academic Press. Google Scholar

    • 50

      Smouse PE, Long JC, Sokal RR. 1986Multiple regression and correlation extensions of the Mantel test of matrix correspondence. Syst. Zool. 35, 627–632. (doi:10.2307/2413122) Crossref, Google Scholar

    • 52

      Diniz-Filho JAF, Soares TN, Lima JS, Dobrovolski R, Landeiro VL, Pires de Campos Telles M, Rangel TF, Bini LM. 2013Mantel test in population genetics. Genet. Mol. Biol. 36, 475–485. (doi:10.1590/S1415-47572013000400002) Crossref, PubMed, ISI, Google Scholar

    • 53

      Harmon LJ, Glor RE. 2010Poor statistical performance of the Mantel test in phylogenetic comparative analyses. Evol. Anthropol. 64, 2173–2178. (doi:10.1111/j.1558-5646.2010.00973.x) Google Scholar

    • 54

      Guillot G, Rousset F. 2013Dismantling the Mantel tests. Meth. Ecol. Evol. 4, 336–344. (doi:10.1111/2041-210x.12018) Crossref, ISI, Google Scholar

    • 55

      Relethford JH. 2010Population-specific deviations of global human craniometric variation from a neutral model. Am. J. Phys. Anthropol. 142, 105–111. (doi:10.1002/ajpa.21207) Crossref, PubMed, ISI, Google Scholar

    • 56

      Peres-Neto PR, Jackson DA. 2001How well do multivariate datasets match? The advantages of a Procrustean superimposition approach over the Mantel test. Oecologia 129, 169–178. (doi:10.1007/s004420100720) Crossref, PubMed, ISI, Google Scholar

    • 58

      Meirmans PG. 2012The trouble with isolation by distance. Mol. Ecol. 21, 2839–2846. (doi:10.1111/j.1365-294X.2012.05578.x) Crossref, PubMed, ISI, Google Scholar

    • 59

      de Campos Telles MP, Diniz-Filho JAF. 2005Multiple Mantel tests and isolation-by-distance, taking into account long-term historical divergence. Genet. Mol. Res. 4, 742–748. PubMed, Google Scholar

    • 60

      von Cramon-Taubadel N, Strauss A, Hubbe M. 2017Evolutionary population history of early Paleoamerican cranial morphology. Sci. Adv. 3, e1602289. (doi:10.1126/sciadv.1602289) Crossref, PubMed, ISI, Google Scholar

    • 61

      Ersts PJ. 2015Geographic distance matrix generator v. 1.2.3. New York, NY: American Museum of Natural History, Center for Biodiversity and Conservation. Google Scholar

    • 62

      Pemberton TJ, DeGiorgio M, Rosenberg NA. 2013Population structure in a comprehensive genomic data set on human microsatellite variation. G3 Genes Genomes Genet. 3, 891–907. (doi:10.1534/g3.113.005728) ISI, Google Scholar

    • 63

      Reich Det al.2012Reconstructing native American population history. Nature 488, 370–375. (doi:10.1038/nature11258) Crossref, PubMed, ISI, Google Scholar

    • 64

      Rasmussen Met al.2011An aboriginal Australian genome reveals separate human dispersals into Asia. Nature 334, 94–98. (doi:10.1126/science.1211177) Google Scholar

    • 65

      Saitou N, Nei M. 1987The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425. PubMed, ISI, Google Scholar

    • 66

      Buikstra JE, Uberlaker DH. 1994Standards for data collection from human skeletal remains. Fayetteville, AR: Arkansas Archaeological Survey Research Series No. 44. Google Scholar

    • 67

      Klingenberg CP. 2011MorphoJ: an integrated software package for geometric morphometrics. Mol. Ecol. Res. 11, 353–357. (doi:10.1111/j.1755-0998.2010.02924.x) Crossref, PubMed, ISI, Google Scholar

    • 68

      Kayser M. 2010The human genetic history of Oceania: near and remote views of dispersal. Curr. Biol. 20, R194–R201. (doi:10.1016/j.cub.2009.12.004) Crossref, PubMed, ISI, Google Scholar

    • 69

      Hammer Ø, Harper DA. T, Ryan PD. 2001Paleontological statistics software package for education and data analysis. Paleontol. Electron. 4, 1–9. Google Scholar

    • 70

      Revell LJ. 2012phytools: an R package for phylogenetic comparative biology (and other things). Meth. Ecol. Evol. 3, 217–223. (doi:10.1111/j.2041-210X.2011.00169.x) Crossref, ISI, Google Scholar

    • 71

      Oksanen Jet al.2016vegan: Community Ecology Package. In R package version 2.3-3. https://cran.r-project.org/web/packages/vegan/vegan.pdf. Google Scholar

    • 72

      von Cramon-Taubadel N. 2014Evolutionary insights into global patterns of human cranial diversity: population history, climatic and dietary effects. J. Anthropol. Sci. 92, 43–77. (doi:10.4436/jass.91010) PubMed, ISI, Google Scholar

    • 73

      Moore CC, Romney AK. 1994Material culture, geographic propinquity, and linguistic affiliation on the north coast of New Guinea: a reanalysis of Welsch, Terrell, and Nadolski (1992). Am. Anthropol. 96, 370–392. (doi:10.1525/aa.1994.96.2.02a00050) Crossref, ISI, Google Scholar

    • 74

      Roberts JMet al.1995Predicting similarity in material culture among New Guinea villages from propinquity and language: a log-linear approach. Curr. Anthropol. 36, 769–788. (doi:10.1086/204431) Crossref, ISI, Google Scholar

    • 75

      Friedlaender JSet al.2008The genetic structure of Pacific islanders. PLoS Genet. 4, e19. (doi:10.1371/journal.pgen.0040019) Crossref, PubMed, ISI, Google Scholar

    • 76

      Bellwood P. 2013First migrants: ancient migration in global perspective. Chichester, UK: John Wiley & Sons. Google Scholar

    • 77

      Lipson M, Loh P-R, Patterson N, Moorjani P, Ko Y-C, Stoneking M, Berger B, Reich D. 2014Reconstructing Autronesian population history in Island Southeast Asia. Nat. Comm. 5, 4689. (doi:10.1038/ncomms5689) Crossref, PubMed, ISI, Google Scholar

    • 78

      Cochrane EE, Lipo CP. 2010Phylogenetic analyses of Lapita decoration do not support branching evolution or regional population structure during colonization of remote Oceania. Phil. Trans. R Soc. B 365, 3889–3902. (doi:10.1098/rstb.2010.0091) Link, ISI, Google Scholar

    • 79

      Heggarty P, Maguire W, McMahon A. 2010Splits or waves? Trees or webs? How divergence measures and network analysis can unravel language histories. Phil. Trans. R. Soc. B 365, 3829–3843. (doi:10.1098/rstb.2010.0099) Link, ISI, Google Scholar

    • 80

      Buchanan B, Hamilton MJ, Kilby JD, Gingerich JA. M. 2016Lithic networks reveal early regionalization in late Pleistocene North America. J. Archaeol. Sci. 65, 114–121. (doi:10.1016/j.jas.2015.11.003) Crossref, ISI, Google Scholar

    • 81

      Hart JP, Shafie T, Birch J, Dermarker S, Williamson RF. 2016Nation building and social signaling in southern Ontario: A.D. 1350–1650. PLoS ONE 11, e0156178. (doi:10.1371/journal.pone.0156178) Crossref, PubMed, ISI, Google Scholar


    Page 6

    Indentured servants, including exiled political prisoners and those who willingly entered servitude in search of a better life, were the primary source of English migrants to the British colonies during the seventeenth century [1]. Between 1645 and 1650, the English Civil War led to the exodus of at least 8000 indentured servants to the West Indian colonies, a movement of labour that coincided with the sugar boom of the 1640s and 1650s [2]. Suriname was one such English colony from 1650 to 1667, before being ceded to the Dutch. In that period, a Creole language known as Sranan (or Sranan Tongo, ‘Surinamese tongue’) developed and has survived to the present day. Most of the basic words in modern Sranan have English origins; the lexical sources of a 200-word basic vocabulary list are English (77.14%), Portuguese (3.7%), Dutch (17.58%) and African (1.59%) [3]. However, after the Suriname colony was ceded to the Dutch in 1667 under the Treaty of Breda, there was no further significant contact with England or the English, and, in the years after this cessation, the population of English people in Suriname dwindled to only 39 individuals [3–6]. Therefore, the English features presently found in Sranan were likely introduced during the 1650–1667 formation period. Of course, we cannot be certain that particular features of Sranan are inherited from seventeenth-century English dialects, rather than from, for example, internal linguistic developments within Suriname over the intervening centuries, other European or African languages present in the early period or, indeed, recent movement of people between Suriname and neighbouring Guyana where a related but distinct Creole with English vocabulary is used. However, as outlined in our Material and methods section, we have attempted to reduce the likelihood that alternative accounts are more plausible than that of seventeenth-century origins.

    As Creole languages primarily develop out of contact among speakers of several languages, studies of Creoles typically attempt to specify the origins of specific linguistic features found in a particular Creole. Multiple hypotheses have been proposed to account for the processes of the formation and transmission of Creoles [3,5,7–18]. Alleyne, in his work, Comparative Afro-American [19], even while arguing for a predominantly African rather than English origin for the features in Atlantic Creoles such as Sranan, accepts the importance of English regional dialects in the Creole language formation process. In doing so, Alleyne proposes, for consideration, an early version of what we term the pan-dialectal (PD) hypothesis when he states, ‘one would have to consider a whole range of dialects from the period, particularly those in the port areas of Bristol, Liverpool, etc.’ (p. 224). The pan-dialectal hypothesis suggests that many English dialects influenced the formation of Sranan.

    Another type of PD hypothesis crops up, as well, in work which argues the opposite position on the centrality of English influence in the Creole formation process. In The ecology of language evolution [20], Mufwene proposed an analogy between languages and biological species, which he applied to the origins of Atlantic Creole languages such as Sranan. A major facet of this analogy is the hypothesis that the features of a Creole language could be the ‘fittest’ of the ‘gene pool’ of linguistic features of the European language that is the main source of the Creole vocabulary. Thus, from Mufwene's perspective, the seventeenth-century dialects of English in England would have created in Sranan a situation that would favour ‘the more salient, more regular, or more transparent alternatives winning over the less common, less transparent and less salient alternatives' [20, p. 57]. By implication, the entire pool of linguistic features introduced by English migrants, irrespective of how large or small in number or influence the users of a particular feature might be, constituted the feature pool. This represents Mufwene's broad version of the PD hypothesis.

    We can test this PD hypothesis only after we add specificity to the hypothesis, even to the point of adding details that are not explicitly specified by Mufwene or others. First, the linguistic features undergoing competition and selection may be ‘words, pronunciations and grammatical rules' [21], whereas the present study is concerned largely with pronunciations of words, and not with grammatical rules. (For a recent study of the latter, see [21,22].) Second, for a given set of languages and language features, Mufwene's theory is open as to the relative impact of key non-linguistic influencing factors, such as the numbers of speakers or the number of languages in the contact situation employing a specific feature, and the number of interactions among speakers. In the present study, we ‘close’ some of these theoretical aspects by adding assumptions to the Mufwene version of the PD hypothesis, so as to render it testable with the present data. The resulting models of linguistic ‘selection’ in the contact situation are perhaps oversimplified realizations of Mufwene's theory, but we hope that testing these models will clarify the role of some of the relevant factors.

    In the more extreme framing of the PD hypothesis, all features across all dialects in England were in competition within the ‘gene pool’ of linguistic features, with some more ‘fit’ dialect features inherently more likely to be included in the Creole. Thus, this hypothesis posits that the relative abundances of different English dialects in the migrant group would be unrelated to the likelihood that features of these dialects are present in Sranan. Instead, features that were more common across English dialects, more common across human languages or more similar to features of the West African languages in the contact situation would be more likely to become part of the Creole, even if these features were rare in the migrant population of English origin. In the second and less extreme framing of the PD hypothesis, the likelihood that a particular feature of English dialects occurs in Sranan would be proportional to the number of speakers with that feature living in Suriname and, therefore, would be positively related to the frequency of the use of the regional dialect feature during the formation period of Sranan.

    A sharply contrasting hypothesis, that of the emergence of a standard dialect (SD) from the dialects of a specific region, hereafter the SD hypothesis, has been suggested by Smith [23,24]. According to Smith's proposal, English in the seventeenth century was undergoing a standardization process focused around the forms of speech in southeastern localities in England. This meant that speakers from all localities across England were becoming familiar with the emerging SD of London and the southeast and would use it to communicate with people who were not from the local community of the speaker in question. Thus, when speakers from various regions in England mixed in Suriname, they would have opted for forms approximating this emerging standard that had started to form in England. In this view, the origin of Sranan's seventeenth-century linguistic features was Early Modern (London) English [23,24], which would constitute an SD based on a regional dialect of the southeast. By contrast to the PD hypotheses above, this SD proposal suggests that the features of English that are included in Sranan do not represent either the most ‘fit’ features or the most common features of the dialects spoken by the migrants to Suriname. Instead, the features of English incorporated into Sranan represent those of a single emerging standardized version of English based in the southeast region in and around London, and this version being chosen as the preferred mode of communication when members of different dialect groups interacted with one another in the Caribbean and Suriname.

    Here, we propose an alternate hypothesis that generalizes the regional element of the SD hypothesis: the sources of the linguistic features surviving in Sranan are neither from all over England nor from a standardizing London version of English, but instead reflect the influence of a small number of specific communities, districts or regions that may include London and southeast England. Dialects of English within England today vary considerably from one (sub)-region to another, and this variation was likely even greater in the seventeenth century. However, if some subset of dialect features (e.g. the ‘fittest’ features) has been relatively stable within England, then it might be possible, using modern English dialect data, to statistically detect a signal of the specific local and regional English dialects that most influenced the origins of Sranan, whether these be in the southeast or elsewhere. We refer to the possibility of such signals as the multiple specific sources (MSS) hypothesis. To test whether the MSS hypothesis is supported over the PD and SD hypotheses described above, we compared Sranan word-forms with those of cognates across twentieth-century dialects of English in England. To do this, we compiled data on Sranan, primarily from the Dictionary of Sranan Tongo [25], and data on the dialects of England from the Survey of English Dialects (SED) [26]. This survey, covering 313 localities across England and the Isle of Man during the mid-twentieth century, is considered a gold standard for regional dialect surveys of spoken language. Of particular importance to the present analysis are the measures taken by the SED to survey only those persons who were most likely to preserve the ‘traditional dialect’. Given these measures, which are outlined below, we propose that our understanding of the evolution of Sranan can be advanced by the use of the SED as a proxy for seventeenth-century English.

    What is the role of geography in Creole formation? The PD hypothesis contains no explicit claims about the role of geography in the selection of features from source languages that are incorporated in the formation of Creoles. However, the available historical data on the geographical origins of migrants from England in seventeenth-century Suriname allow us to examine the role of their geographical place of origin in the formation of Sranan, and the SED gives us information about the relative frequency of particular linguistic features across regional dialects. Accordingly, we attempt to derive, under the PD hypothesis, reasonable and testable implications about the role of geography. Under this hypothesis, the likelihood that a given linguistic feature occurs in Sranan is positively related to the frequency of the dialect feature: (i) in the more extreme version, across all of the dialects and languages that may have contributed to the formation of Sranan and (ii) in the less extreme version, within the speech of the group of migrants present in the formation period. As noted earlier, under the more extreme version, the likelihood that a given feature would occur in Sranan is unrelated to the frequency of that feature within the dialects of the migrant group and, therefore, is unrelated to the frequencies of dialect locations represented in the migrant group. In other words, in the more extreme version of the PD hypothesis, the geographical origins of the English-speaking migrants to Suriname would not be related to feature formation in Sranan.

    In the less extreme version of the PD hypothesis, we expect that the likelihood that a feature would occur in Sranan is related to its frequency in the migrant population. Given the well-known evidence of significant variation in dialect features across geographical locations in England and Wales, the frequency of a feature in the migrant population is likely to depend on the locations from which the migrants originated; thus, the geographical origins of the migrants to Suriname from England would be related to the feature formation within Sranan. Under the other hypotheses introduced above, namely the SD and MSS hypotheses, the dependence of feature formation within Sranan on geographical location is explicit, because these hypotheses propose that either the convergence among the English-speaking migrants in Suriname to the dialects spoken in a specific region of England (London and the southeast) or the geographical origins of migrants, directly influenced Sranan's formation. In what follows, we will consider various ways of establishing the relevance of geography in Creole formation.

    There are no spoken language samples of the seventeenth-century migrants from England. However, from the SED, we have samples of English from the middle of the twentieth century, 300 years later, taken primarily from rural areas with stable populations and mainly from working men and women whose grandparents were born in the same area [26]. The SED describes language features of 313 locations selected for geographical spread and representativeness, from across England, some parts of Wales, the Isle of Wight and the Isle of Man. These locations were chosen, not because it was thought that there was some definite number (e.g. 313) of English dialects, but rather to ensure a relatively even spread geographically and across established administrative county entities in England. Labelling these as ‘dialects' is based on the fact that no two of the 313 locations produce an identical combination of variants for the hundreds of linguistic variables included in the SED. Therefore, each variant combination is unique to a particular one of the 313 geographical locations and can be regarded as belonging to and identifying each of these geographical locations. (Although it would be more appropriate to refer to the speech sample at each location as a ‘lect’, we use ‘dialect’ herein for the sake of convenience.) The aim of the survey was the elicitation of archaic, ‘lexical features … in the form of … phonological variants … and grammatical features’ [26, p. 45]. In this survey, the researchers took as their target population ‘men and women, sixty and over … because they were considered to be more likely to best preserve the traditional dialect; [and therefore] … the questionnaire [used in the SED, was] … constructed for the farmer’ [26, p. 44, 46].

    As is the case with the English dialects, there are no seventeenth-century data for the earliest forms of Sranan Tongo. This study, therefore, relied on the Dictionary of Sranan Tongo, documenting the vocabulary of modern Sranan Tongo, supplemented by a number of early eighteenth-century wordlists documenting sometimes archaic speech forms. These documents, alongside Smith's [23] historical phonology, provided information regarding the etymology of the English origin words in Sranan, the phonological processes they went through while being adapted into Sranan and sound changes they may have gone through since [27]. Using these sources [25,28–31], we checked for all words in Sranan of potential English origin. In addition, words similar in form and meaning across Sranan and English were checked with dictionaries of the other European languages known to have influenced Sranan, notably Portuguese and Dutch [32,33], to eliminate words that could possibly have originated from these languages as opposed to English.

    We then went through the word list compiled from the above-mentioned Sranan sources against the SED's Index of Keywords [26] and their lexical and phonetic regional variant forms. Where no match was found, actual keyword headings in the books of SED responses [26] were checked for any variants that might match items not found in the Index of Keywords, because regionally variable realizations of a given etymon were not always presented in this index. When no SED entries could be matched to a Sranan word form, we removed that word from the wordlist. We then pared down the resulting list of 497 Sranan–English matching items to remove those that showed no linguistic variation across the English dialects surveyed in the SED in the features of interest. Through this process, we identified 45 Sranan words that matched an entry in the SED for which there was relevant variability across 313 localities in twentieth-century English dialects [26]. (Although each of the 313 dialects has a unique combination of features in the SED, we found two dialects that are identical in this much smaller dataset, Ullesthorpe and Carlton Curlieu, both in Leicestershire.) These 45 words were grouped according to their variable phonetic features:

    • a. Variation in post-vocalic rhoticity (±PVR), e.g. ‘burn’ [b

      What is cultural transmission example?
      What is cultural transmission example?
      n] or [b
      What is cultural transmission example?
      _n],

    • b. Variation in word-initial phonemic /h/ ([h]∼[j]∼[w]∼Ø), e.g. ‘hot’ [h

      What is cultural transmission example?
      t] or

      [ _

      What is cultural transmission example?
      t]; ‘help’ [hɛlp], [ _ɛlp] or [jɛlp]; ‘woman’ [h
      What is cultural transmission example?
      man], [w
      What is cultural transmission example?
      man] or [_
      What is cultural transmission example?
      man]

    • c. Variation with /f/ and /v/ (±labial voicing), e.g. ‘four’ [fo

      What is cultural transmission example?
      ] or [vo
      What is cultural transmission example?
      ],

    • d. Variation with /θ/ and /f/, and /ð/ and /v/ (±labial), e.g. ‘broth’ [b

      What is cultural transmission example?
      What is cultural transmission example?
      f] or [b
      What is cultural transmission example?
      What is cultural transmission example?
      θ] and ‘brother’ [b
      What is cultural transmission example?
      What is cultural transmission example?
      d
      What is cultural transmission example?
      What is cultural transmission example?
      ], [b
      What is cultural transmission example?
      What is cultural transmission example?
      ð
      What is cultural transmission example?
      What is cultural transmission example?
      ] or [b
      What is cultural transmission example?
      What is cultural transmission example?
      v
      What is cultural transmission example?
      What is cultural transmission example?
      ],

    • e. Variation with [j]∼Ø (±word-initial palatal), e.g. ‘yesterday’ [jɛst

      What is cultural transmission example?
      What is cultural transmission example?
      de] or [_ɛst
      What is cultural transmission example?
      What is cultural transmission example?
      de],

    • f. Variation with word-final [ks]∼[sk] (±consonant cluster reversal), e.g. ‘ask’ [a:sk] and [aks],

    • g. Variation between forms approximating [o] and those approximating [au], e.g. ‘old’ [old] or [auld].

    From these data, we constructed a reference dataset that includes all of the recorded SED phonetic forms from the 313 localities for each of the 45 items, as well as the phonetic forms found in Sranan for each of the 45 items (see table 1 and electronic supplementary material, table S1). As can be seen in table 1, an ‘item’ is a Sranan word, expressed in English orthography, plus an associated phonetic option or ‘slot’ that can be coded on a binary (0/1) scale. For 33 of the 45 items, the phonetic option referred to the pronunciation at a single word location, e.g. word-initial [h], coded as ‘1’ if +[h], and as ‘0’ if –[h]; or rhoticity, coded as ‘1’ if +[r], and as ‘0’ if –[r]. Thus, for each of the 313 SED locations, and for Sranan, the item, ‘house’, varying in word-initial ±[h], is coded ‘1’, if in that dialect ‘house’ is pronounced with a word-initial [h], e.g. [ha

    What is cultural transmission example?
    s], and ‘0’, if it lacks the word-initial [h], e.g. [a
    What is cultural transmission example?
    s]. These two options associated with ‘house’ are referred to as linguistic ‘features’ (occasionally, ‘variants’). For simplicity, we adopted the convention throughout of assigning the code of ‘0’ whenever the feature in question was not produced in the interview data for a given SED location. For example, in the pronunciation of ‘hog’, a ‘0’ is coded both in localities where ‘hog’ is pronounced without the initial [h] sound and in those where ‘pig’ is the choice of lexical item rather than ‘hog’.

    Table 1.List of the 45 items used to define the similarity between Sranan and the speech of informants in the 313 locations in the SED. Each item is a Sranan word, expressed in English orthography, plus associated phonetic variables that determine a binary code for that item in each dialect. Also shown for each item are the phonetic form and features of, and the binary code assigned to, its Sranan reflex. For 33 of the items, a single phonetic variable, e.g. [r] or [h], is coded for presence (‘+’ or ‘1’) or absence (‘–’ or ‘0’). For each of the remaining 12 items, two variables are coded for presence or absence, and the resulting four combinations of features are reduced to a binary scale by assigning ‘1’ to the item, if its pronunciation corresponds to that in Sranan, and ‘0’, if the pronunciation is different from that of Sranan. By definition, the Sranan code for these 12 items is ‘1’. For all items, a ‘0’ is assigned if the English word containing the phonological variable in question was not produced in that particular dialect (as when ‘pig’ is produced instead of ‘hog’ in relation to studying [h] absence or presence). See the text for details.

    itemsitems
    English orthographySranan reflexesSranan input codeEnglish orthographySranan reflexes
    phonetic formphonetic feature(s)phonetic formphonetic feature(s)Sranan input code
    hand (n)[han]+[h]1turn (n)[tron]+[r]1
    head (n)[hedi]+[h]1wear (v)[weri]+[r]1
    help (n)[helpi]+[h]1work (n)[woroko]+[r]1
    herring (n)[heren]+[h]1work (v)[woroko]+[r]1
    hot (adj)[hati]+[h]1teeth (n)[tifi]+[f]1
    hog (n)[hagu]+[h]1broth (n)[brafu]+[f]1
    house (n)[hoso]+[h]1mouth (n)[mofo]+[f]1
    hungry (adj)[hangri]+[h]1old (adj)[ouru]+[au]1
    eyes (n)[hai]+[h]1cold (adj)[kouru]+[au]1
    woman (n)[(w/h)uman]+[w] or +[h]1gold (adj)[gouru]+[au]1
    arse (n)[ras]+[r]1hare (n)[he]+[h] and −[r]1
    burn (v)[bron]+[r]1hurt (v)[hati]+[h] and −[r]1
    care (v)[ke]−[r]0horse (n)[hasi]+[h] and −[r]1
    corn (n)[karu]+[r]1hear (n)[jeri]+[h] and +[r]1
    curse (n)[kosi]−[r]0yesterday (n)[esrede]−[j] and +[r]1
    door (n)[doro]+[r]1ears (n)[jesi]+[j] and −[r]1
    gutter (n)[gotro]+[r]1finger (n)[fiŋga]+[f] and −[r]1
    iron (n)[aje]−[r]0fire (n)[faija]+[f] and −[r]1
    master (n)[masra]+[r]1first (adj)[fosi]+[f] and −[r]1
    more (comp)[moro]+[r]1four (n)[fo]+[f] and −[r]1
    more (quan)[moro]+[r]1ask (v)[hakisi]+[h] and +[ks]1
    remember (v)[memere]+[r]1brother (n)[brada]+[ð] or +[d], and −[r]1
    star (n)[stari]+[r]1

    For the remaining 12 of the 45 items, the phonetic option was derived from the pronunciation at two word locations. For example, the word ‘ask’ was coded according to word-initial ±[h] and word-final [ks/sk], and the word ‘hurt’ was coded according to word-initial ±[h] and rhoticity. In these cases, the four possible combinations of two pronunciations at each of two locations were reduced to a binary option by assigning ‘1’ to a dialect (including Sranan) if the pronunciation of the English reflex was the same as that of the Sranan word; and ‘0’ if the pronunciation of the English reflex differed at either location from that of the Sranan word, or if the English reflex was not produced in the dialect. In summary, the linguistic profile of each locality, including Sranan, is represented in the dataset as a 45-element row vector of 0s and 1s, according to the linguistic features of the dialect of that locality.

    The similarity between two dialects can be defined in terms of the 45-element vectors representing the dialects. Here, we adopt the simplest definition of similarity as ‘per cent agreement between two vectors'—i.e. the number of positions at which the two vectors have the same element, 0 or 1, divided by 45. Because we are treating the SED speech forms for each locality as a proxy for the seventeenth-century English speech forms produced for that locality, we regard the SED dialect (defined as speech forms for each locality) with the highest per cent agreement to Sranan as being most likely to be among the set of source dialects for Sranan. The technical assumption justifying this use of the SED as a proxy is that if the (unobserved) seventeenth-century English dialects were ranked according to their (unobserved) similarity to Sranan, that ranking would be similar to the ranking of the (observed) corresponding SED dialects with respect to their (observed) similarity to Sranan. While we have no empirical support for or against this assumption, we believe that, pending future work, the use of the SED as a proxy is a reasonable starting point for analysis into the origins of Sranan.

    The port of Bristol was likely the main port of departure for English indentured labour to the English colonies in the Americas in the second half of the seventeenth century [34]. To compare our linguistic dataset with historical records of migration from England to Suriname, we therefore used the Bristol Register of Servants to Foreign Plantations [35] to categorize the place of origin for indentured labourers arriving in the colonies and have assumed that this sample of migrants is representative of the subset that settled in Surname. We extracted the origin and destination of indentured servants travelling between 1654 and 1666 with destinations in the Caribbean from which Suriname was settled [35]. Suriname was not listed as a destination, and the historical record is consistent with the presumption that indentured persons would have first landed at these other destinations, including Barbados, Antigua, Nevis and St Christopher [36–38].

    With the 45 linguistic features from Sranan and 313 English dialects, we tested each of the hypotheses proposed in the introduction. The PD hypothesis—according to which features of English found in Sranan were either selected by the commonality criterion proposed by Mufwene or selected out of the entire range of local and/or regional dialects and thus likely sampled in proportion to their frequency—implies that these selected features would be the ones most common across the various dialects at the time. By extension, these same features might be the ones that are most common across modern dialects. Note that this method can give us insights into whether frequency-dependent feature selection might have occurred, but it cannot account for other factors that might influence whether a feature is selected, such as prestige of the individuals with certain features. With this proxy, we first tested whether the most common variant of each linguistic feature was predictive of the variant in Sranan.

    Next, to test the alternate framing of the PD hypothesis, we combined the linguistic dataset with the information from the Bristol Register to test whether features of Sranan were predicted by the most common linguistic variant from regions of England that were likely source locations for migrants. Third, to test the London/southeast SD hypothesis, which posits that a standardized version of London English would be most likely to contribute language features to Sranan, we compared Sranan with the two modern London/Middlesex dialects in the SED. Fourth, to test our MSS hypothesis that Sranan was shaped by a relatively small number of source dialects from regions of high migration to Suriname, we compared Sranan with each local dialect in turn and found the one with highest overlap of language features. We then determined whether input from another local dialect would substantially improve the overlap with Sranan.

    In addition, we examined whether the distribution of the dialects in linguistic space is ‘similar’ to the distribution of the corresponding locations in geographical space. We first reduced the dimensionality of our language dataset by performing a principal components analysis [39] on the data matrix of binary features for English dialects and Sranan, and extracting the first four principal components (PCs). Then, to quantify the similarity in ‘shape’ between the linguistic and geographical distributions, we created a two-dimensional linguistic distribution using the first and second PCs, and compared it to the two-dimensional (latitude and longitude) geographical distribution of the dialects (excluding Sranan, the geographical coordinates of which are an outlier in the present context). The comparison was done using a Procrustes analysis [40,41]. Specifically, after Procrustes analysis, we denoted by D the minimized sum of squared Euclidean distances, scaled to have minimum 0 and maximum 1, between the linguistic and geographical representations of the SED dialects, and calculated a similarity statistic,

    What is cultural transmission example?
    . We then calculated empirical p-values for these t0 values over 106 permutations of geographical locations, thus determining how likely it would be for any random assignment of geographical locations to be more similar to the language-feature PCs than the actual dialect locations. A significant association between language features of local English dialects and their respective locations would provide support for hypotheses that link the emergence of the linguistic features of Sranan to the geographical origins of the speakers of English who lived in Suriname in the seventeenth century. The SD and MSS hypotheses, as well as the less extreme version of the PD hypothesis, are examples of such hypotheses, whereas the more extreme version of the PD hypothesis is not.

    Finally, with the presence and absence of the 45 features as input data, we constructed a language phylogeny that included English dialects and Sranan using TreeMix [42]. From the input data, TreeMix constructs a maximum-likelihood phylogeny in which dialects that share more features should be closer together on the tree. The algorithm then tests for statistical support for migration/mixture events between branches. For example, in genetic analyses, an admixture (or migration) event between two distantly related populations would result in offspring that have some stretches of DNA that are more closely related to one population and other stretches of DNA that are more closely related to the other population. In a traditional phylogenetic analysis, an admixed sample might be difficult to resolve but would likely appear near one of the two parental populations on the tree; however, it might seem to be somewhat distantly related to that parental population because large parts of its genome are more closely related to a different population. In this case, the ancestry of the admixed sample is better explained by input from two independent populations. Thus, TreeMix tests whether branches on the tree are better explained by hypothesizing input from another branch on the tree, and it draws an arrow connecting branches when these putative migrations significantly improve the likelihood score of the tree. This type of analysis can also be performed with our language features as inputs instead of genetic data: specifically, language features are treated here as analogous to genetic alleles or polymorphisms, dialects are treated as analogous to genotyped populations and branch length represents linguistic distance instead of genetic distance. This analysis is useful for our discussion of Sranan, because it provides another way to test whether the features of Sranan are better explained by multiple specific dialect inputs (inferred, for example, when a migration event to the Sranan branch improves the likelihood score of the tree) than by just one dialect input (e.g. when there is no statistical support for a migration event to the Sranan branch).

    In 1650, Suriname was settled from other English colonies in the Caribbean, including Barbados, Antigua, Nevis and St Christopher. We used the Bristol Register of Servants to Foreign Plantations [35] to establish the origins within England of persons migrating to these colonies, and we assume that this sample of migrants is representative of the subset that settled in Suriname. Of the 3077 persons with destinations listed as English colonies in the Caribbean, 1410 have their places of origin listed; those locations that could be assigned to specific locations in England are listed in table 2 and visualized in figure 1. Bristol and the surrounding counties contribute 64% of the indentured servants whose place of origin is recorded, and counties in Wales contribute another 11.7%; the remainder of the indentured servants originated primarily from counties in the East, South and West of England (see figure 2 for regions).

    What is cultural transmission example?

    Figure 1. (a) Circles represent the location of origins listed in the Bristol Register of Servants to Foreign Plantations [35]; the area of the circle is proportional to the number of individuals from that location. Red circles indicate locations in England, and blue circles indicate locations in Wales. Bristol is marked by a yellow star and London by a cyan star. (b) Similarity of each dialect to Sranan. The most similar dialect, Blagdon, is indicated by a red arrow.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What is cultural transmission example?

    Figure 2. Locations of the dialects of English studied in the SED. The classification of locations into counties and geographical regions are reproduced from the SED.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Table 2.Origin locations of indentured servants from England listed in the Bristol Register, 1654–1666. Locations near Bristol and the surrounding counties are highlighted in bold; locations in Wales are noted in italics.

    location of originno. servantspercentage
    Somersetshire20814.8
    Bristol15210.8
    Gloucestershire14910.6
    Monmouthshire1379.7
    Wiltshire1087.7
    Herefordshire825.8
    Glamorganshire735.2
    Devon392.8
    Dorset271.9
    London392.8
    Carmarthenshire362.6
    Shropshire332.3
    Worcestershire322.3
    Pembrokeshire302.1
    Brecknock/Brecon261.8
    Cornwall151.1
    Hampshire110.8
    Middlesex100.7
    Oxfordshire100.7
    Staffordshire100.7
    Kent80.6
    other locations17512.4

    With this distribution of origin locations from the historical record and the dataset of language features from Sranan and 313 dialects of English, we tested the proposed hypotheses. First, we note that the English features of Sranan are not identical to any one local dialect of English. One prominent feature characterizing English dialects is the production or non-production of the /r/ sound after vowels (post-vocalic rhoticity). In general, many modern dialects of English tend to show either predominantly presence or predominantly absence of post-vocalic rhoticity; however, Sranan shows evidence of influence from both rhotic and non-rhotic dialects. For example, the Sranan word [moro] (more) suggests that the English input forms had a post-vocalic /r/. However, other words in Sranan such as ‘four’ [f

    What is cultural transmission example?
    ] suggest that other English input forms lacked a post-vocalic /r/. This pattern is also observed in Saramaccan and Aukan, two other Creole languages in Suriname that derive their vocabulary from English and whose histories overlap with that of Sranan.

    For rhoticity in particular, we must observe the caveat that dialects in the 1950s, including those sampled in the SED [26], might have different features from dialects in the seventeenth century. While the loss of rhoticity in English dialects became common after 1790, there were sporadic losses of /r/ sounds beginning in the 1300s and a more general weakening of /r/ from 1640 onwards [43]. Furthermore, there is evidence, such as from studies of rhyming, that the early losses of post-vocalic rhoticity were occurring in at least some regions that are non-rhotic in the SED [44]. For example, Wyld cites many instances of r-loss from the years 1441 to 1674 and states that ‘it would appear that the weakening and disappearance of r before another consonant, especially, at first, before [s, ∫], had taken place by the middle of the fifteenth century at any rate in Essex and Suffolk, that a hundred years later London speakers of the humbler sort (Machyn), as well as more highly placed and better educated persons in various walks of life, pronounced the sound but slightly, if at all; that the tendency is more and more marked, not only before [s, ∫], but before other consonants also, until by the middle of the next century it seems that the pronunciation among the upper classes … was very much the same as at present’ [45, p. 299]. More recently, a Cambridge University survey of diversity of English in England has suggested that rhoticity has declined rapidly since the 1950s because of the influence of the non-rhotic dialects of London and the southeast [46]. Finally, although the SED data for each location generally include multiple subjects who were chosen carefully and interviewed thoroughly, we must also note the caveats that (i) no single set of features can accurately summarize the natural variation in a dialect or language and (ii) we cannot account for contact- and context-dependent dialect changes that might occur when individuals interact in a new location.

    To the extent that it is possible to test alternate hypotheses of the origin of linguistic features in Sranan using the modern lexical data, we find substantially more support for our MSS hypothesis than for the previously proposed PD hypothesis or SD hypothesis. In the simplest version of the PD hypothesis, the variant of each binary linguistic option that is found in Sranan is predicted to be the variant that is more frequent across the 313 SED dialects. We found that the per cent agreement, across the 45 linguistic options, between the Sranan variant and the modal variant was 33%. Because of the spatial correlation among dialects, we have no simple formula for calculating a confidence interval for this percentage. However, we developed a bootstrapped interval by sampling 313 dialects with replacement from the SED set, calculating the per cent agreement on each sample for a total of 10 000 samples and then extracting the 2.5th and 97.5th percentiles from the bootstrapped percentages. The resulting bootstrapped 95th per cent confidence interval is (31.1%, 42.2%). We use this interval as the basis for a rule-of-thumb: differences of approximately 10% or more in per cent agreement are substantial. Applying this rule to the percentages in table 3, we conclude that restricting the dialects used to compute per cent agreement to those dialects from (i) regions contributing at least 5% of migrants, or at least one migrant, in the Bristol Register (alternate versions of the PD hypothesis) or (ii) Harmondsworth or Hackney in the Middlesex/London SED region (alternate versions of the SD hypothesis) does not increase the per cent agreement substantially. By contrast, restricting the comparison set to a single local dialect increases per cent agreement substantially; Blagdon in southern England was the best single match, with 60% similarity to Sranan (27/45 features). Finally, we considered the effects of incorporating two independent source dialects as inputs to Sranan, first using Blagdon to compute per cent agreement and then finding the best match to the remaining 40% of features that were not present in Blagdon. In six dialects, nine out of the remaining 18 features were present, increasing the per cent agreement substantially from 60 to 80% (table 3). Four of these additional dialects were in eastern England: High Easter, Docking, Doddinghurst and Canewdon.

    Table 3.Testing hypotheses of the origin of English features in Sranan. In the last column, each of the two percentages in boldface is substantially greater than the percentages above it.

    hypothesissimilarity to Sranan
    PD hypothesis (e.g. Mufwene [20])comparison to most common variant of each feature across English dialects33.3%
    comparison to most common variant of regions contributing at least 5% of migrants in the Bristol Register35.6%
    comparison to most common variant of regions contributing at least one migrant in the Bristol Register42.2%
    standard London/southeast dialect (SD) hypothesis (e.g. Smith [23])comparison to dialect from Middlesex and London SED region: Harmondsworth37.8%
    comparison to dialect from Middlesex and London SED region: Hackney35.6%
    comparison to dialect from Harmondsworth, then from Hackney for Sranan features not found in Harmondsworth46.7%
    MSS hypothesis (this manuscript)comparison to most similar dialect to Sranan (Blagdon)60%
    comparison to most similar dialect to Sranan (Blagdon), then most similar dialects for Sranan features not found in Blagdon (High Easter, Docking, Doddinghurst, Canewdon [Eastern England], Farningham [Southern England], Llanfrechfa [Wales])80%

    The hypothesis that a cluster of local dialects, other than the emerging standardized dialect from London and the southeast, is a possible source of English-derived word-forms in Sranan is supported by a geographical analysis of different types of features. When we look across all 45 features studied, dialects near the port of Bristol appear to be the most similar to Sranan (figure 1b). However, this overall similarity masks interesting regional patterns that are revealed when we examine different types of features separately. For example, as mentioned above, Sranan appears to show inputs from both rhotic and non-rhotic dialects of English. If we use only the rhotic word-forms to calculate the similarity between Sranan and English dialects, we see a striking contrast in similarity that separates the rhotic and non-rhotic dialects (figure 3a). The regions of England where post-vocalic rhoticity is common are clearly delineated by their number of matches to this set of words; in these features, Sranan matches well to the areas around Bristol, which were the source locations for many of the migrants to Suriname. An opposite pattern emerges when we calculate similarity using only the non-rhotic subset of features in Sranan (figure 3b). For these non-rhotic features, the areas in England around Bristol (starred in figure 1a) are not similar to Sranan. Thus, we predict that the non-rhotic word-forms in Sranan could have been influenced by a different subset of dialects; these could have been from Wales, eastern England or parts of northern England (see the yellow points in figure 3b). We can visualize other dialect features in a similar manner: figure 3c shows the distribution of a diphthong in words ending in ‘old’, which is present in Sranan and widespread in the central parts of England, and figure 3d shows the distribution of matches to Sranan in words including a word-initial [h] sound, which is often present in Sranan.

    What is cultural transmission example?

    Figure 3. Each English local dialect is represented by a circle in its geographical location. The features analysed in each panel are presented in the upper right. The colour of each circle indicates the number of these features that are shared with Sranan, corresponding to the colour scale on each panel. The form present in Sranan is indicated in brackets. The types of features considered are: (a) presence of post-vocalic rhoticity, (b) absence of post-vocalic rhoticity, (c) presence of a diphthong and (d) presence of word-initial phonemic [h].

    • Download figure
    • Open in new tab
    • Download PowerPoint

    In figure 3, we observe that there is a strong geographical component to the distribution of linguistic features within England; post-vocalic rhoticity, certain diphthongs and word-initial [h] forms all show geographical clustering in their presence. Our PC analysis of the raw language-feature data reinforces these findings, as can be seen in figure 4. The goal of this analysis is to reduce the dimensionality of the linguistic space from 45, i.e. the number of items on which dialects may vary in our dataset, to a much smaller number of dimensions, the PCs, such that these PCs (i) account for much of the variation across dialects and (ii) are readily interpretable in linguistic terms. The interpretation of the PCs naturally depends heavily on the phonological features of the 45 items used in the analysis, and it would be expected to change if, for example, a different distribution of features were used, or if lexical items were added. As can be seen in table 1, 28 of the present 45 items contained ±[r], 14 items contained ±[h], three items contained ±[au] and three items contained ±[f].

    What is cultural transmission example?

    Figure 4. PCs and the correlation of each language feature to PC scores. (a) PC1 correlates well to post-vocalic rhoticity in local dialects, and we can observe that many dialects in the South (blue) and West (green) are rhotic and are on the positive side of the PC1 axis, while many dialects in the West (magenta) are non-rhotic and are on the negative side of this axis (see figure 3 for comparison). The North (gold) has both rhotic and non-rhotic local dialects, which are correspondingly found on both the positive and negative sides of the PC1 axis. Sranan (red) contains both rhotic and non-rhotic forms, and is near the centre of the PC1 axis. PC2 correlates well with the presence of a word-initial [h] sound. Most dialects in the South (blue) and West (green) lack this sound and are found on the negative side of this axis. Sranan, along with some of the dialects in the South (blue), West (magenta) and North (gold), shows the presence of these word-initial [h] features and are correspondingly found on the positive side of the PC2 axis. (b) The loading (defined as a correlation) of each SED item on PC2 is plotted against the item's loading on PC1. Items with high positive (negative) PC1 loadings are those that are coded as +[r] (−[r]) in relatively many dialects. The PC2 loadings of the items can be interpreted similarly in terms of h-fullness.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    The first principal component (PC1) extracted from the data explained 39.2% of the variance in features across dialects. An examination of the coordinates of the 45 items in figure 4b shows that, in general, rhotic items load positively and non-rhotic items load negatively on PC1. Thus, this first principal component combines the rhotic features highlighted in figure 3a and the non-rhotic features highlighted in figure 3b into a single bivalent dimension that can reasonably be interpreted as ‘rhoticity’. This finding is not surprising, given that more than half of the items contained ±[r]. The second principal component (PC2) in figure 4b explained 11.1% of the variance in the sample, and it was strongly correlated to features describing the presence or absence of the word-initial phonemic ±[h], i.e. the features highlighted in figure 3d. In figure 4a, we plot the 313 SED dialects and Sranan in this two-dimensional linguistic space (rhoticity, h-fullness), the dimensions of which together explain 50.3% of the variation among dialects. The PC1 score of a dialect can be viewed as the extent, ‘averaged’ mainly over the 28 items containing ±[r], to which the dialect is rhotic versus non-rhotic; and the PC2 score of the dialect can be interpreted similarly in terms of h-fullness. It can be seen in figure 4a that Sranan is closer to Blagdon (ID = ‘240’) and other dialects in the South (see figure 2 for the locations corresponding to the IDs in figure 4a) than it is to the dialects in Essex and Norfolk. This finding is consistent with the results reported in table 3.

    The third principal component (PC3, not shown in figure 4) explained 5.6% of the variance in the sample and correlated primarily to features describing the presence of the diphthong, ±[au], in words ending in ‘old’, i.e. the features highlighted in figure 3c. Finally, the fourth principal component (PC4, not shown) explained 4.5% of the variance in the sample and correlated primarily to the presence of ±[f] in words ending in ‘th’, such as ‘broth’ and ‘teeth’.

    To determine whether the features of English dialects were significantly associated with their geographical location, we performed another PC analysis without Sranan, extracted the first two PCs and then conducted a Procrustes analysis (following [40]). This Procrustes analysis indicated that, for the SED dialects, the two-dimensional plot of PC1 versus PC2 was significantly associated with the geographical locations of the dialects: we found significant concordance (p < 10−6) between the first two PCs of dialect feature data and geographical locations for 313 dialects in the SED database (Procrustes t0 = 0.46). This result suggests that there is significant geographical structuring in the dialect features, which further supports the graphical representations in figure 3.

    Finally, our phylogenetic analysis using TreeMix [42] suggests that Sranan is most closely related to local dialects in Somerset (near the port of Bristol), but with strong evidence for mixture from a dialect in High Easter, Essex, in the east of England (figure 5). These results are consistent with the regions in table 3 that contribute to maximum overlap with Sranan: Blagdon, the dialect with the most matching features with Sranan, is in Somerset, and three of the local dialects that matched best to the remaining features are from Essex (High Easter, Doddinghurst and Canewdon). We note that Essex was specifically mentioned by Wyld [45] as a location where loss of rhoticity was occurring by the fifteenth century.

    What is cultural transmission example?

    Figure 5. Phylogeny of English dialects based on the 45 lexical forms included in this study, generated with TreeMix. The backbone of the tree—the branches shown in black lines—represents the maximum-likelihood phylogeny. With this phylogeny, the TreeMix algorithm tests whether a putative mixing and/or migration event between branches significantly improves the statistical likelihood of the tree. An arrow connecting branches indicates that a dialect near the base of the arrow is predicted to have influenced the language features of the dialect at the tip of the arrow. The colour scale on the left indicates the strength of this prediction. This tree indicates that Sranan is most closely related to dialects in Somerset, near Bristol, with putative mixing with dialects in Essex, in the east of England.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    This pattern was observed in multiple replications of the TreeMix algorithm. When we conduct a TreeMix analysis with one migration event allowed, the algorithm predicts a mixture from a dialect in Essex to Sranan. When we increase the number of migration events allowed, we observed additional predicted mixture events that were less strongly supported than the one involving Sranan (see electronic supplementary material, figure S1). The interpretation of these additional mixture events is unclear. For example, in one replication, an additional mixture event was signified by an arrow from Blagdon in the West to Great Strickland in the North. After the fact, we examined the PC scores of the dialects and noted that Great Strickland was placed on a branch with other northern dialects that all have roughly the same PC2, PC3 and PC4 scores. However, Great Strickland was lower in PC1 scores (rhoticity) than the other northern dialects on the branch. It is possible that the algorithm interpreted this pattern as evidence that Great Strickland was influenced by dialects outside of the northern region, but why Blagdon was the best-fit source of influence is unclear.

    In this paper, we performed a set of statistical analyses to compare features of a Creole language, Sranan in Suriname, with potential source dialects in English. We then interpreted these analyses alongside historical data to develop a fuller picture of the formation of this Creole.

    With the obvious caveat that Sranan and all of the dialects in question have accumulated changes since the seventeenth century, we used the variation in modern dialect features as a proxy for variation in the source dialects, and we used this modern dialect data to test multiple hypotheses about the formation of Sranan. The first hypothesis tested, the PD hypothesis, posits that features of Sranan could be potentially sourced from all dialects of England in proportion to the frequencies of the features across local dialects in the early contact situation. The second hypothesis tested, that of a regional London/southeast dialect source, states that a standardized form of London English was in the process of being formed during the seventeenth century, and that dialects approximating the emerging regional London/southeast SD of English would have been likely produced when settlers from multiple dialects spoke with one another, thus influencing Sranan. By contrast, we propose a more general form of the SD hypothesis according to which Sranan was primarily influenced by MSS, which we propose to be two distinct regions of England, the southwest, Somerset in particular, and in the east, specifically Essex.

    The analyses of dialect variation in the 1950s presented in this manuscript support our hypothesis: features of Sranan are not well predicted by the most common features across English dialects or by the features of London-area English, whereas these features are better predicted by the inputs from two regional dialects in English. The results of these comparisons, shown in table 3, are further supported by our statistical analyses. First, we showed that the variation of language features across the dialects of English shows strong signals of regional patterning, and that Sranan exhibits features consistent with input from multiple regions (figure 3). Of the different types of features studied (table 1), rhoticity and word-initial [h] were strongly correlated with the first two PCs of the dialect data, respectively, together explaining approximately 50% of the variance in features across dialects. Our comparisons show that Sranan was most similar to the Blagdon dialect and other dialects close to the port of Bristol, and this finding is consistent with the source locations for indentured servants travelling from England to the Caribbean (figure 1). However, some features of Sranan, most notably the lack of post-vocalic rhoticity found in multiple word-forms, suggest input from another source outside of this region. Several regions of England with modern-day non-rhotic dialects were represented in the migration to Suriname through Bristol, such as Wales and the east of England (figures 1 and 3). Therefore, dialects from these regions could be putative secondary sources for features of Sranan. These observations are supported by a maximum-likelihood phylogeny of the language features that can account for potential mixture between branches. This phylogeny suggests that Sranan is most closely related to dialects near Bristol, and that there is significant evidence of mixture with a dialect from Essex in the east of England.

    It is still an open question as to whether the present data might allow us to infer the degree to which the migrants to Suriname from the different locations in England shared a rough approximation to a well-formed language variety, i.e. a koiné, out of which Sranan evolved. On the one hand, our rejection of the extreme version of the PD hypothesis suggests that the features of Sranan were not merely a random or frequency-dependent selection from the regional dialect speech of the English settlers. On the other hand, the present conclusion that there likely were two main English sources of Sranan is consistent with the view that some degree of koinéisation (i.e. of dialect levelling or linguistic accommodation and convergence) may have occurred. Resolution of this issue may depend on an analysis of the amount of variation in the occurrence of particular features, e.g. +[r], across the relevant SED items. In the present analysis, we focused on the average tendency across the relevant SED items to observe the feature, as this tendency is indexed by the associated principal component score of the dialects. An extension to the analysis of the variation in feature occurrence seems warranted.

    The present basis of 45 Sranan items used to define ‘similarity’ or ‘relatedness’ among dialects is essentially phonological (rather than, e.g. lexical or grammatical). When we reduce this basis to the familiar subgroups of items, namely rhotic, non-rhotic, diphthong (±[au]) and h-fullness (±[h]), the results in figure 3a–d show, as expected, that the similarity between Sranan and a given cluster of SED dialects depends on the specific subgroup of items. This raises the question of how robust are the present conclusions about the origins of Sranan to an expansion of the basis, or set of items, used to define similarity. For example, would our conclusions change if lexical or grammatical features were included in the basis, or if, following Szmrecsanyi [22], one were to generalize the definition of similarity to include the relative frequency of each feature in the different localities, rather than the simpler binary code, ‘present’ versus ‘absent’? These questions merit consideration in future research.

    Taken together, our analyses support the hypothesis that Sranan was influenced by certain distinct dialects from widely separated regions in England. To explain the features of Sranan, we did not need to invoke processes based on frequency of feature occurrence across dialects, or on the use of features of an emerging standardized English dialect based on the speech of London and the southeast. Instead, we could explain the features of Sranan by hypothesizing input from two distinct locations within England that were known source locations for indentured servants migrating to Suriname. In addition, our analyses shed light on the particular features, rhoticity and h-fullness, that explain most of the dialect variation in English, and confirm previous studies showing that this variation is significantly associated with geography [26,47,48]. In summary, our results not only provide support for a new hypothesis of Sranan's formation, but also suggest that a method of combining known historical data with statistical analyses of language features could be fruitful in future studies of Creole formation and, more generally, language variation.

    The datasets supporting this article have been uploaded as part of the electronic supplementary material.

    A.C.S. and H.D. conceived of the study and collected data. N.C., A.C.S., H.D. and E.A.C.T. designed and performed analyses, wrote the article and approved it for submission.

    We have no competing interests.

    E.A.C.T. received research support as a Bass University Fellow in Undergraduate Education at Stanford University. N.C. received research support from the Ruth Landes Memorial Research Fund and the Stanford Center for Computational, Evolutionary, and Human Genetics.

    Footnotes

    One contribution of 16 to a theme issue ‘Bridging cultural gaps: interdisciplinary studies in human cultural evolution’.

    Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.3972828.

    References

    • 1

      Esposito B, Wood L. 1982Prison slavery. Washington, DC: Committee to Abolish Prison Slavery. Google Scholar

    • 2

      Brewer J, Staves S. 1996Early modern conceptions of property. London, UK: Routledge. Google Scholar

    • 3

      Smith N, Veenstra T. 2001Creolization and contact. Amsterdam, The Netherlands: John Benjamins. Crossref, Google Scholar

    • 4

      Rens LLE. 1953The historical and social background of Surinam’s Negro-English. Amsterdam, The Netherlands: North-Holland. Google Scholar

    • 5

      Braun M. 2009Word-formation and creolisation: the case of early Sranan. Berlin, Germany: Walter de Gruyter. Crossref, Google Scholar

    • 6

      Carlin E, Arends J. 2002Atlas of the languages of Suriname. Seattle, WA: University of Washington Press. Google Scholar

    • 7

      Hall RA. 1966Pidgin and creole languages. Ithaca, NY: Cornell University Press. Google Scholar

    • 8

      Carden G, Stewart WA. 1988Binding theory, bioprogram, and creolization: evidence from Haitian Creole. J. Pidgin Creole Lang. 3, 1–67. (doi:10.1075/jpcl.3.1.02car) Crossref, Google Scholar

    • 9

      Bickerton D. 1988Creole languages and the bioprogram. In Linguistics: the Cambridge survey 2(ed. FJ Newmeyer), pp. 268–284. Cambridge, UK: Cambridge University Press. Google Scholar

    • 10

      Mühlhäusler P. 1997Pidgin and creole linguistics. London, UK: Battlebridge Publications. Google Scholar

    • 11

      McWhorter JH. 2001The world's simplest grammars are creole grammars. Ling. Typol. 5, 125–166. (doi:10.1515/lity.2001.001) Google Scholar

    • 12

      McWhorter JH. 1997Towards a new model of creole genesis. Bern, Switzerland: Peter Lang. Google Scholar

    • 13

      McWhorter J. 2002The rest of the story: restoring pidginization to creole genesis theory. J. Pidgin Creole Lang. 17, 1–48. (doi:10.1075/jpcl.17.1.02mcw) Crossref, ISI, Google Scholar

    • 14

      Blasi DE, Michaelis SM, Haspelmath M. 2017Grammars are robustly transmitted even during the emergence of creole languages. Nat. Hum. Behav. 1, 723–729. (doi:10.1038/s41562-017-0192-4) Crossref, PubMed, ISI, Google Scholar

    • 15

      Migge B. 2003Creole formation as language contact: the case of the Suriname creoles. Amsterdam, The Netherlands: John Benjamins. Crossref, Google Scholar

    • 16

      Arends J. 2009A demographic perspective on creole formation. In The handbook of pidgin and creole studies(eds S Kouwenberg, JV Singler), pp. 309–331. Chichester, UK: Wiley-Blackwell. Crossref, Google Scholar

    • 17

      Bakker P, Daval-Markussen A, Parkvall M, Plag I. 2011Creoles are typologically distinct from non-creoles. J. Pidgin Creole Lang. 26, 5–42. Crossref, ISI, Google Scholar

    • 18

      Mufwene SS. 1996The founder principle in creole genesis. Diachronica 13, 83–134. (doi:10.1075/dia.13.1.05muf) Crossref, Google Scholar

    • 19

      Alleyne MC. 1980Comparative Afro-American: an historical-comparative study of English-based Afro-American dialects of the new world. Ann Arbor, MI: Karoma Publishers. Google Scholar

    • 20

      Mufwene SS. 2001The ecology of language evolution. Cambridge, UK: Cambridge University Press. Google Scholar

    • 21

      Mufwene SS. 2002Competition and selection in language evolution. Selection 3, 45–56. (doi:10.1556/Select.3.2002.1.5) Crossref, Google Scholar

    • 22

      Szmrecsanyi B. 2012Grammatical variation in British English dialects: a study in corpus-based dialectometry. Cambridge, UK: Cambridge University Press. Crossref, Google Scholar

    • 23

      Smith N. 1987The genesis of the Creole languages of Suriname. PhD thesis, University of Amsterdam. Google Scholar

    • 24

      Smith NSH. 2009Creole phonology. In The handbook of pidgin and creole studies(eds S Kouwenberg, JV Singler), pp. 98–129. Chichester, UK: Wiley-Blackwell. Crossref, Google Scholar

    • 25

      Wilner J. 2007Wortubuku Ini Sranan Tongo (Sranan Tongo – English Dictionary), 5th edn. Paramaribo, Suriname: Summer Institute of Linguistics. Google Scholar

    • 26

      Orton H, Dieth E. 1962Survey of English dialects: Introduction.Leeds, UK: EJ Arnold & Son. Google Scholar

    • 27

      Sherriah AC. 2013Identifying the dialect sources of the lexico-phonetic inputs from England into Sranan. PhD thesis, University of the West Indies, Jamaica. Google Scholar

    • 28

      Fermin P. 1769Géographie générale, historique, géographique et physique de la colonie de Surinam. Amsterdam, The Netherlands: E. van Harrevelt. Google Scholar

    • 29

      Herlin JD. 1718Beschryvinge van de Volk-plantinge Suriname. Leeuwarden, The Netherlands: Meindert Injema. Google Scholar

    • 30

      Stichting V. 1980Woordenlijst Sranan-Nederlands-English: met een lijst van planten- en dierennamen. Paramaribo, Suriname: Vaco. Google Scholar

    • 31
    • 32

      Vieyra A. 1773A dictionary of the Portuguese and English languages, in two parts: Portuguese and English: and English and Portuguese … in two volumes. London, UK: J Nourse. Google Scholar

    • 33

      van Holtrop JSE. 1823John Holtrop's English and Dutch Dictionary: Engelsch en Néderduitsch Woordenboek. Containing the English before the Dutch. Dordrecht, The Netherlands: Blussé en van Braam. Google Scholar

    • 34

      Beckles H. 1989White servitude and black slavery in Barbados, 1627–1715. Knoxville, TN: University of Tennessee Press. Google Scholar

    • 35

      Coldham PW. 1988The Bristol registers of servants: sent to foreign plantation 1654–1686. Brooklyn Park, MN: Clearfield Company. Google Scholar

    • 36

      Sainsbury WN. 1860Calendar of State Papers: Colonial series. London, UK: Longman. Google Scholar

    • 37

      Kambel E-R, MacKay F. 1999The rights of indigenous peoples and Maroons in Suriname. Copenhagen, Denmark: IWGIA. Google Scholar

    • 38

      Whitehead NL. 1996Native peoples confront colonial regimes in northeastern South America (c. 1500–1900). In The Cambridge history of the native peoples of the Americas: South America, part 2(ed. F Salomon), pp. 382–442. Cambridge, UK: Cambridge University Press. Google Scholar

    • 39

      Pearson K. 1901On lines and planes of closest fit to systems of points in space. Lond. Edinb. Dublin Philos. Mag. J. Sci. 2, 559–572. Crossref, Google Scholar

    • 40

      Wang C, Szpiech ZA, Degnan JH, Jakobsson M, Pemberton TJ, Hardy JA, Singleton AB, Rosenberg NA. 2010Comparing spatial maps of human population-genetic variation using procrustes analysis. Stat. Appl. Genet. Mol. Biol. 9, Article 13. (doi:10.2202/1544-6115.1493) Crossref, PubMed, ISI, Google Scholar

    • 41

      Dryden IL, Mardia KV. 1998Statistical shape analysis. New York, NY: Wiley. Google Scholar

    • 42

      Pickrell JK, Pritchard JK. 2012Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet. 8, e1002967. (doi:10.1371/journal.pgen.1002967) Crossref, PubMed, ISI, Google Scholar

    • 43

      Lass R. 1997Historical linguistics and language change. Cambridge, UK: Cambridge University Press. Google Scholar

    • 44

      Beal JC. 2002English pronunciation in the eighteenth century: Thomas Spence’s grand repository of the English language. New York, NY: Oxford University Press. Google Scholar

    • 45

      Wyld HC. 1920A history of modern colloquial English. New York, NY: EP Dutton. Google Scholar

    • 46

      Leemann A. 2017Using smartphone apps to map phonetic variation in British English, German, and Swiss German. J. Acoust. Soc. Am. 141, 3909–3909. (doi:10.1121/1.4988811) Crossref, Google Scholar

    • 47

      Upton C, Widdowson JDA. 2013An atlas of English dialects: region and dialect. London, UK: Routledge. Crossref, Google Scholar

    • 48

      Trudgill P. 2000The dialects of England. New York, NY: Wiley-Blackwell. Google Scholar


    Page 7

    Understanding how human populations acquire, and use, social information is one of the central challenges of cultural evolution and the focus of a highly active, interdisciplinary debate [1]. Social learning, or cultural transmission, is defined as learning that is facilitated by observations of, or interactions with, another individual or their cultural products [2,3]. It supports the spread of adaptive information, accumulated over generations [4–6], yet also bears the risk of transmitting outdated, misleading or inappropriate information, especially in changing environmental conditions [7]. But there is no unique way in which social information can be acquired; in fact, a large number of social learning processes have been identified in the literature (e.g. [5,6,8]). Research aimed at identifying learning processes in human populations can be roughly divided into two groups: experimental laboratory-based and theoretical modelling-based approaches. Laboratory-based experiments, in particular ‘microsocieties’ (e.g. [9–12]) and diffusion chain experiments (e.g. [13–16]), have focused on uncovering the variety and subtlety of human social learning strategies, providing a powerful framework for studying cultural evolution empirically (see [1,17] for comprehensive review).

    In the following, we focus on theoretical modelling-based approaches. These evolutionary models of learning have mainly focused on understanding which individual and social learning strategies are expected to have to evolved in spatially and temporally changing environments (see [18] for a comprehensive review of this literature). These models provided an elegant characterization of the long-term outcomes of evolution through natural selection, as well as their associated evolutionary trajectories, and therefore produce predictions of which learning processes are expected to be present in the population.

    However, in order to verify those predictions, social learning processes would need to be observed directly so that fine-grained individual-level data detailing who learns from whom can be generated. But outside of controlled experimental conditions, large longitudinal datasets of this kind are difficult to obtain, especially in historical or anthropological contexts (e.g. [19], but see [20,21] for two cultural evolutionary case studies and [22] for a research program dedicated to addressing this issue). This is not to say that no such data exist, but in many case studies of interest the available data are in the form of frequencies of different variants of a cultural trait in the population at one or several points in time. While many modern datasets possess a rich temporal resolution (e.g. those describing the choice of first names in modern populations, which record the number of instances of a specific name each year), prehistoric or anthropological datasets—the focus of this paper—typically describe the frequencies of different cultural variants in sparse samples from the whole population.

    So if we want to infer social learning processes from available data, we face a classical inverse problem: we can only observe aggregated, population-level frequency data but aim at identifying the underlying individual-level learning processes that gave rise to them. Recent approaches to address this inverse problem have focused on, among other things: the shape of adoption curves (e.g. [4,23,24]); the comparison between observed levels of cultural diversity or cultural accumulation and the ones expected under various processes of social learning (in particular unbiased transmission (or neutral evolution) (e.g. [25–28])); the shape of rank-abundance distributions (e.g. [29–31]); and the comparison between observed turnover rates and the ones expected under unbiased transmission (e.g. [32,33]) or phylogenetic analyses (e.g. [34]). This research has clearly shown that robustly inferring the underlying processes of social learning from population-level frequency data becomes a challenging task, especially in the light of equifinality, i.e. in situations where various learning processes can result in very similar population-level characteristics (e.g. [35,36]).

    However, the inverse problem described above is of course not unique to cultural evolution. In fact, other scientific fields have successfully overcome similar challenges, in particular population genetics, which aims to understand the evolutionary mechanisms that produced the allele frequency distributions observed both now and in the past. Here, recent developments have provided elegant means for building complex evolutionary models, and allowing the application of efficient generative inference frameworks, which made possible the statistical testing of increasingly realistic demographic hypotheses. In general, the generative approach proceeds by building a fully specified probabilistic model, in which the hypothesized causal mechanisms are explicitly defined. This model is then used to repeatedly simulate pseudo-datasets under known parameter values, such that their expected distribution can be statistically compared with observed data, through techniques such as approximate Bayesian computation (ABC). This comparison allows certain hypothesized mechanisms to be rejected as inconsistent with the empirical data, and the estimation of model parameters that provide the best fit.

    Our goal in this paper is to demonstrate how the generative inference approach can help answer the question of how human populations use social information, based on observable empirical evidence. We note that the general idea of generative modelling has already been applied to socio-cultural evolution. Significant early examples include Schelling's segregation model [37] and the influential agent-based economic modelling framework of Sugarscape [38]. These approaches and subsequent generalizations (see e.g. [39]) investigated the effects of explicitly defined individual-level causal mechanisms on population-level outcomes, which could then be compared with observed data. While one of the major advantages of this line of work is that the complex nature of the models considered allowed for more realistic expected outcomes, the principal limitation has been the lack of a robust statistical methodology capable of comparing these outcomes to empirical data. However, careful application of techniques like ABC, as mentioned above, is beginning to remove this limitation to inference (e.g. [40]).

    We believe that the generative inference approach reviewed in this paper may link theoretical and empirical work in cultural evolution closer together by providing a framework that is able to evaluate the consistency between different individual-level processes and observed population-level patterns; in our case, between different processes of social learning and observed patterns of cultural change. Similarly to population genetics, such an inference framework consists of building a generative model that establishes a causal link between individual-level learning processes and observable population-level frequency data that then are evaluated for statistical consistency. The outcome of this approach is not only the identification of the most likely underlying learning process given the empirical data but a description of the breadth of processes that could have produced these data equally well, which in turn can be interpreted as an informal measure of the level of equifinality. Additionally, the inference framework may provide insight into the temporal and/or spatial resolution of the population-level frequency data that are needed to reliably distinguish between different processes of social learning (see [41] for a related discussion). In §1a, we briefly review some of the relevant key developments in population genetics, before exploring the applicability of the generative inference approach to cultural frequency data in §2.

    Classical population genetics—with its prospective approach [42]—provided many important theoretical insights into how the processes of mutation, drift, selection, migration and demographic change may shape the genetic variation expected in a population at equilibrium (e.g. [43–45]). But the development of coalescent theory in the early 1980s [46] (see also [47–49]) offered an alternative retrospective view of genetic evolution, providing a statistical model for the genealogical relationships between just a sample of individuals rather than the entire population. One major advantage of this coalescent framework is that, given an explicit model of demographic history and a mutation model, it allows for very efficient simulation of genetic—or genomic-scale—data for an observed sample with no a priori assumption of equilibrium. This has proved very useful in inferring population history, and while there is a wide array of other methodological approaches (e.g. [50–55]), the generative approach—in which simulated genetic data are statistically compared to the observed data—is growing in popularity, with the models of demographic history becoming increasingly complex and realistic (e.g. [56–58]).

    However, generative inference crucially relies on the ability to make an evaluation of the quality of the model used. Rather than simply rejecting those demographic models or hypotheses that generate genetic variation inconsistent with what we observe (as in [59,60]), there exists a large and growing body of statistical techniques that allow for the explicit comparison of competing scenarios and the estimation of their underlying parameters. One such approach, ABC [61,62], was developed by statistical and population geneticists to circumvent the difficulty, or impossibility, of specifying the likelihood functions for complex models. ABC relies on repeatedly simulating pseudo-data under an explicitly specified model and, by retaining just those parameter values that generate data ‘close’ to the observed data, allows estimation of their posterior distributions (full details are given in §2b). A number of researchers have used this pairing of coalescent-based simulation and ABC to answer diverse questions about human demographic history, from early population differentiation in sub-Saharan Africa [63], to the global expansion of modern humans during the Late Pleistocene [58], to hunter–gatherer population replacement in Europe [64] and the initial colonization of the Americas [57] at the end of the last Ice Age.

    In the following, we demonstrate how generative inference procedures can be constructed and used to infer social learning processes from cultural data in the form of time-series detailing the usage or occurrence frequencies of different cultural variants. Similar to the population genetic applications, the inference procedure consists of two steps. First, we develop a non-equilibrium generative model capturing the main cultural and demographic dynamics of the considered system. This model describes the frequency evolution of different cultural variants present in a population at given time points under an assumed social learning hypothesis. Second, ABC techniques are used to derive conclusions about which (mixtures of) learning strategies are consistent with the observable frequency data and which are not. The aim of this framework is to allow researchers to ‘reverse engineer’, which learning strategies are likely to have been used in current or past populations, given knowledge of how frequencies have changed over time, independent of optimality or equilibrium assumptions. Figure 1 summarizes the steps of the generative inference framework described in this section.

    What is cultural transmission example?

    Figure 1. Schematic representation of the proposed generative inference framework. This non-equilibrium framework requires multiple observation (i.e. at least two) of cultural data D(tj), in our case population-level frequencies of different cultural variant types, at known times tj.

    What is cultural transmission example?
    denotes the theoretical data produced under the social learning process described by
    What is cultural transmission example?
    . The generative model is initialized with the data observed at the beginning of the time series t1. (Online version in colour.)

    • Download figure
    • Open in new tab
    • Download PowerPoint

    We stress that this particular inference framework is designed to analyse the temporal dynamic of cultural change, defined as the change in frequency of different variants of cultural traits. If the observed data are of a different nature, e.g. describing the continuous variation of certain attributes of cultural artefacts, such as the dimensions of an arrowhead, then researchers have to first construct a hypothesis about the relationship between temporal variation of the attribute and the social learning processes considered in order to apply a similar inference procedure.

    In §2a,b we describe the two steps of the inference framework and discuss in §2c the theoretical limits to inference; specifically, we ask how much information about underlying social learning processes we should expect to infer from population-level frequency data of a given temporal resolution. In §2d(ii) we show how the generative approach has been applied to cultural case studies. Lastly, in §2e we discuss some issues researchers should consider before applying the proposed, or a similar, inference framework.

    As mentioned above, the generative model aims at capturing the main cultural and demographic dynamics of the cultural system. Importantly, the generative model has to produce pseudo-data—in our case, population-level frequencies of different variants of a cultural trait at different points in time conditioned on the assumed social learning process—so that theoretical predictions can be compared to empirical observations. Thereby different learning processes are expressed by different model parameterizations; the model parameters are denoted by θ = (θ1, …, θk) in the following. In other words, the generative model establishes an explicit causal relationship between the assumed processes of social learning defined by θ and observable population-level patterns of cultural change.

    We note that there are no restrictions on the type of generative model used. Models ranging from systems of partial differential equations to agent-based simulations have also been used successfully; in fact, a number of the models mentioned in §1 could, with an appropriate choice of generative model, feasibly be adapted for use within this inference framework. As we want to generate frequency data at different time points, we advocate the use of non-equilibrium models, which can also account for temporal changes in demographic properties of the cultural system (e.g. variations in the total size of the population of cultural variants). This modelling choice aims at reducing the risk of misinterpreting non-equilibrium dynamics as evidence for the presence or absence of particular social learning processes (see [65] for a detailed discussion). For instance, the rejection of the hypothesis of neutral cultural evolution, based on empirical data, has usually been interpreted as evidence for the existence of selective biases in the population. But it has been pointed out that such a rejection can also be indicative of non-equilibrium dynamics or simply violations of the inherent assumptions of the neutral model (e.g. [28,66]). We note, however, that the relaxing of the equilibrium assumption requires accurate knowledge about e.g. the time points at which the observed frequencies are recorded. We return to this issue in §2e.

    To infer which learning strategies are consistent with the observed data we would ideally determine the likelihood function of the generative model. However, in many cases (if not most in reality) the likelihood functions cannot be determined easily. As introduced in §1a, ABC [61,62] was developed to circumvent this difficulty. Given observed data D, this likelihood-free approach directly approximates the joint posterior density of the model parameters P(θ | D). It does this through repeatedly simulating data D⋆ under a generative model with parameter values drawn from their prior distributions P(θ). These prior distributions describe the possible values that the parameter can assume or summarize all prior knowledge researchers may have. Retaining those parameter sets that generate data sufficiently ‘close’ to the observed data D, and rejecting the rest, results in a random sample from the distribution P(θ|d(D, D⋆) ≤ ɛ), where d( · , · ) is a distance metric between the observed and simulated data, and ɛ is a tolerance level determining the approximation to the true posterior P(θ | D). Modal values and credible intervals for each model parameter can then be obtained from this approximate joint posterior.

    Due to the high-dimensionality of most real-world datasets, the data D are often reduced to a summary statistic (or a set of summary statistics) S, so that we are really sampling from P(θ | d(S, S⋆) ≤ ɛ) to approximate the posterior P(θ | S). The choice of appropriate summary statistics to maximize sufficiency (i.e. such that

    What is cultural transmission example?
    ) is not straightforward, and is an active area of statistical research (e.g. [67] and see also §2e). There have been many extensions to this initial basic—and inefficient—rejection algorithm, including weighting the retained parameter sets dependent on their exact distances d( · , · ) through regression methods (e.g. [61,68]) or increasing the efficiency of sampling from the prior distributions (e.g. [69,70]).

    The output of any ABC procedure is the joint posterior distribution of the model parameters θ = (θ1, …, θk) (and derived from that the marginal posterior distributions), indicating the range of the parameter space that is able to produce frequency data within a given tolerance level ɛ of the observed data, and consequently the learning strategies that are consistent with the data. We stress that the obtained posterior distribution is only a good approximation of the ‘true’, posterior distribution for small tolerance levels ɛ. Therefore if the obtained ɛ is large—and cannot be improved upon—the inferred parameter spaces are likely not meaningful. This situation may point to an inadequacy of the model, and therefore the assumed social learning processes, to explain the data. The explanatory value of the obtained posterior distribution can be investigated by posterior predictive checks [71]. These assess how well the parameter ranges specified by the posterior distribution explain the observed data (see [65] for further detail and §2d(i)). Additionally, cross validation tests or coverage plots have been developed to further investigate the accuracy of the results of the ABC analysis [40,72,73]. In practice, performing ABC analyses has been made relatively straightforward since the release of software such as DIY-ABC [74], ABCtoolbox [75], and R packages abc [72], abctools [76] and EasyABC [77].

    Finally, we note that as well as estimating parameters, ABC has also been used to test between multiple competing models, by estimating Bayes factors from the relative proportions of simulations accepted from each model (e.g. [62]). While it has been shown that this approach is not theoretically justified [78] when reducing the data D to summary statistics S—as owing to the loss of information this approximation does not necessarily converge on the true Bayes factors—a number of authors have successfully applied various simulation-based power analyses to mitigate this problem (see for example [63,64,79]). And more recently another approach utilizing machine learning algorithms—and in particular random forests—has begun to prove successful for complex ABC model selection [80,81].

    It is well-known that efforts to understand learning processes based on population-level data may be confounded by equifinality (e.g. [35,36] for a recent discussion). The inference framework introduced above generates posterior distributions of the model parameter describing the learning strategies that are consistent with the observed data. Therefore, the widths of these distributions, or their credible intervals, may provide an informal measure of the level of equifinality [82]. If the posterior distributions are narrow then only a small region of the parameter space is consistent with the data and therefore a large number of learning processes are not able to produce the observed frequency changes. In this case, the data carry a relatively strong signature of the underlying processes of social learning. By contrast, if the distributions are wide, a large region of the parameter space is consistent with the data and therefore many social learning processes are able to generate very similar population-level frequency patterns.

    In this way the inference framework itself provides a way of exploring the inferential limits of population-level data of a given temporal resolution. For this, the generative model is used to simulate frequency data with a specific parameterization θ, i.e. under a known process of social learning. Applying the inference procedure to this data produces posterior distributions, and while we know that the data have been generated with a specific parameter value, these distributions indicate all other values (and therefore learning processes) that could produce the ‘observed’ frequency changes equally well. Wide posterior distributions then mean that researchers should not expect cultural data—which is likely to be more noisy compared to pseudo-data produced by the generative model—with a similar temporal resolution to provide much information about underlying mechanisms.

    But when do we consider a marginal posterior distribution narrow? One possibility is to compare the widths of prior and posterior distributions of the parameter in question. As mentioned above, the prior distribution describes the possible values that the parameter can assume or summarizes all prior knowledge researchers may have (see blue, solid line in figure 2 for an example of a uniform, uninformative prior distribution). If the parameter range covered by the posterior distribution is smaller compared to the range covered by the prior distribution (see the red, dashed line in figure 2 as an example) then the inference procedure led to the exclusion of some learning hypotheses: social learning processes described by parameter values not covered by the posterior distribution cannot generate theoretical data sufficiently close to the observed data and are consequently not considered to be consistent with the observations. Naturally, the smaller the credible interval the more the pool of potential hypotheses can be reduced, and the stronger the signature of underlying social learning processes in the observed population-level data. If, however, the parameter ranges covered by prior and posterior distributions are almost identical (see the red, dotted line in figure 2 as an example), then a priori knowledge of the researchers cannot be improved by analysing such data at the given resolution.

    What is cultural transmission example?

    Figure 2. Blue solid line: uninformative, uniform prior distribution for parameter θ1; red dashed and red dotted lines: potential marginal posterior distributions for θ1. (Online version in colour.)

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Additionally, cross validation analyses as suggested in [72] provide an alternative way of demonstrating how informative the data are about underlying social learning processes. In this context, we showed in [83] that we should not expect to be able to distinguish between unbiased transmission and moderately strong frequency-dependent selection based on frequency information of a population of cultural variants at two different points in time.

    To demonstrate the applicability and utility of the generative inference framework described above, we summarize in the following the analysis of a cultural dataset from the earliest-known farming population in Central Europe, the so-called linearbandkeramik (LBK) from approximately 7500–7000 years ago (see [84] for the complete analysis). The dataset records the frequencies of different types of decorated vessels at seven different points in time, denoted by tj, j = 1, … , 7 defining six phases of cultural change that vary in duration. The aim of this study was to explore whether observed frequency changes in different types of pottery between the beginning and the end of each of the six phases are consistent with a specific hypothesis about the underlying social learning processes, in particular unbiased transmission, frequency-dependent selection and pro-novelty selection. For the sake of brevity, we consider in the following unbiased and frequency-dependent selection only.

    The first step of the inference framework is the development of the generative model. To make use of all available archaeological information, we used a simulation approach that accounted for the fact that the observed frequencies describe a sample and not the population of pottery types. Starting from observed data, the absolute frequencies D(tj) = [n1, …, nk] of k different variant types in the sample of size n(tj) at the beginning of the phase, tj, we generated a population of cultural variants P(tj) = [R1, …, Rk, Rk+1] from which the sample could have been drawn at random using the Dirichlet distribution approach [71]. The variables Ri represent the absolute frequency of variant type i in the population. Importantly, the population consists of k + 1 variant types, where the type k + 1 contains all variants of types not observed in the sample at tj.

    Based on this population P(tj) = [R1, … , Rk, Rk+1] and an estimate of the population size N(tj) at time tj (if no other information is available the population size N(tj) at time tj is inferred from the size of the sample at this time), we generated population-level frequencies of the k + 1 variant types conditioned on a specific process of social learning at each time step t = 1, … , tj+1 − tj. For that, we assumed that in each time step a fraction r of the population of cultural variants is removed and new variants are subsequently added (in this way the framework can accommodate temporal changes in population size). While the removal process is random, the replacement process is defined by the assumed process of social learning. In detail, a variant type i, i = 1, … , k is chosen to be added to the population according to the probability

    What is cultural transmission example?

    2.1

    where N(t) denotes the population size at time t, Ni(t) is the number of variants of type i, u is the total number of variants removed at this time step, ui is the number of variants removed of type i, bfreq controls the strength of frequency-dependent selection and
    What is cultural transmission example?
    is the number of variant types present at time t. Importantly, choosing bfreq = 0 in equation (2.1) models unbiased transmission, whereas bfreq > 0 describes the selective advantage for high-frequency variant types and bfreq > 0 for low-frequency types. Further, the variable μ defines the probability with which a novel variant type not previously seen in the population is introduced into the system. A similar probability as in equation (2.1) is defined for variant type k + 1, containing all variant types not observed in the sample at t1 and, per definition, all subsequent innovations.

    Lastly, to generate theoretical samples at the end of the phase tj+1, we randomly drew n(tj+1) cultural variants from the (theoretical) populations P(tj + t), t = 1, … , tj+1 − tj.

    In summary, the output of this framework is sample frequencies of the variant types that were present at the beginning of the phase, tj, and an additional type containing all unobserved variants at the end of the phase, tj+1, conditioned on the social learning process specified by the parameter bfreq in equation (2.1). which controls the strength of the frequency-dependent selection.

    To infer the learning processes consistent with the observed changes in frequency between the beginning and the end of the phases, we applied an ABC procedure—specifically SMC ABC (e.g. [70])—and determined the joint posterior distributions of (bfreq, r). The replacement fraction r cannot be estimated from external sources and therefore has to be inferred from the data as well. Thereby the comparison between empirical and theoretical patterns was based on the absolute difference of the theoretical and observed frequencies of the k variant types present at the beginning of the simulation. Additionally, we required the same number of initially present variant types to have gone extinct at the end of the phase. The general scheme of the proposed generative inference framework is illustrated in figure 3.

    What is cultural transmission example?

    Figure 3. Schematic representation of the proposed inference framework. As processes of social learning act on the population-level (as opposed to the sample-level), we generate population distributions from which the observed sample of cultural variants at the beginning of the phase, t1, could have been drawn at random using the Dirichlet distribution approach. Subsequently, we transform these populations over time until the end of the phase, t2, conditioned on the assumed social learning process. To obtain theoretical samples at time t2 that can be compared to the observed sample we randomly select cultural variants from the generated populations between t1 and t2. (Online version in colour.)

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Applying this analysis to all six phases, we concluded that

    • (i) frequency-dependent selection does not describe the cultural dataset from the earliest farming population in Central Europe better than unbiased transmission. In fact, the credible intervals of all six marginal posterior distributions for bfreq contained the value 0, which means that unbiased transmission cannot be excluded as a potential explanation of the data by this analysis (see figure 4a,b for an example);

    • (ii) frequency-dependent selection and unbiased transmission may not be the best model to explain the observed data as the achieved tolerance levels (i.e. the ‘distance’, between empirical and observed patterns) of the ABC analysis were relatively large.

    Point (ii) suggests that the social learning hypotheses considered are not consistent with the data, which requires a re-evaluation of the generative model. Indeed, we showed in [84] that pro-novelty selection, which captures the preference for ‘young’, or recently introduced, cultural variant types, is able to replicate the observed frequency changes between the different phases and is therefore a possible explanation of the data.
    What is cultural transmission example?

    Figure 4. (a) Joint posterior distribution for the selection strength bfreq and the replacement fraction r for phase V of the considered dataset. (b) Corresponding marginal distribution for the selection strength bfreq. It is obvious that bfreq = 0 is contained in the credible interval. (c) Outcome of the posterior predictive check: 95% prediction intervals, of the marginal frequency distribution of the different variant types at the end of the phase. The observed frequencies are indicated by the ‘*’ symbol. Many observations lie outside the prediction intervals, pointing to an inadequate description of the data by frequency dependent and unbiased transmission. (Online version in colour.)

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Posterior predictive checks further highlighted the problem raised in point (ii). To perform this, we sampled values of the model parameters from the joint posterior distribution, inserted these into the generative model and produced theoretical frequencies at the end of each phase. Repeating this procedure generated theoretical expectations of the frequency ranges for each individual variant type based on the joint posterior distribution. The comparison of the observed frequencies of each variant type with these frequency ranges allowed the explanatory power of the derived posterior distribution to be assessed. If observations are outside the theoretical expectations then the inferred social learning processes cannot replicate all aspects of the dynamic of cultural change, indicating a mismatch between theory and data. This analysis also has the potential to reveal single variant types whose temporal frequency patterns deviate from the general population trend (see [65] for more details). Applying the posterior predictive check to the case study showed that a number of observations were outside their (theoretical) frequency ranges as determined by the joint posterior distributions (see figure 4c for an example).

    In the last section, we demonstrated the application of a generative inference framework to a specific archaeological dataset. Traditionally, Bayesian inference in archaeology has been largely limited to age estimation via14C analyses (e.g. [85,86]), but recently the scope of inference techniques has been vastly broadened, with ABC approaches enjoying increasing popularity (e.g. [65,82,87–90]). In one of the first archaeological applications, Crema et al. [87] studied frequency changes of weaponry types in the Jura region of southeast France. The dataset comprises arrowheads of 20 types attributed to 9 chronologically distinct phases. The aim of this study was to analyse whether the temporal frequency change of the different arrowhead types contained evidence for, or against, unbiased transmission or frequency-dependent selection. Using an agent-based simulation as their generative model, the authors produced frequency change patterns under different hypotheses of social learning and under the assumption that the cultural system is at equilibrium. They compared these theoretical patterns to the observed data by measuring the dissimilarity between assemblages. Applying an ABC model selection framework, they concluded that both unbiased transmission and negative-frequency dependent selection could have generated the observed frequency differences within the phases and therefore excluded positive-frequency dependent selection as a possible mechanism of cultural evolution.

    But ABC frameworks have not been exclusively used to infer underlying social learning strategies. Porčić & Nikolić [88] analysed the demographic properties of the Mesolithic–Neolithic transition in the Central Balkan region, in particular growth rates and population size estimates for the Lepenski Vir population. Their model generated the expected number of accumulated houses for a large range of demographic scenarios which could then be compared to that observed in the archaeological record. The analysis revealed higher initial growth rates compared to other populations undergoing the Neolithic demographic transition and an increase in population size over time.

    In order to highlight the breadth of questions that can be addressed within a generative inference framework we outline two further applications, one to historical studies (a field with no strong tradition of quantitative treatments) and to linguistics. Rubio-Campillo [91] investigated the evolution of combat. He explored the validity of different versions of Lanchester's law predicting the causalities of two enemy forces engaged in a land battle, with a dataset comprising the total number of combatants and causalities from 1080 land battles spanning from the middle of the seventeenth to the beginning of the twentieth century. The three most common formulations of Lanchester's law (linear, squared and logarithmic) can be operationalized using difference equations, and iterating these until one of the forces has suffered as many causalities as recorded in the historical record allowed for the comparison between theoretical and observed data. Besides confirming well-known results, the ABC framework pointed to a gradual decrease in the relevance of individual fighting abilities, suggesting that the plausibility of the models is not constant over the different periods.

    Lastly, Thouzeau et al. [81] investigated the coevolution between genes and languages at a regional scale. They simulated population genetic and cognate data under various historical models encompassing divergences and multiple borrowings and admixture events between linguistic groups. They applied an ABC framework using linguistic and genetic data from across Central Asia, and were able to reconstruct the partly differing evolutionary scenarios underlying linguistic and genetic differentiation in the region.

    Naturally, the application of the generative inference framework presented here has to proceed with caution. It is, after all, an analysis based on an underlying model of cultural change. If this model does not capture the main cultural and demographic processes contributing to the observed temporal frequency changes, the inferences obtained will likely be misleading. In the following, we outline some issues researchers should consider before applying this, or a similar, inference framework.

    In this paper, we advocate the use of non-equilibrium frameworks. While this modelling choice allows us to include knowledge about, for example, temporal changes in demographic properties and to initialize the model with observed variant frequencies, it also introduces a time-dependency. The inference framework evaluates whether frequency changes between different time points are consistent with the changes expected under a specific learning process (instead of evaluating whether statistics such as the level of cultural diversity at each point time are consistent with the equilibrium diversity prediction) and therefore misspecifications of time points and consequently the duration of the period over which the frequency changes are measured can produce erroneous theoretical expectations. Crema et al. [65] argue that the equilibrium assumption should serve as a hypothesis to be tested, rather than simply held a priori. They applied equilibrium and non-equilibrium versions of the generative model of cultural change to a dataset similar to the one described in §2d(i). They concluded that the cultural system was likely not at equilibrium and found hints for shifts between negative and positive frequency-dependent selection for different phases of the archaeological record.

    In the archaeological case study described in §2d(i), the temporal change in population size between the beginning and the end of each archaeological phase has been inferred from the change in sample size, and any increase or decrease was assumed to occur in a linear fashion over the relevant time interval. While the assumption of linear change seems plausible, especially in the absence of other information, drastic, unobserved demographic events such as population bottlenecks may be an alternative scenario. Similar to the discussion about equilibrium versus non-equilibrium models, such hidden demographic events have the potential to influence the dynamic of cultural change (e.g. [92]). As they are not included in the generative model, their influences may be mistakenly attributed to social learning processes that are able to produce a similar effect at the population level. But this potential pitfall is also itself amenable to testing with the generative inference framework. Researchers can at least evaluate the extent to which posterior distributions change when assuming a population bottleneck between the beginning and the end of the phase.

    The accuracy of ABC inference depends partly on how the difference between observed and simulated data is calculated and on the achieved tolerance level. Calculating the difference based on summary statistics S instead of the full data D results in discarding likely useful information [93]. If a summary statistic (or set of) is not sufficient—as is generally the case in practice—the resulting posterior distribution will not be equal to that computed with the full data [94] (see also [93] for a review of strategies dealing with this issue). While the impact of using insufficient statistics on inference results can be mitigated by careful application, we note that by using the actual frequencies for calculating the difference between observed and simulated data this problem is circumvented entirely. Further, any posterior distribution with large tolerance levels does not approximate the ‘true’ posterior distribution and should be treated with caution. In this case, the generative model may not produce data that are sufficiently close to the observed data. Additional procedures such as posterior predictive checks, cross validation tests or coverage plots offer additional insights into the accuracy of the inference results.

    Lastly, we point to the relationship between data quality or completeness and inferential accuracy. A recent study [31] revealed the importance of rare variants for inferring underlying processes. Using the progeny distribution (which records the frequencies of cultural variant types that produce k new variants over a fixed period of time) as a statistic, the authors showed that analyses based on only the most popular variants, as is often necessarily the case in cultural evolutionary studies, can provide misleading evidence for underlying transmission hypotheses. Especially in archaeological case studies, the observed frequencies describe the composition of often relatively small samples of cultural variants, and consequently rare variant types are likely not sampled and therefore absent from the data. Even though statistical techniques such as the Dirichlet distribution approach mentioned in §2d(i) are available, the number of rare variant types, i.e. types that are not contained in the observed sample, is likely to be misspecified (e.g. [95]) and future work is needed to understand the influence of missing data on the accuracy of the generative inference frameworks presented in this paper.

    Relatively recent developments in population genetics—namely coalescent modelling and ABC—have made generative inference possible, and shown it to be a powerful inferential framework for understanding the human past (e.g. [57,58,63,64]). Cultural evolutionary theory has been greatly advanced by adopting concepts and modelling paradigms originating in population genetics. In this spirit, the aim of this paper was to demonstrate how analogous generative inference frameworks can be applied to cultural frequency data, potentially allowing us to close the gap between theoretical modelling work and empirical work in cultural evolution.

    In particular, we focused on the topic of inferring how human populations use social information based on the available empirical evidence. In many case studies of interest, the available data are in the form of frequencies of different variants of a cultural trait in the population at one or several points in time, which means that we face a classical inverse problem. Naturally, attempting to address this problem leads to the question of how much information about underlying processes of social learning can in fact be extracted from cultural frequency data of a given resolution. The framework outlined here allows us to address this equifinality problem. At the heart of this framework is a generative model, which captures the main cultural and demographic properties of the system considered. As noted, there are no restrictions on the type of model used, with the one described in §2d(i) simply an example tailored specifically to the observed population-level frequency data. Whatever their form, these models establish a causal link between model parameters controlling the strengths of underlying evolutionary processes and observable population-level patterns; in our case, between parameters controlling the strengths of social learning processes and population-level frequencies of cultural variant types. Bayesian inference techniques, such as ABC, can then evaluate whether this specific process of social learning is able to produce frequency patterns consistent with the observed ones.

    The outcome of this inference approach is posterior distributions of the model parameters describing the learning processes that are consistent with the observed data. As discussed in §2e, while there are a number of important factors potentially influencing the accuracy of the analysis to consider, the widths of the posterior distributions may be indicative of the amount of information about the underlying social learning processes contained in the data. Narrow posterior distributions indicate that the data carry a relatively strong signature of these processes, while wider distributions suggest that the data are largely uninformative or that the models considered do not provide an adequate description of the cultural system. Therefore, this approach does not only allow for the identification of the most likely underlying learning process given the empirical data, but also for a description of the breadth of processes that could have produced the these data equally well.

    Revealing the presence of equifinality may appear to be a negative result, but we stress that one should not expect a unique mapping between (sparse) population-level frequency data and underlying processes of cultural evolution [4,36]. Nevertheless, the analysis of such data will help in excluding social learning processes that could not have produced the observed data. In this way inference frameworks will lead to a reduction in the pool of potential hypotheses (even though the level of reduction might vary from case study to case study) and to an understanding of which kinds of scientific questions can be answered by which kinds of data. Additionally, we note that generative inference frameworks inform about the consistency of a limited set of possible underlying mechanisms with the data while not excluding the possibility that other mechanisms may be consistent as well. However, this should not necessarily be seen as a weakness, and as pointed out by Csilléry et al. [93, p. 413], ‘in reality scientific arguments often revolve around a limited number of hypotheses or scenarios without the need to consider an infinite set of alternative models. Models can always be improved and refined by other authors, allowing an open discussion that can greatly increase our understanding of the problem being studied.’

    Undoubtedly, more research is needed to further develop and improve the statistical tools and to explore the influence of e.g. unobserved changes in the demographic properties of the system considered or of the quality of the observed data on the accuracy of generative inference frameworks, but we believe this is an exciting and promising new direction in cultural evolution that has already begun to produce interesting results.

    This article has no additional data.

    We declare we have no competing interests.

    No funding has been received for this article.

    We thank Nicole Creanza, Oren Kolodny and Mark Feldman for inviting us to contribute to this special issue. Further, we thank two anonymous reviewers for their constructive comments and criticisms, which helped us improve this manuscript and members of the department of Human Behavior, Ecology and Culture at the Max Planck Institute for Evolutionary Anthropology for helpful comments on an earlier version of the manuscript.

    Footnotes

    One contribution of 16 to a theme issue ‘Bridging cultural gaps: interdisciplinary studies in human cultural evolution’.

    References

    • 1

      Rendell L, Fogarty L, Hoppitt WJE, Morgan TJH, Webster MM, Laland KN. 2011Cognitive culture: theoretical and empirical insights into social learning strategies. Trends. Cogn. Sci. (Regul. Ed.) 15, 68–76. (doi:10.1016/j.tics.2010.12.002) Google Scholar

    • 2

      Heyes CM. 1994Social learning in animals: categories and mechanisms. Biol. Rev. 69, 207–231. (doi:10.1111/j.1469-185X.1994.tb01506.x) Crossref, PubMed, ISI, Google Scholar

    • 3

      Hoppitt W, Laland KN. 2013Social learning: an introduction to mechanisms, methods, and models. Princeton, NJ: Princeton University Press. Crossref, Google Scholar

    • 4

      Cavalli-Sforza LL, Feldman MW. 1981Cultural transmission and evolution: a quantitative approach. Princeton, NJ: Princeton University Press. Google Scholar

    • 5

      Boyd R, Richerson PJ. 1985Culture and the evolutionary process. Chicago, IL: University of Chicago Press. Google Scholar

    • 6

      Rendell Let al.2010Why copy others? Insights from the social learning strategies tournament. Science 328, 208–213. (doi:10.1126/science.1184719) Crossref, PubMed, ISI, Google Scholar

    • 7

      Giraldeau L-A, Valone TJ, Templeton JJ. 2002Potential disadvantages of using socially acquired information. Phil. Trans. R. Soc. Lond. B 357, 1559–1566. (doi:10.1098/rstb.2002.1065) Link, ISI, Google Scholar

    • 8

      Laland KN. 2004Social learning strategies. Anim. Learn. Behav. 32, 4–14. (doi:10.3758/BF03196002) Crossref, Google Scholar

    • 9

      Coultas JC. 2004When in Rome … an evolutionary perspective on conformity. Group. Process. Intergroup. Relat. 7, 317–331. (doi:10.1177/1368430204046141) Crossref, ISI, Google Scholar

    • 10

      Baum WM, Richerson PJ, Efferson CM, Paciotti BM. 2004Cultural evolution in laboratory microsocieties including traditions of rule giving and rule following. Evol. Hum. Behav. 25, 305–326. (doi:10.1016/j.evolhumbehav.2004.05.003) Crossref, ISI, Google Scholar

    • 11

      McElreath R, Bell AV, Efferson C, Lubell M, Richerson PJ, Waring T. 2008Beyond existence and aiming outside the laboratory: estimating frequency-dependent and pay-off-biased social learning strategies. Phil. Trans. R. Soc. B 363, 3515–3528. (doi:10.1098/rstb.2008.0131) Link, ISI, Google Scholar

    • 12

      Morgan TJH, Rendell LE, Ehn M, Hoppitt W, Laland KN. 2012The evolutionary basis of human social learning. Proc. R. Soc. B 279, 653–662. (doi:10.1098/rspb.2011.1172) Link, ISI, Google Scholar

    • 13

      Mesoudi A, O'Brien MJ2008The cultural transmission of great basin projectile-point technology (i): an experimental simulation. Am. Antiq. 73, 3–28. (doi:10.2307/25470521) Crossref, ISI, Google Scholar

    • 14

      Caldwell CA, Millen AE. 2009Social learning mechanisms and cumulative cultural evolution: is imitation necessary?Psychol. Sci. 20, 1478–1483. (doi:10.1111/j.1467-9280.2009.02469.x) Crossref, PubMed, ISI, Google Scholar

    • 15

      Kirby S, Cornish H, Smith K. 2008Cumulative cultural evolution in the laboratory: an experimental approach to the origins of structure in human language. Proc. Natl Acad. Sci. USA 105, 10 681–10 686. (doi:10.1073/pnas.0707835105) Crossref, ISI, Google Scholar

    • 16

      Schillinger K, Mesoudi A, Lycett SJ. 2015The impact of imitative versus emulative learning mechanisms on artifactual variation: implications for the evolution of material culture. Evol. Hum. Behav. 36, 446–455. (doi:10.1016/j.evolhumbehav.2015.04.003) Crossref, ISI, Google Scholar

    • 17

      Whiten A, Caldwell CA, Mesoudi A. 2016Cultural diffusion in humans and other animals. Curr. Opin. Psychol. 8, 15–21. (doi:10.1016/j.copsyc.2015.09.002) Crossref, PubMed, ISI, Google Scholar

    • 18

      Aoki K, Feldman MW. 2014Evolution of learning strategies in temporally and spatially variable environments: a review of theory. Theor. Popul. Biol. 91, 3–19. (doi:10.1016/j.tpb.2013.10.004) Crossref, PubMed, ISI, Google Scholar

    • 19

      Guglielmino CR, Viganotti C, Hewlett B, Cavalli-Sforza LL. 1995Cultural variation in Africa: role of mechanisms of transmission and adaptation. Proc. Natl Acad. Sci. USA 92, 7585–7589. (doi:10.1073/pnas.92.16.7585) Crossref, PubMed, ISI, Google Scholar

    • 20

      Beheim BA, Thigpen C, McElreath R. 2014Strategic social learning and the population dynamics of human behavior: the game of Go. Evol. Hum. Behav. 35, 351–357. (doi:10.1016/j.evolhumbehav.2014.04.001) Crossref, ISI, Google Scholar

    • 21

      Henrich J, Broesch J. 2011On the nature of cultural transmission networks: evidence from Fijian villages for adaptive learning biases. Phil. Trans. R. Soc. B 366, 1139–1148. (doi:10.1098/rstb.2010.0323) Link, ISI, Google Scholar

    • 22

      McElreath R. 2018A long-form research program in human behavior, ecology, and culture. White paper available under http://www.eva.mpg.de/fileadmin/content_files/staff/richard_mcelreath/pdf/HBEC_whitepaper.pdf (Last accessed 9 January 2018.). Google Scholar

    • 23

      Henrich J. 2004Cultural group selection, coevolutionary processes and large-scale cooperation. J. Econ. Behav. Organ. 53, 3–35. (doi:10.1016/S0167-2681(03)00094-5) Crossref, ISI, Google Scholar

    • 24

      Rogers EM. 1995Diffusion of innovations. New York, NY: Free Press. Google Scholar

    • 25

      Neiman FD. 1995Stylistic variation in evolutionary perspective: inferences from decorative diversity and interassemblage distance in Illinois woodland ceramic assemblages. Am. Antiq. 60, 7–36. (doi:10.2307/282074) Crossref, ISI, Google Scholar

    • 26

      Shennan SJ, Wilkinson JR. 2001Ceramic style change and neutral evolution: a case study from neolithic Europe. Am. Antiq. 66, 577–593. (doi:10.2307/2694174) Crossref, ISI, Google Scholar

    • 27

      Kohler TA, VanBuskirk S, Ruscavage-Barz S. 2004Vessels and villages: evidence for conformist transmission in early village aggregations on the Pajarito Plateau, New Mexico. J. Anthropol. Archaeol. 23, 100–118. (doi:10.1016/j.jaa.2003.12.003) Crossref, ISI, Google Scholar

    • 28

      Premo LS. 2014Cultural transmission and diversity in time-averaged assemblages. Curr. Anthropol. 55, 105–114. (doi:10.1086/674873) Crossref, ISI, Google Scholar

    • 29

      Hahn MW, Bentley RA. 2003Drift as a mechanism for cultural change: an example from baby names. Proc. R. Soc. Lond. B 270(Suppl. 1), S120–S123. (doi:10.1098/rsbl.2003.0045) Link, ISI, Google Scholar

    • 30

      Bentley RA, Hahn MW, Shennan SJ. 2004Random drift and culture change. Proc. R. Soc. Lond. B 271, 1443–1450. (doi:10.1098/rspb.2004.2746) Link, ISI, Google Scholar

    • 31

      O'Dwyer JP, Kandler A. 2017Inferring processes of cultural transmission: the role of rare variants for distinguishing neutrality from novelty biases. Phil. Trans. R. Soc. B 372, 20160426. (doi:10.1098/rstb.2016.0426) Link, ISI, Google Scholar

    • 32

      Bentley RA, Lipo CP, Herzog HA, Hahn MW. 2007Regular rates of popular culture change reflect random copying. Evol. Hum. Behav. 28, 151–158. (doi:10.1016/j.evolhumbehav.2006.10.002) Crossref, ISI, Google Scholar

    • 33

      Acerbi A, Bentley RA. 2014Biases in cultural transmission shape the turnover of popular traits. Evol. Hum. Behav. 35, 228–236. (doi:10.1016/j.evolhumbehav.2014.02.003) Crossref, ISI, Google Scholar

    • 34

      Eerkens JW, Bettinger RL, McElreath R. 2006Cultural transmission, phylogenetics, and the archaeological record. In Mapping our ancestors: Phylogenetic methods in anthropology and prehistory (eds Lipo CP, O'Brien MJ, Collard M, Shennan SJ), pp. 169–83. New York, NY: Aldine. Google Scholar

    • 35

      Von Bertalanffy L. 1969General system theory: foundations, development, applications (revised edition). New York, NY: George Braziller Inc. Google Scholar

    • 36

      Premo LS. 2010Equifinality and explanation: the role of agent-based modeling in postpositivist archaeology. In Simulating change: Archaeology into the twenty-first century (eds Costopoulos A, Lake MW), pp. 28–37. Salt Lake City, UT: University of Utah Press. Google Scholar

    • 37

      Schelling TC. 1971Dynamic models of segregation. J. Math. Sociol. 1, 143–186. (doi:10.1080/0022250X.1971.9989794) Crossref, ISI, Google Scholar

    • 38

      Epstein JM, Axtell R. 1996Growing artificial societies: social science from the bottom up. Washington, DC: Brookings Institution Press. Crossref, Google Scholar

    • 39

      Epstein JM. 2006Generative social science: Studies in agent-based computational modeling. Princeton, NJ: Princeton University Press. Google Scholar

    • 40

      van der Vaart E, Beaumont MA, Johnston ASA, Sibly RM. 2015Calibration and evaluation of individual-based models using approximate Bayesian computation. Ecol. Modell. 312, 182–190. (doi:10.1016/j.ecolmodel.2015.05.020) Crossref, ISI, Google Scholar

    • 41

      Kandler A, Wilder B, Fortunato L. 2017Inferring individual-level processes from population-level patterns in cultural evolution. R. Soc. open. sci. 4, 170949. (doi:10.1098/rsos.170949) Link, ISI, Google Scholar

    • 42

      Ewens WJ. 2004Mathematical population genetics 1: Theoretical introduction. Berlin, Germany: Springer Science & Business Media. Crossref, Google Scholar

    • 43

      Fisher RA. 1930The genetical theory of natural selection: a complete variorum edition. Oxford, UK: Oxford University Press. Crossref, Google Scholar

    • 45

      Moran PAP. 1958Random processes in genetics. Math. Proc. Cambridge Philos. Soc. 54, 60–71. (doi:10.1017/S0305004100033193) Crossref, Google Scholar

    • 46

      Kingman JFC. 1982The coalescent. Stoch. Process. Their Appl. 13, 235–248. (doi:10.1016/0304-4149(82)90011-4) Crossref, Google Scholar

    • 47

      Hudson RR. 1983Properties of a neutral allele model with intragenic recombination. Theor. Popul. Biol. 23, 183–201. (doi:10.1016/0040-5809(83)90013-8) Crossref, PubMed, ISI, Google Scholar

    • 49

      Tavaré S. 1984Line-of-descent and genealogical processes, and their applications in population genetics models. Theor. Popul. Biol. 26, 119–164. (doi:10.1016/0040-5809(84)90027-3) Crossref, PubMed, ISI, Google Scholar

    • 50

      Li H, Durbin R. 2011Inference of human population history from individual whole-genome sequences. Nature 475, 493–496. (doi:10.1038/nature10231) Crossref, PubMed, ISI, Google Scholar

    • 51

      Schiffels S, Durbin R. 2014Inferring human population size and separation history from multiple genome sequences. Nat. Genet. 46, 919–925. (doi:10.1038/ng.3015) Crossref, PubMed, ISI, Google Scholar

    • 52

      Pickrell JK, Pritchard JK. 2012Inference of population splits and mixtures from genome-wide allele frequency data. PLoS. Genet. 8, e1002967. (doi:10.1371/journal.pgen.1002967) Crossref, PubMed, ISI, Google Scholar

    • 53

      Patterson N, Moorjani P, Luo Y, Mallick S, Rohland N, Zhan Y, Genschoreck T, Webster T, Reich D. 2012Ancient admixture in human history. Genetics 192, 1065–1093. (doi:10.1534/genetics.112.145037) Crossref, PubMed, ISI, Google Scholar

    • 54

      Hellenthal G, Busby GBJ, Band G, Wilson JF, Capelli C, Falush D, Myers S. 2014A genetic atlas of human admixture history. Science 343, 747–751. (doi:10.1126/science.1243518) Crossref, PubMed, ISI, Google Scholar

    • 55

      Gronau I, Hubisz MJ, Gulko B, Danko CG, Siepel A. 2011Bayesian inference of ancient human demography from individual genome sequences. Nat. Genet. 43, 1031–1034. (doi:10.1038/ng.937) Crossref, PubMed, ISI, Google Scholar

    • 56

      Currat M, Ray N, Excoffier L. 2004Splatche: a program to simulate genetic diversity taking into account environmental heterogeneity. Mol. Ecol. Resour. 4, 139–142. (doi:10.1046/j.1471-8286.2003.00582.x) Crossref, ISI, Google Scholar

    • 57

      Ray N, Wegmann D, Fagundes NJR, Wang S, Ruiz-Linares A, Excoffier L. 2009A statistical evaluation of models for the initial settlement of the American continent emphasizes the importance of gene flow with Asia. Mol. Biol. Evol. 27, 337–345. (doi:10.1093/molbev/msp238) Crossref, PubMed, ISI, Google Scholar

    • 58

      Eriksson A, Betti L, Friend AD, Lycett SJ, Singarayer JS, von Cramon-Taubadel N, Valdes PJ, Balloux F, Manica A. 2012Late Pleistocene climate change and the global expansion of anatomically modern humans. Proc. Natl Acad. Sci. USA 109, 16 089–16 094. (doi:10.1073/pnas.1209494109) Crossref, ISI, Google Scholar

    • 59

      Bramanti B et al. 2009Genetic discontinuity between local hunter–gatherers and central Europe's first farmers. Science 326, 137–140. (doi:10.1126/science.1176869) Crossref, PubMed, ISI, Google Scholar

    • 60

      Bollongino R, Nehlich O, Richards MP, Orschiedt J, Thomas MG, Sell C, Fajkošová Z, Powell AT, Burger J. 20132000 years of parallel societies in stone age central Europe. Science 342, 479–481. (doi:10.1126/science.1245049) Crossref, PubMed, ISI, Google Scholar

    • 62

      Pritchard JK, Seielstad MT, Perez-Lezaun A, Feldman MW. 1999Population growth of human Y chromosomes: a study of Y chromosome microsatellites. Mol. Biol. Evol. 16, 1791–1798. (doi:10.1093/oxfordjournals.molbev.a026091) Crossref, PubMed, ISI, Google Scholar

    • 63

      Veeramah KR, Wegmann D, Woerner A, Mendez FL, Watkins JC, Destro-Bisol G, Soodyall H, Louie L, Hammer MF. 2011An early divergence of KhoeSan ancestors from those of other modern humans is supported by an ABC-based analysis of autosomal resequencing data. Mol. Biol. Evol. 29, 617–630. (doi:10.1093/molbev/msr212) Crossref, PubMed, ISI, Google Scholar

    • 64

      Posth Cet al.2016Pleistocene mitochondrial genomes suggest a single major dispersal of non-Africans and a late glacial population turnover in Europe. Curr. Biol. 26, 827–833. (doi:10.1016/j.cub.2016.01.037) Crossref, PubMed, ISI, Google Scholar

    • 65

      Crema ER, Kandler A, Shennan SJ. 2016Revealing patterns of cultural transmission from frequency data: equilibrium and non-equilibrium assumptions. Sci. Rep. 6, 39122 (doi:10.1038/srep39122) Crossref, PubMed, ISI, Google Scholar

    • 66

      Steele J, Glatz C, Kandler A. 2010Ceramic diversity, random copying, and tests for selectivity in ceramic production. J. Archaeol. Sci. 37, 1348–1358. Crossref, ISI, Google Scholar

    • 67

      Harrison J, Baker R. 2017An automatic adaptive method to combine summary statistics in approximate Bayesian computation. (http://arxiv.org/abs/1703.02341v1) Google Scholar

    • 68

      Blum MGB, François O. 2010Non-linear regression models for approximate Bayesian computation. Stat. Comput. 20, 63–73. (doi:10.1007/s11222-009-9116-0) Crossref, ISI, Google Scholar

    • 69

      Marjoram P, Molitor J, Plagnol V, Tavaré S. 2003Markov chain Monte Carlo without likelihoods. Proc. Natl Acad. Sci. USA 100, 15 324–15 328. (doi:10.1073/pnas.0306899100) Crossref, ISI, Google Scholar

    • 70

      Toni T, Welch D, Strelkowa N, Ipsen A, Stumpf MPH. 2009Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J. R. Soc. Interface 6, 187–202. (doi:10.1098/rsif.2008.0172) Link, ISI, Google Scholar

    • 71

      Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. 2013Bayesian data analysis. 3rd edn. Boca Raton, FL: CRC Press. Crossref, Google Scholar

    • 72

      Csilléry K, François O, Blum MGB. 2012abc: an r package for approximate bayesian computation (abc). Methods Ecol. Evol. 3, 475–479. (doi:10.1111/j.2041-210X.2011.00179.x) Crossref, ISI, Google Scholar

    • 73

      Prangle D, Blum MGB, Popovic G, Sisson SA. 2014Diagnostic tools for approximate bayesian computation using the coverage property. Aust. N. Z. J. Stat. 56, 309–329. (doi:10.1111/anzs.12087) Crossref, ISI, Google Scholar

    • 74

      Cornuet J-M, Santos F, Beaumont MA, Robert CP, Marin J-M, Balding DJ, Guillemaud T, Estoup A. 2008Inferring population history with DIY ABC: a user-friendly approach to approximate bayesian computation. Bioinformatics 24, 2713. (doi:10.1093/bioinformatics/btn514) Crossref, PubMed, ISI, Google Scholar

    • 75

      Wegmann D, Leuenberger C, Neuenschwander S, Excoffier L. 2010ABCtoolbox: a versatile toolkit for approximate Bayesian computations. BMC Bioinformatics 11, 116. (doi:10.1186/1471-2105-11-116) Crossref, PubMed, ISI, Google Scholar

    • 76

      Nunes MA, Prangle D. 2015abctools: an R package for tuning approximate Bayesian computation analyses. R. J. 7, 189–205. Crossref, ISI, Google Scholar

    • 77

      Jabot F, Faure T, Dumoulin N. 2013EasyABC: performing efficient approximate Bayesian computation sampling schemes using R. Methods Ecol. Evol. 4, 684–687. (doi:10.1111/2041-210X.12050) Crossref, ISI, Google Scholar

    • 78

      Robert CP, Cornuet J-M, Marin J-M, Pillai NS. 2011Lack of confidence in approximate Bayesian computation model choice. Proc. Natl Acad. Sci. USA 108, 15 112–15 117. (doi:10.1073/pnas.1102900108) Crossref, ISI, Google Scholar

    • 79

      Beaumont M. 2008Joint determination of topology, divergence time and immigration in population trees. In Simulations, genetics and human prehistory (eds Matsumura S, Forster P, Renfrew C), pp. 135–154. Cambridge, UK: McDonald Institute for Archaeological Research. Google Scholar

    • 80

      Pudlo P, Marin J-M, Estoup A, Cornuet J-M, Gautier M, Robert CP. 2015Reliable ABC model choice via random forests. Bioinformatics 32, 859–866. (doi:10.1093/bioinformatics/btv684) Crossref, PubMed, ISI, Google Scholar

    • 81

      Thouzeau V, Mennecier P, Verdu P, Austerlitz F. 2017Genetic and linguistic histories in central Asia inferred using approximate Bayesian computations. Proc. R. Soc. B 284, 20170706. (doi:10.1098/rspb.2017.0706) Link, ISI, Google Scholar

    • 82

      Kandler A, Laland KN. 2013Tradeoffs between the strength of conformity and number of conformists in variable environments. J. Theor. Biol. 332, 191–202. (doi:10.1016/j.jtbi.2013.04.023) Crossref, PubMed, ISI, Google Scholar

    • 83

      Kandler A, Powell A. 2015Inferring learning strategies from cultural frequency data. In Learning strategies and cultural evolution during the Palaeolithic (eds Mesoudi A, Aoki K), pp. 85–101. Berlin, Germany: Springer. Crossref, Google Scholar

    • 84

      Kandler A, Shennan SJ. 2015A generative inference framework for analysing patterns of cultural change in sparse population data with evidence for fashion trends in LBK culture. J. R. Soc. Interface 12, 20150905. (doi:10.1098/rsif.2015.0905) Link, ISI, Google Scholar

    • 85

      Ramsey CB. 2009Bayesian analysis of radiocarbon dates. Radiocarbon 51, 337–360. (doi:10.1017/S0033822200033865) Crossref, ISI, Google Scholar

    • 86

      Buck CE, Kenworthy JB, Litton CD, Smith AFM. 1991Combining archaeological and radiocarbon information: a Bayesian approach to calibration. Antiquity 65, 808–821. (doi:10.1017/S0003598X00080534) Crossref, ISI, Google Scholar

    • 87

      Crema ER, Edinborough K, Kerig T, Shennan SJ. 2014An approximate Bayesian computation approach for inferring patterns of cultural evolutionary change. J. Archaeol. Sci. 50, 160–170. (doi:10.1016/j.jas.2014.07.014) Crossref, ISI, Google Scholar

    • 88

      Porčić M, Nikolić M. 2016The approximate Bayesian computation approach to reconstructing population dynamics and size from settlement data: demography of the mesolithic–neolithic transition at Lepenski Vir. Archaeol. Anthropol. Sci. 8, 169–186. (doi:10.1007/s12520-014-0223-2) Crossref, ISI, Google Scholar

    • 89

      Edinborough K, Shennan SJ, Crema ER, Kerig T. 2015An ABC of lithic arrowheads: a case study from south-eastern France. In Neolithic Diversities. Acta Archaeologica Lundensia, Series 8o, vol. 65 (eds Brink K, Hydén S, Jenn bert K, Larsson L, Olausson D), pp. 213–223. Department of Archaeology and Ancient History, Lund University: Sweden. Google Scholar

    • 90

      Kovacevic M, Shennan SJ, Vanhaeren M, d'Errico F, Thomas MG. 2015Simulating geographical variation in material culture: were early modern humans in Europe ethnically structured? In Learning strategies and cultural evolution during the Palaeolithic (eds Mesoudi A, Aoki K), pp. 103–120. Berlin, Germany: Springer. Google Scholar

    • 91

      Rubio-Campillo X. 2016Model selection in historical research using approximate Bayesian computation. PLoS ONE 11, e0146491. (doi:10.1371/journal.pone.0146491) Crossref, PubMed, ISI, Google Scholar

    • 92

      Rorabaugh AN. 2014Impacts of drift and population bottlenecks on the cultural transmission of a neutral continuous trait: an agent based model. J. Archaeol. Sci. 49, 255–264. (doi:10.1016/j.jas.2014.05.016) Crossref, ISI, Google Scholar

    • 93

      Csilléry K, Blum MGB, Gaggiotti OE, François O. 2010Approximate Bayesian computation (ABC) in practice. Trends. Ecol. Evol. (Amst.) 25, 410–418. (doi:10.1016/j.tree.2010.04.001) Google Scholar

    • 94

      Marjoram P, Tavaré S. 2006Modern computational approaches for analysing molecular genetic variation data. Nat. Rev. Genet. 7, 759–770. (doi:10.1038/nrg1961) Crossref, PubMed, ISI, Google Scholar

    • 95

      Mao CX, Colwell RK. 2005Estimation of species richness: mixture models, the role of rare species, and inferential challenges. Ecology 86, 1143–1153. (doi:10.1890/04-1078) Crossref, ISI, Google Scholar


    Page 8

    Archaeology generates vast amounts of empirical data related to material outcomes of human social learning. These data range in scale from individual artefact traits (e.g. decorations on pots) to global trait distributions (the spread of farming) and records of technological change spanning millions of years (stone tool ‘modes', sensu Clarke [1]). Archaeologists also routinely gather data that give environmental, demographic and social context to the evolution of material culture. These anthropic and contextual data are particularly well suited to studying long-run effects of distinct cultural transmission mechanisms on material cultural evolution; identifying phylogenetic relationships among artefacts; tracking long-term cultural stability, rates of change and diffusions of innovations; and exploring cultural extinctions, instances of convergent cultural evolution and other evolutionary outcomes that may not be predicted by current models. Nonetheless, many of these are still relatively infrequent (or entirely unexplored) subjects of archaeological analysis using Darwinian concepts and methods (cf. [2]). This is partly because archaeological data are complex and more often present a fragmentary record of aggregated events than a clear and detailed account of cultural evolutionary forces acting over long timespans; we perceive a gulf between the person-to-person exchanges that drive cultural transmission and the much coarser grain of the archaeological record. But, while the gulf is both real and consequential, a similar one exists between genetics and paleontology, yet their complementarity and mutual relevance to evolutionary biology are, today, undeniable. Indeed, relatively recent advances in archaeology—and examinations of archaeological data by scholars in other fields (e.g. [3–5])—show that there are a variety of ways archaeology might contribute to the development of cultural evolutionary theory. First, as we continue to identify archaeologically relevant units of observation and analysis, we increase the potential for archaeological data to provide independent tests of existing models' predictions, as demonstrated here through an original study of projectile points from the US Southwest. I argue, further, that we should pursue avenues of research that leverage archaeology's unique perspectives and rich—if complicated—records of social learning in real-world contexts to generate novel cultural evolutionary hypotheses.

    Archaeological definitions of cultural evolution have varied through time and have only recently come to include concepts and methods informed by the modern evolutionary synthesis (sensu Huxley [6]; for archaeological histories see [2,7,8]). Three such approaches currently applied in archaeology are human behavioural ecology (HBE), phylogenetics and cultural transmission theory. To date, HBE has enjoyed the widest application in archaeology, particularly in the study of prehistoric hunter–gatherers' foraging decision. Reasons for this are both practical and historical: the bulk of material remains associated with prehistoric hunter–gatherers are subsistence related, and HBE's optimal foraging theory is in many ways compatible with archaeology's dominant (processualist) paradigm [7]. Nonetheless, many archaeologists are drawn to seminal texts by Cavalli-Sforza & Feldman [9] and Boyd & Richerson [10], which describe ways to analyse cultural data using techniques derived from evolutionary biology, genetics and population ecology, providing empirically testable predictions related to cultural change in social contexts. And, while expressly archaeological studies of gene–culture coevolution and cultural transmission theories remain few, a relatively small group of archaeologists is developing ways to interpret archaeological data in these terms [11–19]. These developments include use of cladistics to map possible phylogenetic relationships among artefacts, and quantification of artefact variability to identify learning biases. Both approaches use established cultural evolutionary models to infer evolutionary processes from archaeological patterns, which, in turn, provide independent tests of the models' predictions.

    Archaeologists first incorporated cladistics in the 1990s to explore possible phylogenetic relationships among artefacts and to ‘identify which character state changes are homologous—the result of inheritance—and which are analogous—the result of adaptation' [20, p. 728]. Since then, the approach has produced compelling evidence of ancestor–descendant relationships within classes of prehistoric technology including stone projectile points (e.g. arrowheads and spear points) and pottery [11,20,21]. It has also provoked stimulating discussions related to units and modes of cultural transmission, and the primacy of selective forces [22–24]. Granting that many phylogenetic methods will produce lineage trees whether or not a true evolutionary relationship exists [25], and that cladistics is premised on vertical cultural transmission while much social learning follows other forms (e.g. horizontal, oblique; [20,22]), cladistics can be a powerful and principled means of identifying and quantifying the evolutionary relationships that are sometimes simply assumed in other approaches (see also [2,17]).

    Where cladistics uses similarity to establish historically meaningful artefact taxa, a second approach to cultural evolution uses measures of difference—patterns of artefact variation—to identify specific modes of cultural transmission. This approach leverages the fact that the archaeological record is, in a sense, simply a spatially and temporally expansive account of artefact variation, interpreted in terms of human behaviour. Inferring past behaviours from patterns of artefact variation requires a reliable measure of variability and a body of theory equipped to distinguish its potential sources. Bettinger & Eerkens [12–15] have been particularly influential in this arena, exploring a variety of measures of within- and between-group variability (where groups are sets of artefact, sites, etc.) to ‘construct models that produce objective, explicit predictions about how variability should behave under different natural and cultural forces at various spatial and temporal scales' [15, p. 38].

    The natural and cultural forces to which Eerkens and Bettinger refer include things like raw material quality and abundance, artefact makers' proficiency, tool functional requirements, cultural attitudes towards variation and learning biases. In essence, within- and between-group variability reflects the strength of such forces; relatively low variation indicates tighter constraints on artefact form and relatively high variation, looser or no constraints. Distinguishing specific forces is an exercise in modelling, discussed below. Measuring variability is relatively simple, though, and the coefficient of variation (CV) is a robust and reliable means of quantifying variation to determine the degree to which artefacts of a kind were standardized [14]. Itself a standardized measure, CV can be used to make comparisons both within and between sets of artefacts, across space and through time to test predictions of cultural evolutionary theory, such as the strength of a particular learning bias under certain socioeconomic conditions, as explored in the Case study below (see §2c).

    Correlation is another simple measure that can be used to detect archaeological signals of cultural transmission. For example, Bettinger & Eerkens [13] argue that different learning biases—guided variation and indirect bias (sensu [10]) in this case—should produce distinct patterns of attribute correlation. Their study centres on stone projectile points whose attributes include length, width, thickness, weight and distances between particular landmarks. The authors hypothesize that guided variation, whereby individuals acquire cultural traits largely through trial and error, should be characterized by weaker attribute correlation than indirectly biased transmission—model-based biased transmission in this case, whereby social learners copy the behaviours of prestigious or successful individuals. When suites of traits are inherited together because they have been copied more or less faithfully from a single social model, any variation in their expression (differences in point length and width, for example) should be correlated; when individuals learn largely through trial and error, trait variation should be uncorrelated or only weakly so. The authors find support for these predictions among projectile points produced during the well-documented transition from atlatl-and-dart to bow-and-arrow technology in the US Great Basin (ca 1350 BP). Their analysis of a large database of projectile point attributes shows significant attribute correlation among points from central Nevada, whereas attributes vary largely independently among points from eastern California. Based on this result, the authors argue that bow-and-arrow technology ‘was maintained, and may have spread initially' by indirect bias in central Nevada and by guided variation in eastern California ([13], p. 235), a hypothesis later supported by behavioural experiments [26] and simulations [27].

    Development of these explicitly archaeological hypotheses has helped us identify and interpret patterns in material culture. However, even well-defined archaeological patterns of variation can sometimes be difficult to interpret in terms of cultural evolutionary processes. In some cases, for example, data are insufficient to draw comparisons between artefact types, sites or regions, and the significance of isolated CVs can be difficult to gauge. In their initial discussion of CV as a tool for scaling artefact variability, Eerkens & Bettinger [14] suggest independent standards to which archaeological CVs can be compared. To define the lower boundary of variation—the lowest CV we should expect in the absence of an external standard to which artefact makers compared their products (e.g. a ruler or template)—the authors cite a threshold of human perception, the ‘Weber fraction'. As Ernst Weber first observed in the 1800s, objects' linear measurements must differ by approximately 3% before the difference is perceptible to humans [28–30]. Artefact makers limited only by this sensory threshold should produce artefacts that differ by an average of 3% in any dimension; the ratio of any attribute's standard deviation to its mean should be approximately 1.7% (CV ≈ 1.7, assuming a uniform distribution and a range of 6% around a sample's mean). Lower CVs would suggest use of an external standard. Conversely, ‘high' variation can be interpreted relative to the CV of a uniform distribution whose range is 200% of the mean (CV = 57.7), which is what we would expect if attribute values were chosen at random (see [14] for full discussion). Higher CVs may be indicative of deliberate attempts to make each object distinct (e.g. a social preference for self-expression or functional need for hunters' points to be differentiable).

    These independent standards provide guidelines for interpreting very high and very low CVs. Intermediate values are less readily explained, particularly in the absence of comparative collections—groups of artefacts whose production histories are well enough known to provide benchmark CVs for manufacture under specific conditions. Such comparative collections can be simulated, though, providing a control to which empirical data can be compared. The following case study uses simulations informed by rich contextual data to gauge the amount of standardization reflected in intermediate CVs associated with a collection of projectile points from a late prehistoric site in the US Southwest.

    Projectile points from the Henderson site (figures 1 and 2; N = 1029) are of two primary types: Washita (28%) and Fresno (27%; an additional 37% are indeterminate). Fresnos are excluded from the present study because it is unclear whether they are a distinct point type or simply Washita ‘preforms' (unfinished points) [31]. An analysis of Washita point variability shows that several attributes' CVs are in the intermediate range (table 1), but seemingly under relatively tight production constraints since they are much closer to 1.7 (the Weber fraction) than to 57.7 (random choice). Still, even the lowest CV (11%, maximum width) is difficult to interpret in isolation. The points' archaeological context, which indicates an increase in both the socioeconomic importance of bison and, perhaps, incentive to advertise group membership during the late prehistoric period, suggests a variety of plausible, testable hypotheses regarding cultural evolutionary mechanisms that might account for observed patterns of artefact variability. A discussion of these hypotheses follows this brief description of the site.

    What is cultural transmission example?

    Figure 1. Map of the US Southwest and westernmost southern High Plains (UT, Utah; CO, Colorado; AZ, Arizona; NM, New Mexico). The Henderson, Garnsey Bison Kill and Bloom Mound sites are located in close proximity to one another in the area indicated by the star. Base map modified from ‘North America second level political division 2 and Greenland.svg', by Alex Covarrubias [CC BY-SA 2.5 (http://creativecommons.org/licenses/by-sa/2.5)], via Wikimedia Commons.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What is cultural transmission example?

    Figure 2. (a) Archetypal Washita and Fresno projectile points. Average dimensions for Henderson site Washita points are provided in table 1. (b) Attributes considered in this study: mid, midline length; nw, neck width; ml, maximum length; bl, blade length; hl, haft length; w, maximum width; bw, base width. The illustrated point's maximum and base widths are the same; this is true of many, but not all, of the archaeological samples. (Point illustrations by Emily Wolfe.)

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Table 1.Summary statistics for Washita points from the Henderson site. The rows labelled ‘CV = 10% (5%, 3%)' list percentages of simulated variances greater than the archaeological variance when simulated using the corresponding CV. Weight is measured in grams; all linear measurements are in millimetres. Differences in attribute sample size owe to the fact that several of the points in the Henderson collection are broken. To maximize the data available for study, any point complete in a particular dimension (e.g. width) was included in analysis of that dimension. For example, points with broken tips were not included in analyses of maximum length but, if their bases were intact, their widths were included.

    weightthicknessmax. lengthmidline lengthmax. widthbase widthblade lengthhaft lengthneck width
    N samples87259136136162157135227249
     mean0.613.0421.2920.1511.8611.5314.866.916.58
     s.d.0.200.584.113.931.331.523.811.351.18
    CV0.330.190.190.200.110.130.260.190.18
    CV = 10%59799594100100799699
    CV = 5%444365658981556570
    CV = 3%413854556460495255

    The Henderson site represents the remains of a modestly sized residential complex occupied between AD 1250 and AD 1350 by a relatively small group of hunter–farmers. Like the Puebloan groups to the west, Henderson's occupants grew and ate corn, but bison hunting appears to have been more important, both economically and socially [32]. For example, projectile points are abundant while milling equipment, used to process corn and other agricultural products, is relatively scarce and crudely made, and often incorporated into domestic architecture before its useable life was exhausted. Dental caries are infrequent among Henderson's human skeletal remains and isotopic signatures on bones indicate modest reliance on C4 plants, both in contrast to patterns seen among committed farmers [33,34]. Moreover, a virtual absence of bison ribs and vertebrae from the Henderson assemblage as well as that of a nearby, peri-contemporaneous bison kill site (Garnsey; figure 1) suggests that dried bison meat and hides were traded, likely with Puebloan groups to the west. Beyond their economic value, bison may also have been socially important: bison bones are found almost exclusively in roasting features located in public plazas, while bones of other species are primarily found in hearths associated with individual households [32].

    Bison's centrality to economic and social life at Henderson may have translated to hunters' prestige, which, in turn, may have biased the transmission of information related to projectile point production. As mentioned above, the Washita points found at the Henderson site are more standardized than we might expect if point production were under loose or no constraints (e.g. a deliberately individualistic enterprise). Hunters, especially consistently successful ones, may have been preferentially copied, a bias perhaps facilitated by public ‘feasting' events where people may have had greater access to hunters and their gear. A reasonable hypothesis, then, is that a restricted pool of social models and a strong learning bias would lead to relatively high projectile point standardization (i.e. attribute CVs approaching 1.7).

    Alternatively, point production might have been influenced by group-affiliative norms, perhaps even an incentive to advertise group membership. Southeastern New Mexico, where Henderson is located, appears to have been a boundary zone between the farming Pueblos to the west and mobile bison hunters of the southern High Plains and Edwards Plateau to the east (figure 1). McElreath and colleagues [35–37] have argued that ecological boundaries promote the evolution of ethnic markers—characteristics that readily identify members of a group—a phenomenon that may be amplified at tense boundaries where social differentiation can have even greater fitness implications. Archaeological evidence from the Bloom Mound, a contemporaneous site roughly 1 mile from Henderson (figure 1), suggests that Henderson area groups may have been at odds with Plains groups over access to bison and/or trade partnerships with Puebloans [38]. Accordingly, a second hypothesis regarding point production in the Henderson area is that standardization was a form of ethnic marking. Points were almost certainly not a primary means of advertising group membership, but ethnographic studies suggest that functional classes of artefacts including projectile points do sometimes serve this purpose [39].

    As a preliminary test of these hypotheses, and to gauge the significance of Henderson points' intermediate CVs, I simulated point attribute data for a variety of learning scenarios. To test the first hypothesis, where all point makers model led their points on those produced by the most successful hunters, I assume that the target value for each attribute is reflected in the archaeological sample's mean value for that attribute. If such a bias were strong, attribute values should vary only according to knapper skill, raw material quality and human perception (the Weber fraction); attribute CVs should be quite low due to a small model pool and strong learning bias. Under the second hypothesis, where point variability at Henderson was constrained by group-affiliative norms, the incentive to standardize may have been stronger than under the first hypothesis but, depending on how learning individuals acquired information, we might expect higher point variability. That is, if the relevant information and skills were acquired within households, attribute CVs would likely be higher than if knappers copied a small number of successful hunters, even if knapper skill and material quality were invariant within the community. I modelled the second hypothesis assuming within-household learning at three levels of transmission fidelity: (i) CV = 10%, which is twice the hypothesized ‘limit of human ability to standardize manually produced artifacts' [30, p. 667] (see next item); (ii) CV = 5%, the hypothesized ‘limit of human ability to standardize manually produced artifacts' [30, p. 667] or the ratio of standard deviation to mean expected assuming the minimum error introduced by limitations of perception (the Weber fraction), motor skill and memory when artefacts are produced without the aid of external measures like rulers or scales; and (iii) CV = 3%, the threshold of human visual perception (Weber fraction, assuming a normal rather than a uniform distribution).

    Speth [32] estimates that the Henderson site has approximately 100 ‘room blocks'— rectangular dwellings thought to have housed single nuclear or small extended families. Calibrated radiocarbon data indicate that the site was occupied for approximately 100 years (ca AD 1250 to AD 1350), assumed here to represent four learning generations. As a preliminary test of the hypotheses described above, and to gauge the significance of Henderson points' intermediate CVs, I simulated point attribute data for a variety of learning scenarios and generated CVs to which I then compared the real data. For the first simulation, I assigned each room block (house) a starting target value for each point attribute (e.g. maximum length) by taking a single random draw from that attribute's archaeological distribution. In this simulation, each house then ‘produced' four points per generation, a sample equivalent to the average archaeological density of Washita points per room block at Henderson. The points' attribute values were drawn from normal distributions with standard deviations equal to 10% of the corresponding mean (CV = 10%). Each distribution's first-generation mean was determined by a single random draw as described above; subsequent generation means were the average of the preceding generation's four point attributes. For each attribute, I ran 1000 simulations of within-house point production across four generations in each of 100 houses, each run of the simulation begun with new, randomly drawn within-house starting target values. I recorded each run's variance and then repeated the routine using CVs of 5% and 3%. (The point data and R code for this simulation are available as electronic supplementary material.)

    Distributions of the simulated samples' attribute variances provide a framework for interpreting archaeological attribute variances (figure 3a). Henderson site Washita attribute variances most closely resemble those of samples simulated assuming within-house (vertical) transmission and a CV of 3% (figure 3b). Preliminarily, this can be interpreted as extremely high-fidelity copying, limited only by the makers' ability to perceive differences between their points and those they copied (the Weber fraction). Considering the points' broader archaeological context, it is plausible that such high-fidelity copying was motivated by a strong group-affiliative norm or incentive to advertise group membership in light of tensions between Henderson area groups and Plains groups to the east.

    What is cultural transmission example?

    Figure 3. (a) Comparison of simulated (distributions) and archaeological (vertical black lines) Henderson site Washita point attributes variances. Each distribution describes 1000 simulations. The best-fitting model (CV of 3%, 5% or 10%) for each attribute is indicated by close alignment of the mean simulation variance and empirical variance. Each model's fit is summarized as a density plot in (b), which shows the standardized distances of attributes' simulated mean variances from the same attributes' archaeological variances. Henderson site Washita attribute variances most closely resemble those of samples simulated assuming within-house (vertical) transmission and a CV of 3%, tentatively interpreted as extremely high-fidelity copying, limited only by the makers' ability to perceive differences between their points and those they copied (the Weber fraction).

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Alternative explanations are possible, but not well supported by available data. Despite the potential for a strong prestige bias given the socioeconomic importance of bison, it does not appear that reverence for successful hunters was the primary driver of point standardization (see above). However, the pattern at Henderson could be explained by strong functional constraints on artefact form—perhaps points whose attributes differed significantly from the mean negatively affected hunting success—or a division of labour that delegated point production to a group of skilled artisans. Other evidence from the site contradicts these alternative hypotheses, however. For example, several points in the Roswell assemblage cannot be assigned to a named type, nor do they show any standardization among them, suggesting that functional constraints did not preclude the production and use of unstandardized points. Such constraints, if they existed, appear not to have been sufficiently strong to produce the observed level of standardization among Washita points. Moreover, the prevalence and distribution of knapping debris at the site suggest that at least some point production was done at home whereas a group of skilled artisans might have camped near the stone source instead, to reduce transport-related production costs [40,41].

    While certainly not conclusive, the preliminary finding that point standardization at Henderson most closely resembles faithful within-house vertical transmission, possibly motivated by a strong group-affiliative norm, is both a compelling alternative to more traditional ecological explanations and a hypothesis that could be tested using other lines of evidence. For example, assessment of a region-wide spatial distribution of point variability relative to social and ecological boundaries would be instructive. If projectile points were, in essence, a form of ethnic marking, the incentive to standardize should have relaxed with increased distance from boundaries and I would predict clinal variation in CVs, with the lowest near boundaries and the highest towards the centre of a group's territory.

    This case study shows how simulation, informed by cultural evolutionary models, available archaeological data and relevant contextual information, can generate ‘comparative collections' for use in the interpretation of artefact standardization and the assessment of cultural transmission in archaeological contexts. In turn, archaeological analyses can provide independent tests of established models' predictions. Moreover, the measures of artefact variability used in this and other case studies [13–15,26,27] could be used in novel ways to explore evolutionary outcomes that may not be predicted by existing models. Following a brief discussion of persistent challenges associated with archaeological data, I suggest potential lines of inquiry that integrate simulation, measures of diversity and archaeology's rich—if complex—record of real-world social learning to generate new cultural evolutionary hypotheses.

    As we pursue a greater role for archaeology in the continued development of cultural evolutionary theory, it is worth bearing certain limitations in mind. There is some concern, for instance, that archaeology's coarse-grained, typically aggregated record of human behaviour is inadequate for evaluating cultural evolutionary models that are based on person-to-person transmission of cultural information [42]. Two common phenomena affecting the archaeological record, ‘time averaging' and preservation biases, are particularly problematic in this regard.

    Archaeological tests of cultural evolutionary theory can be complicated by time averaging, whereby artefacts produced at different times become spatially associated, giving the false impression of contemporaneity [43,44]. This is problematic because it can artificially inflate measures of variability or diversity: variants produced at one time come to rest beside distinct variants produced at other times, decreasing the likelihood that observed archaeological patterns accurately reflect human behaviours and evolutionary processes [44]. The magnitude of these effects can scale with the duration of site occupation because, as more time elapses, there is both more opportunity for the record to be affected and, potentially, more variants to be admixed. However, time averaging is more directly related to environmental factors—primarily soil erosion, which conflates distinct deposits, but also agents that promote vertical mixing (e.g. burrowing animals); time averaging is a form of preservation bias.

    Preservation biases affect the likelihood that material remains (e.g. artefacts) and their original spatial relationships are preserved in the archaeological record. Biasing factors include environmental variables such as those mentioned above, as well as physical properties of the remains themselves (e.g. organic materials are less likely to preserve than inorganic ones) and the original location of a deposit (e.g. riverside sites are generally more susceptible to destruction than sites on less dynamic landforms [45,46]). Like time averaging, differential preservation reduces the integrity of the archaeological record, disproportionately affecting certain materials and geographic regions, potentially biasing our understanding of how and why material culture associated with different groups—or particular demographics within groups (e.g. gender- or age-specific artefacts)—evolve at different rates or by different means.

    The record's limitations are a perennial archaeological concern but, while time averaging and biased preservation complicate interpretations, they need not paralyse cultural evolutionary research. Archaeological studies that assess variability among continuous data (e.g. projectile point lengths) can readily incorporate simple tests to detect potential biases. For example, if preferences for different point lengths changed through time, this may present as multiple modes in a time-averaged assemblage's length distribution. Of course, attribute variability can itself vary through time if preferences, people's tolerance of variation, or learning biases change. This would be much more difficult to detect archaeologically. Nonetheless, in many cases enough is known about a region and its record that independent evidence can help identify and minimize the effects of these confounding factors. Additionally, in some instances modelling and simulation can be used to approximate the effects of data lost to time-averaging or preservation biases to estimate how pronounced an archaeological pattern would have to be before relevant biasing factors are identifiable by available means.

    Even when not affected by time-averaging or preservation biases, the archaeological record's resolution is often mis-aligned with cultural evolutionary questions posed in other fields. Most cultural deposits are aggregated samples of multiple years at best, and more often of centuries or millennia, while cultural information is transmitted on much shorter timescales. Likewise, while cultural transmission theory centres on individuals' learning biases, the archaeological record most often represents group-level products of these biases. Rather than projecting predictions derived from existing models, simulations and laboratory experiments directly onto the archaeological record, we should continue efforts to identify archaeologically relevant units of observation and analysis, as described in the previous section, and find new ways to capitalize on the record's greatest strengths: large-scale and long-term perspectives of both cultural change and the social and ecological contexts in which it occurred, as discussed below.

    The metrics and methods described in the Current archaeological strengths section (§2) and Case study (§2c above) can be used to explore a range of topics that build on existing cultural evolutionary theory and use archaeology's unique perspective to full advantage. For example, simply mapping spatial distributions of trait variation (CVs, correlations) is likely to reveal patterns that suggest social and ecological barriers to (or conduits for) transmission, which can then be used to formulate new hypotheses. Similarly, observing how trait variation tracks with other social and ecological phenomena (e.g. increased environmental productivity and high CVs; population contraction and patterns of variation consistent with conformism) can inform our understanding of long-term patterns of material cultural evolution. This kind of exploratory analysis has the potential to both reveal and mitigate issues associated with low-resolution archaeological records described in the previous section, particularly when the analyses are performed at large spatial and temporal scales. At smaller scales, studies of trait variation can address whether different classes of artefacts (e.g. projectile weaponry versus grinding stones used to process food) evolved at different rates, perhaps as a function of their visibility (e.g. household equipment may evolve more slowly because it is less publicly visible). Differential rates of change among artefact classes might, in turn, suggest different evolutionary mechanisms. Lastly, measures of variation can be incorporated into more complex evolutionary models, as described below.

    Modelling and simulation form the foundation of modern approaches to cultural evolution. These methods have obvious advantages including their potential for exploring causal relationships through isolation and manipulation of variables, and their capacity for replication and repetition. Laboratory experiments designed to gauge humans' adherence to modelled expectations under controlled conditions [47–49] are a complement to modelling and simulation, offering some of the same benefits (reproducibility, repeatability) while exploring the effects of humanity on cultural evolution. Archaeology has the potential to further enhance our understanding through observations of cultural evolution ‘in the wild' [50]. That is, archaeological records capture real-world, high-stakes, long-run outcomes of evolutionary processes, which can be very different from short-run outcomes and model predictions [51,52]. This is partly because cultural change is a complex process involving interactions among social, ecological and demographic variables. Archaeological projects routinely generate data that provide direct or approximate measures of these variables, which can be used to develop models that incorporate archaeology's large-scale and long-term perspectives.

    Interactions and feedback among cultural, ecological and demographic systems can dilute (or amplify) ‘pure' effects of change in one system on another. For example, culture (e.g. technologies, behaviours and institutions) can mitigate environmental pressures, raise local carrying capacities and improve survivorship and fertility, stimulating population growth, which can then feed back and effect subsequent cultural change [53]. Attempts to understand material cultural evolution as simply an adaptive response to changed conditions or a product of biased cultural transmission may fail to account for important variables. Nonetheless, archeological explanations have historically centred on prime movers or singular causes whose relationship to cultural change is assumed to be direct and unambiguous. To maximize archaeological records' potential, we should augment traditional approaches with multi-system models of cultural evolution that incorporate rich contextual evidence related to the social and ecological contexts of evolution, as in the following example.

    Toolkit richness, or the number of different kinds of tools in an archaeological assemblage, has been used as a proxy for cultural complexity in recent debates surrounding the evolutionary role of demographics [54–60]. To model the relative effects of ecological, demographic and cultural variables on the evolution of toolkit complexity, we might first identify all relevant variables: e.g. ecological: diet breadth, food density (high- and low-ranked food patches per km2), food dispersion (an index of resource clumping), food richness (number of food ‘types' sensu Bettinger and colleagues [7]) and food availability (number of available calories per km2); demographic: population size, density and connectivity among groups (e.g. number of shared cultural elements); cultural: artefact variability, as described in the previous section. A number of plausible hypotheses can then be identified (e.g. toolkit richness is a factor of (H1) population size; (H2) population size + diet breadth; … (Hi) population size + diet breadth + population density + between-group connectivity + food density + food dispersion + etc.) and models fit to available archaeological and paleoenvironmental data can then be compared using formal information criteria to identify those with the best predictive power. This exercise can be repeated for multiple types of data from within a single site to understand rates of change among different artefact classes (e.g. projectile weaponry versus grinding stones, as mentioned above), or at a global scale to understand conditions that promote material cultural diversification and the trend of increasing technological and social complexity that began during the late Pleistocene. By reframing archaeological approaches to include questions that pertain directly to evolutionary context, we might provide a more nuanced understanding of cultural evolution.

    Archaeological data are not only well suited to examining the complex dynamics of cultural evolution, they are essential: they are often our only means of empirically testing the long-run effects of distinct evolutionary mechanisms. Nonetheless, archaeology's potential to help advance our understanding of cultural evolution has been largely unrealized to this point. Our contribution likely lies in the vast amounts of anthropological and contextual data we generate, which can be used to develop evolutionary models specifically tailored to archaeological circumstances and that account for real-world messiness including interactions among cultural, ecological and demographic variables. Ultimately, archaeology's better integration with the broader field of cultural evolution (sensu Cavali-Sforza & Feldman [9], Boyd & Richerson [10]) is critical for assessing the effects of evolutionary mechanisms on long time scales.

    The primary data and all R code associated with the simulation and analysis are available as electronic supplementary material.

    I declare that I have no competing interests.

    Research funding was provided by the Regents of the University of Michigan and UM Department of Anthropology. Participation in the ‘New Perspectives in Cultural Evolution' workshop was funded by the John Templeton Foundation.

    The author thanks Marcus Feldman, Nicole Creanza, and Oren Kolodny for the invitation to participate in the ‘New perspectives in cultural evolution' workshop and to contribute to this special issue; John Speth for access to the Henderson site collections, and enlightening and engaging discussions of the region's prehistory; Edward Potchen, Laura Kochlefl, Gordon Beeman and Theodore Stern for collecting the projectile point data; Emily Wolfe for the projectile point illustrations in figure 2; Andrew Marshall for assistance with the simulation and figure 3; and two anonymous reviewers for their feedback on a draft of this paper.

    Footnotes

    One contribution of 16 to a theme issue ‘Bridging cultural gaps: interdisciplinary studies in human cultural evolution’.

    Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.3965853.

    References

    • 1

      Clarke G. 1969World prehistory: a new outline. 2nd edn. Cambridge, UK: Cambridge University Press. Google Scholar

    • 2

      Lyman RL. 2008Cultural transmission in North American anthropology and archaeology, ca. 1895–1965. In Cultural transmission and archaeology: issues and case studies (ed. O'Brien MJ), pp. 10–20. Washington, DC: Society for American Archaeology. Google Scholar

    • 3

      Henrich J. 2004Demography and cultural evolution: why adaptive cultural processes produce maladaptive losses in Tasmania. Am. Antiquity 69, 197–214. (doi:10.2307/4128416) Crossref, ISI, Google Scholar

    • 4

      Kolodny O, Creanza N, Feldman M. 2015Evolution in leaps: the punctuated accumulation and loss of cultural innovations. Proc. Natl Acad. Sci. USA 112, E6762–E6769. (doi:10.1073/pnas.1520492112) Crossref, PubMed, ISI, Google Scholar

    • 5

      Kolodny O, Creanza N, Feldman M. 2016Game-changing innovations: how culture can change the parameters of its own evolution and induce abrupt cultural shifts. PLoS ONE 12, e1005302. (doi:10.1371/journal.pcbi.1005302) Google Scholar

    • 6

      Huxley JS. 1942Evolution: the modern synthesis. London, UK: Allen & Unwin. Google Scholar

    • 7

      Bettinger RL, Garvey R, Tushingham S. 2015Hunter-gatherers: archaeological and evolutionary theory, 2nd edn. New York, NY: Springer. Crossref, Google Scholar

    • 8

      Shennan S. 2008Evolution in archaeology. Annu. Rev. Anthropol. 37, 75–91. (doi:10.1146/annurev.anthro.37.081407.085153) Crossref, ISI, Google Scholar

    • 9

      Cavalli-Sforza L, Feldman M. 1981Cultural transmission and evolution: a quantitative approach. Princeton, NJ: Princeton University Press. Google Scholar

    • 10

      Boyd RM, Richerson PJ. 1985Culture and the evolutionary process. Chicago, IL: University of Chicago Press. Google Scholar

    • 11

      Bentley RA, Shennan SJ. 2003Cultural transmission and stochastic network growth. Am. Antiquity 68, 459–485. (doi:10.2307/3557104) Crossref, ISI, Google Scholar

    • 12

      Bettinger RL, Eerkens JW. 1997Evolutionary implications of metrical variation in Great Basin projectile points. In Rediscovering Darwin: evolutionary theory in archaeological explanation (eds Barton CM, Clark GA), pp. 177–191. Arlington, VA: American Anthropological Association. Google Scholar

    • 13

      Bettinger RL, Eerkens JW. 1999Point typologies, cultural transmission, and the spread of bow-and-arrow technology in the prehistoric Great Basin. Am. Antiquity 64, 231–242. (doi:10.2307/2694276) Crossref, ISI, Google Scholar

    • 14

      Eerkens JW, Bettinger RL. 2001Techniques for assessing standardization in artifact assemblages: can we scale material variability?Am. Antiquity 66, 493–504. (doi:10.2307/2694247) Crossref, ISI, Google Scholar

    • 15

      Eerkens JW, Bettinger RL. 2008Cultural transmission and the analysis of stylistic and functional variation. In Cultural transmission and archaeology: issues and case studies (ed. O'Brien MJ), pp. 21–38. Washington, DC: Society for American Archaeology. Google Scholar

    • 16

      Eerkens JW, Lipo CP. 2005Cultural transmission, copying errors, and the generation of variation in material culture and the archaeological record. J. Anthropol. Archaeol. 24, 316–334. (doi:10.1016/j.jaa.2005.08.001) Crossref, ISI, Google Scholar

    • 17

      Lipo C, O'Brien MJ, Shennan S, Collard M (eds). 2006Mapping our ancestors: phylogenetic approaches in anthropology and prehistory. New York, NY: Aldine. Google Scholar

    • 18

      O'Brien MJ, Darwent J, Lyman RL. 2001Cladistics is useful for reconstructing archaeological phylogenies: palaeoindian points from the southeastern United States. J. Archaeol. Sci. 28, 1115–1136. (doi:10.1006/jasc.2001.0681) Crossref, ISI, Google Scholar

    • 19

      O'Brien M, Lyman RL, Collard M, Holden C, Gray RD, Shennan SJ. 2008Transmission, phylogenetics, and the evolution of cultural diversity. In Cultural transmission and archaeology: issues and case studies (ed. O‘Brien MJ), pp. 39–58. Washington, DC: Society for American Archaeology. Google Scholar

    • 20

      O'Brien MJ, Boulanger M, Buchanan B, Bentley RA, Lyman RL, Lipo C, Madsen M, Eren M. 2016Design space and cultural transmission: case studies from Paleoindian eastern North America. J. Archaeol. Method Theory 23, 692–740. (doi:10.1007/s10816-015-9258-7) Crossref, ISI, Google Scholar

    • 21

      Neiman FD. 1995Stylistic variation in evolutionary perspective: inferences from decorative diversity and interassemblage distance in Illinois woodland ceramic assemblages. Am. Antiquity 60, 7–36. (doi:10.2307/282074) Crossref, ISI, Google Scholar

    • 22

      Bettinger RL. 2008Cultural transmission and archaeology. In Cultural transmission and archaeology: issues and case studies (ed. O'Brien MJ), pp. 1–9. Washington, DC: Society for American Archaeology. Google Scholar

    • 23

      Borgerhoff Mulder M, Nunn CL, Towner MC. 2006Cultural macroevolution and the transmission of traits. Evol. Anthropol. 15, 52–64. (doi:10.1002/evan.20088) Crossref, ISI, Google Scholar

    • 24

      Boyd RM, Borgerhoff Mulder M, Durham WH, Richerson PJ. 1997Are cultural phylogenies possible? In Human by nature: between biology and the social sciences (eds Weingart SD, Mitchell P, Richerson PJ, Maasen S), pp. 355–386. Mahwah, NJ: Erlbaum. Google Scholar

    • 25

      Jordan P. 2009Linking pattern to process in cultural evolution: investigating material culture diversity among the northern Khanty of northwest Siberia. In Pattern and process in cultural evolution (ed. Shennan S), pp. 61–83. Berkeley, CA: University of California Press. Google Scholar

    • 26

      Mesoudi A, O'Brien M. 2008The cultural transmission of Great Basin projectile point technology I: an experimental simulation. Am. Antiquity 73, 3–28. (doi:10.1017/S0002731600041263) Crossref, ISI, Google Scholar

    • 27

      Mesoudi A, O'Brien M. 2008The cultural transmission of Great Basin projectile-point technology II: an agent-based computer simulation. Am. Antiquity 73, 627–644. (doi:10.1017/S0002731600047338) Crossref, ISI, Google Scholar

    • 28

      Weber EH. 1834De pulen, resorptione, auditu et tactu: annotationes anatomicae et physiologicae. Liepzig, Germany: Kohler. Google Scholar

    • 29

      Coren S, Ward LM, Enns JT. 1994Sensation and perception, 4th edn. Forth Worth, TX: Harcourt Brace. Google Scholar

    • 30

      Eerkens JW. 2000Practice makes within 5% of perfect: visual perception, motor skills, and memory in artifact variation. Curr. Anthropol. 41, 663–668. (doi:10.1086/317394) Crossref, ISI, Google Scholar

    • 31

      Adler M, Speth J. 2004Projectile points from the Henderson site (1980–1981). In Life on the periphery: economic change in late prehistoric southeastern New Mexico (ed. Speth J), pp. 350–367. Museum of Anthropology, University of Michigan Memoirs, No. 37. Ann Arbor, MI: Regents of the University of Michigan. Google Scholar

    • 32

      Speth J. 2004The Henderson site. In Life on the periphery: economic change in late prehistoric southeastern New Mexico (ed. Speth J), pp. 4–66. Museum of Anthropology, University of Michigan Memoirs, No. 37. Ann Arbor, MI: Regents of the University of Michigan. Google Scholar

    • 33

      Speth J. 2004Life on the periphery: economic and social change in southeastern New Mexico. In Life on the periphery: economic change in late prehistoric southeastern New Mexico (ed. Speth J), pp. 420–429. Museum of Anthropology, University of Michigan Memoirs, No. 37. Ann Arbor, MI: Regents of the University of Michigan. Google Scholar

    • 34

      Rocek T, Speth J. 1986The Henderson site burials: glimpses of a late prehistoric population in the Pecos Valley. Technical Report 18. Ann Arbor, MI: Museum of Anthropology, University of Michigan. Google Scholar

    • 35

      McElreath R, Boyd R, Richerson PJ. 2003Shared norms and the evolution of ethnic markers. Curr. Anthropol. 44, 122–129. (doi:10.1086/345689) Crossref, ISI, Google Scholar

    • 36

      Boyd RM, Richerson PJ. 1987The evolution of ethnic markers. Curr. Anthropol. 2, 65–79. Google Scholar

    • 37

      Boyd RM, Richerson PJ, McElreath R. 2005Shared norms and the evolution of ethnic markers. In The origin and evolution of cultures (eds Boyd RM, Richerson PJ), pp. 118–131. Oxford, UK: Oxford University Press. Google Scholar

    • 38

      Speth J, Newander K. 2012Plains–Pueblo interaction: a view from the ‘middle’. In The toyah phase of central texas: late prehistoric economic and social processes (eds Kenmotsu N, Boyd D), pp. 152–180. College Station, TX: Texas A&M University Press. Google Scholar

    • 39

      Weissner P. 1983Style and social information in Kalahari San projectile points. Am. Antiquity 48, 253–276. (doi:10.2307/280450) Crossref, ISI, Google Scholar

    • 40

      Beck C, Taylor A, Jones G, Fadem C, Cook C, Millward S. 2002Rocks are heavy: transport costs and paleoarchaic quarry behavior in the Great Basin. J. Anthropol. Archaeol. 21, 481–507. (doi:10.1016/S0278-4165(02)00007-7) Crossref, ISI, Google Scholar

    • 41

      Garvey R. 2015A model of lithic raw material procurement. In Lithic technological systems and evolutionary theory (eds Goodale N, Andrefsky W), pp. 156–171. Cambridge, UK: Cambridge University Press. Crossref, Google Scholar

    • 42

      Cochrane E. 2009Evolutionary explanation and the record of interest: using evolutionary archaeology and dual inheritance theory to explain the archaeological record. In Pattern and process in cultural evolution (ed. Shennan S), pp. 113–132. Berkeley, CA: University of California Press. Google Scholar

    • 43

      Kidwell SM, Behrensmeyer AK (eds). 1993Taphonomic approaches to time resolution in fossil assemblages. Short Courses in Paleontology, no. 6. Knoxville, TN: Paleontological Society. Crossref, Google Scholar

    • 44

      Premo L. 2014Cultural transmission and diversity in time-averaged assemblages. Curr.Anthropol. 55, 105–114. (doi:10.1086/674873) Crossref, ISI, Google Scholar

    • 45

      Garvey R. 2015Probabilistic survey and prehistoric patterns of land and resource use in Mendoza Province, Argentina. Intersecciones Antro. 16, 301–312. ISI, Google Scholar

    • 46

      Garvey R, Bettinger RL. 2017A regional approach to prehistoric landscape use in west-central Argentina. J. Archaeol. Sci. Rep. (doi:10.1016/j.jasrep.2017.03.013) ISI, Google Scholar

    • 47

      Baum WM, Richerson PJ, Efferson CM, Paciotti BM. 2004Cultural evolution in laboratory microsocieties including traditions of rule giving and rule following. Evol. Hum. Behav. 25, 305–326. (doi:10.1016/j.evolhumbehav.2004.05.003) Crossref, ISI, Google Scholar

    • 48

      Mesoudi A, Whiten A. 2004The hierarchical transformation of event knowledge in human cultural transmission. J. Cogn. Culture 4, 1–24. (doi:10.1163/156853704323074732) Crossref, Google Scholar

    • 49

      Mesoudi A. 2008The experimental study of cultural transmission and its potential for explaining archaeological data. In Cultural transmission and archaeology: issues and case studies (ed. O'Brien MJ), pp. 91–101. Washington, DC: Society for American Archaeology. Google Scholar

    • 50

      Boyd RM, Richerson PJ, Henrich J. 2013The cultural evolution of technology: facts and theories. In Cultural evolution: society, technology, language, and religion (eds Richerson PJ, Christiansen M), pp. 119–142. Cambridge, MA: MIT Press. Crossref, Google Scholar

    • 51

      Eerkens JW, Bettinger RL, Richerson PJ. 2014Cultural transmission theory and hunter–gatherer archaeology. In The Oxford handbook of the archaeology and anthropology of hunter–gatherers (eds Cummings V, Jordan P, Zvelebil M), pp. 1127–1142. Oxford, UK: Oxford University Press. Google Scholar

    • 52

      Gingrich P. 1982Time resolution in mammalian evolution: sampling, lineages and faunal turnover. In Proc. 3rd N. Am. Paleont. Convention, Montreal, vol. 1, pp. 205–210. Toronto, Ontario: NAPC-3. Google Scholar

    • 53

      Garvey R. 2018Cultural transmission and sources of diversity: a comparison of temperate maritime foragers of the Northern and Southern Hemispheres. In Foraging in the past: archaeological studies in hunter–gatherer diversity (ed. Lemke A). Boulder, CO: University Press of Colorado. Google Scholar

    • 54

      Baldini R. 2015Revisiting the effect of population size on cumulative cultural evolution. J. Cogn. Cul. 15, 320–326. (doi:10.1163/15685373-12342153) Crossref, ISI, Google Scholar

    • 55

      Collard M, Kemery M, Banks S. 2005Causes of tool kit variation among hunter-gatherers: a test of four competing hypotheses. Can. J. Archaeol. 29, 1–19. Google Scholar

    • 56

      Collard M, Buchanan B, O'Brien M, Scholnick J. 2013Risk, mobility or population size? Drivers of technological richness among contact-period western North American hunter-gatherers. Phil. Trans. R. Soc. B 368, 20120412. (doi:10.1098/rstb.2012.0412) Link, ISI, Google Scholar

    • 57

      Kline M, Boyd R. 2010Population size predicts technological complexity in Oceania. Proc. R Soc. B 277, 2559–2564. (doi:10.1098/rspb.2010.0452) Link, ISI, Google Scholar

    • 58

      Read D. 2012Population size does not predict artifact complexity: analysis of data from tasmania, Arctic hunter-gatherers, and oceania fishing groups. UC Los Angeles: Human Complex Systems. Google Scholar

    • 59

      Richerson PJ, Boyd R, Bettinger R. 2009Cultural innovations and demographic change. Hum. Biol. 81, 211–235. (doi:10.3378/027.081.0306) Crossref, PubMed, ISI, Google Scholar

    • 60

      Shennan S. 2015Demography and cultural evolution. In Emerging trends in the social and behavioral sciences: An interdisciplinary, searchable, and linkable resource (eds Scott RA, Kosslyn SM). Dynamic online publication. (doi:10.1002/9781118900772) Google Scholar


    Page 9

    Humans stand out among other animals because we adapt to new environments both by being clever innovators [1] and through the accumulation of cultural knowledge across generations [2,3]. Social learning, including intensive forms such as teaching [4–6], can facilitate cumulative cultural evolution. In fact, low-cost social learning mechanisms, as well as sources of innovation, are prerequisites for the evolution of cumulative culture. For this reason, social learning mechanisms are central to the understanding of cultural evolution—and cultural evolution is key to explaining why and how human ontogeny is so very flexible.

    Culture is a human universal: all societies have shared knowledge, practices, beliefs and rituals that are transmitted socially. At the same time, culture is also a source of psychological and behavioural variation both within and across populations. Developmental processes that are sensitive to socio-environmental influences are one way that flexibility can evolve [7,8], and evolution can produce developmental processes that vary in adaptive ways in terms of the degree and nature of their flexibility [9]. Elaborating on the relationship between culture and development first requires recognizing that evolution and development are not mutually exclusive, then building on that insight to explore how evolved developmental mechanisms that are sensitive to cultural influence can create psychological and behavioural variation across and within societies [8].

    Despite the importance of culture to development, developmental psychology as a field retains a near-absolute focus on development in relatively wealthy Western, English-speaking populations. Henrich et al. [9] term general psychology's participant pool ‘WEIRD:’ Western, educated, industrialized, rich and democratic. A recent review provides evidence that this is also the case in leading developmental psychology journals: more than 90% of study populations represented there are from the USA, Europe and/or are English-speaking [10]. The rest of the world is vastly underrepresented, with only approximately 7% of participant populations coming from non-Western human populations (the remainder are non-human animal populations). In this context, developmental psychologists who pursue cross-cultural research are wisely expanding the scope of research to include participants beyond predominantly Western, upper middle class and often ethnically white participants [9,11,12]. We applaud these efforts—anything less would only perpetuate an incomplete and inaccurate picture of human development.

    Poor sampling, however, is not the only problem in the field. Arnett [11], and Meadon & Spurrett [13] address a lack of inclusivity in the broader practice of psychology: theories, studies and publications in the American Psychological Association journals are all overwhelmingly created, reviewed and edited by this same subset of the world's population. This is one reason why the sampling problem in developmental psychology is not likely to be solved by laboratory-based researchers making the decision to take on cross-cultural work unilaterally, in the short term. Dropping in on communities with unfamiliar cultures to run brief, one-off studies without a long-term reciprocal relationship with the community can be ethically dubious [14], especially where there is a power differential. Further, interpreting results in isolation from a population's daily cultural context can produce more confusion than answers [15]. And yet avoiding these pitfalls requires investing what can be a prohibitive amount of time, effort and funding to start and maintain a field site. A more plausible way to ameliorate psychology's WEIRD problem is to recruit, support, include and collaborate with more scientists from beyond the WEIRD populations that have created the bias in the first place [11,13]. Alternatively, researchers can work with non-university populations nearby, to explore variation among people in their own local context [14]. More generally, researchers who study WEIRD populations must also recognize that their populations are also influenced by culture and should consider carefully how to define the specific population from which they recruit participants. Both these strategies fit with a broader, theoretically motivated approach to expand the inclusiveness of sampling in developmental psychology. This paper aims to show why developmental psychology needs this change, and establish some guidelines for how to study culture's role in development, no matter how near or far from home the study site may be.

    Cross-cultural data are expensive to get, but valuable to have. Their rarity in developmental psychology is due to more than a lack of interest in cross-cultural sampling, and we cannot dissolve those very real barriers in this paper. Instead, our goals in this paper are twofold. First, we aim to convince researchers in the field of developmental psychology that considerations of culture are relevant to their work, even if they do not do far-flung fieldwork themselves. Second, for cross-cultural developmental psychologists, we aim to leverage cultural evolutionary theory to enrich the central role of cross-cultural data to developmental psychology as a field. To achieve these aims, we highlight four common but false assumptions in present-day approaches to cultural variation in developmental psychology, and critique each in turn by drawing on cultural evolutionary theory and empirical findings. This step of identifying and refuting these assumptions will help to integrate the ‘cross-cultural’ niche within developmental psychology, in general, by demonstrating how culture and culture-based assumptions underlie some of the basic ideas that motivate research in developmental psychology. Those assumptions are that: (i) universality and uniformity are equivalent: that what is universal must necessarily follow a uniform pattern of development; (ii) Western populations are central in human psychology; (iii) differences among populations in development are always indicative of deficits; (iv) methods can automatically be transported across cultural contexts and yet maintain validity. We critique each assumption in turn, by drawing both on cultural evolutionary theory and on positive examples from the developmental psychology literature. In our conclusion section (§8), we summarize a general strategy for research that eschews these assumptions, and argue that this approach can pave the way for an improved science of developmental psychology by placing the cultural nature of humans at its centre.

    The universality assumption is the belief that observed uniformity is evidence for species-wide, biologically based universality. By contrast, any variation is regarded as evidence for culturally derived differences. By ‘universal’, we mean core mental or behavioural attributes shared by humans everywhere [16]. This assumption sometimes takes the form of an explicit claim that uniformity implies genetic underpinnings (often miscategorized as ‘biological’ or ‘evolutionary’), while variation necessarily indicates ‘cultural’ influences [17]. In all its forms, this assumption rests on the false nature/nurture dichotomy, that culture and biology are separate, opposite and competing explanations. In reality, human cultural capacities are part of our biology [18,19]. Equating psychological or behavioural variation with cultural influence precludes a deeper understanding of human behaviour, because a universally shared developmental process can function to produce behavioural or psychological variation. Instead, developmental flexibility and culture are both parts of the biology of human development, not alternative explanations—culture is a part of human biology and development [8].

    This false dichotomy between nature and nurture produces two versions of the universality as uniformity assumption: (a) that variation is equivalent to a lack of universality, and that (b) psychological/behavioural similarity is equivalent to universality. For the sake of clarity, we address each in turn.

    This assumption is often implicit in data analysis and study interpretation. For example, researchers conduct cross-site comparisons and conclude that any between-site difference is ‘cultural’, without explaining how culture produces differences in psychology and behaviour. In addition, researchers often treat whole cultures as if they are a single experimental condition, without considering the influence of environmental factors, such as resource availability, wealth or differences in the interpretation of the method (see §6 below). For example, directly comparing norms for anonymous sharing among wealthy Americans with those among poor, food-insecure Polynesian populations may result in differences—but those differences may be due to circumstances specific to resource scarcity, rather than some underspecified aspect of culture. This line of reasoning is not considered sufficient for studies of culture in other animals, and leads to energetic debates about sources of behavioural variation even in our closest living relatives (e.g. [20–22]). However, the same logic is rarely questioned in cross-cultural comparisons of human psychology. While cross-cultural comparisons do contribute to our knowledge of the range of variation in human behaviour, most fall short of understanding the sources and the scale of variation that can emerge via developmental processes—the real question at hand.

    The other side of the universality assumption consists of a belief that uniformity in behaviour and psychology is indicative of universally ‘innate’ traits that develop without cultural inputs.

    When developmental psychologists ask whether a feature is innate, and then seek to show that it emerges early and reliably across human populations, they rely upon assumptions that equate sameness, universality and innateness. By contrast, biologists have recognized notions of innateness as useless in ecology, biology and behaviour since the early 1990s [19]. This rests on a recognition, as Barrett [8, p.157] writes, that ‘…[t]here are not two kinds of things, the innate and the non-innate, but only one, the developmental process itself.’ Put simply, genes rely upon the environment in order to create an organism, and vice versa. In humans, culture is part of that ever-present environment.

    The equation of sameness with universality, and the desire to describe a general human psychology in these terms, have long been a driving philosophy in American psychology [11,16,23]. While valuable as a first pass, documenting similarities across sociocultural contexts is a subpar strategy for data collection when the goal is to understand culture's role in shaping development, or vice versa. Cultural evolutionary theory offers an alternative perspective for shaping research questions: that genes and culture have co-evolved in humans. Because of this ‘dual-inheritance’ system, both genetic and cultural information are essential ingredients in any explanation of human biology. Most developmental psychologists would not argue with this stance, but putting it into action in a research programme is still a challenge. Cultural evolutionary theory is useful in this practical sense, because it provides a working definition of culture that can inform quantitative work: ‘[c]ulture is information capable of affecting individuals' behaviour, that they acquire from other members of their species through teaching, imitation, and other forms of social transmission’ [19, p. 5].

    Cultural evolution's distinction of culture as socially learned information is useful as a research tool because it means developmental psychologists need not ask whether any particular trait is universal, biological and innate, versus cultural. When biology and culture are not opposites, this either/or is a meaningless, and therefore unanswerable, question. Instead, developmental psychology can embrace a transformed question: what is the relative influence of environmental, cultural and other contextual factors on shaping development of specific traits, in particular population? In other words, how variable and flexible is the development of this trait? Answering this context-rich question through studies that theorize about the functional role of variation will produce a body of evidence on how human psychological development varies. From this, researchers can build a more complete map of human psychological development.

    This view, rooted in cultural evolutionary theory, places flexibility at the centre of understanding what is universal about human psychological development. This provides a theoretically motivated way to predict when and how culture ought to impact development, rather than simply checking Western-based work against non-Western populations and lumping traits that are the ‘same’ as universal, and those that are ‘different’ as cultural.

    Studies of human language acquisition and socialization provide evidence for both variation in a cultural context, and shared developmental processes. Geographically and culturally disparate populations typically speak different languages, and in some cases even show variation in the neurological underpinnings necessary to master and use different languages [24]. The cultural expectations for children as language learners are shaped by their cultural contexts, and in some ways are inseparable from socialization more generally [25]. Language acquisition processes illustrate that developmental processes themselves—such as statistical learning [26]—can constitute universal learning mechanisms, which in turn generate behavioural and psychological variation. The same can be said for children's early learning environments: there are both shared and variable features, cross-culturally. For example, Broesch & Bryant [27,28] find that mothers and fathers across disparate societies routinely modify the properties of their speech when addressing young infants compared to when they address adults, yet they do so in different ways [28]. Despite identifying the existence of infant-directed speech by caregivers in North America, Kenya, Fiji and Vanuatu, they also find that parents vary cross-culturally in the form their infant-directed speech takes. Mothers across diverse societies and rural Vanuatu fathers modified their speech by adjusting features of the perceived pitch of their speech to infants. However, fathers in North America only slowed down the rate of their speech, without adjusting the perceived pitch [27]. The results of this study demonstrate why researchers cannot simply search for universality by equating it with similarity: it is too broad a question, and would lead us to ignore key details about the flexible nature of developmental processes.

    The Western centrality assumption is the belief that Western populations represent a normal and/or healthy standard against which development in all societies can and should be compared. This assumption literally fits the original definition of ethnocentric [29], in that it divides global populations into two rough categories, ‘the West’ and ‘the Rest,’ with Western societies at the centre of everything. This assumption is rarely if ever made explicit in print, but it is worked into the foundation of much developmental research, including the cognitive and medical milestones that serve as guidelines for both Western parents and international health agencies.

    From a cultural evolutionary perspective, lumping Western and non-Western societies into two broad categories of analysis is simply throwing data away. The study of cultural evolution is necessarily built on the study of the cultural history of societies all over the world, because explaining cultural variation requires a breadth of data across socioecological environments ([19]; see e.g. the range of sites included in Mace et al.'s edited volume [30]). From this perspective, every cultural context is an equally valid study site, and the importance of a particular site is down to its specific cultural features and their relevance to the research question. For example, Polynesia's history of step-wise settlement by ocean-faring canoe and its estimable rates of contact among societies make its cultural history an excellent case study on how population interconnectedness can influence the accumulation of complex material culture [31,32]. The key message from cultural evolutionary theory here is that these studies stand alone, and do not require a Western comparison sample to lend them value.

    The Western centrality assumption directly damages the accuracy and usefulness of developmental research. For example, Karasik et al. [33] review how developmental textbooks and medical guidelines employ standards for motor development that are built exclusively on American middle-class samples as proscriptive milestones. Karasik et al.'s data, drawn from six different societies, document within- and between-population variation in both the timing of the motor development of sitting, as well as the social and material contexts that contribute to those differences. This establishes a causal link between context and developmental trajectories. Karasik et al. conclude that using American-centric guidelines as if they are universal has ‘led to a gross misrepresentation of motor development’ (p. 1033). Treating Western samples as a universal measuring stick for development is, unfortunately, a pervasive practice. Greenfield et al. [34] review evidence that developmental trajectories derived from the study of Western populations, with their focus on independence, are unlikely to match how children learn and grow in sociocultural contexts where interdependence is prioritized. This is particularly true for social development. For example, while adolescence may be a transition to autonomy in independence-focused societies, in an interdependent society it is instead a relational shift that makes sense only in the context of kinship and community [34]. Likewise, classic theories of attachment [35] presuppose that the end goal of child development is independence and autonomy, rather than locally appropriate integration into kinship- and community-based interdependent relationships. In a review, Keller [36] questions whether these theories hold up when used to explain behaviour in cultural contexts beyond Western societies, and argues that incorporating data from additional populations requires revising existing theory along lines suggested by cultural and evolutionary theories of development.

    The deficit assumption is that population-level differences in developmental timing or outcomes are necessarily caused by something lacking, typically in parenting or educational systems. This line of reasoning allows for no flexibility, and assumes a single, inflexible developmental outcome. The assumption rides the coattails of the Western centrality assumption, in that the timeline that establishes ‘normal’ development from ‘delayed’ development is typically anchored on data from Western populations. However, the deficit assumption can also apply to Western populations or subpopulations therein. For example, Lancy [37] argues that excessive levels of teaching in Western societies may impinge on the development of a child's autonomy, The deficit assumption is also sometimes applied to subpopulations within Western societies, and so has recently become an important domain for self-critique in the field of developmental psychology (see [38]). However, the deficit assumption differs from the Western centrality assumption in two important ways. First, the deficit assumption carries an extra layer of interpretation in comparison to the Western centrality assumption. By this we mean that researchers simultaneously judge a given pattern in development as deviant and also attribute that difference to something that is lacking or missing from a family's or a population's way of raising children. This carries with it a value judgement that goes beyond a scientific approach to describing and explaining variation, and in doing so obscures the science itself. Second, the Western centrality assumption functions only in one direction. By contrast, the deficit assumption can lead researchers to claim that Western children are somehow worse off than non-Western ones. Often this takes the form of arguing that Western children are coddled, spoiled or excessively dependent on direct parent intervention.

    In assuming that group-level developmental differences are due to what is lacking in schooling or parenting, researchers frequently fail to (a) give any evidence for this mechanism beyond handwaving that ‘culture’ is the cause, and (b) in doing so, fail to consider the many specific axes of variation that comprise between-population differences. When researchers fail to give a specific cultural mechanism yet attribute differences to ‘culture,’ some of the variation may be due to situation (e.g. resource insecurity) rather than culturally inherited differences (e.g. collective ownership norms). Where this is the case, it is a serious challenge to the validity of cross-cultural comparisons, in that it fails to account for potential confounding variables. Recognizing and controlling for potential confounds are accepted as a crucial components of high-quality research in developmental psychology, with particular attention to detail in experimental studies. The same standard should be applied at the level of cross-cultural comparisons. The risk of neglecting to recognize a confounding variable decreases with a research team's expertise in the local context at their study site. Finally, the deficit assumption reinforces a deeper-seated assumption, (c) that there is one shared, correct outcome for various stages of development, and that this does not vary across populations or across societies.

    Cultural evolutionary theory instead presents a functionalist perspective. This means that the focus is on how different domains of development fit into both physical maturity and context-dependent social, emotional and relational factors. This emphasis on function in context is shared with dynamic systems theories [39], but an evolutionary approach is further motivated by understanding how developmental processes have emerged over an evolutionary timespan and in comparison to other species. From this perspective, developmental flexibility, including social learning, is part of what allows human culture to evolve faster than the human gene pool [40], and this in turn makes humans adaptable over short timescales [2]. (In contrast with dynamic systems theory, the term ‘adapt’ is almost never used in cultural evolutionary theory to refer to the timescale of a single individual behaving flexibly, but rather it is a population-level concept.) As a result, psychological development is pluralistic by design, and this evolved because flexibility is incredibly useful for a wide-ranging, invasive species like Homo sapiens. Barrett [8] has coined the term ‘designed emergence’ to capture the idea that developmental processes are flexible as a result of evolution by natural selection. Simply put, this means there is a range of healthy, functional outcomes that emerge from developmental processes. Outside of that range, pathology is still possible, especially in cases of extreme abuse or neglect that fall outside the breadth of typical human experience. Specific outcomes are not predetermined by genes, but are instead shaped by the interaction between genes and environment in ways that have been manufactured by natural selection. For developmental psychologists, the take-home message here is that shared processes of human development have a variety of outcomes, and this flexibility in outcomes is a feature rather than a bug. Developmental researchers can leverage this insight to create and evaluate hypotheses about how the form and developmental timing of psychological phenomena fit in functional ways with children's roles in varying sociocultural contexts.

    For example, psychologists have long assumed that direct, active teaching (often characterized by the verbal communication of abstract ideas) is the most efficient way to scaffold learning, and that therefore it must be present in all human societies (for review see [6]). By contrast, some anthropologists have often conflated direct instruction with involuntary, forced transmission, which replaces more enjoyable and (by this account) effective forms of learning by participation ([37,41,42]; see [6] for review). For both accounts, at least some societies have got the wrong answer to how children learn best—and children in those societies are at a deficit.

    Kline [6,43] uses cultural evolutionary theory as a foundation to argue that there are many functionally distinct types of teaching, which can be mixed and matched with learning problems. From this perspective, no single type always provides a ‘best’ outcome for the learner, because it depends on the learning problem at hand. This approach treats development as an integral working part of evolutionary processes, and prioritizes functional and causal explanations of variation. This is in contrast with other evolutionary accounts that explain why humans, and only humans, teach by referring to constraints in other animals. When successful, a cultural evolutionary approach uses the rich and culturally specific interpretations offered by ethnographic research as insights that can inform broader claims about the evolution and nature of human developmental psychology. Taking a functionalist, cultural evolutionary perspective offers power for generating and testing hypotheses in developmental psychology by incorporating the full range of human variation into what developmental psychologists term ‘typical’ development.

    The equivalency assumption is that using identical research methods, scales or questions will automatically produce equivalent and externally valid data, even across disparate cultural contexts. Arnett [11] elaborates on this rationale as the predominant philosophy of science in experimental American psychology: that in the laboratory, it does not matter who the participants are, or where or how they live—it matters only that the procedures within the experiment itself are sufficiently controlled. The equivalency assumption is demonstrably false when taken to the extreme: written methods must be translated, and translation inevitably brings up questions of whether or not there are shared concepts and meanings, across sociolinguistic contexts. Non-linguistic methods may avoid the problem of translation, but the question of whether methods and stimuli map to shared concepts, social context and expected behaviour across cultural groups is still an important one. Such comparisons are only useful when the meaning of the protocol is comparable across societies [44–46]. Further, assuming equivalency also means that researchers may fail to account for culturally specific environmental factors in development that are either present in WEIRD contexts but not at their study site, or that are absent in WEIRD contexts and therefore may be unrecognized as important factors at their study sites. For example, while direct verbal instruction may be rare in many non-Western societies, ethnographic studies of development in these contexts reveal a rich, interactive social context in which learning happens via participant observation and inclusion of children in everyday activities [37,41,47]. The social learning mechanisms vary but learning and developmental change happen in all cultural contexts.

    Cultural evolutionary theory treats the human brain, mind and behaviour as having evolved in the context of human interaction with the world, rich with social and cultural context. Ignoring that this cultural context affects how participants understand and respond to methods is particularly problematic when transporting methodologies across sociocultural contexts that differ in broad ways [16,44,48,49]. This is a problem even for developmental psychologists who do not venture to do cross-cultural work, because it means their methods and their results may be culture-bound and therefore limited in ways they have not explored.

    The equivalency assumption raises a particularly difficult challenge for cross-cultural comparisons in developmental psychology. The standards for experimental control are stringent and technically demanding. For example, effect sizes and statistical significance for studies with infants can depend on looking times that differ in terms of milliseconds. These tasks often require electricity, delicate equipment, trained personnel and quiet laboratory space to run effectively. However, even a perfectly replicated and controlled methodology cannot guarantee that participants from two different sociocultural contexts are interpreting the situation in similar ways and therefore the behaviours observed may not be comparable.

    As Heine and co-workers [44,50] conclude, there is no straightforward solution for this broad problem of context-specific methodological validity. Instead, establishing real comparability across populations requires more context, not less—and this means bringing ethnography into the picture as a standard resource to inform the design and interpretation of studies in developmental psychology. Cultural evolutionary research may seem an unlikely resource for addressing this methodological challenge because the field has no signature methodology of its own: for example, its studies of learning biases draw upon established psychological methods, and its studies of behaviour build on human behavioural ecology and animal behaviour. The formal mathematical models that established the field are themselves built on established models in epidemiology and genetics. The field is so thoroughly interdisciplinary that some cultural evolutionists have even proposed a division of labour within cultural evolutionary studies that subsumes existing disciplines [51]. We advocate instead for a mixed-methods approach, deploying methods in combinations that strategically compensate for the particular shortcomings of each method, and that are suitable for the research problem at hand. This is standard practice in some areas of social science, including the anthropological sciences, where both qualitative and quantitative data and analyses are used as needed [52].

    For example, researchers often treat mutual eye gaze between infant and caretaker as a reliable and stand-alone indicator of joint attention in the study of infant cognition. However, Akhtar & Gernsbacher [53] point out that the social role of eye gaze is variable across cultural contexts, and hence is not always a reliable indicator of joint attention. North Americans typically privilege eye contact and verbal interaction as a key part of parenting [54], but Gusii mothers in Kenya avert their eyes in response to mutual eye gaze with an excited infant, in part to keep their babies calm [55]. According to LeVine & LeVine [55], gaze avoidance by mothers is consistent with polite behaviour by Gusii adults, where excessive eye contact is considered rude and sometimes even aggressive. Gaze avoidance does not mean Gusii mothers are inattentive to their infants, but rather that they do not use mutual gaze as a means of establishing joint attention. Instead, they may use more physical types of interaction—a typical Gusii mother cosleeps with her infant, breastfeeds on demand and responds quickly to her infant's distress. Based on Lancy's review of the ethnographic literature on children and childhood [54], the Gusii approach of using more tactile contact and gestural communication may be more typical around the world than the North American approach, which emphasizes eye contact and verbal communication. An excessive focus on eye gaze as the key element in joint attention (e.g. [56]) may twist the scientific understanding of joint attention by underestimating its prevalence in societies where eye gaze is less important than in North American contexts.

    Rather than the narrowly Western-centric cue of eye gaze, vocal and postural behaviours may represent a more culturally generalizable set of cues for the study of infant social cognition [53]. In fact, gestural, postural and vocal cues may play an important role in Western contexts, but one that is de-emphasized in developmental psychology as a reflection of North American culture. However, the plurality of methodological approaches suggested by cultural evolutionary theory means there is another option besides searching for single (or a set of) cues that always indicate joint attention, across sociocultural contexts. Instead, researchers should use an array of cues, designed for particular sociocultural contexts, to compare the prevalence and behavioural form of joint attention across human populations. Using identical methods based on culturally specific cues will produce only superficially comparable data, and will produce a misleading picture of the ways in which populations vary.

    For each assumption above, we offer a shift in perspective that uses cultural evolutionary theory to pry those assumptions loose from present-day developmental psychological research. For standard developmental psychology, this means seeing the culture-bound nature of the questions, methods and results, and appropriately characterizing the generalizability of the research given the limited samples. For cross-cultural developmental psychology, this means guarding against some of the assumptions that are common in psychology more generally, and employing cultural evolutionary theory to improve how cross-cultural research is designed, conducted and interpreted.

    Using this approach, researchers can take some small steps to remediate the sampling problem in developmental psychology. Researchers working at institutions in WEIRD societies can step off campus to create more inclusive study by sampling populations in their towns but beyond campus, and in doing so can increase the inclusivity of their samples with a moderate level of investment in community engagement. They can also collaborate with and learn from colleagues at institutions outside of North America and Western Europe, to work with scholars who are both highly trained academics as well as regional experts in the societies in which they work and live. We do not argue that researchers should avoid studying or drawing comparisons between WEIRD populations and additional populations around the world. Instead, we argue that carefully specifying the meanings of cross-cultural studies, using cultural evolutionary theory, may open up a rich avenue for comparative research. This includes comparisons both within and between populations, to look for robust relationships between cultural variation and corresponding psychological, behavioural and developmental variation. This kind of data will allow researchers to study just how flexible human psychological development may be, because it allows us to ask whether the same causal relationships hold for development across populations, or whether the relationships and processes themselves are flexible. In essence, this approach ties the form of developmental flexibility to the sociocultural and ecological contexts in which human psychology functions over the lifespan.

    Researchers before us have tackled the question of appropriate cross-cultural comparisons, with a similar emphasis on the need for strategic selection of field sites and research problems (see e.g. [9,16]). In addition to these existing recommendations, we caution against any approach that treats entire ‘cultures’ or nations as indivisible wholes that are culturally, psychologically or behaviourally homogeneous. Rather than comparing whole ‘cultures,’ researchers should aim to map variation both within and across populations, along measurable axes of variation. This is especially applicable to broad cross-site surveys, which often include only coarse measures of cultural variation (e.g. gross domestic product, Gini coefficient or years of education), treat single sites as representative of entire countries, and further conflate those countries with ‘cultures.’ However, it is equally applicable to studies restricted to Western populations, where researchers can both expand the inclusivity of their samples, and be more explicit about the degrees of variation included in those samples. Both these practices will lead to better science in developmental psychology. By placing cultural context—and the flexibility that it entails—at the centre of this work, researchers will gain a deeper understanding of the developmental processes that build human cultural variation.

    The overarching message from a cultural evolutionary perspective is that developmental trajectories and endpoints can vary due to the human ability to learn flexibly, acquire information from others, and to recombine socially and individually learned information in creative ways. Using this as a springboard, developmental psychologists are well positioned to explore the developmental mechanisms and processes by which human children adapt to their local sociocultural and environmental contexts. Doing so will mean shedding light on one of the broadest human universals of all: variability.

    This article has no additional data.

    M.A.K. conceived of and drafted the manuscript. R.S. and T.B. both made intellectual contributions prior to the manuscript's first draft, and made edits and contributions to manuscript drafts. R.S. and T.B. contributed equally. All the authors approved the final version of this manuscript.

    We declare we have no competing interests.

    This research was made possible through the support of a grant from the John Templeton Foundation to the Institute of Human Origins at Arizona State University (no. 14020515). The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the John Templeton Foundation.

    We would like to thank Central European University's Department of Cognitive Science, for inviting the authors to a Social Mind Institute Workshop, which led to the formation of some of the early ideas for this paper.

    Footnotes

    One contribution of 16 to a theme issue ‘Bridging cultural gaps: interdisciplinary studies in human cultural evolution’.

    References

    • 1

      Pinker S. 2010The cognitive niche: coevolution of intelligence, sociality, and language. Proc. Natl Acad. Sci. USA 107(Suppl. 2), 8993–8999. (doi:10.1073/pnas.0914630107) Crossref, PubMed, ISI, Google Scholar

    • 2

      Boyd R, Richerson PJ, Henrich J. 2011Colloquium Paper: The cultural niche: why social learning is essential for human adaptation. Proc. Natl Acad. Sci. USA 108(Suppl. 2), 10 918–10 925. (doi:10.1073/pnas.1100290108) Crossref, ISI, Google Scholar

    • 3

      Henrich J. 2015The secret of our success: how culture is driving human evolution, domesticating our species, and making us smarter. Princeton, NJ: Princeton University Press. Crossref, Google Scholar

    • 4

      Tennie C, Call J, Tomasello M. 2009Ratcheting up the ratchet: on the evolution of cumulative culture. Phil. Trans. R. Soc. B 364, 2405–2415. (doi:10.1098/rstb.2009.0052) Link, ISI, Google Scholar

    • 5

      Dean LG, Vale GL, Laland KN, Flynn E, Kendal RL. 2013Human cumulative culture: a comparative perspective. Biol. Rev. 89, 284–301. (doi:10.1111/brv.12053) Crossref, PubMed, ISI, Google Scholar

    • 6

      Kline MA. 2015How to learn about teaching: an evolutionary framework for the study of teaching behavior in humans and other animals. Behav. Brain Sci. 38, 1–70. (doi:10.1017/S0140525X14001071) Crossref, ISI, Google Scholar

    • 7

      Jablonka E, Lamb MJ. 2014Evolution in four dimensions, 2nd edn. Cambridge, MA: MIT press. Crossref, Google Scholar

    • 8

      Barrett HC. 2014The shape of thought. Oxford, UK: Oxford University Press. Google Scholar

    • 9

      Henrich J, Heine SJ, Norenzayan A. 2010The weirdest people in the world?Behav. Brain Sci. 33, 61–83. (doi:10.1017/S0140525X0999152X) Crossref, PubMed, ISI, Google Scholar

    • 10

      Nielsen M, Haun D, Kartner J, Legare CH. 2017The persistent sampling bias in developmental psychology: a call to action. J. Exp. Child Psychol. 162, 31–38. (doi:10.1016/j.jecp.2017.04.017) Crossref, PubMed, ISI, Google Scholar

    • 11

      Arnett JJ. 2008The neglected 95%. Am. Psychol. 63, 602–614. (doi:10.1037/0003-066X.63.7.602) Crossref, PubMed, ISI, Google Scholar

    • 12

      Nielsen M, Haun D. 2015Why developmental psychology is incomplete without comparative and cross-cultural perspectives. Phil. Trans. R. Soc. B 371, 20150071. (doi:10.1098/rstb.2015.0071) Link, ISI, Google Scholar

    • 13

      Meadon M, Spurret D. 2010It's not just the subjects—there are too many WEIRD researchers. Behav. Brain Sci. 33, 104–115. (doi:10.1017/S0140525X10000208) Crossref, PubMed, ISI, Google Scholar

    • 14

      Fernald A. 2010Getting beyond the ‘convenience sample’ in research on early cognitive development. Behav. Brain Sci. 33, 91–92. (doi:10.1017/S0140525X10000294) Crossref, PubMed, ISI, Google Scholar

    • 15

      Rai TS, Fiske A. 2010ODD (observation-and description-deprived) psychological research. Behav. Brain Sci. 33, 106–107. (doi:10.1017/S0140525X10000221) Crossref, PubMed, ISI, Google Scholar

    • 16

      Norenzayan A, Heine SJ. 2005Psychological universals: what are they and how can we know?Psychol. Bull. 131, 763–784. (doi:10.1037/0033-2909.131.5.763) Crossref, PubMed, ISI, Google Scholar

    • 17

      Apicella CL, Barrett HC. 2016Cross-cultural evolutionary psychology. Curr. Opin. Psychol. 7, 92–97. (doi:10.1016/j.copsyc.2015.08.015) Crossref, ISI, Google Scholar

    • 18

      Boyd R, Richerson PJ. 1985Culture and the evolutionary process. Chicago, IL: University of Chicago Press. Google Scholar

    • 19

      Richerson PJ, Boyd R. 2005Not by genes alone: how culture transformed human evolution. Chicago, IL: University of Chicago Press. Google Scholar

    • 20

      Whiten A, Horner V, Marshall-Pescini S. 2003Cultural panthropology. Evol. Anthropol. 12, 92–105. (doi:10.1002/evan.10107) Crossref, ISI, Google Scholar

    • 21

      Langergraber KE, Vigilant L. 2011Genetic differences cannot be excluded from generating behavioural differences among chimpanzee groups. Proc. R. Soc. B 278, 2094–2095. (doi:10.1098/rspb.2011.0391) Link, ISI, Google Scholar

    • 22

      Langergraber K, Schubert G, Rowney C, Wrangham R, Zommers Z, Vigilant L. 2011Genetic differentiation and the evolution of cooperation in chimpanzees and humans. Proc. R. Soc. B 278, 2546–2552. (doi:10.1098/rspb.2010.2592) Link, ISI, Google Scholar

    • 23

      Shweder RA. 1999Why cultural psychology?Ethos 27(1), 62–73. (doi:10.1525/eth.1999.27.1.62) Crossref, ISI, Google Scholar

    • 24

      Gea J, Peng G, Lyu B, Wang Y, Zhuoe Y, Niuf Z, Tang LH. 2015Cross-language differences in the brain network subserving intelligible speech. Proc. Natl Acad. Sci. USA 112, 2972–29777. (doi:10.1073/pnas.1416000112) Crossref, PubMed, ISI, Google Scholar

    • 25

      Schieffelin B, Ochs E. 1986Language socialization. Annu. Rev. Anthropol. 15, 163–191. (doi:10.1146/annurev.an.15.100186.001115) Crossref, ISI, Google Scholar

    • 26

      Saffran JR, Aslin RN, Newport EL. 1996Statistical learning by 8-month-old infants. Science 274, 1926–1928. (doi:10.1126/science.274.5294.1926) Crossref, PubMed, ISI, Google Scholar

    • 27

      Broesch T, Bryant GA. 2017Fathers' infant-directed speech in a small-scale society. Child Dev. (doi:10.1111/cdev.12768) PubMed, ISI, Google Scholar

    • 28

      Broesch TL, Bryant GA. 2015Prosody in infant-directed speech is similar across western and traditional cultures. J. Cogn. Dev. 16, 31–43. (doi:10.1080/15248372.2013.833923) Crossref, ISI, Google Scholar

    • 29

      LeVine RA. 2001Ethnocentrism. In International encyclopedia of the social and behavioral sciences (eds Smelser NJ, Baltes PB), pp. 4852–4854. Oxford, UK: Oxford University Press. Google Scholar

    • 30

      Mace R, Holden C, Shennan S (eds). 2005The evolution of cultural diversity: a phylogenetic approach. Walnut Creek, CA: Leftcoast Press. Google Scholar

    • 31

      Kline MA, Boyd R. 2010Population size predicts technological complexity in Oceania. Proc. R. Soc. B 277, 2559–2564. (doi:10.1098/rspb.2010.0452) Link, ISI, Google Scholar

    • 32

      Henrich Jet al.2016Understanding cumulative cultural evolution. Proc. Natl Acad. Sci. USA. 113, E6724–E6725. (doi:10.1073/pnas.1610005113) Crossref, PubMed, ISI, Google Scholar

    • 33

      Karasik LB, Tamis-LeMonda CS, Adolph KE, Bornstein MH. 2015Places and postures. J. Cross Cult. Psychol. 46, 1023–1038. (doi:10.1177/0022022115593803) Crossref, PubMed, ISI, Google Scholar

    • 34

      Greenfield PM, Keller H, Fuligni A, Maynard A. 2003Cultural pathways through universal development. Annu. Rev. Psychol. 54, 461–490. (doi:10.1146/annurev.psych.54.101601.145221) Crossref, PubMed, ISI, Google Scholar

    • 35

      Bowlby J. 1989Attachment theory. Los Angeles, CA: Lifespan Learning Institute. Google Scholar

    • 36

      Keller H. 2013Attachment and culture. J. Cross Cult. Psychol 44, 175–194. (doi:10.1177/0022022112472253) Crossref, ISI, Google Scholar

    • 37

      Lancy DF. 2010Learning ‘from nobody’: the limited role of teaching in folk models of children's development. Childhood Past. 3.1, 79–106. (doi:10.1179/cip.2010.3.1.79) Crossref, Google Scholar

    • 38

      Akhtar N, Jaswal VK. 2013Deficit or difference? Interpreting diverse developmental paths: an introduction to the special section. Dev. Psychol. 49, 1–3. (doi:10.1037/a0029851) Crossref, PubMed, ISI, Google Scholar

    • 39

      Smith LB. 1993A dynamic systems approach to development: applications. Cambridge, MA: The MIT Press. Google Scholar

    • 40

      Perreault C. 2012The pace of cultural evolution. PLoS ONE 7, e45150. (doi:10.1371/journal.pone.0045150) Crossref, PubMed, ISI, Google Scholar

    • 41

      Paradise R, Rogoff B. 2009Side by side: learning by observing and pitching in. Ethos 37, 102–138. (doi:10.1111/j.1548-1352.2009.01033.x) Crossref, ISI, Google Scholar

    • 42

      Rogoff B, Matusov E, White C. 1996Models of teaching and learning: participation in a community of learners. In The handbook of education and human development: New models of learning, teaching and schooling (eds Olson DR, Torrance N), pp. 388–414. Oxford, UK: Blackwell. Google Scholar

    • 43

      Kline MA. 2016TEACH: an ethogram-based method to observe and record teaching behavior. Field Methods 29, 205–220. (doi:10.1177/1525822X16669282) Crossref, ISI, Google Scholar

    • 44

      Heine SJ, Norenzayan A. 2006Toward a psychological science for a cultural species. Perspect. Psychol. Sci. 1, 251–269. (doi:10.1111/j.1745-6916.2006.00015.x) Crossref, PubMed, ISI, Google Scholar

    • 45

      Pepitone A, Triandis HC. 1987On the universality of social psychological theories. J. Cross Cult. Psychol 18, 471–498. (doi:10.1177/0022002187018004003) Crossref, ISI, Google Scholar

    • 46

      Poortinga YH. 1989Equivalence of cross-cultural data: an overview of basic issues. Int. J. Psychol. 24, 737–756. (doi:10.1080/00207598908246809) Crossref, PubMed, ISI, Google Scholar

    • 47

      Rogoff B, Paradise R, Arauz R, Correa-Chávez M, Angelillo C. 2003Firsthand learning through intent participation. Annu. Rev. Psychol. 54, 175–203. (doi:10.1146/annurev.psych.54.101601.145118) Crossref, PubMed, ISI, Google Scholar

    • 48

      Cohen D. 2007Methods in cultural psychology. In Handbook of cultural psychology, pp. 196–236. London, UK: The Guilford Press. Google Scholar

    • 49

      Greenfield PM. 1997Culture as process: empirical methods for cultural psychology. In Handbook of cross-cultural psychology: theory and method (eds Berry JW, Poortinga YH, Pandey J), pp. 301–346. Boston, MA: Allyn & Bacon. Google Scholar

    • 50

      Heine SJ, Lehman DR, Peng K, Greenholtz J. 2002What's wrong with cross-cultural comparisons of subjective Likert scales?: The reference-group effect. J. Pers. Soc. Psychol. 82, 903–918. (doi:10.1037/0022-3514.82.6.903) Crossref, PubMed, ISI, Google Scholar

    • 51

      Mesoudi A, Whiten A, Laland KN. 2006Toward a unified science of cultural evolution. Behav. Brain Sci. 29, 329–383. (doi:10.1017/S0140525X06009083) Crossref, PubMed, ISI, Google Scholar

    • 52

      Bernard HR. 2011Research methods in anthropology: qualitative and quantitative approaches, 5th edn. New York, NY: Altamira Press. Google Scholar

    • 53

      Akhtar N, Gernsbacher MA. 2008On privileging the role of gaze in infant social cognition. Child Dev. Perspect. 2, 59–65. (doi:10.1111/j.1750-8606.2008.00044.x) Crossref, PubMed, ISI, Google Scholar

    • 54

      Lancy DF. 2008The anthropology of childhood. Cambridge, UK: Cambridge University Press. Google Scholar

    • 55

      LeVine RA, Levine S. 1996Child care and culture: lessons from Africa. Cambridge, UK: Cambridge University Press. Google Scholar

    • 56

      Tomasello M, Carpenter M, Call J, Behne T, Moll H. 2005Understanding sharing intentions: the origins of cultural cognition. Behav. Brain Sci. 28, 675–735. (doi:10.1017/S0140525X05000129) Crossref, PubMed, ISI, Google Scholar


    Page 10

    Humans reproduce far less than their physiological capacity allows [1]. One indicator of this tendency is a cross-culturally common practice of women stopping reproduction well before they reach menopause [2,3]. Such behaviours appear puzzling from evolutionary perspectives that anticipate women making full use of their reproductive careers to maximize fitness. In this paper, we use two evolutionary frameworks—human behavioural ecology (HBE) and cultural evolutionary theory (CET)—to explore the timing of reproductive cessation among the Mosuo of Southwest China. As discussed in §4, the Mosuo have been officially constrained to a maximum of three children per woman since the implementation of the Chinese fertility policy in the late 1970s. They also consist of two distinct subpopulations—one matrilineal and duolocal and one patrilineal and patrilocal—that reside in distinct geographical areas with very different terrains that affect the spread of information among communities. We make use of these differences (i.e. the fertility policy and differences in kinship ecologies) as quasi-natural experiments [4] to perform simultaneous tests of hypotheses drawn from HBE and CET frameworks.

    Just as there are no singular evolutionary predictions, there are no singular behavioural ecological or cultural evolutionary predictions for a given phenomenon, including age at last birth (ALB). Rather, several behavioural ecological and cultural evolutionary models with diverse assumptions suggest different predictions. Human behavioural ecologists emphasize the concept of trade-offs [5] and strategic negotiations [6] in their hypothesis building. CET has traditionally focused hypothesis building on the dynamics of change and the ways that individuals acquire information from others.

    The compatibilities and disjunctures between evolutionary frameworks, including HBE and CET, have been debated [7–10], but there is increasing interest in integrating these frameworks [8,11]. The most common empirical attempts at integration (e.g. [12–14]) are arguably in line with Tinbergen's [15] call to establish comprehensive understandings of behaviour, by exploring both proximate (mechanism and ontogeny) and ultimate (evolutionary history and current adaptive value) explanations for a given trait. In this type of integration, HBE may be viewed as seeking ‘ultimate’ explanations for an adaptively plastic trait, whereas CET explanations that focus on social transmission may be viewed as ‘proximate’ mechanisms describing how traits are spread [10]. The approach is integrative in the sense that it recognizes that these answers are necessarily mutually compatible as answers at one level ‘cannot be regarded as also answering another’ [8, p. 716], while accommodating the fundamental interest in HBE of identifying the adaptive value of a behaviour in its current environment [16] versus the focus on how behaviours are spread that is central to CET [17].

    This type of integration falls short of truly synthetic approaches envisioned by recent proponents of a ‘new’, or extended, evolutionary synthesis. One of these proponents' main goals is to investigate the ways that other inheritance systems, such as cultural ones, affect evolutionary dynamics and adaptation [8,9]. Such approaches point out that processes other than natural selection can lead to non-genetic adaptations [17]. This is because cultural traits or institutions, somewhat analogously to genes, can be differentially successful at replicating if they enhance their hosts’ survival or reproduction, or if they are preferentially copied. The simplest example of cultural adaptations are tools, such as arrowheads, that evolve to become increasingly adept at performing a function in a given environment. Institutions can be thought of in a similar way if those that provide individual or group benefits in a given environment are more likely to be adopted or persist (e.g. [18,19]). This pushes cultural processes into the domain of ultimate explanation. Complex coevolutionary models between genes and cultural traits and adaptive lags in quickly changing environments arguably dissolve the neat distinction between ultimate and proximate explanations for various phenomena [10,20].

    In our view, the synthetic value of this approach applies most clearly to understanding the dynamics of evolutionary processes and less clearly affects inferences drawn about the current adaptive value of a trait as it is measured at one place and time (see also [21]). For this project, we lack the intergenerational data necessary to explore the dynamic interactions between transmission and long-term adaptive behaviour. With cross-sectional data from women of varying ages who were subject to the changing institutional landscapes of twentieth century China, we examine (i) how reproductive behaviour changes in response to cultural shifts, (ii) some possible mechanisms of such change and (iii) whether such behaviours are consistent with specific adaptive models of decision-making. While we cannot accomplish an ‘extended’ synthesis, we recognize the importance of multiple levels of causality for the observed variation in ALB. Simultaneous consideration of HBE and CET frameworks is thus, on the one hand, pragmatic [8]; at the same time, it represents a true step forward in attempting to explore both the current utility and the transmission dynamics of ALB strategies.

    While we tried to devise comprehensive hypotheses sets before seeing the data, our analysis is largely exploratory. We review several hypotheses drawn from HBE and CET to posit explanations for the pathways influencing spatial and temporal variation in the timing of ALB. We focus on a population of ethnic Mosuo in Southwest China where fertility has been officially restricted for nearly 40 years. This allows us to examine reproductive timing decisions in a context where they are unlikely to reflect downstream effects of decisions made with respect to total fertility. Furthermore, the context affords the opportunity to study the impacts of (i) top–down institutional policy changes such as the ‘one child policy’ on temporal variation, (ii) matri- versus patri-focal kinship institutions on spatial variation and (iii) educational variation on both spatial and temporal variation in individuals' strategies towards the end of their reproductive careers.

    Several evolutionary models used by human behavioural ecologists suggest possible adaptive motivations for earlier reproductive stopping. Such models have close ties to related phenomena, including fertility decline and the evolution of menopause and a long post-reproductive lifespan. First, earlier reproductive stopping may be used to manage quality/quantity trade-offs when intermediate levels of reproductive output maximize fitness [22,23]. The decision to terminate reproduction at an earlier age could be consistent with a quality-focused strategy if it allowed parents to invest more intensively in existing offspring, particularly while their energetic reserves are relatively high [24]. On the other hand, later ALB could also be construed as an investment in offspring quality if this were associated with increased time-sensitive parental inputs (e.g. breastfeeding) into offspring who are more widely spaced, or higher-quality parenting after accumulation of greater social or economic capital. Particularly in the former case, we would anticipate a corresponding shift to lower lifetime reproductive output and improved indicators of child quality (e.g. child health). Mattison et al. [25] have argued previously that earlier reproduction is associated with increased availability of potential allocarers among the matrilineal Mosuo. If so, we might anticipate that the presence of allocarers favours later ALB, reflecting a longer reproductive span with no trade-off in child quality.

    In the absence of strong quality–quantity trade-offs, a more potent consideration in evolutionary models concerns the effects of demography on reproductive timing. Earlier reproduction is favoured in stationary and growing populations. This idea is encapsulated by Fisher's [26] concept of reproductive value [5]: earlier-born offspring represent a greater marginal benefit to parental reproductive success (RS) than later-born offspring and are also favoured, given sufficient spacing between offspring, due to future discounting [5]. This is because earlier bouts of reproduction shorten generation times and earlier-born individuals constitute a higher relative proportion of the population gene pool [27–29]. Delayed reproduction also increases the likelihood that a woman will die before the birth or maturation of her next child [30]. Timing is thus often as important a determinant of fitness as the total reproductive output, which is only a true reflection of fitness in stationary populations [5]. Thus, in a society where total fertility is tightly regulated, but where the population was still growing, as was the case after the implementation of various fertility restriction policies in China, we anticipate earlier timing of births being beneficial with respect to fitness.

    A third class of evolutionary game theoretic models focuses on reproductive negotiations between members of a household. Sometimes termed ‘cooperative breeding models’, various accounts posit that the cost of reproduction may be alleviated by the availability of allocarers (e.g. maternal grandmothers or older daughters) who may promote continued reproduction [31–33]. There is some empirical support for the importance of older daughters' contributions to their mothers' reproductive success and prolonged reproductive careers. For example, Bereczkei & Dunbar [34] showed later ALB for Roma women who had first-born daughters as compared to those with first-born sons and interpreted this as evidence of first-born daughters acting as ‘helpers at the nest’. Similarly, Turke [35] found higher ALB and lifetime RS for families with first-born daughters among the matrilocal Ifaluk of Melanesia.

    However, within cooperative systems, there is often conflict and competition over household resources and the recipients of allocare [3,36], which could also bear on timing of ALB. Cant & Johnstone [36] have argued that women who recently married patrilocally, and are thus genetically unrelated to their marital households, have a higher stake in the competition over who gets to reproduce than do their mothers-in-law, who by later life would be related to many household members. Thus, ALB is predicted to be earlier in patrilocal contexts because mothers-in-law cede the competition to their daughters-in-law. In matrilocal contexts, by contrast, a woman would be as related to any offspring she produced as to those produced by her parents, which would result in the conflict being resolved in favour of a woman's mother, thus favouring later ALBs. Snopkowski et al. [3] did not find evidence to support these predictions when they compared ALB and the age of menopause between matrilocal and patrilocal women in Indonesia. Stronger evidence for a conflict model of reproductive cessation was found in a study of pre-industrial Finns, where intergenerational overlap in reproduction led to decreased offspring survivorship [37]. In the Chinese context, we might anticipate such reproductive conflicts, and therefore the associations between ALB and residence strategy being stronger before reductions in fertility reduced the potential for reproductive overlap.

    The above models are agnostic as to the ways that people achieve these equilibria or fitness-enhancing strategies. While individual learning, evolved psychological mechanisms and physiological responses may play roles, it is likely that cultural learning contributes to the spread of strategies about timing of ALB. This is because (i) learning what is optimal reproductive timing in a vast range of socio-ecological conditions is a difficult task that does not lend itself to trial-and-error learning and (ii) the moralization of reproductive behaviours is cross-culturally pervasive [38,39] and requires learning local norms in order to coordinate with others. Several empirical studies have found evidence for social transmission affecting reproductive behaviours. While Alvergne et al. [13] found that social transmission of contraception norms was less important than individual factors such as parity and education in explaining the uptake of contraception in Ethiopia, education itself may show an effect because it is an avenue for social learning. Colleran and co-workers [12] find that social influences were important both with respect to the uptake of contraception and in relation to the specific form of contraception that women chose to use in rural Poland. This study also reinforced the importance of community-level characteristics (e.g. mean level of education) in affecting the dynamics of social transmission. Howard & Gibson [14] also show that female genital cutting (FGC) in West Africa plausibly persists due to mechanisms of cultural transmission and that the adaptive value of FGC appears higher in contexts where the practice is more common, suggesting the importance of coordinating norms. Several models have suggested that the increased importance of educational systems with teachers who act as cultural models and have lower fertility may be implicated in fertility declines. Empirical evidence supports the importance of both group- [12] and individual-level [40] education in fertility decline. This pattern is consistent with accounts that stress social models (such as teachers) with lower fertility behaviours serving as vectors of new norms and with accounts that stress the new economic and reproductive trade-offs for individually educated women. To our knowledge, pathways illuminating how ALB is spread have not previously been explored. In this paper, we inspect patterns of spatial clustering across villages and temporal change in ALB to draw inferences about the role of social transmission in the timing of ALB among the Mosuo.

    Before we test more specific hypotheses derived from HBE and CET frameworks, we examine whether there is spatial (across villages) and temporal (across cohorts) variation in ALB. There are many reasons why ALB might be structured at the village level. This could be due to village-level norms surrounding reproductive strategies specifically (e.g. regarding the timing of births or the length of reproductive careers), or other norms that have downstream effects on reproduction (e.g. education is important, so reproductive careers should be pushed back). Villages could also differ in their ecologies (e.g. terrain ruggedness) in ways that would affect reproductive timing (e.g. because women engage in more strenuous labour in some villages, which prevents them from conceiving as easily at older ages). The most obvious reason for temporal variation in ALB in this context is the top–down implementation of the Chinese fertility policy in the late 1970s that restricted fertility to a maximum of three children for ethnic minorities such as the Mosuo living in rural areas [41]. The wan-xi-shao (late–long–few) messaging of the earlier part of the decade, by which the government encouraged voluntary later births, longer birth intervals and fewer children, resulted in lower fertility in China [42,43] and might be associated with shifts in ALB. Even before these national policies, in the 1940s and 1950s, the Mosuo exhibited lower fertility than their neighbours [44], suggesting that local reproductive norm differences may have already been in play. Furthermore, global trends associated with the demographic transition, including increasing importance of education, are likely to play a role in any cohort changes we might find [45].

    All of these cultural and institutional shifts changed the fitness landscapes for different reproductive strategies among the Mosuo. Possible implications of different findings are outlined below in H1–H5.

    H1. Temporal variation in ALB across the twentieth century will be partly explained by the late–long–few messages of the 1970s, and the so-called one-child policy, which began being implemented in 1979. This can produce various patterns.

    • (a) Top–down policies to reduce fertility may result in earlier ALB if earlier reproduction is favoured [25] as parents move to reproduce earlier in order to front-load reproduction. Furthermore, people ‘caught’ in the middle of their reproductive careers when policies were enacted might have earlier last births just by virtue of having met fertility quota.

    • (b) If fertility policies served as a cue that population sizes were likely to start decreasing, people may have pursued later fertility schedules, as later ALB is hypothesized to be advantageous in shrinking populations [5].

    • (c) Bottom–up shifts in fertility may have preceded policy implementation [43,46]; if so, then there may be no clear association between policy implementation and ALB.

    H2. Spatial variation in village-level ALB will be partly explained by kinship systems. If reproductive conflict between women in stable reproductive unions is a determinant of female ALB, we expect earlier ALB in patrilocal communities.

    H3. Education both changes life-history trade-offs and introduces new norms.

    • (a) If the higher embodied capital and norms associated with education mean pushing reproductive careers later, we would expect a positive association between education and ALB, controlling for fertility.

    • (b) Insofar as education accounts for the historical shifts associated with demographic transitions, we expect cohort effects to drop out once we account for education.

    H4. Differences seen in ALB might be by-products of other changes in reproductive decisions that shift in response to policies or local socio-ecology.

    • (a) If ALB changes as a consequence of total fertility, we expect the effects of cohort, education and village-level variance to attenuate once we account for fertility.

    • (b) If ALB changes as a consequence of a later shift in reproductive career, we expect AFB to show similar cohort effects to ALB.

    Finally, patterns of spatial and temporal heterogeneity can also speak to how ALB and related norms are culturally transmitted and spread. Villages in the matrilineal region are in less rugged and more accessible terrain than those in the patrilineal region. Therefore we might expect that:

    H5. Information spreads more quickly in matrilineal than patrilineal regions and in villages close to the major market town. This may be manifested as:

    • (a) greater homogeneity in ALB in matrilineal regions or

    • (b) faster changes in ALB in matrilineal regions and in villages nearer to the market town.

    The data for this study were collected over nine months in 2008 in both patrilineal and matrilineal communities of the ethnic Mosuo of Southwest China. The Mosuo are one of 55 officially recognized ethnic minorities in China [47] now numbering over 40 000 [40] and residing on the border of Sichuan and Yunnan Provinces surrounding the picturesque Lugu Lake. The matrilineal subpopulation of the Mosuo are best known to social scientists: this subpopulation resides in the flatlands of the Hengduan Mountains, practises matrilineal descent, in which lineage membership is conferred via females [48] and inheritance, in which resources are passed on from senior generations of matrilineally related individuals to all junior members of the household [49]. Under typical circumstances, only descendants of household females would be present to inherit, as husbands and wives normatively maintain separate residences (i.e. they are duolocal), practising non-committal reproductive unions known as ‘walking marriages’ [50], whereby children that result from these unions reside with their mothers. The patrilineal Mosuo reside in distinct but neighbouring geographical regions, in steeper areas of the Hengduan Mountains. Although they share much in common with their matrilineal counterparts, including language, attire, certain customs, religious beliefs and even blood relations, they differ almost entirely in their systems of inheritance, descent and marriage [51]. Descent and inheritance are patrilineally reckoned, marriage is monogamous, and postmarital residence normatively patrilocal. The terrain is also especially rugged, movement between villages correspondingly difficult, and access to the market town quite limited. Land is more circumscribed for individual households due to this rugged terrain [52]. The patrilineal Mosuo are, on the aggregate, poorer than the matrilineal Mosuo, whose wealth is also more variable; two Mosuo villages at the time the study was deployed were particularly wealthy due to the influence of tourism (see also [53]). Wealth differences increased beginning in the 1980s, when tourism become an important source of income and probably affected a relatively small fraction of Mosuo families prior to that. Our test of the conflict model of timing in ALB is thus anchored in a setting where the major distinctions in subpopulations are based on kinship, and possibly ecology, as opposed to broader normative distinctions in cultural ideologies.

    By contrast, the Chinese fertility policy provides a relatively clean test of how a pinpointed change in the cultural regulation of reproduction may have affected ALB. Neither subpopulation of the Mosuo could be considered to display natural fertility [54]. Barrier contraception is used commonly to regulate the timing of childbearing, as is tubal ligation when no further children are desired. Since ca. 1979, the Mosuo have been subject to the Chinese fertility policy. As a minority population, they were allowed a maximum of three children at the time of our surveys. Although the implementation of the fertility policy was variable across China [55], it left a clear signature among the Mosuo, with declines in fertility evident beginning with cohorts born after 1950 [56,57]. The implementation of this policy thus provides a ‘natural experiment’ [4] through which to investigate the effects of rapid cultural transitions on reproductive behaviour.

    To do so, we examine the demographic records of women drawn from 12 villages in both patrilineal (N = 5 villages) and matrilineal (N = 7 villages) subpopulations of the ethnic Mosuo. These records were obtained via direct interviewing of a single respondent in each of 228 households. Interviews were administered in Mandarin Chinese or in the local dialect by a member of the research team, or translated into Naru, the Mosuo language, by a local assistant [52]. Each respondent was asked to provide information on all individuals who had been born in the household, even if they resided elsewhere at the time the survey was given. In these analyses, we focus on a subset of data pertaining to stopping reproduction.

    To study ALB, we examine a sample of reproductive women from the 2008 household surveys (n = 320 women). This sample includes all resident women who were at least 30 years old, who had had one or more children and who had complete data on the modelled variables. Variables include ALB—our dependent variable—and the following covariates: age; cohort; education; normative lineality of community of residence (matrilineal versus patrilineal) and village of residence; distance to market town; fertility (number of surviving children); and age at first birth (AFB); see table 1 for summary statistics and detailed descriptions.

    Table 1.Descriptive statistics of sample population, broken down by lineality.

    variable namematrilinealpatrilinealcombinedvariable description
    mean(s.d.)mean(s.d.)mean(s.d.)
    age48.6 (14.8)47.7 (13.4)48.4 (14.6)approximate age at time of interview
    fertility2.8 (1.7)2.6 (1.1)2.8 (1.6)number of live births
    AFB23.1 (4.1)21.0 (2.8)22.7 (4.0)age minus oldest child's age
    ALB29.2 (6.4)26.3 (4.6)28.7 (6.2)age minus youngest child's age
    distance8.1 (7.3)48.8 (2.7)15.2 (16.9)village-level distance (km) to primary market town
    N (%)N (%)total
    lineality264 (82.5)56 (17.5)320residence in normatively patrilineal or matrilineal area
    cohort<195534 (94.4)2 (5.6)365-year intervals centred on calendar year at age 16
    1955–195915 (71.4)6 (28.6)21
    1960–196416 (76.2)5 (23.8)21
    1965–196923 (85.2)4 (14.8)27
    1970–197418 (81.8)4 (18.2)22
    1975–197919 (76.0)6 (24.0)25
    1980–198441 (83.7)8 (16.3)49
    1985–198954 (81.8)12 (18.2)66
    1990–199444 (83.0)9 (17.0)53
    educationnone207 (82.1)45 (17.9)252level of highest school attended
    elementary36 (80.0)9 (20.0)45
    middle+21 (91.3)2 (8.7)23
    village village of current residence (N = 12)

    To examine temporal variation, we use age to construct a reproductive cohort variable based on 5-year intervals. This cohort variable is centred to show the period during which the woman was 16 years of age, corresponding to the earliest AFB in the data. Compared with linear age, the cohort variable allows us to detect nonlinear changes over time, including those that the various fertility policies of the 1970s might have precipitated.

    To examine spatial variation, we use multilevel models with random effects for the specific village in which a woman resided. As described above, villages are located in normatively matrilineal and patrilineal subpopulations, which are spatially clustered. Lineality thus captures both the kinship system and a larger region. The distance variable gives geodesic distance in kilometres between each specific village and the main market town.

    We evaluate our predictions through estimation and comparison of multilevel survival statistical models. We define as right-censored all women between 30 and 44 who gave birth within the past 6 years, irrespective of the number of children. Given demographic trends [58], this is a conservative censoring rule—many censored women would likely have had their final child. All statistical analyses are conducted in R [59]. To use both survival and mixed modelling approaches, we use the coxme package to estimate cox proportional hazard models with village as a random effect, allowing the intercept to vary [60]. We evaluate the resulting models using a model comparison approach based on information criteria [61,62]. AIC differences are used to calculate model weights among different subsets of models according to the hypothesis being evaluated. The weight of the model is a function of the distance (difference) in AIC values between models and is a measure of the relative likelihood that it is the best model given the data among the models being compared [63].

    For coherence, we report results in the order that we presented our hypotheses.

    H1. ALB exhibits a downward temporal trend that is apparent across the full sample of women (figure 1). The model with reproductive cohort receives much stronger support (w = 1) than a model without cohort, despite the jump in the number of estimated parameters (table 2; ‘cohort’). Older women in the sample—particularly those who reached 16 years old before 1960—have later ALBs than younger women, as well as greater variation in ALB; i.e. younger cohorts are progressing faster to their last births (table 1). That said, the changes in ALB clearly began prior to the late, long, few campaign of the 1970s and the one-child policy of 1979. Although ALB continues to drop for women whose prime reproductive years occur during these policies, there is no clear evidence of a time threshold—i.e. of an abrupt change in ALB that maps onto new fertility policies.

    What is cultural transmission example?

    Figure 1. Age at last birth (ALB) by cohort. ALB shows declines in mean and median age and in variability across cohorts in our sample. Note that the steepest decline occurs between the cohort of women reaching maturity prior to 1960 and the following cohorts. The first cohort includes the most women who would have finished reproduction by the enactment of the Chinese fertility policy.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Table 2.Model estimates predicting progression to last birth. Column values are exponentiated coefficients for the variables in the given model, followed by 95% confidence intervals in brackets. In simple terms, exponentiating the coefficients gives us a description of how, for a particular woman's realized values for the independent variables, the risk of having an event (e.g. a last birth) increases or decreases relative to the comparison group. For ALB, a value greater than 1 increases the likelihood (risk) of a last birth relative to the base risk, whereas a value less than 1 decreases the likelihood of a last birth relative to the base risk [60]. AIC, Akaike information criterion.

    cohortcohort + linealitycohort + fertilitycohort + educationcohort + distancefull
    cohort(<1955)
    1955–19591.46 [0.83, 2.54]1.35 [0.77, 2.36]1.49 [0.85, 2.60]1.47 [0.84, 2.56]1.35 [0.77, 2.36]1.40 [0.80, 2.45]
    1960–19642.03 [1.15, 3.59]1.94 [1.11, 3.42]2.23 [1.26, 3.95]1.99 [1.12, 3.52]1.96 [1.11, 3.46]2.07 [1.17, 3.66]
    1965–19693.39 [1.96, 5.87]3.43 [1.99, 5.93]2.76 [1.59, 4.78]3.38 [1.95, 5.85]3.37 [1.95, 5.83]2.74 [1.58, 4.75]
    1970–19743.19 [1.80, 5.65]2.96 [1.70, 5.17]2.44 [1.37, 4.34]3.16 [1.78, 5.59]3.04 [1.73, 5.34]2.27 [1.28, 4.02]
    1975–19796.39 [3.63, 11.28]6.12 [3.47, 10.80]4.42 [2.47, 7.92]6.69 [3.77, 11.85]6.19 [3.51, 10.91]4.42 [2.46, 7.96]
    1980–19844.85 [2.97, 7.92]4.47 [2.77, 7.22]3.10 [1.84, 5.23]4.99 [3.05, 8.18]4.52 [2.79, 7.33]2.98 [1.77, 5.01]
    1985–19894.78 [2.95, 7.72]4.44 [2.77, 7.12]2.80 [1.64, 4.78]5.53 [3.36, 9.11]4.52 [2.81, 7.26]3.02 [1.75, 5.20]
    1990–19942.09 [1.20, 3.64]2.00 [1.15, 3.47]1.22 [0.67, 2.23]2.41 [1.36, 4.27]1.98 [1.14, 3.45]1.31 [0.71, 2.43]
    lineality(matrilineal)
    patrilineal1.95 [1.40, 2.73]1.86 [1.28, 2.70]
    fertility 0.82 [0.74, 0.91]0.82 [0.74, 0.91]
    education(none)
    elementary1.17 [0.82, 1.68]1.28 [0.89, 1.83]
    middle+0.40 [0.21, 0.76]0.43 [0.23, 0.81]
    distance1.02 [1.01, 1.02]
    village-level variance0.0920.0080.0970.0800.0210.023
    negative log likelihood−1321.25−1315.67−1314.28−1315.43−1316.28−1303.616
    d.f. (k)91010111013
    AICi2660.52651.42648.52652.92652.62633.232
    AICi − AICcohort0−9.1−12.0−7.6−7.9−27.3

    H2. We do not find evidence of structured village-level variation when village is entered as a random effect in models of ALB (table 2). That is, a model with the random effect is not an improvement over a null model with no random effect; likelihood ratio = 0.144. Despite this, there is a large effect of a village's kinship system on ALB, with women from patrilineal villages being more likely to experience an earlier ALB (table 1 and figure 1). The model that includes lineality in addition to cohort is strongly favoured (w = 0.99) over a model with cohort only (table 2; ‘cohort + lineality’).

    H3. Including education also leads to a much better model (w = 0.98) than the cohort-only model, with women with the highest levels of education being more likely to have a later ALB (table 2; ‘cohort + educaton’). Variation in education does not, however, account for the initial temporal change across cohorts in ALB given the relatively small number of women in the higher education category and the fact that they are limited to more recent reproductive cohorts. Nevertheless, the model suggests that the highest educational levels might indeed delay reproductive cessation, countervailing other temporal influences; this trend is nonlinear, with women at intermediate (elementary) levels of education actually progressing faster to a last birth.

    H4. Including fertility in the temporal model of ALB leads to an improved model (w = 1), but does not completely attenuate the cohort effects (table 2; ‘cohort + fertility’). This suggests that there were some historical shifts in decision-making about ALB that were independent of fertility reduction goals. Rather than include AFB directly into a model of ALB (due to collinearity for women with just one child), we modelled AFB itself using the same approach as we do for ALB and found no support for a sustained temporal shift in AFB. This suggests that the effects of ALB are not being driven by changes in AFB (electronic supplementary material, table S2).

    H5. There is little evidence that ALB variation is structured in ways that reflect recent cultural transmission events in the region. We find that a model with distance to market town does receive some support (w = 0.35) when compared with the lineality (w = 0.64) and temporal models (w = 0.01). That said, the distance to market is confounded by the patrilineal and matrilineal regions, because the patrilineal villages are all farther from town. Moreover, the matrilineal villages show less change over time in ALB, counter to the predicted direction if norms were flowing from the market town to the nearer smaller villages. Finally, the matrilineal villages are not more homogeneous than the patrilineal ones, counter to the prediction of an easier flow of information in the less rugged matrilineal areas (electronic supplementary material, figure S1).

    Timing of reproductive cessation among humans is highly variable. Significant effort in the evolutionary sciences has been expended to explain this variation as it arises physiologically, i.e. in terms of the timing of menopause (e.g. [3,24,64]), with explicit interest in the facultative (i.e. behavioural) adjustment of timing of ALB arising more recently (e.g. [2]). In this paper, we test several hypotheses explaining variation in the distribution of ALB drawn from HBE and CET frameworks.

    In particular, we show that ALB varies across time among the Mosuo, with women from earlier cohorts reproducing until later ages. The drops in ALB are fairly consistent through time, casting doubt on a primary causal role of 1970s Chinese fertility policies in motivating different timing decisions. Such temporal shifts in the timing of reproduction plausibly reflect increasing orientation towards front-loading of reproduction in response to third-party regulation of total fertility behaviour in growing populations. Second, we show that almost all spatial variation in ALB is accounted for by villages being in the matrilineal versus patrilineal areas. Specifically, ALB is later among women residing in matrilineal areas. Third, we show that variation in education does not account for cohort effects, while shifts in fertility and AFB only partially account for the temporal shifts in ALB. This suggests that shifts in embodied capital investments for mothers or investments in child quality are unlikely explanations for ALB variation. Finally, we find no evidence that ALB-relevant norms spread geographically—neither proximity to cities nor terrain ruggedness associated with the patrilineal areas was associated with ALB homogeneity or the rate of ALB change.

    The timescale at which we see changes to the ALB suggests that cultural (rather than genetic or strictly ecological) shifts have influenced reproductive decision-making among the Mosuo. Mean ALB decreases 10 or more years within two decades (figure 1). Furthermore, much of this shift takes place before the Chinese fertility policy was implemented. It is difficult to interpret why this earlier decline occurred. On the one hand, this may imply that both bottom–up and top–down cultural forces influenced late-life reproductive timing if Mosuo villagers were already moving towards anti-natalist norms [44]. For example, it may suggest that the Chinese fertility policy was aligned to some degree with already changing reproductive norms [43,46]. On the other hand, the Chinese Communist Party (CCP) was well established in this region by 1953 and had begun regulating reproductive behaviour among the Mosuo by 1958, when they required Mosuo families to abandon non-committal reproductive unions in favour of monogamous marriage [44]. The Great Leap Forward (1958–1962) resulted in widespread famine and depressed fertility all over China (e.g. [65]) that may have had effects on the timing of ALB in our sample of Mosuo women. Notably, these policies regulated fertility, marriage and subsistence. We know of no specific sanctions against continued reproduction at advanced ages among the Mosuo, so the specific mechanisms of social transmission remain obscure. Nonetheless, the regional scale at which we see village-level differences in ALB suggests that cultural norms associated with matri- or patrilineal institutions may play a role in reproductive timing. While the patrilineal area is in more rugged terrain, the fact that ALB has dropped so dramatically as conditions improved in the later twentieth century suggests that people are ceasing to reproduce earlier than physiological constraints influenced by a difficult working environment would mandate. It is thus unlikely that genetic differences or simple ecological differences with physiological knock-on effects can account for the variation in reproductive timing documented here.

    The pattern of declining ALB alongside fertility restriction is consistent with behavioural ecological models that envisage benefits to earlier reproduction [5]. While front-loading strategies might be transmitted via cultural learning mechanisms, we find no evidence that ALB behaviour spread out from regional market centres. As argued previously for the Mosuo case [25], earlier reproduction should be especially valuable under circumstances where fertility is constrained but populations are growing. Indeed, top–down policies dramatically curtailing fertility were very likely to have contributed to future discounting as the context for bearing and raising children became increasingly insecure, enhancing the motivation for early reproduction. Even if ALB is not the direct target of such a strategy (i.e. because earlier onset of reproduction alongside fertility reduction could automatically result in earlier ages of last birth), the fitness effects of reducing one's age at childbearing would be present across all births [5,28]. ALB, per se, could be targeted as a strategy for investing in children while one is a younger and healthier parent, or as a mechanism of reducing quantity of offspring to invest in their quality (e.g. [24]). As argued in the introduction (§1), even under conditions of severe fertility restriction, early termination of reproduction could leave a mother in better condition (physiologically and financially) to invest in her offspring. Future work could assess this by exploring the condition of children born to mothers who terminate their reproduction early versus late.

    One of our more intriguing results provides evidence that is consistent with models of ALB emphasizing the potential for household cooperation and conflict to affect the timing of reproduction. In particular, our models show relatively early ALB among Mosuo residing in patrilineal areas compared with Mosuo residing in matrilineal areas. At face value, this may seem to support the idea that daughters-in-law win reproductive competition with their mothers-in-law in patrilocal households ([36], but see [3]). However, as this society falls on a relatively low end of the fertility spectrum, it is likely to show relatively little intergenerational overlap and therefore relatively little potential for conflict. This suggests the need for alternative explanations of the regional pattern and casts some doubt on interpretations of similar patterns in the broader literature [3,36,37,65]. The fact that educational and fertility differences do not account for the regional differences between the matrilineal and patrilineal areas' ALBs points to other reasons for the observed patterns in ALB. Furthermore, if easier transmission of market-based norms accounted for the difference, we would expect earlier ALB among the more market-integrated matrilineal villages than the patrilineal ones. One possibility is that ageing women expect reproductive conflict in patrilocal areas and moderate their reproductive timing accordingly, but this account would require a rather inflexible mechanism that cannot handle the mismatch to the modern low fertility context. Future research should collect more direct data on intergenerational reproductive overlap, household composition, economic conditions and individual household-level contributions to allow for more direct tests of how either real or perceived household conflict may impact ALB and other fertility behaviours in the context of differing kinship norms.

    An alternative explanation for the kinship structure results suggests that matrilineal kin are more cooperative than patrilineal kin, at least insofar as they provide more alloparental care. If this were the case, we might expect to find our current pattern of longer reproductive spans (6.12 versus 5.3 years, on average) and higher fertility among the matrilineal villages than the patrilineal ones (table 1). There is also lower household-level competition for subsistence resources such as land in the matrilineal region, and households are correspondingly larger. In light of this interpretation, it is a bit surprising that we see later ages at first birth for matrilineal than patrilineal women. However, this is consistent with older daughters acting as helpers at the nest for longer periods in the matrilineal villages. While the traditional ethnographic accounts of matrilineal Mosuo suggest a high prevalence of half-siblings that would disincentivize such helping at the nest [66], our own data show that mixed paternity sibsets were rare at the time these data were collected [67]. Better evidence on the actual contributions of alloparents and fuller explanations for any kin-structured differences in cooperation await further research.

    We have limited ability to speak to how mothers use ALB strategically to allocate resources between reproduction and other goals, or as a way to manage the quality versus quantity of her children. However, we do find that higher fertility is associated with a slower progression to last birth. Relatively early stopping may therefore be part of a fertility reduction strategy. Similarly, women may be shifting their reproductive careers in response to educational investments that they make in themselves. Women with middle school or higher educational levels have the slowest progressions to last births. This interpretation must remain fairly guarded given that women with intermediate educational levels in fact show the fastest progression to last births, meaning that any effects of education on ALB are not linear. Furthermore, the highest levels of education observed in these communities at the time data were collected (i.e. high school) are unlikely to directly interfere with earlier reproductive careers [25]. This means that any effects of education would have to be due to later trade-offs given different levels of embodied capital or shifting norms acquired through, or reflected in, educational institutions. Finally, the fact that historical shifts towards lower fertility and higher maternal education do not completely account for the earlier ALBs in more recent cohorts suggests that reproductive stopping is changing for other reasons. Further data on various kinds of investments in children, the long-term effects of education for adult women and the kinds of social models present in schools would help address the role of education in shifting norms and perceived reproductive trade-offs.

    It is worth reiterating that HBE and CET perspectives often do not make divergent predictions insofar as CET proposes mechanisms of transmission whereby equilibria predicted by HBE models are reached. However, in contexts of change, CET perspectives have the potential to offer additional insights regarding the dynamics of change. For example, during these periods of change, it may be possible to examine humans’ reproductive decision-making rules and the extent to which they may deviate from fitness maximization because of changing norms or institutions. This helps us make sense of the otherwise puzzling evidence that early adopters of low fertility norms suffer long-term fitness losses even as their descendants benefit in social status ([68], but see [5]). Extrapolating from the literature on cultural shifts towards fertility reduction suggests that evolved biases in the types of status cues we attend to, coupled with social learning heuristics, may play a role in other reproductive decisions.

    The results presented here should be taken with some measure of caution. First, excluding nulliparous women means that women with later ages at first birth (after age 30) are systematically under-represented in the younger age cohorts, possibly masking effects of delayed childbearing on ALB and other reproductive behaviours. Similarly, women who have not given birth in at least 6 years were classified as post-reproductive, which may also underestimate ALB if some of these women plan to continue reproducing. However, the reverse would also be true (some women with completed fertility would be censored). Furthermore, it should be noted that no women reproducing prior to 1979 were censored by these criteria. However, for earlier cohorts, the data may include survivorship biases, which could be problematic if, for example, women who lived longer also had later ALBs. This may be important when evaluating the impacts of the Chinese fertility policy on reproductive behaviours. Given that our results suggest that ALB began declining prior to implementation of the fertility policy, artefacts arising from this censorship or selection biases are likely to be small.

    Another challenge for analysis is that cross-sectional demographic data are limited in the degree to which they can characterize cultural transmission of reproductive behaviours. While cultural evolution studies typically emphasize equilibria resulting from dynamic models of social learning processes, we are limited to the use of crude proxies for transmission dynamics such as geographic proximity to market towns and kinship norms (matri-/patri-lineality). Without being able to link transmission mechanisms directly with outcomes, we are unable to interpret patterns in terms of specific social learning mechanisms or their predicted effects on the adoption and spread of reproductive norms. Social network studies might be a slight improvement, because they can identify assortative clustering and frequency-dependent behaviours (e.g. [69]), but more direct measures will be needed in order to parse out the complementary contributions of HBE and CET mechanisms to ALB behaviours and patterns in reproductive behaviours more generally. We suggest that researchers interested in reproductive decision-making directly examine people's beliefs about various trade-offs, which currencies they value, what they consider high-quality traits for their children and how such beliefs are represented in social networks and among potential social models of different statuses. Intergenerational data could also help with theorizing about changing fitness landscapes for different strategies as culture shifts. This could take the form of better long-term fitness indicators and diverse measures of child quality (e.g. health, educational and economic outcomes) that can hint at parents' motivations for lower fertility or earlier reproduction.

    The timing of reproductive events in human life history can have important consequences for individual fitness, a fact that may be increasingly salient as variation in fertility continues to decline worldwide. Given that ALB is often a stronger predictor of reproductive success than more commonly used proxies such as onset of menarche or AFB [2,70], we anticipate increased interest in this topic. ALB, like lifetime fertility, has been showing declines in many populations over time [71,72] and, indeed, may be one of the most important contributions to the reductions in fertility associated with global fertility transition [72,73]. Consistent with this, our data show clear declines in ALB, which seem to have arisen independently of the Chinese fertility policy. Our results also show that the matrilineal areas exhibit later ALB than patrilineal areas, consistent with kinship systems shaping household-level cooperation, and conflict over reproductive behaviour. Given that there is limited reproductive overlap between generations in these communities, cooperation among alloparents in matrilineal areas is a more likely explanation for this pattern. Our results regarding temporal shifts are most consistent with demographic front-loading models that postulate increased marginal returns to early childbearing. Although we find limited evidence that supports any particular pathway directing the social transmission of norms regulating the timing of ALB, we caution that the data were not designed to explore social transmission and advocate for increased integration of data collection efforts and models from different streams of thought in the evolutionary social sciences. As we have demonstrated in this paper, HBE and CET offer distinct but compatible perspectives on the evolution of reproductive behaviours. That ALB appears to change rapidly and is coupled to kinship norms in these communities suggests that stopping behaviours may adapt to cultural shifts as well as socio-ecological incentives.

    Research protocols were approved by the University of Washington Institutional Review Board (UW IRB 07-4858-C 01) and as part of obtaining research permissions locally in China (via the Yunnan Academy of Social Sciences). Informed consent was obtained from all participants.

    Data and code for analysis from this paper may be accessed via the corresponding author's ResearchGate page: http://bit.ly/2v8POst.

    S.M., C.M. and M.C.T. conceived and designed the study. S.M. collected data. M.C.T. performed analysis. S.M., C.M., M.C.T. and A.R. drafted the manuscript.

    We have no competing interests.

    This research was supported by a National Science Foundation (BCS 0717918) and a Chester Fritz Fellowship to the corresponding author.

    We thank Oren Kolodny, Nicole Creanza and Marc Feldman for organizing the workshop that led to this collaboration. S.M. thanks all Mosuo participants and research assistants and the Yunnan Academy of Social Sciences for their support of this work. Eric Alden Smith provided useful comments on the manuscript.

    Footnotes

    One contribution of 16 to a theme issue ‘Bridging cultural gaps: interdisciplinary studies in human cultural evolution’.

    Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.3965856.

    Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

    References

    • 1

      Wood JW. 1994Dynamics of human reproduction: biology, biometry, demography. London, UK and New York, NY: Routledge. Google Scholar

    • 2

      Towner MC, Nenko I, Walton SE. 2016Why do women stop reproducing before menopause? A life-history approach to age at last birth. Phil. Trans. R. Soc. B 371, 20150147. (doi:10.1098/rstb.2015.0147) Link, ISI, Google Scholar

    • 3

      Snopkowski K, Moya C, Sear R. 2014A test of the intergenerational conflict model in Indonesia shows no evidence of earlier menopause in female-dispersing groups. Proc. R. Soc. B 281, 20140580. (doi:10.1098/rspb.2014.0580) Link, ISI, Google Scholar

    • 4

      Garruto RM, Little MA, James GD, Brown DE. 1999Natural experimental models: the global search for biomedical paradigms among traditional, modernizing, and modern populations. Proc. Natl Acad. Sci. USA 96, 10 536–10 543. (doi:10.1073/pnas.96.18.10536) Crossref, ISI, Google Scholar

    • 5

      Jones JH, Bird RB. 2014The marginal valuation of fertility. Evol. Hum. Behav. 35, 65–71. (doi:10.1016/j.evolhumbehav.2013.10.002) Crossref, PubMed, ISI, Google Scholar

    • 6

      Buston PM, Zink AG. 2009Reproductive skew and the evolution of conflict resolution: a synthesis of transactional and tug-of-war models. Behav. Ecol. 20, 672–684. (doi:10.1093/beheco/arp050) Crossref, ISI, Google Scholar

    • 7

      Smith EA. 2000Three styles in the evolutionary study of human behavior. In Human behavior and adaptation: an anthropological perspective (eds Cronk L, Chagnon N, Irons W), pp. 27–46. Hawthorne, NY: Aldine de Gruyter. Google Scholar

    • 8

      Bateson P, Laland KN. 2013Tinbergen's four questions: an appreciation and an update. Trends Ecol. Evol. 28, 712–718. (doi:10.1016/j.tree.2013.09.013) Crossref, PubMed, ISI, Google Scholar

    • 9

      Laland Ket al.2014Does evolutionary theory need a rethink?Nature 514, 161–164. (doi:10.1038/514161a) Crossref, PubMed, ISI, Google Scholar

    • 10

      Laland KN, Sterelny K, Odling-Smee J, Hoppitt W, Uller T. 2011Cause and effect in biology revisited: is Mayr's proximate–ultimate dichotomy still useful?Science 334, 1512–1516. (doi:10.1126/science.1210879) Crossref, PubMed, ISI, Google Scholar

    • 11

      Colleran H. 2016The cultural evolution of fertility decline. Phil. Trans. R. Soc. B 371, 20150152. (doi:10.1098/rstb.2015.0152) Link, ISI, Google Scholar

    • 12

      Colleran H, Jasienska G, Nenko I, Galbarczyk A, Mace R. 2014Community-level education accelerates the cultural evolution of fertility decline. Proc. R. Soc. B 281, 20132732. (doi:10.1098/rspb.2013.2732) Link, ISI, Google Scholar

    • 13

      Alvergne A, Gibson MA, Gurmu E, Mace R. 2011Social transmission and the spread of modern contraception in rural Ethiopia. PLoS ONE 6, e22515. (doi:10.1371/journal.pone.0022515) Crossref, PubMed, ISI, Google Scholar

    • 14

      Howard JA, Gibson MA. 2017Frequency-dependent female genital cutting behaviour confers evolutionary fitness benefits. Nat. Ecol. Evol. 1, 49. (doi:10.1038/s41559-016-0049) Crossref, PubMed, ISI, Google Scholar

    • 15

      Tinbergen N. 1963On aims and methods of ethology. Z. Tierpsychol. 20, 410–433. (doi:10.1111/j.1439-0310.1963.tb01161.x) Crossref, Google Scholar

    • 16

      Laland KN, Kendal JR, Brown GR. 2007The niche construction perspective. J. Evol. Psychol. 5, 51–66. (doi:10.1556/JEP.2007.1003) Crossref, Google Scholar

    • 17

      Boyd R, Richerson PJ. 1985Culture and the evolutionary process. Chicago, IL: University of Chicago Press. Google Scholar

    • 18

      Henrich J, Boyd R, Richerson PJ. 2012The puzzle of monogamous marriage. Phil. Trans. R. Soc. B 367, 657–669. (doi:10.1098/rstb.2011.0290) Link, ISI, Google Scholar

    • 19

      Jones D. 2011The matrilocal tribe: an organization of demic expansion. Hum. Nat. 22, 177–200. (doi:10.1007/s12110-011-9108-6) Crossref, PubMed, ISI, Google Scholar

    • 20

      Laland KN, Odling-Smee J, Hoppitt W, Uller T. 2013More on how and why: cause and effect in biology revisited. Biol. Philos. 28, 719–745. (doi:10.1007/s10539-012-9335-1) Crossref, ISI, Google Scholar

    • 21

      Reeve HK, Sherman PW. 1993Adaptation and the goals of evolutionary research. Q. Rev. Biol. 68, 1–32. (doi:10.1086/417909) Crossref, ISI, Google Scholar

    • 22

      Lack D. 1947The significance of clutch-size. Ibis 89, 302–352. (doi:10.1111/j.1474-919X.1947.tb04155.x) Crossref, ISI, Google Scholar

    • 23

      Smith CC, Fretwell SD. 1974The optimal balance between size and number of offspring. Am. Nat. 108, 499–506. (doi:10.1086/282929) Crossref, ISI, Google Scholar

    • 24

      Peccei JS. 2001Menopause: adaptation or epiphenomenon?Evol. Anthropol. Issues News Rev. 10, 43–57. (doi:10.1002/evan.1013) Crossref, ISI, Google Scholar

    • 25

      Mattison SM, Scelza B, Blumenfield T. 2014Paternal investment and the positive effects of fathers among the matrilineal Mosuo of Southwest China. Am. Anthropol. 116, 591–610. (doi:10.1111/aman.12125) Crossref, ISI, Google Scholar

    • 26

      Fisher RA. 1930The genetical theory of natural selection: a complete variorum edition. Oxford, UK: Oxford University Press. Crossref, Google Scholar

    • 27

      Jones J. 2011Primates and the evolution of long, slow life histories. Curr. Biol. 21, R708–R717. (doi:10.1016/j.cub.2011.08.025) Crossref, PubMed, ISI, Google Scholar

    • 28

      Lewontin RC. 1965Selection for colonizing ability. In The genetics of colonizing species (eds Baker HG, Stebbins GL), pp. 77–94. New York, NY: Academic. Google Scholar

    • 29

      Voland E. 1998Evolutionary ecology of human reproduction. Annu. Rev. Anthropol. 27, 347–374. (doi:10.1146/annurev.anthro.27.1.347) Crossref, PubMed, ISI, Google Scholar

    • 30

      Quinlan RJ, Flinn MV. 2005Kinship, sex, and fitness in a Caribbean community. Hum. Nat. 16, 32–57. (doi:10.1007/s12110-005-1006-3) Crossref, PubMed, ISI, Google Scholar

    • 31

      Kramer KL. 2010Cooperative breeding and its significance to the demographic success of humans. Annu. Rev. Anthropol. 39, 417–436. (doi:10.1146/annurev.anthro.012809.105054) Crossref, ISI, Google Scholar

    • 32

      Sear R, Coall D. 2011How much does family matter? Cooperative breeding and the demographic transition. Popul. Dev. Rev. 37, 81–112. (doi:10.1111/j.1728-4457.2011.00379.x) Crossref, PubMed, ISI, Google Scholar

    • 33

      Mace R. 2013Cooperation and conflict between women in the family. Evol. Anthropol. Issues News Rev. 22, 251–258. (doi:10.1002/evan.21374) Crossref, PubMed, ISI, Google Scholar

    • 34

      Bereczkei T, Dunbar RIM. 2002Helping at the nest and sex-biased parental investment in a Hungarian gypsy population. Curr. Anthropol. 43, 804–809. (doi:10.1086/344374) Crossref, ISI, Google Scholar

    • 35

      Turke PW. 1988Helpers at the nest: childcare networks on Ifaluk. In Human reproductive behavior: a Darwinian perspective (eds Betzig LL, Borgerhoff Mulder M, Turke P), pp. 173–188. New York: NY: Cambridge University Press. Google Scholar

    • 36

      Cant MA, Johnstone RA. 2008Reproductive conflict and the separation of reproductive generations in humans. Proc. Natl Acad. Sci. USA 105, 5332–5336. (doi:10.1073/pnas.0711911105) Crossref, PubMed, ISI, Google Scholar

    • 37

      Lahdenperä M, Gillespie DOS, Lummaa V, Russell AF. 2012Severe intergenerational reproductive conflict and the evolution of menopause. Ecol. Lett. 15, 1283–1290. (doi:10.1111/j.1461-0248.2012.01851.x) Crossref, PubMed, ISI, Google Scholar

    • 38

      Horne C, Dodoo FN-A, Dodoo ND. 2013The shadow of indebtedness: bridewealth and norms constraining female reproductive autonomy. Am. Sociol. Rev. 78, 503–520. (doi:10.1177/0003122413484923) Crossref, ISI, Google Scholar

    • 39

      Van Dalen PH, Henkens K. 2012What is on a demographer's mind? A worldwide survey. Demogr. Res. 26, 363–408. (doi:10.4054/DemRes.2012.26.16) Crossref, ISI, Google Scholar

    • 40

      Shenk MK, Towner MC, Kress HC, Alam N. 2013A model comparison approach shows stronger support for economic models of fertility decline. Proc. Natl Acad. Sci. USA 110, 8045–8050. (doi:10.1073/pnas.1217029110) Crossref, PubMed, ISI, Google Scholar

    • 41

      Hardee-Cleaveland K, Banister J. 1988Fertility policy and implementation in China, 1986–88. Popul. Dev. Rev. 14, 245–286. (doi:10.2307/1973572) Crossref, PubMed, ISI, Google Scholar

    • 42

      Hesketh T, Lu L, Xing ZW. 2005The effect of China's one-child family policy after 25 years. N. Engl. J. Med. 353, 1171–1176. (doi:10.1056/NEJMhpr051833) Crossref, PubMed, ISI, Google Scholar

    • 43

      Lavely W, Freedman R. 1990The origins of the Chinese fertility decline. Demography 27, 357–367. (doi:10.2307/2061373) Crossref, PubMed, ISI, Google Scholar

    • 44

      Shih C, Jenike MR. 2002A cultural–historical perspective on the depressed fertility among the matrilineal Moso in Southwest China. Hum. Ecol. 30, 21–47. (doi:10.1023/A:1014579404548) Crossref, ISI, Google Scholar

    • 45

      Banister J. 1991China's changing population. Stanford, CA: Stanford University Press. Google Scholar

    • 46

      Cai Y. 2010China's below-replacement fertility: government policy or socioeconomic development?Popul. Dev. Rev. 36, 419–440. (doi:10.1111/j.1728-4457.2010.00341.x) Crossref, PubMed, ISI, Google Scholar

    • 47

      Harrell S. 2001Ways of being ethnic in southwest China. Seattle, WA: University of Washington Press. Google Scholar

    • 48

      Mattison SM. 2016Matrilineal and matrilocal systems. In The Wiley Blackwell encyclopedia of gender and sexuality studies (eds Naples N, Hoogland RC, Wickramasinghe M, Wong WCA), pp.1655–1660. Hoboken, NJ: John Wiley & Sons, Ltd. Google Scholar

    • 49

      Shih C-K. 2010Quest for harmony: the Moso traditions of sexual union & family life. Stanford, CA: Stanford University Press. Google Scholar

    • 50

      Shih C. 2000Tisese and its anthropological significance: issues around the visiting sexual system among the Moso. L'homme 154–155, 697–712. (doi:10.4000/lhomme.56) Crossref, Google Scholar

    • 51

      Shih C. 1993The Yongning Moso: sexual union, household organization, gender and ethnicity in a matrilineal duolocal society in Southwest China. PhD Dissertation, Stanford University, Stanford, CA. Google Scholar

    • 52

      Mattison SM. 2010Demystifying the Mosuo: The behavioral ecology of kinship and reproduction of China's ‘last matriarchal society’. PhD Dissertation, University of Washington, Seattle, WA. Google Scholar

    • 53

      Walsh ER. 2001The Mosuo — beyond the myths of matriarchy: gender transformation and economic development. PhD Dissertation, Temple University, Philadelphia, PA. Google Scholar

    • 54

      Henry L. 1961Some data on natural fertility. Biodemogr. Soc. Biol. 8, 81–91. (doi:10.1080/19485565.1961.9987465) Google Scholar

    • 55

      Cai Y, Lavely W. 2007Child sex ratios and their regional variation. In Transition and challenge: China's population at the beginning of the 21st century (eds Zhongwei Z, Fei G), pp. 108–123. Oxford, UK: Oxford University Press. Crossref, Google Scholar

    • 56

      Mattison SM. 2011Evolutionary contributions to solving the ‘matrilineal puzzle’: a test of Holden, Sear, and Mace's model. Hum. Nat. 22, 64–88. (doi:10.1007/s12110-011-9107-7) Crossref, PubMed, ISI, Google Scholar

    • 57

      Zhang K. 1990Family marriage and fertility in a matriarchal society—social survey of the Naxi nationality in Ninglang County, Yunnan Province. Chin. J. Popul. Sci. 2, 247–256. PubMed, Google Scholar

    • 58

      Feeney G, Feng W. 1993Parity progression and birth intervals in China: the influence of policy in hastening fertility decline. Popul. Dev. Rev. 19, 61–101. (doi:10.2307/2938385) Crossref, ISI, Google Scholar

    • 59

      R Core DevelopmentTeam. 2017R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Google Scholar

    • 60

      Therneau TM. 2015coxme: Mixed Effects Cox Models. R package version 2.2-5. https://CRAN.R-project.org/package=coxme Google Scholar

    • 61

      Towner MC, Luttbeg B. 2007Alternative statistical approaches to the use of data as evidence for hypotheses in human behavioral ecology. Evol. Anthropol. Issues News Rev. 16, 107–118. (doi:10.1002/evan.20134) Crossref, ISI, Google Scholar

    • 62

      Burnham KP, Anderson DR. 2003Model selection and multimodel inference: a practical information–theoretic approach. Berlin, Germany: Springer Science & Business Media. Google Scholar

    • 63

      Anderson DR. 2007Model based inference in the life sciences: a primer on evidence. Berlin, Germany: Springer Science & Business Media. Google Scholar

    • 64

      Hawkes K. 2004Human longevity: the grandmother effect. Nature 428, 128–129. (doi:10.1038/428128a) Crossref, PubMed, ISI, Google Scholar

    • 65

      Song S. 2012Does famine influence sex ratio at birth? Evidence from the 1959–1961 Great Leap Forward Famine in China. Proc. R. Soc. B 279, 2883–2890. (doi:10.1098/rspb.2012.0320) Link, ISI, Google Scholar

    • 66

      Moya C, Sear R. 2014Intergenerational conflicts may help explain parental absence effects on reproductive timing: a model of age at first birth in humans. PeerJ 2, e512. (doi:10.7717/peerj.512) Crossref, PubMed, ISI, Google Scholar

    • 67

      Mattison SM, Beheim B, Chak B, Buston P. 2016Offspring sex preferences among patrilineal and matrilineal Mosuo in Southwest China revealed by differences in parity progression. R. Soc. open sci. 3, 160526. (doi:10.1098/rsos.160526) Link, ISI, Google Scholar

    • 68

      Goodman A, Koupil I, Lawson DW. 2012Low fertility increases descendant socioeconomic position but reduces long-term fitness in a modern post-industrial society. Proc. R. Soc. B 279, 4342–4351. (doi:10.1098/rspb.2012.1415) Link, ISI, Google Scholar

    • 69

      Colleran H, Mace R. 2015Social network- and community-level influences on contraceptive use: evidence from rural Poland. Proc. R. Soc. B 282, 20150398. (doi:10.1098/rspb.2015.0398) Link, ISI, Google Scholar

    • 70

      Borgerhoff Mulder M. 1989Menarche, menopause and reproduction in the Kipsigis of Kenya. J. Biosoc. Sci. 21, 170–192. (doi:10.1017/S0021932000017879) ISI, Google Scholar

    • 71

      Desjardins B, Bideau A, Brunet G. 1994Age of mother at last birth in two historical populations. J. Biosoc. Sci. 26, 509–516. (doi:10.1017/S0021932000021635) Crossref, PubMed, ISI, Google Scholar

    • 72

      Knodel J. 1987Starting, stopping, and spacing during the early stages of fertility transition: the experience of German village populations in the 18th and 19th centuries. Demography 24, 143–162. (doi:10.2307/2061627) Crossref, PubMed, ISI, Google Scholar

    • 73

      Pascual J, García-Moro CE, Hernández M. 2005Biological and behavioral determinants of fertility in Tierra del Fuego. Am. J. Phys. Anthropol. 127, 105–113. (doi:10.1002/ajpa.20065) Crossref, PubMed, ISI, Google Scholar


    Page 11

    Collard and his colleagues have shown, in a series of papers beginning with Collard et al. [1], that population size and ‘toolkit size’ (see below) in ethnographic hunter–gatherers are neither totally nor partially correlated (the actual statistical analyses used multiple regression; see also [2,3]). Based partly on this finding, Collard et al. [4] and Vaesen et al. [5] criticize the theoretical models of Henrich [6] and Powell et al. [7], which they aver do not ‘really' predict such a correlation. A major bone of contention is whether the modes of cultural transmission (social learning) assumed by Henrich [6] and Powell et al. [7] in their models are empirically justifiable [8,9]. The modes of cultural transmission that humans are capable of and hunter–gatherers rely on are important issues in themselves [8,10]), but, as I will show, irrelevant to the immediate question.

    I agree with Collard et al. [4] and Vaesen et al. [5] that the models of Henrich [6] and Powell et al. [7] are not pertinent to the question of whether population size and toolkit size are correlated. However, my reasons are entirely different—these models were not intended to address this question—and I believe their criticisms are misplaced. The first, technical and less important, reason is that these models do not possess equilibria. The second, more substantive, reason is that the variable interpreted as representing toolkit size in these models does not correspond to how toolkit size is recorded in the ethnographic literature, i.e. the observable.

    My purpose in writing this paper is limited. I will argue that when more appropriate theoretical models with more relevant variables are invoked, a correlation between population size and toolkit size is predicted regardless of the mode of cultural transmission. But how can this be reconciled with the fact that such a correlation is not observed in ethnographic hunter–gatherers [1–3,11,12]? I discuss three possible theoretical scenarios that would attenuate or eliminate the predicted correlation: saturation of toolkit size [13], population growth and decline [14], and bistability [15]. If these scenarios are accepted, then theory and observation can be reconciled.

    I will begin with a brief description of the basic model proposed by Henrich [6], pp. 200–204) and explain why its predictions are not pertinent to the question at hand. The Powell et al. [7] model and other extensions of the basic Henrich [6] model, which share the same limitations, will not be discussed.

    The underlying variable is the ‘skill', z, of an individual belonging to a population of constant finite size N. When attempting to fit this model to the ethnographic data on toolkit size, the most natural interpretation of this variable may be that individuals with larger values of z manufacture a greater variety of tools. Other interpretations (e.g. ‘cultural complexity') are possible in other contexts [4,5].

    Generations are discrete with overlap only insofar as oblique social learning occurs. Each naive newborn of the offspring generation identifies the maximally skilled individual of the parental generation and attempts to imitate his/her skill, zmax. However, social learning is noisy and biased so that the skill acquired by the imitator deviates probabilistically from the skill of the exemplar and is, on average, lower. Nevertheless, and importantly, the skill acquired by the imitator may exceed that of the exemplar. More specifically, it is assumed that the skill acquired by each imitator follows a Gumbel distribution with mode

    What is cultural transmission example?
    and dispersion parameter β (α > 0, β > 0).

    Now let

    What is cultural transmission example?
    be the population mean skill of the N individuals. Rigorously speaking, it is the expected value of the population mean. Given the interpretation of z above, it may be natural to regard
    What is cultural transmission example?
    as in some way representing the average number of different tools attributable to an individual. Using subscript t to denote the value of
    What is cultural transmission example?
    in that generation, Henrich [6] showed that

    What is cultural transmission example?

    2.1

    where
    What is cultural transmission example?
    is Euler's constant. Hence,

    What is cultural transmission example?

    2.2

    where
    What is cultural transmission example?
    is the initial value of
    What is cultural transmission example?
    (see also [16]).

    Clearly, equation (2.1) entails that

    What is cultural transmission example?
    either keeps on increasing or decreasing with time, depending on whether the right-hand side,
    What is cultural transmission example?
    , is positive or negative. Hence, there is no equilibrium that is determined by the population size, N. More significantly, when we attempt to apply equation (2.2) to a sample of populations, the relation between
    What is cultural transmission example?
    and N will be confounded by the effects of the two other variables
    What is cultural transmission example?
    and t, which are unknowns that may differ among the populations (I assume throughout this paper, except in the last paragraph of the discussion, that the model parameters—in this case α and β—are the same in all populations.). This is the technical reason noted in the introduction as to why the Henrich [6] model is not pertinent. The other, more important, reason is discussed in the next section.

    A slight modification to this model produces an equilibrium. From equation (2.1) or equation (2.2), we see that parameter α measures the decay per generation in the population mean skill due to the infidelity of social learning. Mesoudi [17] assumed that errors in social learning by the offspring generation would be proportional to the population mean skill in the parental generation and replaced α by

    What is cultural transmission example?
    in equation (2.1) to obtain

    What is cultural transmission example?

    2.3

    Then, on setting
    What is cultural transmission example?
    (a super-hat here and elsewhere indicates that the variable is evaluated at equilibrium), the population mean skill at equilibrium is

    What is cultural transmission example?

    2.4

    Clearly, equation (2.4) predicts a (nonlinear) correlation between N and
    What is cultural transmission example?
    , with the caveat that, for all populations sampled, N has remained constant and
    What is cultural transmission example?
    has reached, or is close to, the equilibrium for this value of N.

    The ethnographic data on toolkit size in the earlier studies [1–3,11] were taken entirely or mostly from Oswalt [18]. Toolkit refers to ‘subsistants’ (food-getting tools) and ‘technounits’ (component parts of such tools) as defined by Oswalt [18], who unfortunately does not explain how the data cited in his book were obtained. Moreover, the original references are, as far as I can tell, not available online.

    Quantification of ethnographic data appears to be based on a ‘qualitative assessment of the ethnographic literature' ([19]; also [20]). More explicitly, quantitative data on toolkit size are ‘assembled [as] a list of all technologies mentioned' ([21] supplement). Tools are apparently not identified as belonging to individual members of the population, and hence it is not known how many different tools the average individual has. Thus, I assume in what follows that the toolkit size of a target population refers to the number of different subsistants or technounits of which at least one specimen exists in that population, as reported by an informant or observed by the investigator.

    The variable that is most relevant, if this is the case, is S of Strimling et al. ([22], their equation (2.3)), ρP of Lehmann et al. ([23], their eq. 2.5), Cpop of Fogarty et al. ([13], see below) and possibly x of Ghirlanda & Enquist [24]. However, it is not

    What is cultural transmission example?
    of Henrich [6], because
    What is cultural transmission example?
    is most naturally interpreted as an average over the number of different tools per individual.

    This model was first proposed by Strimling et al. [22] and is an adaptation to cultural evolution of the neutral infinite sites Moran model of population genetics [25–29]. Here, I present an outline of this model as modified and generalized by Fogarty et al. [13,14] to modes of cultural transmission other than random oblique (see below), and including some results newly obtained here.

    Assume a finite population of fixed size N in which a potentially infinite number of cultural traits may occur. In the context of this paper, a cultural trait is the knowhow to manufacture a particular tool. The underlying variable is Cij, which equals 1 if the ith individual possesses the jth cultural trait, and 0 if he/she does not (i = 1,2, … ,N; j = 1,2, …). Hence, the state of an individual can be represented by a vector of 1s and 0s, and of the population by a matrix formed by aligning N such vectors; Cij is the ijth element of this matrix.

    The cultural dynamics are defined by four events occurring during one time step. For a population of fixed size N, one generation (life expectancy at birth) comprises N time steps. Let us refer to the N individuals alive at the beginning of a time step as adults. Then, the four events are: (i) innovation by all N adults, (ii) birth of one naive individual (temporarily increasing the population size to N + 1), (iii) social learning by the newborn from a subset of the N adults (the exemplars), and (iv) death of a random adult who is replaced by the newborn (bringing the population size back to N).

    Innovations have not been seen before (i.e. result is an entirely new cultural trait) and occur at the rate μ per adult per generation, or μ/N per adult per time step (μ > 0). This assumption differs from most previous models in which only the one newborn was allowed to innovate [22,23].

    In terms of the underlying Cij, the number of cultural traits possessed by the ith individual can be written as

    What is cultural transmission example?
    (the upper bound of the summation may be infinite). Hence, the number of cultural traits possessed by the average individual is

    What is cultural transmission example?

    4.1

    The expected value of this variable, which we write as
    What is cultural transmission example?
    , is a close analogue of the population mean skill,
    What is cultural transmission example?
    , of the Henrich [6] model.

    Next, define

    What is cultural transmission example?
    if
    What is cultural transmission example?
    , and 0 otherwise, where l ≤ l ≤ N. That is,
    What is cultural transmission example?
    counts the jth cultural trait if it is possessed by exactly l individuals in the population. Then, the number of different cultural traits possessed by exactly l individuals is

    What is cultural transmission example?

    4.2

    Strimling et al. [22] refer to this variable, Pl, as the number of cultural traits of ‘popularity' l.

    In terms of the Pls, we can write the number of distinct cultural traits in the population (i.e. possessed by at least one individual) as

    What is cultural transmission example?

    4.3

    If we focus on material culture and assume that the possession of a cultural trait by an individual entails the manufacture of the corresponding subsistant or technounit, then Cpop is the variable that most closely approximates the observable. However, what is really needed is a theory in terms of artefacts rather than the individuals that make them, i.e. a theory that allows for possible production bias.

    We can also express Cind in terms of the Pl's. Note that

    What is cultural transmission example?
    if and only if the jth cultural trait has popularity l, and that there are Pl cultural traits of popularity l. Hence,

    What is cultural transmission example?

    4.4

    Clearly, Cind ≤ Cpop with equality if and only if Pl = 0 for 1 ≤ l ≤ N − 1.

    For each variable, we indicate the expected value with a bar, the value at the end of a time step with a prime, and the equilibrium value with a caret.

    We make two crucial independence assumptions. First, there is no association among the cultural traits carried by an individual; for example, the possession of one cultural trait does not predict possession of another cultural trait (analogous to the assumption of ‘linkage equilibrium' in genetics). Second, the cultural traits carried by an exemplar, or exemplars, are transmitted independently of each other to the newborn (analogous to ‘free recombination'). Not all modes of cultural transmission can be accommodated within this analytical framework, in which case individual-based simulations prove useful. We also assume that there is no natural selection (see below).

    The two independence assumptions in the 0,1 vector model are admittedly very strong. A general theory to deal with associations, either positive or negative, among cultural traits has not yet been formulated. Nevertheless, for the infinite sites model of population genetics, the spectrum

    What is cultural transmission example?
    is identical whether the sites are independent or completely linked—compare equations A 4 and A 5 of the appendix with equation 9.24 of Ewens [28]—and I expect this property will carry over, at least partially, to the current 0,1 vector model. A start on this problem has been made by Strimling et al. [22], who show that with random oblique transmission (see below) the spectra for mutually exclusive traits show the same dependencies on population size as for independent traits.

    Denote the expected values of Pl by

    What is cultural transmission example?
    (
    What is cultural transmission example?
    ). Using an asterisk to distinguish the values after innovation, we have

    What is cultural transmission example?

    5.1

    Equation (5.1) formalizes our assumption that all innovations are novel, i.e. are cultural traits of popularity 1. For each cultural trait of expected popularity l after innovation, let bl be the (binomial) probability that the newborn acquires that cultural trait, and let dl be the (binomial) probability that death then strikes an adult possessing that cultural trait. Then, at the end of the time step, we have

    What is cultural transmission example?

    5.2

    The derivation of equation (5.2) is explained in Fogarty et al. [13,14]. Intuitively, the meaning of the recursion equation (5.2) is clear, if we disregard the possibility that each
    What is cultural transmission example?
    is not necessarily an integer. For example, the first term on the right-hand side means that, of the cultural traits of popularity l − 1 among the adults immediately after innovation, a fraction bl−1 are acquired by the newborn and a fraction
    What is cultural transmission example?
    are not lost by the death of an adult. Hence, this term represents the expected number of cultural traits of popularity l − 1 just after innovation that have popularity l at the end of the time step. Note in applying equation (5.2) that b0 = 0 and
    What is cultural transmission example?
    .

    At equilibrium, setting

    What is cultural transmission example?
    , we obtain

    What is cultural transmission example?

    5.3a

    What is cultural transmission example?

    5.3b

    and

    What is cultural transmission example?

    5.3c

    for
    What is cultural transmission example?
    .

    Equation (5.3) can be solved (see appendix) to yield

    What is cultural transmission example?

    5.4a

    and

    What is cultural transmission example?

    5.4b

    for
    What is cultural transmission example?
    .

    Then, noting equations (4.3) and (4.4), the expected values of Cpop and Cind at equilibrium are

    What is cultural transmission example?

    5.5

    and

    What is cultural transmission example?

    5.6

    respectively. Clearly,
    What is cultural transmission example?
    with equality if and only if
    What is cultural transmission example?
    for
    What is cultural transmission example?
    and
    What is cultural transmission example?
    , which is precluded if equation (5.4) holds.

    Next, we consider explicit values of bi and di. Our assumption of no selection implies di = i/N. The value of bi depends on the mode of cultural transmission. For random oblique transmission (a mode of cultural transmission in which the newborn chooses a random adult to imitate),

    What is cultural transmission example?

    6.1

    where β (
    What is cultural transmission example?
    ) is the fidelity of cultural transmission (Under the assumptions of this model, vertical transmission cannot be distinguished from random oblique transmission.). Equation (5.4) reduces to

    What is cultural transmission example?

    6.2a

    and

    What is cultural transmission example?

    6.2b

    for
    What is cultural transmission example?
    . Equation (6.2) is identical to equation (2.2) of Strimling et al. [22], if we take into account that their model is formulated in terms of a death–birth chain rather than as in our case a birth–death chain, and that they evaluate the popularities just after innovation.

    For random oblique transmission, it can be shown by a direct argument (see [22]) that the recursion in

    What is cultural transmission example?
    satisfies

    What is cultural transmission example?

    6.3

    Hence, at equilibrium,

    What is cultural transmission example?

    6.4

    Equation (6.4) entails that
    What is cultural transmission example?
    when N is not too small, in which case
    What is cultural transmission example?
    does not depend on N (see below). This result is consistent with the statement in Collard et al. [4] that ‘under unbiased transmission, the association … fails to hold.' However, as noted above
    What is cultural transmission example?
    and hence
    What is cultural transmission example?
    is not the appropriate variable to represent toolkit size.

    Aoki et al. [30] define two other modes of cultural transmission, best-of-K and conformist, for which the expected values of the popularities at equilibrium can be obtained within the current framework. In best-of-K transmission, which assumes a preference for having each cultural trait as opposed to not having it, the newborn samples K adults (the exemplars) at random without replacement, and each cultural trait is independently acquired with probability β provided at least one of these K exemplars possesses it. Hence,

    What is cultural transmission example?

    7.1

    where
    What is cultural transmission example?
    and bi = β for i > N −K [13]. Parameter K is analogous to the number of ‘cultural parents' in Enquist et al. [31].

    Conformist transmission can be modelled in various ways [32–36]. Here, we adopt a model in which each newborn samples K adults at random without replacement, and each cultural trait is independently acquired with a probability that depends on the fraction of these K exemplars that possesses that cultural trait [30]. Specifically, set

    What is cultural transmission example?

    7.2

    where h(j; K, N, i) = 0 if j < 0, j > i, j > K, or j < K − (N − i). Equation (7.2) is the hypergeometric distribution giving the probability that a cultural trait of popularity i is represented exactly j times among the K exemplars. Then

    What is cultural transmission example?

    7.3a

    if K ≥ 2 is even, and

    What is cultural transmission example?

    7.3b

    if K ≥ 3 is odd, and where ɛ < 1/K. Since ɛ < 1/K, the newborn will be less likely than by random copying to acquire a minority (less than frequency 1/2) cultural trait and more likely to acquire a majority (greater than frequency 1/2) cultural trait. Smaller values of ɛ entail stronger frequency-dependence and hence stronger conformity.

    Similarly, anticonformist transmission can be modelled by setting 1/K < ɛ < 2/(K − 2) when K ≥ 4 is even, or 1/K < ɛ < 2/(K − 1) when K ≥ 3 is odd.

    Equation (7.1), or equation (7.3) with (7.2), can be substituted into equation (5.4) to obtain the expected popularities at equilibrium for best-of-K, or conformist/anti-conformist transmission, respectively.

    Figure 1 shows how

    What is cultural transmission example?
    increases with the population size, N, for these four modes of cultural transmission (random oblique, best-of-K, conformist, anticonformist). The scale on the vertical axis is not critical to the argument. Importantly, it is assumed that N has remained constant and that
    What is cultural transmission example?
    has reached, or is close to, the equilibrium for this value of N. In particular, we see, contra Collard et al. [4] and Vaesen et al. [5], that an approximately linear monotone increasing relation is predicted even with random oblique (also vertical) and conformist transmission.

    What is cultural transmission example?

    Figure 1. Dependence of

    What is cultural transmission example?
    on N, with random oblique (blue), best-of-2 (red), conformist-of-3 (purple) and anticonformist-of-3 (yellow) transmission. In the case of best-of-2, values of
    What is cultural transmission example?
    for N > 10 are not shown. Parameter values are β = 0.9, μ = 0.04; ɛ = 0.1 for conformist and ɛ = 0.4 for anticonformist.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Figure 2a shows for random oblique and conformist transmission that

    What is cultural transmission example?
    and
    What is cultural transmission example?
    depend on N in entirely different ways. Specifically,
    What is cultural transmission example?
    with random oblique transmission (see equation (6.4)) or with conformist transmission rapidly converge to a constant value. Clearly, the theoretical predictions differ markedly with the choice of variable. Figure 2b shows on the other hand that, with best-of-K and anticonformist transmission, both
    What is cultural transmission example?
    and
    What is cultural transmission example?
    are monotone increasing in N.

    What is cultural transmission example?

    Figure 2. Dependence of

    What is cultural transmission example?
    (broken lines) and
    What is cultural transmission example?
    (solid lines) on N. (a) Random oblique (blue) and conformist-of-3 (purple) transmission.
    What is cultural transmission example?
    is essentially independent of N. For the parameter values, β = 0.9, μ = 0.04, assumed here,
    What is cultural transmission example?
    asymptotes to 0.4 with random oblique transmission (see equation (6.4)) and converges to 0.055 with conformist-of-3 transmission. (b) Best-of-2 (red) and anticonformist-of-3 (yellow) transmission. Both
    What is cultural transmission example?
    and
    What is cultural transmission example?
    are monotone increasing in N. Parameter values are β = 0.8, μ = 0.01; ɛ = 0.8 for anticonformist.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Fogarty et al. [13,14] also conducted individual-based simulations of the 0,1 vector model for random oblique, best-of-2, success bias, and one-to-many (but not conformist or anticonformist) transmission. In these simulations, the possible number of cultural traits was assumed to be finite, and a less restrictive model of innovation was also considered. Success bias, as defined by these authors, entails that the newborn samples K adults and chooses one adult possessing the greatest number of cultural traits as his/her exemplar. It most closely resembles the mode of cultural transmission assumed by Henrich [6] if we set K = N (see also [16]). One-to-many transmission entails that one adult has a special status and continues to be imitated by all newborns until his/her death. In all cases examined,

    What is cultural transmission example?
    showed an increase with N.

    For random oblique and best-of-2 transmission, where the simulation results could be compared to the analytical predictions (equation (6.2), and equation (5.4) with equation (7.1), respectively), it was found that the former underestimated the latter at large values of N, especially when β and/or μ were also large. That is, a saturation effect was observed when an upper limit, M, was set on the possible number of cultural traits (figure 3). Incidentally, Fogarty et al. [13] observed a large difference in

    What is cultural transmission example?
    between best-of-K and random oblique transmission when K = 2, but a relatively small effect of increasing K further. (This property is shared by the model of Enquist et al. [31], as can be seen from their figure 5).

    What is cultural transmission example?

    Figure 3. Dependence of

    What is cultural transmission example?
    on N for best-of-2. Analytical values (red), simulation values (orange). Parameter values are β = 0.9, μ = 0.04.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Given that Cpop is the appropriate theoretical variable that most closely approximates the observable, and a correlation is predicted between

    What is cultural transmission example?
    and population size, N, for all modes of cultural transmission examined above, why have the empirical studies on toolkit size yielded negative results? I suggest three possibilities. The first is the saturation effect observed in the individual-based simulations noted in the last section. Figure 3 shows for best-of-2 transmission how
    What is cultural transmission example?
    may rapidly asymptote to the upper limit of the possible number of cultural traits (M = 500 in this example) as N increases. Clearly, for fixed N greater than about 25 in this example, a correlation is not predicted between population size and toolkit size. This saturation effect is not as pronounced for random oblique transmission [13].

    The remaining two possibilities are discussed separately in the following two sections.

    The recursions equation (5.2) in the expected popularities can be generalized to deal with population growth and decline driven by exogenous factors [14]. For a growing population, each time step comprises innovation by all adults (equation (5.1)) followed by either a birth–death event (population size does not change) or a birth-only event (population size increases by one individual). For the former event equation (5.2) applies, and for the latter,

    What is cultural transmission example?

    10.1

    because dl = 0. Fogarty et al. [14] show that if a birth-only event occurs every S time steps, then the growth rate per generation of the population is r = 1/S. For a declining population, the recursion for a death-only event,

    What is cultural transmission example?

    10.2

    replaces equation (10.1) for the birth-only event.

    The effect of population growth and decline on the expected number of different cultural traits in the population,

    What is cultural transmission example?
    , can be investigated by numerical iteration of equations (5.1), (5.2), (10.1) and (10.2). Figure 4 illustrates a case where a periodic solution is observed. Clearly, two different values of
    What is cultural transmission example?
    are predicted for each (intermediate) value of N, depending on whether the population is in the growth phase or decline phase. Specifically, a relatively large population in the growth phase (e.g. N = 300) may be associated with a smaller value of
    What is cultural transmission example?
    (≈ 1000) than a smaller population in the decline phase (e.g. N = 200 with
    What is cultural transmission example?
    ≈ 2000).

    What is cultural transmission example?

    Figure 4. Dependence of

    What is cultural transmission example?
    on N during the growth (blue) and decline (green) phases of a periodic equilibrium. Best-of-2, with parameter values β = 0.7, μ = 1, S = 5.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Ghirlanda & Enquist [24] have defined a variable, x, which they call the ‘amount of culture' and assume satisfies the continuous-time deterministic dynamic

    What is cultural transmission example?

    11.1

    where, as before, N is the population size. Parameter γ is the rate of decay per unit time of the amount of culture due to the infidelity of social learning. Parameter δ is the rate of innovation per individual per unit time, and hence the second term on the right-hand side of equation (11.1) represents the total input of innovations per unit time.

    In contrast with equation (2.3) for the Henrich [6] model as modified by Mesoudi [17] or equations (5.1) and (5.2) for the 0,1 vector model, equation (11.1) is not derived from mechanistic assumptions (realistic or not) on the behaviour of individuals. It is a ‘conceptual' model, which has a broad if nonspecific applicability. Clearly, variable x represents a population property; equation (11.1) is similar in form to equation (2.3), but variable x would seem to have a meaning more analogous to

    What is cultural transmission example?
    .

    In the preceding theoretical analyses, population size N was either fixed or assumed to change independently of culture. Here, we combine equation (11.1) with a dynamic for N that depends reciprocally on the amount of culture. Specifically, we set

    What is cultural transmission example?

    11.2

    where the carrying capacity is assumed to be given by the following monotone non-decreasing ‘ramp' function of the amount of culture [37],

    What is cultural transmission example?

    11.3

    The equilibria of the coupled system of equation (11.1), (11.2) and (11.3) are obtained on setting dx/dt = 0 and dN/dt = 0 simultaneously. In addition to the extinction equilibrium,
    What is cultural transmission example?
    , which is unstable, the intersection(s) of the two null clines,

    What is cultural transmission example?

    11.4a

    and

    What is cultural transmission example?

    11.4b

    yield equilibria.

    Figure 5 shows the space of the variables x and N with null clines that, in this case, intersect three times. Of the three equilibria, the two on the outside are each locally stable and the one in the middle is unstable (bistability). The state of a population—its x and N values—can be represented by a point on this space. It is likely to coincide with or lie close to one of the two locally stable equilibria, because a locally stable equilibrium acts as an attractor.

    What is cultural transmission example?

    Figure 5. Space of the variables x and N with the null clines equation (11.4a) and (11.4b) depicted by solid and broken lines, respectively. Of the three intersections of these null clines, the two on the outside are locally stable ‘attracting' equilibria. Each green dot represents a hypothetical sampled population. (a) Sampled populations are distributed around both locally stable equilibria; correlation is observed. (b) Sampled populations are randomly distributed around just one of the locally stable equilibria; correlation is not observed.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Consider now a sample of more than one population for which estimates of population size and toolkit size are available. We can plot the state of each sampled population as a point on figure 5 (green dots), equating these estimates with the variables N and x, respectively. Then, two possible outcomes can be envisaged. First, some of the points may be distributed around one of the locally stable equilibria and the remainder around the other (figure 5a). Second, all of the points may be distributed around just one of the locally stable equilibria (figure 5b). In the first case, we should observe a correlation between population size and toolkit size, as empirical studies have shown to be the case for food producers [12,21]. In the second case, we do not expect to observe a correlation if the points are randomly distributed around the one equilibrium, which is perhaps the situation with the ethnographic hunter–gatherer data [15].

    The 0,1 vector model is the most general model of cultural evolution currently available [13,14,22,23,30]. It has also found use in empirical studies as a convenient way of summarizing data [19,38–40]. Any attempt to articulate theoretical and empirical studies must choose the variable(s) of the former and the observable(s) of the latter so that they agree. With regard to the empirical studies on toolkit size [1–3,11,41], I have argued that the variable Cpop, not Cind, of the 0,1 vector model is most suitable. The variable

    What is cultural transmission example?
    of the Henrich [6] model is analogous to
    What is cultural transmission example?
    , and hence Collard et al. [4] and Vaesen et al. [5] would seem to have criticized the wrong model for the wrong reasons. However, I repeat that what is really needed is a theory in terms of artefacts rather than the individuals that make them.

    Focusing on the variable Cpop, I have shown that when the population size, N, is constant, the expected value of this variable at equilibrium,

    What is cultural transmission example?
    , is predicted to correlate with N for the seven modes of cultural transmission examined or mentioned in this paper: random oblique, vertical, best-of-K, conformist, anticonformist, success bias and one-to-many. I have also suggested two theoretical scenarios based on the 0,1 vector model (sections 8 and 9) and one scenario based on a model of feedback between population size and amount of culture (section 10), which can explain the empirical absence of correlation.

    It is of course possible that Cind = Cpop. I noted above that the expected equilibrium values of these variables are equal, i.e.

    What is cultural transmission example?
    , if and only if
    What is cultural transmission example?
    for
    What is cultural transmission example?
    and
    What is cultural transmission example?
    . This is the situation in which each extant cultural trait is shared by all individuals in the population. I also noted that this special case is inconsistent with the positive solution of
    What is cultural transmission example?
    (
    What is cultural transmission example?
    ) given by equation (5.4). This entails that
    What is cultural transmission example?
    cannot hold for random oblique, vertical, best-of-K, conformist, or anticonformist transmission.

    I have omitted considerations of natural selection in this paper. Individual-level selection can be incorporated into the Henrich [6] model and the 0,1 vector model. For example, we could add directional selection, in which the viability of the ith individual is assumed to depend positively on the number of cultural traits he/she carries,

    What is cultural transmission example?
    . Preliminary individual-based simulation results for random oblique transmission indicate that the values of
    What is cultural transmission example?
    and
    What is cultural transmission example?
    are both displaced upward, relative to the case of no natural selection, but that for both variables the dependence on N remains qualitatively unchanged.

    The model of feedback between population size and amount of culture is different. Here, the variable x represents a population property, and equation (11.1) cannot be modified to include individual-level selection. On the other hand, it is implicit in this model that group-level selection is acting, because a larger population size, N, is associated with more culture, x (equations (11.2), (11.3)). Gilpin et al. [37] have extended this model to deal with interspecific competition.

    To return to the main argument, Collard et al. [1] have shown that ‘risk of resource failure' is a good predictor of toolkit size among ethnographic hunter–gatherers (see also [3]. Equations (5.4) and (5.5) may be relevant in this connection. These equations show that

    What is cultural transmission example?
    increases linearly with the innovation rate, μ. Hence, if such risk stimulates innovation—i.e. if there is any truth in the commonplace, ‘necessity is the mother of invention'—then theory and observation would seem to be in agreement.

    This article has no additional data.

    I declare I have no competing interests.

    This work was supported in part by Monbukagakusho grant 16H06412 to Joe Yuichiro Wakano.

    I thank Atsushi Nobayashi for discussion on the nature of ethnographic data; Naoyuki Takahata for information on the population genetics literature; Alex Mesoudi, Marc Feldman and the reviewers for comments on earlier drafts.

    Note first that equation (5.3c) when spelled out comprises N − 2 lines. Adding equations (5.3a), (5.3b) and all N − 2 lines of equation (5.3c), we find that the terms in

    What is cultural transmission example?
    cancel, yielding equation (5.4a) for
    What is cultural transmission example?
    . Next, adding equation (5.3b) and all N − 2 lines of equation (5.3c), we find that the terms in
    What is cultural transmission example?
    cancel. Then,
    What is cultural transmission example?
    can be expressed in terms of
    What is cultural transmission example?
    as

    What is cultural transmission example?

    A 1

    Similarly, adding the
    What is cultural transmission example?
    lines of equation (5.3c) from the lth (
    What is cultural transmission example?
    ) to the last, we obtain

    What is cultural transmission example?

    A 2

    and the last line of equation (5.3c) reduces to

    What is cultural transmission example?

    A 3

    because dN = N/N = 1. Putting equations (5.4a), A 1, A 2 and A 3 together yields equation (5.4b).

    With random oblique transmission of fidelity β = 1, the 0,1 vector model of cultural evolution reduces to the Moran model for a haploid genetic population. Direct substitution into equations (5.3a), (5.3b) and (5.3c) shows that the equilibrium expected popularities are

    What is cultural transmission example?

    A 4

    and

    What is cultural transmission example?

    A 5

    for
    What is cultural transmission example?
    . On the other hand, we have from the last line of equation (5.3c),

    What is cultural transmission example?

    A 6

    Hence, after
    What is cultural transmission example?
    has reached its equilibrium value of
    What is cultural transmission example?
    , we see that
    What is cultural transmission example?
    will increase arithmetically at rate μ/N per time step, or μ per generation, which is the substitution rate. Equations A 4 and A 5 agree with equation 9.24 in Ewens [28] if we substitute the diploid number of genes 2N for N and take into account that
    What is cultural transmission example?
    is evaluated before innovation.

    Footnotes

    One contribution of 16 to a theme issue ‘Bridging cultural gaps: interdisciplinary studies in human cultural evolution’.

    References

    • 1

      Collard M, Kemery M, Banks S. 2005Causes of toolkit variation among hunter-gatherers: a test of four competing hypotheses. Can. J. Archaeol. 29, 1–19. Google Scholar

    • 2

      Read D. 2006Tasmanian knowledge and skill: maladaptive imitation or adequate technology?Am. Antiquity 71, 164–184. (doi:10.2307/40035327) Crossref, ISI, Google Scholar

    • 3

      Read D. 2008An interaction model for resource implement complexity based on risk and number of annual moves. Am. Antiquity 73, 599–625. (doi:10.1017/S0002731600047326) Crossref, ISI, Google Scholar

    • 4

      Collard M, Vaesen K, Cosgrove R, Roebroeks W. 2016The empirical case against the ‘demographic turn’ in Palaeolithic archaeology. Phil. Trans. R. Soc. B 371, 20150242. (doi:10.1098/rstb.2015.0242) Link, ISI, Google Scholar

    • 5

      Vaesen K, Collard M, Cosgrove R, Roebroeks W. 2016aPopulation size does not explain past changes in cultural complexity. Proc. Natl Acad. Sci. USA 113, E2241–E2247. (doi:10.1073/pnas.1520288113) Crossref, PubMed, ISI, Google Scholar

    • 6

      Henrich J. 2004Demography and cultural evolution: how adaptive cultural processes can produce maladaptive losses—the Tasmanian case. Am. Antiquity 69, 197–214. (doi:10.2307/4128416) Crossref, ISI, Google Scholar

    • 7

      Powell A, Shennan S, Thomas MG. 2009Late Pleistocene demography and the appearance of modern human behavior. Science 324, 1298–1301. (doi:10.1126/science.1170165) Crossref, PubMed, ISI, Google Scholar

    • 8

      Henrich J, Boyd R, Derex M, Kline MA, Mesoudi A, Muthukrishna M, Powell AT, Shennan SJ, Thomas MG. 2016Understanding cumulative cultural evolution. Proc. Natl Acad. Sci. USA 113, E6724–E6725. (doi:10.1073/pnas.1610005113) Crossref, PubMed, ISI, Google Scholar

    • 9

      Vaesen K, Collard M, Cosgrove R, Roebroeks W. 2016bThe Tasmanian effect and other red herrings. Proc. Natl Acad. Sci. USA 113, E6726–E6727. (doi:10.1073/pnas.1613074113) Crossref, PubMed, ISI, Google Scholar

    • 10

      Aoki K, Feldman MW. 2014Evolution of learning strategies in temporally and spatially variable environments: a review of theory. Theor. Popul. Biol. 91, 3–19. (doi:10.1016/j.tpb.2013.10.004) Crossref, PubMed, ISI, Google Scholar

    • 11

      Collard M, Buchanan B, Morin J, Costopoulos A. 2011What drives the evolution of hunter-gatherer subsistence technology? A reanalysis of the risk hypothesis with data from the Pacific Northwest. Phil. Trans. R. Soc. B 366, 1129–1138. (doi:10.1098/rstb.2010.0366) Link, ISI, Google Scholar

    • 12

      Collard M, Buchanan B, O'Brien MJ. 2013Population size as an explanation for patterns in the Paleolithic archaeological record: more caution is needed. Curr. Anthropol. 54, S388–S396. (doi:10.1086/673881) Crossref, ISI, Google Scholar

    • 13

      Fogarty L, Wakano JY, Feldman MW, Aoki K. 2015Factors limiting the number of independent cultural traits that can be maintained in a population. In Learning strategies and cultural evolution during the palaeolithic (eds Mesoudi A, Aoki K), pp. 9–21. Tokyo, Japan: Springer. Google Scholar

    • 14

      Fogarty L, Wakano JY, Feldman MW, Aoki K. 2017The driving forces of cultural complexity: neanderthals, modern humans, and the question of population size. Hum. Nat. 28, 39–52. (doi:10.1007/s12110-016-9275-6) Crossref, PubMed, ISI, Google Scholar

    • 15

      Aoki K. 2015Modeling abrupt cultural regime shifts during the Palaeolithic and Stone Age. Theor. Popul. Biol. 100, 6–12. (doi:10.1016/j.tpb.2014.11.006) Crossref, ISI, Google Scholar

    • 16

      Kobayashi Y, Aoki K. 2012Innovativeness, population size and cumulative cultural evolution. Theor. Popul. Biol. 82, 38–47. (doi:10.1016/j.tpb.2012.04.001) Crossref, PubMed, ISI, Google Scholar

    • 17

      Mesoudi A. 2011Variable cultural acquisition costs constrain cumulative cultural evolution. PLoS ONE 6, e18239. (doi:10.1371/journal.pone.0018239) Crossref, PubMed, ISI, Google Scholar

    • 18

      Oswalt WH. 1976An anthropological analysis of food-getting technology. New York, NY: Wiley. Google Scholar

    • 19

      Jordan P, Shennan S. 2009Diversity in hunter-gatherer technological traditions: mapping trajectories of cultural ‘descent with modification’ in northeast California. J. Anthropol. Archaeol. 29, 342–365. (doi:10.1016/j.jaa.2009.05.004) Crossref, ISI, Google Scholar

    • 20

      Roscoe P. 2006Fish, game, and the foundations of complexity in forager society: the evidence from New Guinea. Cross-Cultural Res. 40, 29–46. (doi:10.1177/1069397105282432) Crossref, ISI, Google Scholar

    • 21

      Kline MA, Boyd R. 2010Population size predicts technological complexity in Oceania. Proc. R. Soc. B 277, 2559–2564. (doi:10.1098/rspb.2010.0452) Link, ISI, Google Scholar

    • 22

      Strimling P, Sjöstrand J, Enquist M, Eriksson K. 2009Accumulation of independent cultural traits. Theor. Popul. Biol. 76, 77–83. (doi:10.1016/j.tpb.2009.04.006) Crossref, PubMed, ISI, Google Scholar

    • 23

      Lehmann L, Aoki K, Feldman MW. 2011On the number of independent cultural traits carried by individuals and populations. Phil. Trans. R. Soc. B 366, 424–435. (doi:10.1098/rstb.2010.0313) Link, ISI, Google Scholar

    • 24

      Ghirlanda S, Enquist M. 2007Cumulative culture and explosive demographic transitions. Qual. Quant. 41, 591–600. (doi:10.1007/s11135-007-9070-x) Crossref, Google Scholar

    • 25

      Moran PAP. 1958Random processes in genetics. Proc. Camb. Phil. Soc. 54, 60–71. (doi:10.1017/S0305004100033193) Crossref, Google Scholar

    • 26

      Kimura M. 1969The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics 61, 893–903. Crossref, PubMed, ISI, Google Scholar

    • 27

      Ewens WJ. 1974A note on the sampling theory for infinite alleles and infinite sites models. Theor. Popul. Biol. 6, 143–148. (doi:10.1016/0040-5809(74)90020-3) Crossref, PubMed, ISI, Google Scholar

    • 29

      Watterson GA. 1975On the number of segregating sites in genetical models without recombination. Theor. Popul. Biol. 7, 256–276. (doi:10.1016/0040-5809(75)90020-9) Crossref, PubMed, ISI, Google Scholar

    • 30

      Aoki K, Lehmann L, Feldman MW. 2011Rates of cultural change and patterns of cultural accumulation in stochastic models of social transmission. Theor. Popul. Biol. 79, 192–202. (doi:10.1016/j.tpb.2011.02.001) Crossref, PubMed, ISI, Google Scholar

    • 31

      Enquist M, Strimling P, Eriksson K, Laland K, Sjostrand J. 2010One cultural parent makes no culture. Anim. Behav. 79, 1353–1362. (doi:10.1016/j.anbehav.2010.03.009) Crossref, ISI, Google Scholar

    • 32

      Lumsden C, Wilson EO. 1981Genes, mind, and culture. Cambridge, MA: Harvard University Press. Google Scholar

    • 33

      Boyd R, Richerson PJ. 1985Culture and the evolutionary process. Chicago, IL: University Chicago Press. Google Scholar

    • 34

      Lachlan RF, Janik VM, Slater JB. 2004The evolution of conformity-enforcing behavior in cultural communication systems. Anim. Behav. 68, 561–570. (doi:10.1016/j.anbehav.2003.11.015) Crossref, ISI, Google Scholar

    • 35

      Eriksson K, Enquist M, Ghirlanda S. 2007Critical points in current theory of conformist social learning. J. Evol. Psych. 5, 67–87. (doi:10.1556/JEP.2007.1009) Crossref, Google Scholar

    • 36

      Nakahashi W. 2007The evolution of conformist transmission in social learning when the environment changes periodically. Theor. Popul. Biol. 72, 52–66. (doi:10.1016/j.tpb.2007.03.003) Crossref, PubMed, ISI, Google Scholar

    • 37

      Gilpin W, Feldman MW, Aoki K. 2016An ecocultural model predicts Neanderthal extinction through competition with modern humans. Proc. Natl Acad. Sci. USA 113, 2134–2139. (doi:10.1073/pnas.1524861113) Crossref, PubMed, ISI, Google Scholar

    • 38

      Rogers DS, Ehrlich PR. 2008Natural selection and cultural rates of change. Proc. Natl Acad. Sci. USA 105, 3416–3420. (doi:10.1073/pnas.0711802105) Crossref, PubMed, ISI, Google Scholar

    • 39

      Rogers DS, Feldman MW, Ehrlich PR. 2009Inferring population histories using cultural data. Proc. R. Soc. B 276, 3835–3843. (doi:10.1098/rspb.2009.1088) Link, ISI, Google Scholar

    • 40

      Jordan P, O'Neill S. 2010Untangling cultural inheritance: language diversity and long-house architecture on the Pacific northwest coast. Phil. Trans. R. Soc. B 365, 3875–3888. (doi:10.1098/rstb.2010.0092) Link, ISI, Google Scholar

    • 41

      Collard M, Ruttle A, Buchanan B, O'Brien MJ. 2013Population size and cultural evolution in nonindustrial food-producing societies. PLoS ONE 8, e72628. (doi:10.1371/journal.pone.0072628) Crossref, PubMed, ISI, Google Scholar


    Page 12

    From the accumulation of innovations driving the emergence of complex technologies to the accumulation of knowledge paving the way to increasingly accurate scientific theories, cumulative cultural evolution has set the stage for the remarkable ecological success of our species. Thus, identifying the determinants of cumulative cultural evolution is a key issue in the interdisciplinary field of cultural evolution.

    Much interest has focused on demography as a determinant of the rate of cumulative evolution [1–15]. In general, larger populations are thought to facilitate cumulative cultural evolution because they host a larger number of innovators and are less likely to suffer from random loss or incomplete transmission of cultural traits.

    A number of empirical studies have explored the relationship between population size and cultural complexity. Results have been mixed: some studies reported an effect in line with theoretical expectations [3,4] while others found no effect [16–18]. These inconsistencies have raised concerns among some scholars about the veracity of the link between population size and cultural complexity [19–21]. However, others have stressed that theoretical models specifically predict a positive relationship between cultural complexity and the effective population size, i.e. the size of the population that shares information [2,22]. Empirical studies that used census population size, i.e. the estimated size of a particular group without taking into account contacts with other groups, should thus be interpreted with caution. When tested under controlled conditions, the positive relationship between population size and cultural complexity is well supported: a growing body of laboratory experiments show that groups composed of a larger number of individuals produce more complex cultural traits than smaller groups [9–12].

    A recent experimental study, however, suggests that partially connected groups produce more complex cultural traits than fully connected groups of the same size when innovation depends of the recombination of existing cultural traits [13]. These results seem to be at odds with theoretical models of cultural evolution that predict that increasing the degree of connectedness, whether within or between populations, will positively affect a population's ability to accumulate cultural information. Increasing the degree of connectedness, the argument goes, gives individuals access to a larger number of social models and promotes individuals' opportunity to build upon each other's solutions [1–3,9]. Most models of cultural evolution, however, focus on the transmission of cultural traits, without considering the processes underlying the production of new traits. For instance, many models fail to capture the fact that rates of innovations are determined, in part, by the level of cultural diversity that exists in a population.

    In many fields, population structure is considered as a strong driver of the amount of diversity that exists in a population. Population geneticists, most notably Sewall Wright, have emphasized how populations subdivided into small and partially isolated subgroups would explore a more diverse set of solutions than populations with unconstrained gene flow [23]. Similarly, organization scientists have shown that groups that are well connected tend to lose cultural diversity faster than less-connected groups because individuals' propensity to learn from successful cultural models cause the entire population to converge rapidly on the same solution [24–27].

    In this paper, we argue that population structure is likely to critically affect cumulative culture through its effects on both production and maintenance of innovations. First, there exists a relationship between population connectedness and the exploration of the design space [13,24,25,27]. This suggests that populations subdivided into partially isolated subgroups will produce more diverse cultural traits than fully connected populations. Second, evidence from various fields suggests that innovation rates are affected by the level of cultural diversity that exists in a population [28]. This suggests that populations divided into partially isolated subgroups will exhibit more innovative abilities than fully connected populations. Third, theoretical and experimental studies of cultural evolution show that there is a relationship between the size of a population and its ability to maintain complex cultural traits [1,2,9]. This suggests that populations divided into partially isolated subgroups will be less likely to maintain complex cultural traits than fully connected populations.

    In the following sections, we aim at incorporating these different ideas within a single cultural-evolution framework to investigate the effect of population fragmentation on cumulative culture. We begin by reviewing the literature from various fields to highlight how increasing connectedness can decrease cultural diversity and stifle populations' innovative ability. We then use a simple agent-based model to show that for a given population size, there exists an intermediate level of population fragmentation that maximizes cumulative cultural evolution. Finally, we discuss the relevance of considering population fragmentation to explain patterns of cultural change in a wide range of contexts.

    Agent-based models from organization science show that population structure affects individuals' ability to solve problems associated with rugged fitness landscapes. Rugged fitness landscapes are hard to search because they have multiple peaks and so it is easy to get stuck at a local maximum. In fully connected populations, individuals are more likely to observe, and imitate, the same set of successful models, which can cause the entire population to converge rapidly to a suboptimal peak. In less-connected populations, individuals observe only a subset of cultural models and so do not benefit from the same cultural information. This can lead to more thorough exploration of the design space because it reduces populations' probability of prematurely converging on suboptimal solutions [24,25,27].

    Models investigating search in rugged landscapes differ from models of cumulative cultural evolution because they assume landscapes with a limited set of solutions and they focus on identifying the conditions that allow a population to find the most rewarding solution. By contrast, models of cumulative culture aim to capture an open-ended process that generates increasingly complex solutions. The ultimate goal of these models is to identify the conditions that promote the production and maintenance of complex cultural traits. Nevertheless, the literature about search in rugged landscapes can inform cultural evolution theory because the production of complex cultural traits can benefit from thorough exploration of the cultural landscape.

    Economists have pointed out that among all the possible directions technological development may take, only a small portion is ever realized [29]. Furthermore, it has been acknowledged that evolutionary change exhibits path dependence because early innovations constrain the future direction of change [30–33] (the notion is similar to what evolutionary biologists call phylogenetic inertia [34]). The QWERTY keyboard, for example, was invented in order to prevent jamming of the keys in the case of mechanical typewriting. With the invention of computer keyboards, the jamming problem disappeared but the QWERTY layout persists nowadays despite more efficient solutions [35].

    In models of search in complex landscapes, some structural isolation is beneficial because it promotes the exploration of several peaks and increases the population's likelihood of finding the highest peak. In an open-ended landscape, it suggests that populations subdivided into partially isolated subgroups should explore a more diverse set of trajectories than fully connected populations do. Moreover, in a cumulative framework, the partial isolation of subgroups might result in a feedback loop between cultural diversity and innovation because occasional contacts between groups will bring a variety of cultural traits together and will promote combinatorial opportunities.

    Economists studying the evolution of technology have long stressed the importance of the horizontal transfer of knowledge and innovations between different, but complementary, technological trajectories (e.g. [29,36,37]).

    Such transfers can take place between related trajectories [38–40]. For instance, phylogenetic analyses show that the evolution of the cornet, a brass wind musical instrument, was propelled by horizontal transfers between different coexisting types of cornets, as were other musical instruments such as the Baltic psaltery, a plucked stringed instrument [38]. Horizontal transfers between related lineages have also been documented among modern technologies such as programming languages [39]. The incorporation of solutions from different lineages is actually so common in the evolution of material culture that it limits the relevance of using biological phylogenetic methodology to infer historical patterns of material culture [41].

    Recombination and horizontal transfer can also take place between unrelated branches of knowledge [42]. For instance, in order to invent the electric light, Thomas Edison combined innovations made in electricity generation, the manufacture of conducting filaments and the removing of gas molecules from sealed volumes [36]. Barely modified traits can also serve a new function in a different domain. For example, pintle and gudgeon hinges that were used to mount sternpost rudders on medieval sailing boats during the late thirteenth century were borrowed from newly developed iron hinges from large castle and cathedral doors [43].

    But the merging of knowledge extends well beyond the domain of technology. Science, for example, benefits from borrowing ideas, concepts and methods between disciplines [44]. An analysis of 17.9 million academic papers shows that the papers that are built upon unusual combination of prior knowledge (e.g. work from unrelated disciplines) are more likely to have a high impact [45]. Similarly, economics and social psychology studies suggest that exposure to foreign cultures fosters innovation. Skilled migrants have been shown to positively contribute to knowledge creation in host countries (measured either by the number of patents applied for or by the number of citations to published articles) [46]. Consistently, experiments showed that ethnically diverse groups can generate better ideas than homogeneous groups during brainstorming sessions [47] and that studying abroad positively affects creative thinking among Western students [48].

    These studies suggest that innovation is fuelled by the combination of unrelated bodies of skills, technology and knowledge. This, in turn, indicates that a population's ability to innovate should depend on how culturally diverse it is.

    Models of cumulative cultural evolution aim at identifying the conditions promoting the evolution of complex cultural traits. So far, we have suggested that low levels of connectedness might increase the production of new cultural traits because structural isolation promotes cultural diversity and combinatorial opportunities. However, theoretical models suggest that populations composed of partially isolated subgroups will be more likely to suffer from random loss of cultural traits than well-connected populations because the probability of inaccurate transmission is negatively related to the number of cultural learners [1,2]. In less-connected networks, information travels slower so that fewer individuals will be exposed to novel adaptive cultural traits. Experimental studies of cultural evolution demonstrated that cultural traits are more likely to become deteriorated, or even lost, when groups of cultural learners are small [9,11].

    These facts suggest that population fragmentation is a double-edged sword. On the one hand, it allows the production of more complex cultural traits, and on the other hand, it restrains populations' ability to maintain these traits. This suggests that for any given population size, there should be an intermediate level of fragmentation that maximizes cumulative cultural evolution, by balancing cultural loss and cultural diversity.

    We use an agent-based model to explore the effects of population fragmentation on cultural accumulation. To do so, we integrate into a single cultural evolution framework the ideas discussed above and that are scattered across different fields. Studies from the field of organization science have explored the effect of population structure on the diversity of observed solutions. However, these studies are usually based on finite landscapes where cultural diversity only serves the pinpointing of an optimal solution [24,25,27]. The effects of connectedness on cultural diversity are also well known in the field of cultural diffusion but these models usually condition the act of copying on cultural similarity, or rates of adoption, rather than on the success of cultural models (e.g. [49]). Moreover, both these fields do not deal with cultural traits of varying complexity. Cultural-evolution models have extensively investigated the relationship between traits' complexity and their probability of being lost [1,2,50]. However, these models often neglect the processes that underlie the production of these traits. In particular, most of the existing models of cumulative cultural evolution do not account for how cultural diversity affects the production of new knowledge (e.g. [2,3,14]).

    We model cumulative cultural evolution as a walk on a tree in order to capture the idea that past innovations shape future evolution (figure 1). Each branch on the tree denotes a different trajectory and the nodes represent different cultural items. The hierarchical level at which an item lies specifies its complexity. From the base of the tree two alternative items of a complexity level of 1 can be produced, A and B. Each of these items can then be improved in two different ways. Item A, for example, can give rise to AA or AB, which are items with a complexity level of 2. Alternatively, item B can give rise to BA or BB. The number of cultural items that can be produced at any given level of complexity C equals 2C as each new item opens two alternative pathways.

    What is cultural transmission example?

    Figure 1. The cultural landscape is modelled as a branching tree. Each node represents a cultural item that can give rise to two new cultural items. More complex items are composed of an increasing number of sub-items. Progress along different trajectories leads to cultural divergence. The number of alternative items increases with cultural complexity according to 2C. The coloured lines are examples of progression within the landscape. The individual in red produced A then AB then ABA. The diversity level of her cultural repertoire is 3 (she knows A, AB and ABA) and its complexity level is 3 (that is the complexity level of ABA, the most complex item in the repertoire). The individual in blue produced B. The diversity level of her cultural repertoire is 1 and its complexity level is 1. If these two individuals could learn from each other, they could end up with four items in their cultural repertoire (A, AB, ABA and B). The diversity level of their cultural repertoire would be 4 with a complexity level of 3 (as ABA is the most complex item of this repertoire). The number of cultural items that can be produced is theoretically unlimited.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Highly complex items are composed of simpler items along the same trajectory. An increase in cultural complexity is represented as a climb up the tree, and cultural loss as a walk down it. This captures the fact that the accumulation of cultural traits leads to new branching possibilities, that is new opportunities for cultural diversity.

    In the model, cultural complexity refers to the level at which innovations lie in the tree-shaped landscape—climbing the tree means higher complexity. Cultural diversity refers to the number of nodes that are explored by a population and captures populations' ability to produce a variety of cultural items. Because groups can diverge, this model differs from cultural dissemination models where cultural diversity can only go down across time (e.g. [49]).

    We simulated technological evolution in a population of size n and fragmented equally into f subpopulations. For instance, when n = 600 and f = 1, the population is composed of one single group of 600 individuals. Similarly, when n = 600 and f = 5, the population is composed of five subpopulations of 120 individuals each.

    At the start of a simulation, individuals do not possess any cultural items. At the beginning of each time step, they innovate with probability p. Individuals innovate from their most complex item. If an individual possesses more than one item with the same level of complexity, she picks one randomly. When an innovation occurs, individuals acquire one of the two alternative solutions with the same probability. An individual having discovered item B, for example, can only produce BA or BB and not AA or AB. This simulates the effect of early innovation events on future direction of change and individuals' tendency for local search (empirical evidence from many fields suggests that individuals, and firms, tend to search locally for new solutions by building upon their established technology and expertise [36,51,52]). Thus, isolated individuals progress along a single trajectory and will not explore a diverse set of branches. Individuals, however, can acquire items from various branches through social learning although they always innovate from the most complex item in their toolkit. This captures the effect of specialized knowledge on innovation.

    After individuals have had a chance to innovate, they learn socially by copying the items of the members of their subpopulation. When individuals observe more than one technology belonging to the same technological trajectory, they adopt the most complex one. For instance, if individuals observe items BBA and B, they learn BBA (which includes B). Errors may occur during the social learning phase: individuals can fail to properly acquire an item and sometimes end up with a less complex form of that item. For example, an individual may observe item AAA, but end up with AA or even A. The extent of the error, i.e. the number of hierarchical levels that separates the item observed from the one learnt, is binomially distributed with parameter

    What is cultural transmission example?
    , a constant probability of error and a sample size equal to the complexity of the trait being innovated. Thus, the more complex an item is, the harder it is to learn accurately.

    After the social learning phase, individuals visit other subgroups with probability m. When individuals visit a subpopulation, they choose one population randomly and spend the next time step in that subpopulation. After the next time step is over, visiting individuals return to their primary subgroup. For simplicity, individuals carry all their items when they visit.

    Our model aims to capture the idea that cultural diversity promotes opportunities to innovate. Cultural diversity can promote innovation in many ways; however, for simplicity, we suppose that there is a positive relationship between cultural diversity and the opportunity to create complex traits. This assumption is based on the fact that higher amounts of cultural diversity create new combinatorial opportunities and more complex traits are as usually composed of an increasing number of sub-components.

    The parameter ρ determines the minimum number of items that an individual has to possess in order to innovate, Cρ, where C is the complexity of the item picked by the individual for innovation (figure 2).

    What is cultural transmission example?

    Figure 2. Minimum number of unique cultural items required to innovate, as a function of the level of complexity of the trait being innovated, for ρ = 1, 1.3 and 1.6. When ρ = 1.3, an individual needs at least eight different items in order to innovate on an item with a level of complexity of 5. When ρ = 1.6, an individual needs at least 13 different items.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    When ρ = 1, cultural diversity does not affect innovation. Individuals can innovate without having any knowledge about other trajectories because any item is composed from a sufficient number of sub-items to match Cρ. For instance, an individual who picked an item with a complexity level of 3 to innovate necessarily possesses a cultural repertoire of size 3 which is the number of items required to innovate when ρ = 1 (figure 1).

    When ρ > 1, individuals must possess items from more than one branch in order to innovate. For example, if ρ = 1.3, an individual needs at least four different items in order to innovate on an item with a level of complexity of 3 (because 31.3 ≈ 4). This condition cannot be met when individuals progress along a single trajectory (figure 1). When ρ increases, still more diversity has to be produced in order to innovate. For example, if ρ = 1.6, individuals need at least six different items in order to innovate on an item with a level of complexity of 3 (because 31.6 ≈ 6). Thus, when ρ increases, generating innovations demands increasingly high levels of cultural diversity, and a population that specializes in one or just a small set of trajectories will soon be unable to produce innovations. Figure 2 shows how the minimum number of items required to innovate varies as a function of ρ and the complexity of the trait being innovated.

    The diversity of individuals' cultural repertoires also affects social learning. Individuals can acquire new items through social learning only when they possess in their cultural repertoire the minimum number of items required to innovate,

    What is cultural transmission example?
    , as described above. If they do not meet this requirement, individuals acquire the most complex item on the trajectory that their cultural repertoire can support. For example, if an individual observes ABAA (complexity level of 4) but has a cultural repertoire that is not diverse enough to produce traits of complexity level of 4, he will acquire a simpler version of that trait (such as ABA).

    When innovation does not depend on cultural diversity (ρ = 1), fragmentation results in lower levels of cultural complexity (figure 3). Fragmentation affects cumulative cultural evolution in two related ways. First, it slows down the spread of innovation between individuals, affecting individual's capacity to build upon each other's innovation and decreasing the pace of cumulative cultural evolution. Second, it affects populations' capacity to maintain complex items. When cultural items get more complex, they also get harder to learn. Because each individual has some probability of making learning errors, the probability of inaccurate transmission is negatively related to the number of cultural learners. Complex cultural traits can arise in fragmented populations but they are likely to quickly disappear because few individuals will be exposed to them, increasing the likelihood of failed transmission. More fragmented populations thus exhibit less complex cultural items when they reach their cultural complexity steady state.

    What is cultural transmission example?

    Figure 3. Effect of fragmentation (f) when innovation does not depend on cultural diversity (ρ = 1). When ρ = 1, cultural complexity, defined as the most complex cultural items possessed by individuals, is negatively affected by population fragmentation. Lines show the value (±s.e.m.) found after 300 time steps averaged over 30 simulations. Other parameters: n = 600, p = 0.005, m = 0.01, ɛ = 0.2.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Increasing the error rate ɛ does not qualitatively change these results. Higher rates of error lead to lower levels of cultural complexity in all populations because accurate learning events become uncommon at lower levels of cultural complexity (electronic supplementary material, figure S1). Changing the innovation probability, p, mainly affects the time it takes populations to reach their cultural steady state without affecting the level of cultural complexity (electronic supplementary material, figure S1).

    These results are in line with most theoretical models and experimental studies that investigated the relationship between effective population size and cumulative culture [1–5,7–15]. For simplicity, we assume a constant error rate of 0.2 and a constant probability of innovation of 0.005 in the simulations below.

    When innovation depends on cultural diversity (i.e. ρ > 1), cultural accumulation in non-fragmented populations is reduced because non-fragmented populations suffer from cultural homogenization, and this prevents them from generating the diversity of traits required to produce highly complex cultural traits. This means that cultural complexity in non-fragmented populations is not limited by what these populations are able to maintain but by what they are able to produce.

    Highly fragmented populations suffer from the opposite effect. They produce a variety of cultural traits but cannot stabilize them above a certain level of complexity. Thus, cultural complexity in highly fragmented populations is mainly limited by what these populations are able to maintain.

    In populations with intermediate levels of fragmentation, cultural loss and cultural diversity are balanced in a way that maximizes cultural complexity (figures 4 and 5). Weakly fragmented populations exhibit higher levels of cultural diversity than non-fragmented populations, which fuels the innovation process and promotes the emergence of highly complex cultural items. At the same time, low levels of fragmentation do not drastically reduce populations' ability to maintain complex cultural items because many learners will be exposed to innovations.

    What is cultural transmission example?

    Figure 4. Effect of fragmentation (f) when innovation weakly depends on cultural diversity (ρ = 1.3). Weakly fragmented populations produce the most complex cultural items when innovation weakly depends on cultural diversity. This is because weakly fragmented populations (f = 5) are able to both stabilize complex cultural items and produce enough cultural diversity to generate them. In comparison, well-connected populations (f = 1) suffer from a lack of cultural diversity to innovate, while cultural accumulation in moderately (f = 10) and highly (f = 60) fragmented populations is limited by populations' ability at maintaining complex cultural items. Lines show the value (±s.e.m.) found after 300 time steps averaged over 30 simulations. Other parameters: n = 600, p = 0.005, m = 0.01, ɛ = 0.2.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What is cultural transmission example?

    Figure 5. Effect of fragmentation (f) when innovation strongly depends on cultural diversity (ρ = 1.6). When innovation requires large amounts of cultural diversity, moderately fragmented populations (f = 10) produce the most complex cultural items. Well-connected (f = 1) and weakly fragmented populations (f = 5) do not generate enough cultural diversity to produce complex items, while highly fragmented populations (f = 60) remain limited by their ability at stabilizing complex cultural traits. Lines show the value (± s.e.m.) found after 300 time steps averaged over 30 simulations. Other parameters: n = 600, p = 0.005, m = 0.01, ɛ = 0.2.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    The optimal level of population fragmentation depends on the extent to which innovation relies on cultural diversity. The more the innovation process depends on cultural diversity, the more fragmented populations must be in order to produce the level of cultural diversity required for further innovation (figures 4 and 5). However, highly fragmented populations become unable to maintain complex cultural traits because few individuals will be exposed to innovations when they appear. The highest level of cultural complexity is thus reached at the minimal level of fragmentation that provides populations with enough cultural diversity to keep innovating.

    Fragmentation can increase cultural accumulation because it positively affects cultural diversity. However, when migration rates are high, subpopulations do not diverge because they share cultural information before alternative solutions are produced. Thus, as migration rate increases, the benefit of fragmentation decreases (figure 6).

    What is cultural transmission example?

    Figure 6. Effect of migration (m) on the performance of moderately fragmented populations (f = 10) when innovation strongly depends on cultural diversity (ρ = 1.6). As migration rate increases, subpopulations' cultural repertoires become homogenized despite of fragmentation. As a result, populations produce less-complex cultural traits. Other parameters: n = 600, p = 0.005, ɛ = 0.2.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Our model shows that populations that are partially fragmented can reach higher levels of cultural complexity than populations that are fully connected when generating complex cultural traits depending on cultural diversity.

    Well-connected populations do better than fragmented ones when the production of complex traits is independent of the level of cultural diversity because the steady-state level of cultural complexity is determined only by populations' ability to maintain innovations. In fragmented populations, a smaller number of individuals observes novel cultural traits, which makes innovation more likely to be lost. The effect of fragmentation becomes more acute as cultural traits increase in complexity because more complex traits are more difficult to learn without error. Thus, when the process of innovation does not depend on cultural diversity, less-fragmented populations exhibit more complex cultural traits. This result is consistent with most theoretical models and experimental studies that investigated the relationship between effective population size and cumulative culture [1–5,7–15].

    However, evidence suggests that innovation rates are affected by cultural diversity [28,46]. The studies reviewed above suggest that innovation often takes the form of recombination of unrelated technologies, skills and knowledge, and higher levels of cultural diversity makes recombination more fruitful.

    When innovation depends on cultural diversity, cultural accumulation is driven by both populations’ ability to produce new traits and to maintain them. Fragmented populations are more likely to explore different technological trajectories, and individuals migrating between subpopulations allows diverse cultural traits to be brought together (electronic supplementary material, figure S2). This increase in cultural diversity fuels innovation and allows fragmented populations to produce more complex cultural traits than non-fragmented populations.

    The optimal level of fragmentation depends on how strongly the production of complex cultural traits is dependent on cultural diversity. In theory, the populations that produce the most diverse cultural repertoires should be the most innovative. However, highly diverse cultural repertoires cannot be produced without increasing the level of fragmentation, which reduces populations' ability to maintain complex traits. When populations are too fragmented they cannot maintain complex cultural traits and cannot accumulate innovations, even if they have very diverse cultural repertoires. Thus, the most complex cultural traits are produced when populations are fragmented in a way that minimizes cultural loss but generates enough cultural diversity to keep innovating. When innovation is increasingly dependent on cultural diversity, more-fragmented populations tend to perform better while less-fragmented populations tend to perform worse (figures 3–5).

    Interestingly, we found that small rates of learning error can promote cumulative culture in non-fragmented populations when innovation weakly depends on cultural diversity. This is because errors allow individuals to shift to a new trajectory after having failed to properly acquire a cultural trait. This leads to a more thorough exploration of the space of possibilities and increases cultural diversity in the overall population. This result is consistent with previous work suggesting that learning errors can benefit cultural diversity [53]. Non-fragmented populations become limited in their ability to innovate due to low levels of cultural diversity but benefit from the diversity arising from learning errors to slowly reach higher levels of cultural complexity (figure 2). This effect is, however, limited as in many cases it does not provide non-fragmented populations with enough cultural diversity to keep innovating (figure 5; electronic supplementary material, figure S3).

    In our simulations, the most fragmented populations did not attain the highest levels of cultural diversity because they could not stabilize complex cultural traits (electronic supplementary material, figure S2). As a consequence, highly fragmented populations, despite being composed of many semi-isolated groups, exhibit relatively low levels of cultural diversity because only a few different solutions can be produced at low levels of complexity (figure 1). It could be argued that this result is an artefact of modelling technological evolution as a branching tree. Yet it is worth noting that in real life the space of possible solutions does tend to increase with cultural complexity because more-complex innovations are made of an increasing number of sub-components. Thus, in many situations, the number of directions that innovation can take increases with the number of sub-components because sub-components can be refined in many different ways. Moreover, the addition of sub-components creates new combinatorial opportunities, which further widen the range of possible innovations [54]. Note, however, that highly fragmented populations might produce more cultural diversity in landscapes with more branching possibilities [55].

    These simulations predict that the amount of cultural diversity produced by fragmented populations depends on the number of subpopulations and the level of migration between these subpopulations. When migration rate increases, cultural traits spread faster and cultural repertoires are homogenized despite population fragmentation (electronic supplementary material, figure S4). This is in line with a recent agent-based model that showed that higher migration rate can negatively affect cultural accumulation by preventing a culturally distinct toolkit to evolve [15]. As a result, higher migration rates reduce the populations' cultural complexity steady state when innovation depends on cultural diversity (figure 6). The migration rate that maximizes cultural complexity ultimately depends on innovation rate. Lower innovation rates reduce cultural diversity unless migration rates are also lower. In our model, migration events did not affect average population size, as individuals returned to their primary subgroup after social learning. As a consequence, the threshold of cultural complexity that can be maintained by fragmented populations remains determined by the size of their subpopulations. This means that in our simulations lower levels of migration will always lead to higher cultural–complexity steady states, although at lower rates of accumulation. Note that a recent model that considered the joint effect of cultural contact, innovation, and modifiers of biological carrying capacity showed that intermediate rates of migration are better for cultural accumulation [15]. This suggests that low rates of migration could be detrimental to cumulative culture when feedback effects between population size and cultural complexity are more realistically considered.

    Our results are consistent with recent experimental and theoretical studies showing that population interaction can be a strong driver of cultural accumulation [13,15]. However, our model also indicates that population interaction does not necessarily increase cultural complexity. When semi-isolated populations are small, cultural complexity is primarily determined by populations' ability to maintain cultural traits. Contact between populations has little effect on cultural accumulation because populations cannot benefit from cultural exchange. When contacts occur between sizeable groups, intergroup contacts promote cumulative culture because contacts increase cultural diversity and foster the emergence of more-complex traits. This suggests that population structure can have important effects on cultural accumulation and should be taken into account when it comes to investigate the relationship between demography and cumulative culture.

    Taking into account the role of population structure on cumulative culture may help explain ancestral and historical patterns of cultural change [13]. For example, the Middle Palaeolithic (MP) in Eurasia is characterized by little evidence of change in stone tool technology as compared with later periods such as the Upper Palaeolithic (UP) in Europe or the Late Stone Age (LSA) in Africa [3,56]. The increase in cultural complexity that characterized the UP and LSA has been interpreted as resulting from an increase in effective population size because of the positive effect of demography on cultural transmission [1,3,57,58]. The present results suggest an alternative mechanism, namely that the rise of intergroup interaction that took place during the Palaeolithic could have driven up cultural complexity by bringing diverse cultural traits together, thereby promoting populations' opportunities to innovate (see also [13,15]).

    In more recent times, the effect of population structure might also have played a role during the Industrial Revolution in Western Europe in the eighteenth century. Explaining why Europe has been the scene of remarkable technological development over the last centuries has been the focus of much attention in various fields [59–61]. Among economists, one of the most common explanations is that the long political fragmentation of Europe encouraged scientific and technological innovation through competition. According to this view, unified civilizations, such a China, did not experience comparable rates of technological advancement because they had no adversaries to compete with. Emulation, as well as many other factors, certainly contributed in the Industrial Revolution. Yet it is worth noting that Europe's political and cultural fragmentation might have promoted the pursuit of different technological trajectories. In his book Guns, Germs and Steel, Jared Diamond notes that ‘Europe's geographic balkanization resulted in dozens or hundreds of independent, competing statelets and centers of innovation. If one state did not pursue some particular innovation, another did' [60]. Diamond also stresses that although Europe's barriers were sufficient to prevent political unification, they did not halt the spread of technology and ideas between countries. According to our results, this population structure might have benefited technological progress because it promotes cultural diversity and spurs innovation (although many other factors probably contributed to that phenomenon).

    It should be noted that the exploration of alternative portions of the space of possibilities does not require subpopulations to be spatially isolated. Modern communication technologies such as the Internet guarantee access to knowledge accumulated in any disciplines in any part of the world. Thus, the academic world could be considered to be a fully connected population. Nonetheless, it has been shown that the geographical and cultural fragmentation of the research community serves an adaptive role in facilitating the resistance of more diverse ideas and preventing global homogenization even within a single discipline [62]. More generally, because of division of labour and specialization, individuals carry different subsets of information and explore different trajectories within the space of possibilities [63]. Physics scholars, for example, are more likely to make discoveries about elementary particles than biologists. Yet breakthroughs in one branch of science can have a large impact on seemingly unrelated fields. Discoveries about nuclear spin in physics in the 1940s, for example, led to the development of magnetic resonance imaging techniques that led, in turn, to new discoveries in fields of medicine and biology [44]. Thus, the role of population structure on cultural accumulation is not limited to cases where populations are geographically fragmented. Division of labour and other mechanisms that prevent cultural homogenization at the population level are likely to have comparable effects to those observed in our model [55,64]. For example, an empirical study that investigated the relationship between groups' connectivity and their ability to produce innovative ideas in a firm's research and development department found that intermediate levels of connectivity were of most benefit to the production of high-quality ideas [65].

    The present paper suggests that it is important to take into account the processes that underlie the emergence of new cultural traits, i.e. the innovation process [13,15,63,66]. Much of the work in the field of cultural evolution has focused on transmission fidelity, as it is considered as one of the main drivers of cumulative cultural evolution [50]. The present model indicates that when the processes that underlie the emergence of new traits are taken into account, cumulative cultural evolution is driven by both transmission fidelity and innovation production. Population structure through its effect on cultural diversity can be a strong driver of cultural accumulation and may help better explain ancestral and historical patterns of cultural change.

    The simulation code is available as the electronic supplementary material.

    M.D. and C.P. designed the study, performed and analysed simulations. M.D. C.P. and R.B. wrote the manuscript.

    The authors declare no competing interests.

    This research was made possible through the support of a grant (ID: 48952) from the John Templeton Foundation to the Institute of Human Origins at Arizona State University.

    The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the John Templeton Foundation.

    Footnotes

    One contribution of 16 to a theme issue ‘Bridging cultural gaps: interdisciplinary studies in human cultural evolution’.

    Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.3965859.

    References

    • 1

      Shennan S. 2001Demography and cultural innovation: a model and its implications for the emergence of modern human culture. Camb. Archaeol. J. 11, 5–16. (doi:10.1017/S0959774301000014) Crossref, ISI, Google Scholar

    • 2

      Henrich J. 2004Demography and cultural evolution: how adaptive cultural processes can produce maladaptive losses—the Tasmanian case. Am. Antiq. 69, 197–214. (doi:10.2307/4128416) Crossref, ISI, Google Scholar

    • 3

      Powell A, Shennan S, Thomas MG. 2009Late pleistocene demography and the appearance of modern human behavior. Science 324, 1298–1301. (doi:10.1126/science.1170165) Crossref, PubMed, ISI, Google Scholar

    • 4

      Kline MA, Boyd R. 2010Population size predicts technological complexity in Oceania. Proc. R. Soc. B 277, 2559–2564. (doi:10.1098/rspb.2010.0452) Link, ISI, Google Scholar

    • 5

      Mesoudi A. 2011Variable cultural acquisition costs constrain cumulative cultural evolution. PLoS ONE 6, e18239. (doi:10.1371/journal.pone.0018239) Crossref, PubMed, ISI, Google Scholar

    • 6

      Vaesen K. 2012Cumulative cultural evolution and demography. PLoS ONE 7, e40989. (doi:10.1371/journal.pone.0040989) Crossref, PubMed, ISI, Google Scholar

    • 7

      Lehmann L, Wakano JY. 2013The handaxe and the microscope: individual and social learning in a multidimensional model of adaptation. Evol. Hum. Behav. 34, 109–117. (doi:10.1016/j.evolhumbehav.2012.11.001) Crossref, ISI, Google Scholar

    • 8

      Kobayashi Y, Aoki K. 2012Innovativeness, population size and cumulative cultural evolution. Theor. Popul. Biol. 82, 38–47. (doi:10.1016/j.tpb.2012.04.001) Crossref, PubMed, ISI, Google Scholar

    • 9

      Derex M, Beugin M-P, Godelle B, Raymond M. 2013Experimental evidence for the influence of group size on cultural complexity. Nature 503, 389–391. (doi:10.1038/nature12774) Crossref, PubMed, ISI, Google Scholar

    • 10

      Kempe M, Mesoudi A. 2014An experimental demonstration of the effect of group size on cultural accumulation. Evol. Hum. Behav. 35, 285–290. (doi:10.1016/j.evolhumbehav.2014.02.009) Crossref, ISI, Google Scholar

    • 11

      Muthukrishna M, Shulman BW, Vasilescu V, Henrich J. 2014Sociality influences cultural complexity. Proc. R. Soc. B 281, 20132511. (doi:10.1098/rspb.2013.2511) Link, ISI, Google Scholar

    • 12

      Derex M, Boyd R. 2015The foundations of the human cultural niche. Nat. Commun. 6, 8398. (doi:10.1038/ncomms9398) Crossref, PubMed, ISI, Google Scholar

    • 13

      Derex M, Boyd R. 2016Partial connectivity increases cultural accumulation within groups. Proc. Natl Acad. Sci. USA 113, 2982–2987. (doi:10.1073/pnas.1518798113) Crossref, PubMed, ISI, Google Scholar

    • 14

      Kobayashi Y, Ohtsuki H, Wakano JY. 2016Population size vs. social connectedness — A gene-culture coevolutionary approach to cumulative cultural evolution. Theor. Popul. Biol. 111, 87–95. (doi:10.1016/j.tpb.2016.07.001) Crossref, PubMed, ISI, Google Scholar

    • 15

      Creanza N, Kolodny O, Feldman MW. 2017Greater than the sum of its parts? Modelling population contact and interaction of cultural repertoires. J. R. Soc. Interface 14, 20170171. (doi:10.1098/rsif.2017.0171) Link, ISI, Google Scholar

    • 16

      Collard M, Buchanan B, O'Brien MJ, Scholnick J. 2013Risk, mobility or population size? Drivers of technological richness among contact-period western North American hunter–gatherers. Phil. Trans. R. Soc. B 368, 20120412. (doi:10.1098/rstb.2012.0412) Link, ISI, Google Scholar

    • 17

      Buchanan B, O'Brien M, Collard M. 2015Drivers of technological richness in prehistoric Texas: an archaeological test of the population size and environmental risk hypotheses. Archaeol. Anthropol. Sci 8, 625–634. (doi:10.1007/s12520-015-0245-4) Crossref, ISI, Google Scholar

    • 18

      Collard M, Buchanan B, Morin J, Costopoulos A. 2011What drives the evolution of hunter-gatherer subsistence technology? A reanalysis of the risk hypothesis with data from the Pacific Northwest. Phil. Trans. R. Soc. B 366, 1129–1138. (doi:10.1098/rstb.2010.0366) Link, ISI, Google Scholar

    • 19

      Vaesen K, Collard M, Cosgrove R, Roebroeks W. 2016Population size does not explain past changes in cultural complexity. Proc. Natl Acad. Sci. USA 113, E2241–E2247. (doi:10.1073/pnas.1520288113) Crossref, PubMed, ISI, Google Scholar

    • 20

      Collard M, Buchanan B, O'Brien MJ. 2013Population size as an explanation for patterns in the Paleolithic archaeological record: more caution is needed. Curr. Anthropol. 54, S388–S396. (doi:10.1086/673881) Crossref, ISI, Google Scholar

    • 21

      Collard M, Vaesen K, Cosgrove R, Roebroeks W. 2016The empirical case against the ‘demographic turn’ in Palaeolithic archaeology. Phil. Trans. R. Soc. B 371, 20150242. (doi:10.1098/rstb.2015.0242) Link, ISI, Google Scholar

    • 22

      Henrich J, Boyd R, Derex M, Kline MA, Mesoudi A, Muthukrishna M, Powell AT, Shennan SJ, Thomas MG. 2016Understanding cumulative cultural evolution. Proc. Natl Acad. Sci. USA 113, E6724–E6725. (doi:10.1073/pnas.1610005113) Crossref, PubMed, ISI, Google Scholar

    • 23

      Wright S. 1932The roles of mutation, inbreeding, crossbreeding, and selection in evolution. Proc. 6th Intl. Cong. Genet. 1, 356–366. Google Scholar

    • 24

      Fang C, Lee J, Schilling MA. 2009Balancing exploration and exploitation through structural design: the isolation of subgroups and organizational learning. Organ. Sci. 21, 625–642. (doi:10.1287/orsc.1090.0468) Crossref, ISI, Google Scholar

    • 25

      Lazer D, Friedman A. 2007The network structure of exploration and exploitation. Adm. Sci. Quart. 52, 667–694. (doi:10.2189/asqu.52.4.667) Crossref, ISI, Google Scholar

    • 26

      Schilling MA, Phelps CC. 2007Interfirm collaboration networks: the impact of large-scale network structure on firm innovation. Manag. Sci. 53, 1113–1126. (doi:10.1287/mnsc.1060.0624) Crossref, ISI, Google Scholar

    • 27

      Mason WA, Jones A, Goldstone RL. 2008Propagation of innovations in networked groups. J. Exp. Psychol. 137, 422–433. (doi:10.1037/a0012798) Crossref, Google Scholar

    • 28

      Page S. 2007The difference: how the power of diversity creates better groups, firms, schools, and societies. Princeton, NJ: Princeton University Press. Google Scholar

    • 29

      Dosi G. 1982Technological paradigms and technological trajectories. Res. Policy 11, 147–162. (doi:10.1016/0048-7333(82)90016-6) Crossref, ISI, Google Scholar

    • 30

      Liebowitz SJ, Margolis SE. 1995Path dependence, lock-in, and history. J. Law Econ. Organ. 11, 205–226. (doi:10.2139/ssrn.1706450) ISI, Google Scholar

    • 31

      David P. 2007Path dependence: a foundational concept for historical social science. Cliometrica 1, 91–114. (doi:10.1007/s11698-006-0005-x) Crossref, ISI, Google Scholar

    • 32

      Martin R, Sunley P. 2010The place of path dependence in an evolutionary perspective on the economic landscape. In Handbook of evolutionary economic geography (eds Boschma R, Martin R), pp. 62–92. Chichester, UK: Edward Elgar. Google Scholar

    • 33

      Nelson RR, Winter SG. 1977In search of useful theory of innovation. Res. Policy 6, 36–76. (doi:10.1016/0048-7333(77)90029-4) Crossref, ISI, Google Scholar

    • 34

      Shanahan T. 2011Phylogenetic inertia and Darwin's higher law. Stud. His. Phil. Sci. Part C 42, 60–68. (doi:10.1016/j.shpsc.2010.11.013) PubMed, Google Scholar

    • 35

      David PA. 1985Clio and the economics of QWERTY. Am. Econ. Rev. 75, 332–337. ISI, Google Scholar

    • 36

      Silverberg G, Verspagen B. 2005A percolation model of innovation in complex technology spaces. J. Econ. Dyn. Control 29, 225–244. (doi:10.1016/j.jedc.2003.05.005) Crossref, ISI, Google Scholar

    • 37

      Fleming L. 2001Recombinant uncertainty in technological search. Manage. Sci. 47, 117–132. (doi:10.1287/mnsc.47.1.117.10671) Crossref, ISI, Google Scholar

    • 38

      Tëmkin I, Eldredge N. 2007Phylogenetics and material cultural evolution. Curr. Anthropol. 48, 146–154. (doi:10.1086/510463) Crossref, ISI, Google Scholar

    • 39

      Solée RV, Valverde S, Casals MR, Kauffman SA, Farmer D, Eldredge N. 2013The evolutionary ecology of technological innovations. Complexity 18, 15–27. (doi:10.1002/cplx.21436) Crossref, ISI, Google Scholar

    • 40

      Wagner A, Rosen W. 2014Spaces of the possible: universal Darwinism and the wall between technological and biological innovation. J. R. Soc. Interface 11, 20131190. (doi:10.1098/rsif.2013.1190) Link, ISI, Google Scholar

    • 41

      Terrell JE, Hunt TL, Gosden C. 1997The dimensions of social life in the pacific: human diversity and the myth of the primitive isolate. Curr. Anthropol. 38, 155–195. (doi:10.1086/204604) Crossref, ISI, Google Scholar

    • 42

      Basalla G. 1988The evolution of technology. Cambridge, UK: Cambridge University Press. Google Scholar

    • 43

      Boyd R, Richerson PJ, Henrich J. 2013The cultural evolution of technology: facts and theories. In Cultural evolution: society, technology, language, and religion vol. 12 (eds Richerson PJ, Christiansen MH), pp. 119–142. Cambridge, MA: MIT Press. Crossref, Google Scholar

    • 44

      Rinia EJ, van Leeuwen TN, Bruins EEW, van Vuren HG, van Raan AFJ. 2002Measuring knowledge transfer between fields of science. Scientometrics 54, 347–362. (doi:10.1023/a:1016078331752) Crossref, ISI, Google Scholar

    • 45

      Uzzi B, Mukherjee S, Stringer M, Jones B. 2013Atypical combinations and scientific impact. Science 342, 468–472. (doi:10.1126/science.1240474) Crossref, PubMed, ISI, Google Scholar

    • 46

      Bosetti V, Cattaneo C, Verdolini E. 2012Migration, cultural diversity and innovation: a European perspective. FEEM Work. Pap. 69.2012. (doi:10.2139/ssrn.2162836). Google Scholar

    • 47

      McLeod PL, Lobel SA, Taylor H, Cox J. 1996Ethnic diversity and creativity in small groups. Small Group Res. 27, 248–264. (doi:10.1177/1046496496272003) Crossref, ISI, Google Scholar

    • 48

      Lee CS, Therriault DJ, Linderholm T. 2012On the cognitive benefits of cultural experience: exploring the relationship between studying abroad and creative thinking. Appl. Cogn. Psychol. 26, 768–778. (doi:10.1002/acp.2857) Crossref, ISI, Google Scholar

    • 49

      Axelrod R. 1997The dissemination of culture. J. Confl. Resolu. 41, 203–226. (doi:10.1177/0022002797041002001) Crossref, ISI, Google Scholar

    • 50

      Lewis HM, Laland KN. 2012Transmission fidelity is the key to the build-up of cumulative culture. Phil. Trans. R. Soc. B 367, 2171–2180. (doi:10.1098/rstb.2012.0119) Link, ISI, Google Scholar

    • 51

      Lobo J, Miller JH, Fontana W. 2004Neutrality in technological landscapes. Sante Fe Institute Working Paper. Google Scholar

    • 52

      Boschma R. 2005Proximity and innovation: a critical assessment. Reg. Stud. 39, 61–74. (doi:10.1080/0034340052000320887) Crossref, ISI, Google Scholar

    • 53

      Rendell L, Boyd R, Enquist M, Feldman MW, Fogarty L, Laland KN. 2011How copying affects the amount, evenness and persistence of cultural knowledge: insights from the social learning strategies tournament. Phil. Trans. R. Soc. B 366, 1118–1128. (doi:10.1098/rstb.2010.0376) Link, ISI, Google Scholar

    • 54

      Youn H, Strumsky D, Bettencourt LMA, Lobo J. 2015Invention as a combinatorial process: evidence from US patents. J. R. Soc. Interface 12, 20150272. (doi:10.1098/rsif.2015.0272) Link, ISI, Google Scholar

    • 55

      Enquist M, Ghirlanda S, Eriksson K. 2011Modelling the evolution and diversity of cumulative culture. Phil. Trans. R. Soc. B 366, 412–423. (doi:10.1098/rstb.2010.0132) Link, ISI, Google Scholar

    • 56

      Premo LS. 2012Local extinctions, connectedness, and cultural evolution in structured populations. Adv. Complex Syst. 15, 1150002. (doi:10.1142/s0219525911003268) Crossref, ISI, Google Scholar

    • 57

      Lycett SJ, Norton CJ. 2010A demographic model for Palaeolithic technological evolution: the case of East Asia and the Movius Line. Quat. Int. 211, 55–65. (doi:10.1016/j.quaint.2008.12.001) Crossref, ISI, Google Scholar

    • 58

      Premo LS, Kuhn SL. 2010Modeling effects of local extinctions on culture change and diversity in the Paleolithic. PLoS ONE 5, e15582. (doi:10.1371/journal.pone.0015582) Crossref, PubMed, ISI, Google Scholar

    • 60

      Diamond J.1999Guns, germs, and steel: the fates of human societies. New York, NY: WW. Norton. Google Scholar

    • 61

      Mokyr J. 2016A culture of growth: The origins of the modern economy. Princeton, NJ: Princeton University Press. Crossref, Google Scholar

    • 62

      March JG. 2005Parochialism in the evolution of a research community: the case of organization studies. Manage. Organ. Rev. 1, 5–22. (doi:10.1111/j.1740-8784.2004.00002.x) Crossref, Google Scholar

    • 63

      Kolodny O, Creanza N, Feldman MW. 2015Evolution in leaps: the punctuated accumulation and loss of cultural innovations. Proc. Natl Acad. Sci. USA 112, E6762–E6769. (doi:10.1073/pnas.1520492112) Crossref, PubMed, ISI, Google Scholar

    • 64

      Lehmann L, Aoki K, Feldman MW. 2011On the number of independent cultural traits carried by individuals and populations. Phil. Trans. R. Soc. B 366, 424–435 (doi:10.1098/rstb.2010.0313) Link, ISI, Google Scholar

    • 65

      Björk J, Magnusson M. 2009Where Do good innovation ideas come from? Exploring the influence of network connectivity on innovation idea quality. J. Prod. Innov. Manage. 26, 662–670. (doi:10.1111/j.1540-5885.2009.00691.x) Crossref, ISI, Google Scholar

    • 66

      Fogarty L, Creanza N, Feldman MW. 2015Cultural evolutionary perspectives on creativity and human innovation. Trends Ecol. Evol. 30, 736–754. (doi:10.1016/j.tree.2015.10.004) Crossref, PubMed, ISI, Google Scholar


    Page 13

    The theoretical underpinnings of cultural evolution have been the focus of considerable attention in recent decades. Much of this theoretical work has adapted and extended well-understood models from theoretical population genetics to study the dynamics of cultural evolution (e.g. [1–7]). One pressing question at the heart of cultural evolution is: what are the forces driving cultural complexity and cultural loss? This question has been the centre of heated controversy for over a decade (e.g. [8–13]). Mathematical models and laboratory experiments have consistently suggested that demographic factors such as population size [8,9,14–16] or connectedness [17] should drive cultural complexity. These models suggest that larger populations can be expected to have more complex toolkits and smaller populations should, on average, have less complex toolkits [8,9,17–20]. Regional and worldwide statistical analyses have shown little support for this idea in food-gathering societies where proxy measures for environmental risk have been shown to correlate with toolkit complexity [10,21–23], but see [24] for a discussion of population size measures used), although this does not appear to be the case for food-producing societies [25–27]. In one notable case, diversity in pottery design was shown to correlate inversely with population density, which may have led to an increase in the benefit of conformity and collective action [28]. It is, therefore, crucial that we understand the interaction between the environment, innovation and cultural complexity. However, despite the important role that environmental fluctuations appear to play in maintaining cultural complexity in hunter–gatherer societies in particular, as well as their putative role in determining the rate of innovation [29], mathematical models of cultural complexity rarely model environmental factors explicitly [27,30,31]. Here I present two models that describe the effects of environmental fluctuations on the rate of innovation and go on to examine the effect of innovation, environmental fluctuation and social learning on a measure of cultural diversity. The models use well-understood theory describing the interaction between the environment and mutation in genetic systems and help to build on a body of literature on innovation and environmental factors on which the debates surrounding the driving forces of cultural complexity can draw.

    The first model presented here is a neutral modifier model with the modifier acting on the rate of innovation in a cultural system. This class of models has been used to understand evolution of mutation rates, recombination rates, and rates of migration in genetic systems [32–34], epigenetic systems [35] and cultural niche construction systems [36]. Here they are used to examine the evolution of rates of innovation and the maintenance of cultural diversity in fluctuating environments.

    We first consider an infinite population of individuals living in a temporally varying environment that can switch between environment 1 (E1) and environment 2 (E2). E1 and E2 favour different cultural toolkits (or cultural repertoires), which consist of a different set of subsistants [37], designed to extract energy from each distinct environment. These repertoires are labelled R: R and r and the fitness associated with them in each of two environments is given by table 1.

    Table 1.The fitness associated with repertoires R and r in environments E1 and E2.

    environmentRr
    E111 + s1
    E21 + s21

    A ‘cultural modifier’ trait (which could be either genetic or cultural itself) controls the rate of innovation such that an individual with M has an innovation rate given by

    What is cultural transmission example?
    , and an individual with m has an innovation rate given by
    What is cultural transmission example?
    Each generation, the population undergoes mating, learning, innovation and selection, after which the environmental state may change. Mating is random but learning may be either random or governed by learning rules. Here learning may be random oblique (as a baseline expectation), conformist or success biased as described by Aoki et al. [19]. Note that these modes of transmission are modelled separately but in real human populations they may, of course, be used in concert. Success bias entails taking a random sample of n individuals and choosing the most successful (or most fit in the current environment) among that sample as a cultural role model. Success bias is found in small-scale human societies in a variety of domains [38]. This formulation also reflects the likelihood of restricted neighbour–neighbour interactions, for example, between individuals in large populations. Conformist learning is defined as a disproportionate tendency to copy the most common variant in a subsample of n tutors.

    The population thus consists of individuals with phenotypes RM, Rm, rM and rm, the frequencies of which are represented as x1, x2, x3 and x4, respectively. We can describe the fitness of each phenotype and assess the optimal rate of innovation in certain environmental conditions. We can extend this model to examine the probability of a polymorphism in R, which would indicate the presence of a large repertoire suited to all environments.

    Mating is random and so the probability of parents meeting is proportional to their frequency in the population (table 2). Vertical learning happens directly after mating and production of a new generation of individuals.

    Table 2.Probabilities of random matings in a specialized population.

    mating pairprobability
    RM×RM
    What is cultural transmission example?
    RM×Rmx1x2
    RM × rMx1x3
    RM × rmx1x4
    Rm×Rm
    What is cultural transmission example?
    Rm × rMx2x3
    Rm × rmx2x4
    rM × rM
    What is cultural transmission example?
    rM × rmx3x4
    rm × rm
    What is cultural transmission example?

    After mating, newborn individuals learn a repertoire through vertical transmission first, followed by oblique transmission. This reflects the mode of learning in a number of hunter–gatherer groups including, for example, the Penan and Bedamuni who learn first from their fathers and then from uncles or unrelated adults [39]. Hewlett & Cavalli-Sforza [40] suggested that up to 80% of learning in the Aka in some domains is vertical although the value may be much lower in other groups. We define the proportion of learning that is vertical to be PV and the proportion that is oblique to be Po with

    What is cultural transmission example?
    and examine values of Po that are anthropologically relevant. Using the mating probabilities in table 2, we can describe the proportion of offspring with each phenotype after vertical learning. We will denote these proportions as
    What is cultural transmission example?
    with i = {1,2,3,4}. These are

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    After vertical transmission, oblique learning can take place. If this learning is random, then a role model is chosen in proportion to the frequency of that type in the adult population. For convenience, we assume that the modifier trait M is either genetic or vertically transmitted and so oblique transmission describes transmission of the R trait only, although in reality the mode of transmission of a trait modifying the rate of innovation is an empirical question. Thus, after random oblique transmission we have

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    If transmission is not random but, rather, success biased then we have

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    if R is favoured, or

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    if r is favoured. Here,

    What is cultural transmission example?

    is the probability that at least one of an individual's sample of n role models has the favoured variant whose frequency is given by
    What is cultural transmission example?
    and
    What is cultural transmission example?
    denotes the probability that all of the sample have the non-favoured variant. Finally, if transmission is conformist, we have

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    where

    What is cultural transmission example?
    or the probability that over half of an individual's sample of n role models has a certain R trait whose frequency is given by xi + xj. By varying the parameter n or the size of the pool of role models available to an individual, it is possible to assess the effect of population connectedness on the optimal rate of innovation and the frequency of the R and r traits.

    After learning, population-wide innovation can occur. The rate of innovation depends on the frequency of the modifier traits in the population and the repertoire traits with which they are associated. We assume that individuals can innovate either repertoire R or r with equal probability. This probability is denoted by

    What is cultural transmission example?
    for the M trait and
    What is cultural transmission example?
    for the m trait. Thus the final proportions of each phenotype in the population after vertical learning, oblique learning and innovation are given by

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    Selection acts on offspring after learning and innovation and is based on the fitness associated with the R trait only (in other words, the modifier trait M is neutral). The proportion of surviving offspring of each type that forms the next generation is given by

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    in environment E1, where
    What is cultural transmission example?
    and

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    in environment E2, where

    What is cultural transmission example?
    .

    Following the work of Carja et al. [35] on the evolution of mutation, recombination and migration in fluctuating environments, simulations began with a population fixed on the M modifier trait. The rate of innovation associated with this trait was sampled randomly from a uniform distribution between 0 and 1. This population was allowed to evolve until a stable equilibrium was reached. Trait m was then introduced at low frequency with an associated innovation rate that was the product of the resident rate and a random number generated from an exponential distribution with mean 1. After 1000 generations the frequency of m in the population was assessed and rates higher than the initial rate were used to indicate invasion and lower, eventual extinction. If the m trait did invade, the value associated with m became the resident value and a new simulation began with that value represented by trait M. After 500 trials in which the resident innovation rate cannot be invaded, the resident is considered to be the stable rate in that simulation [35].

    The environment changes every c generations where c is fixed and allows the environment to alternate between two environmental states. In other words, a periodically varying environment was implemented.

    We can assess the relative importance of the regime of environmental change and the population connectedness in determining rates of innovation by examining the parameter n, which describes the connectedness of the population with regard to learning. Where individuals subsample the population in order to choose a role model, the size of that subsample can be varied to assess the effect of this parameter on the eventual rate of innovation.

    The model described above allows the population to evolve and maintain distinct repertoires in different environmental conditions. The implication of this formulation is that the populations are in some way specialized: some individuals can use repertoire R when appropriate and others can use repertoire r. In some cases, a population may not be specialized in such a way and an individual may be capable of using a repertoire favoured in one environment or a larger repertoire favoured in all (or many) environments. To examine this possibility, the model can allow a kind of repertoire ‘heterozygosity’ to exist where the possible variants of the R trait in the population are R, Rr or r. In other words, an individual can use the R repertoire only, the r repertoire only or both. Note that in contrast to genetic models, neither the R repertoire nor the r repertoire is dominant here—Rr suggests that both R and r can be used, in other words, that the individual has a more complex repertoire suited to both environments. The frequency of these phenotypes can be represented by x1, x2 and x3, respectively, as is consistent with the previous formulation.

    It is necessary to introduce a cost to learning and maintaining both the R and r repertoires relative to learning and maintaining just one repertoire. This is represented by δc—the cost of learning two repertoires, relative to one. We can normalize the cost of learning one repertoire to 1 and assume that the cost of learning R is equal to the cost of learning r. Again, we consider the effects of random oblique learning and success-biased learning as defined above and examine the effects of these modes on the probability of maintaining a polymorphism in R under different regimes of innovation and environmental change. This formulation implies that an individual, not just a population, may have a smaller or larger repertoire and examines the population level effects of this individual variation. It is assumed that there is little overlap between R and r. The mating frequencies in this ‘heterozygosity’ model are given in table 3 and equations covering mating, learning, innovation and selection are presented below.

    Table 3.Probabilities of random matings in a population with heterozygosity.

    mating pairprobability
    R×R
    What is cultural transmission example?
    R×Rrx1x2
    R × rx1x3
    r×Rrx2x3
    r × r
    What is cultural transmission example?

    After vertical learning, the frequencies of the three types in the population are denoted by

    What is cultural transmission example?
    where i = {1,2,3}. These values are

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    Following vertical learning, oblique learning occurs with a probability Po. Here we investigate the effects of random oblique and success-biased learning. These are described by the equations:

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    for random oblique learning. Note that the equations make the further assumption that it is prohibitively difficult to switch between specialisms, in other words, the rate of switching directly from R to r through oblique learning after vertical leaning is 0.

    For success-biased learning, following the scheme set out above for random oblique learning we get

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    when R is favoured, and

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    if r is favoured. We can then allow the population to innovate at a rate µ. Here, again, individuals can change their type from R to Rr, r to Rr or from Rr to any of the available types, assuming that an individual ‘forgets’ or neglects part of the larger repertoire. The frequencies of the types after innovation are given by

    What is cultural transmission example?
    where j = {1,2,3}.

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    Finally, selection acts on the population with different selection affecting the types in different environments. Environment 1 favours the r trait and so selection proceeds as follows:

    What is cultural transmission example?
    and

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    where δc is the cost of maintaining two repertoires compared with maintaining one and s1 is the fitness benefit of having trait r in environment 1. Similar logic can be applied to environment 2, which favours R:

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    What is cultural transmission example?

    The simulations begin with the environment in state 0 and allow the environment to change every c time steps. In this case, the innovation rate is varied in order to assess the effect of innovation rates on cultural complexity under different regimes of environmental change and under the influence of different social learning mechanisms. It would be an interesting and worthwhile extension of this model to combine the approaches used in the first model presented with those of the second in order to assess the optimal rate of innovation alongside the frequency of large repertoires.

    The optimal rate of innovation decreases as environmental stability increases, and is lower when oblique learning keeps useful information in the population. In general, the more social learning is relied upon, in other words, for higher values of Po, the lower the optimal rate of innovation becomes (figure 1). This optimal value decreases further when oblique learning is more effective and for both success-biased learning and conformist learning. This happens because less frequent innovations are needed to maintain useful information when success bias or conformist learning is in operation than when random oblique learning or vertical learning alone is used. The interaction between effective social learning and innovation is important. Social learning allows the spread of new traits and keeps traits in the population that might otherwise be lost at times when they are not under direct natural selection. Therefore, under different circumstances, effective social learning may be a conservative or an innovative force.

    What is cultural transmission example?

    Figure 1. Figure showing the optimal (or evolutionarily stable) rate of innovation in a population for different regimes of periodic environmental fluctuation found after five repeats of the simulation (standard error bars included). Coloured lines represent different proportions of oblique learning and different levels of population connectedness (i.e. different size of demonstrator pool during learning events). (a) Results for vertical learning only (black), random oblique learning (red), 10% success-biased learning with k= 5 and k= 50, 40% success-biased learning with k= 5 (blue, orange and yellow, respectively). (b) The same for conformist learning s1 = s2 = 1.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    An increase in the connectedness of the population, in other words, an increase in the number of role models from whom an individual can learn, makes social learning more effective and, in turn, decreases the optimal rate of innovation (figure 1). Therefore, the effect of population connectedness in this model is to decrease the optimal rate of innovation but to maintain previously existing information in the population. Depending on the regime of environmental change, an increase in population connectedness may increase or decrease cultural diversity either through its effect on conservation of traits or innovation. This points to an important nuance regarding the effects of innovation. It is not the case that high rates of innovation in response to environmental changes or risks will necessarily lead to high cultural complexity, nor will population connectedness (or population size) drive cultural accumulation in isolation. The cultural repertoire of a population is a complex product of the population's environment, innovativeness, learning biases and connectivity.

    The cultural diversity model examines a case where individuals may be specialists or generalists with the ‘general’ cultural repertoire being larger and representing a higher level of cultural diversity. Individuals with large cultural repertoires are favoured in all environmental conditions. On the other hand, the ‘specialists’ maintain just one cultural repertoire and are culturally adapted to one environment only. Note that there is an implication in economics that ‘specialists’ benefit not just themselves, but also other individuals in the population at times where their skills are needed—this is not the case in this model. The results of the model depend qualitatively on two things. First, the cultural diversity observed in the system depends on the rate of innovation in most cases. Second, the shape of this relationship depends on the rate of environmental fluctuation. For example, figure 2a shows that for vertical learning (dashed lines) when the environment is very unstable (the environmental period is 2), the frequency of the generalist repertoire in the population does not change significantly with the innovation rate. However, when the environment is more stable (environmental period is 100), the frequency of the generalist repertoire increases in the population with the innovation rate. To see why this might be, it is instructive to examine the case where there is little innovation and the system relies heavily on learning. This is the case on the far left of figure 2a where the value of µ = 0.01. For a stable environment with low innovation, the population tends towards high frequencies of a single repertoire, spread by learning, and higher rates of innovation (towards the right of the x-axis) favour diversity.

    What is cultural transmission example?

    Figure 2. Figure showing the frequency of the large repertoire in the population after 15 000 generations. The environment changes every c generations with values indicated in the legend. (a) Results for Po = 0.1 (low reliance on social learning) and (b) results for Po = 0.6 (high reliance on social learning) s1 = s2 = 1 and δc = 0.01.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Oblique learning changes this relationship. Where the population relies on some oblique learning, even where the environment is very unstable, novelty is spread rapidly increasing diversity as rates of innovation increase. For a heavy reliance on social learning, this effect is striking (figure 2b). It is clear here that in increase in reliance on social learning leads to a decrease in diversity compared with vertical learning alone even as innovation rate increases.

    The aim of this study was to explore the interaction between environmental fluctuations and the rate of cultural innovation within a population and to examine the relationship between rates of innovation and the probability of maintaining a complex cultural repertoire in a changing environment. The models showed that, as in a genetic system, the rate of cultural innovation in a population decreases with environmental stability and increases in unstable environments. This effect was similar for different modes of cultural transmission (success bias, conformity bias and random oblique learning). However, there were clear quantitative differences between these modes of learning. The efficiency of information spread, or the effectiveness of the social learning strategy in use by a population, seems to alter the amount of innovation that is necessary to maintain an effective cultural repertoire in the face of rapid environmental fluctuations. Previous work has shown that cultural accumulation occurs more rapidly in certain populations when the rate of environmental fluctuation is high [27]. The model presented here suggests that this may be partially the result of increased rates of innovation in unstable environments.

    As with initial work in the genetic sphere, these models focus on a non-random, periodic environmental fluctuation. There is considerable scope for extensions of this model to account for temporally random fluctuations as well as fluctuations with more than two possible environmental states. Given the results of the model, we might expect to see that a high rate of environmental change, represented in the model by the parameter c, should allow the evolution of high levels of innovation and thus generate or maintain cultural complexity. This can be gleaned from data by comparing the cultural complexity in populations with the environmental variability to which they are subject. This has been addressed to some extent by Fogarty & Creanza [27]. Analysing a number of papers that collated data on the interaction between the environment and measures of cultural complexity, this study showed that the extent to which environmental factors are correlated with cultural complexity relies heavily on subsistence strategy. However, many studies focus on hunter–gatherer populations alone (e.g. [21–23,41]).

    To assess the nature of a relationship between environmental fluctuation, innovation rate and diversity, it may be possible to use longstanding data on cultural complexity and a variety of environmental parameters from Torrence [42]. This dataset has been used to examine the ‘risk hypothesis’, which suggests that the cultural complexity of a population should be proportional to the environmental risk experienced by that population. Torrence used a population's latitude as a proxy for a direct measure of this ‘riskiness’. Subsequently, Collard et al. (e.g. [21]) suggested that above ground productivity among other measures may be better approximations of risk, and aspects of the dataset have been critiqued by Henrich [43]. However, latitude is closely related not only to the average temperature experienced by a population but to the annual temperature range experienced by a population [44], which may be a useful proxy for environmental variability. The dataset used by Torrence and subsequently by others contains information about latitude and an approximation of the annual temperature range can be gained from that information. Intuitively, a similar correlation exists between the measures of ‘cultural complexity’ and a measure of environmental risk thus defined as have been found between cultural complexity and latitude. However, a complete analysis would require information on cultural complexity, latitude, altitude, as well as downwind distance from the sea for all relevant populations—all of which play a role in determining the temperature range [44].

    One of the most interesting findings of this study is that innovation may increase diversity under certain circumstances or innovation may have little or no effect on cultural diversity. A similar relationship between innovation and cultural complexity has been shown in previous studies notably by Kandler & Laland [45]. This work showed that independent innovations, that is, innovation that does not build on previous knowledge, does in general lead to an increase in cultural diversity. However, conformist learning tended to weaken this relationship. The model presented here shows that effective forms of learning such as conformist transmission may, in fact, lead to lower rates of innovation in a population. The results of the ‘heterozygosity’ model presented above are consistent with this previous work showing that effective learning decreases the need for a pool of cultural diversity to buffer against a changing environment and that this effect changes the relationship between cultural diversity and innovation.

    Necessarily, there are a number of important aspects of innovation that have not been accounted for in the models presented above. For example, it has been shown that the ‘magnitude’ of innovation is linked to the evolution of innovative behaviour [46], that recombination can be an important aspect of innovation [29,31] and that the increasing costs of innovation as information becomes more complex is a crucial aspect of the relationship between innovation and cultural accumulation [47]. It is also important to note that learning processes are also likely to be under selection and that this possibility is not accounted for here. Finally, the results seen in this infinite population model may change in a more realistic finite population where more innovation may be needed to maintain important traits in a slowly changing environment. However, these models represent a step towards further understanding important but sometimes unintuitive links between the environment, innovation and cultural complexity, and may begin to offer a theoretical insight into one of the most contentious questions in cultural evolution at present: What primarily determines cultural complexity—the environment in which a population lives and learns, or the strength and number of the social ties between individuals within that population? A thorough understanding of the effects of the environment on innovation and learning will be crucial in answering that question.

    This article has no additional data.

    I declare I have no competing interests.

    I received no funding for this study.

    The author thank the editors and two anonymous reviewers for their careful reading of the paper, and Gilbert Smith for comments on an earlier draft.

    Footnotes

    One contribution of 16 to a theme issue ‘Bridging cultural gaps: interdisciplinary studies in human cultural evolution’.

    References

    • 1

      Cavalli-Sforza LL, Feldman MW. 1973Cultural versus biological inheritance: phenotypic transmission from parents to children (A theory of the effect of parental phenotypes on children's phenotypes). Am. J. Hum. Genet. 25, 618. PubMed, ISI, Google Scholar

    • 2

      Cavalli-Sforza LL, Feldman MW. 1981Cultural transmission and evolution. Princeton, NJ: Princeton University Press. Google Scholar

    • 3

      Boyd R, Richerson PJ. 1985Culture and the evolutionary process. Chicago, IL: Chicago University Press. Google Scholar

    • 4

      Richerson PJ, Boyd R. 2005Not by genes alone.Chicago, IL: University of Chicago Press. Google Scholar

    • 5

      Laland KN, Odling-Smee J, Feldman MW. 1996The evolutionary consequences of niche construction: a theoretical investigation using two-locus theory. J. Evolution. 316, 293–316. (doi:10.1046/j.1420-9101.1996.9030293.x) Crossref, ISI, Google Scholar

    • 6

      Henrich J, McElreath R. 2003The evolution of cultural evolution. Evol. Anthropol. Issues News Rev. 12, 123–135. (doi:10.1002/evan.10110) Crossref, ISI, Google Scholar

    • 8

      Shennan SJ. 2001Demography and cultural innovation: a model and its implications for the emergence of modern human culture. Cambridge Archaeol. J. 11, 5–16. (doi:10.1017/S0959774301000014) Crossref, ISI, Google Scholar

    • 9

      Henrich J. 2004Demography and cultural evolution: how adaptive cultural processes can produce maladaptive losses: the tasmanian case. Am. Antiquity 69, 197–214. (doi:10.2307/4128416) Crossref, ISI, Google Scholar

    • 10

      Read D. 2006Tasmanian knowledge and skill: maladaptive imitation or adequate technology?Am. Antiquity 71, 164–184. (doi:10.2307/40035327) Crossref, ISI, Google Scholar

    • 11

      Collard M, Buchanan B, Morin J, Costopoulos A. 2011What drives the evolution of hunter-gatherer subsistence technology? A reanalysis of the risk hypothesis with data from the Pacific Northwest. Phil. Trans. R. Soc. B 366, 1129–1138. (doi:10.1098/rstb.2010.0366) Link, ISI, Google Scholar

    • 12

      Buchanan B, O'Brien MJ, Collard M. 2015Drivers of technological richness in prehistoric Texas: an archaeological test of the population size and environmental risk hypotheses. Archaeol. Anthropol. Sci. See http://link.springer.com/10.1007/s12520-015-0245-4. ISI, Google Scholar

    • 13

      Premo LS, Kuhn SL. 2010Modeling effects of local extinctions on culture change and diversity in the paleolithic. PLoS ONE 5, e15582. (doi:10.1371/journal.pone.0015582) Crossref, PubMed, ISI, Google Scholar

    • 14

      Derex Met al.2013Experimental evidence for the influence of group size on cultural complexity. Nature 503, 389–391. (doi:10.1038/nature12774) Crossref, PubMed, ISI, Google Scholar

    • 15

      Muthukrishna Met al.2013Sociality influences cultural complexity. Proc. R. Soc. B 281, 20132511. (doi:10.1098/rspb.2013.2511) Link, ISI, Google Scholar

    • 16

      Kempe M, Mesoudi A. 2014An experimental demonstration of the effect of group size on cultural accumulation. Evol. Hum. Behav. 35, 285–290. (doi:10.1016/j.evolhumbehav.2014.02.009) Crossref, ISI, Google Scholar

    • 17

      Powell A, Shennan SJ, Thomas MG. 2009Late Pleistocene demography and the appearance of modern human behavior. Science 324, 1298–1301. (doi:10.1126/science.1170165) Crossref, PubMed, ISI, Google Scholar

    • 18

      Neiman FD. 1995Stylistic variation in evolutionary perspective - inferences from decorative diversity and interassemblage distance in Illinois woodland ceramic assemblages. Am. Antiquity 60, 7–36. (doi:10.2307/282074) Crossref, ISI, Google Scholar

    • 19

      Aoki K, Lehmann L, Feldman MW. 2011Rates of cultural change and patterns of cultural accumulation in stochastic models of social transmission. Theor. Popul. Biol. 79, 192–202. (doi:10.1016/j.tpb.2011.02.001) Crossref, PubMed, ISI, Google Scholar

    • 20

      Fogarty Let al.2015Factors limiting the number of independent cultural traits that can be maintained in a population. In Learning strategies and cultural evolution during the palaeolithic. Part of The replacement of neanderthals by modern humans series (eds Mesoudi A, Aoki K), pp. 9–21. Tokyo, Japan: Springer. Google Scholar

    • 21

      Collard M, Kemery M, Banks S. 2005Causes of toolkit variation among hunter-gatherers: a test of four competing hypotheses. Can. J. Archaeol. 29, 1–19. (doi:10.1017/S0140525X06009083) Google Scholar

    • 22

      Collard M, Buchanan B, O'Brien MJ, Scholnick J. 2013Risk, mobility or population size? Drivers of technological richness among contact-period western North American hunter−gatherers. Phil. Trans. R. Soc. B 368, 201220412. (doi:10.1098/rstb.2012.0412) Link, ISI, Google Scholar

    • 23

      Collard M, Buchanan B, O'Brien MJ. 2013Population size as an explanation for patterns in the paleolithic archaeological record. Curr. Anthropol. 54, S388–S396. (doi:10.1086/673881) Crossref, ISI, Google Scholar

    • 24

      Henrich Jet al.2016Understanding cumulative cultural evolution. Proc. Natl Acad. Sci. USA 113, E6724–E6725. (doi:10.1073/pnas.1610005113) Crossref, PubMed, ISI, Google Scholar

    • 25

      Collard M, Buchanan B, Ruttle A, O’ Brien MJ. 2011Niche construction and the toolkits of hunter–gatherers and food producers. Biol. Theory 6, 251–259. (doi:10.1007/s13752-012-0034-6) Crossref, Google Scholar

    • 26

      Kline MA, Boyd R. 2010Population size predicts technological complexity in Oceania. Proc. R. Soc. B 277, 2559–2564. (doi:10.1098/rspb.2010.0452) Link, ISI, Google Scholar

    • 27

      Fogarty L, Creanza N. 2017The niche construction of cultural complexity: interactions between innovations, population size and the environment. Phil. Trans. R. Soc. B 372, 20160428. (doi:10.1098/rstb.2016.0428) Link, ISI, Google Scholar

    • 28

      Nelson MCet al.2011Resisting diversity: a long-term archaeological study. Ecol. Soc. 16, 24. (doi:10.5751/ES-03887-160125) Crossref, ISI, Google Scholar

    • 29

      Fogarty L, Creanza N, Feldman MW. 2015Cultural evolutionary perspectives on creativity and human innovation. Trends Ecol. Evol. 30, 736–754. (doi:10.1016/j.tree.2015.10.004) Crossref, PubMed, ISI, Google Scholar

    • 30

      Vegvari C, Foley RA. 2014High selection pressure promotes increase in cumulative adaptive culture. PLoS ONE 9, e86406. (doi:10.1371/journal.pone.0086406) Crossref, PubMed, ISI, Google Scholar

    • 31

      Kolodny O, Creanza N, Feldman MW. 2015Evolution in leaps: the punctuated accumulation and loss of cultural innovations. Proc. Natl Acad. Sci. USA 112, E6762–E6769. (doi:10.1073/pnas.1520492112) Crossref, PubMed, ISI, Google Scholar

    • 33

      Feldman MW. 1972Selection for linkage modification: I. Random mating populations. Theor. Popul. Biol. 3, 324–346. (doi:10.1016/0040-5809(72)90007-X) Crossref, PubMed, ISI, Google Scholar

    • 34

      Feldman MW, Liberman U. 1986An evolutionary reduction principle for genetic modifiers. Proc. Natl Acad. Sci. USA 83, 4824–4827. (doi:10.1073/pnas.83.13.4824) Crossref, PubMed, ISI, Google Scholar

    • 35

      Carja O, Liberman U, Feldman MW. 2014Evolution in changing environments: modifiers of mutation, recombination, and migration. Proc. Natl Acad. Sci. USA 111, 17 935–17 940. (doi:10.1073/pnas.1417664111) Crossref, ISI, Google Scholar

    • 36

      Creanza et al.2014. Google Scholar

    • 37

      Oswalt WH. 1976An anthropological analysis of food-getting technology. New York, NY: Wiley. Google Scholar

    • 38

      Henrich J, Broesch J. 2011On the nature of cultural transmission networks: evidence from Fijian villages for adaptive learning biases. Phil. Trans. R. Soc. B 366, 1139–1148. (doi:10.1098/rstb.2010.0323) Link, ISI, Google Scholar

    • 39

      MacDonald K. 2007Cross-cultural comparison of learning in human hunting. Human Nat. 18, 386–402. (doi:10.1007/s12110-007-9019-8) Crossref, PubMed, ISI, Google Scholar

    • 40

      Hewlett BS, Cavalli-Sforza LL. 1986Cultural transmission among Aka pygmies. Am. Anthropol. 88, 922–934. (doi:10.1525/aa.1986.88.4.02a00100) Crossref, ISI, Google Scholar

    • 41

      Torrence R. 2001Hunter-Gatherer Technology: Macro- and Microscale Approaches. In Hunter-gatherers: An interdisciplinary perspective (eds Rowley-Conwy P, Layton RH, Panter-Brick C), pp. 73–98. Cambridge, UK: Cambridge University Press. Google Scholar

    • 42

      Torrence R. 1983Time Budgeting and Hunter Gatherer Technology. In Hunter-Gatherer economy in prehistory: A european perspective (ed. Bailey GV), pp. 11–22. Cambridge, UK: Cambridge University Press. Google Scholar

    • 43

      Henrich J. 2006Understanding cultural evolutionary models : a reply to read's critique. Am. antiquity 71, 771–782. (doi:10.2307/40035890) Crossref, ISI, Google Scholar

    • 45

      Kandler A, Laland KN. 2009An investigation of the relationship between innovation and cultural diversity. Theor. Popul. Biol. 76, 59–67. (doi:10.1016/j.tpb.2009.04.004) Crossref, PubMed, ISI, Google Scholar

    • 46

      Arbilly M, Laland KN. 2017The magnitude of innovation and its evolution in social animals. Proc. R. Soc. B 284, 20162385. (doi:10.1098/rspb.2016.2385) Link, ISI, Google Scholar

    • 47

      Mesoudi A. 2011Variable cultural acquisition costs constrain cumulative cultural evolution. PLoS ONE 6, 15–17. (doi:10.1371/journal.pone.0018239) Crossref, ISI, Google Scholar


    Page 14

    The explosion of genome-wide association (GWA) studies over the past 10 years might lead one to believe that scientists understand how complex human phenotypes are determined. The wording of many GWA papers suggests that the authors are unaware of the heated debates concerning the utility of the heritability statistic that occurred between 1969 and 1982. These debates, in academic and public forums, were often focused on intelligence (and by proxy measures like IQ), personality traits, or attitudes, and the extent to which these are genetically determined.

    GWA studies have found thousands of DNA variants that are statistically associated with human phenotypes. The phenotypes studied are most frequently diseases, but many non-disease characteristics, such as height [1–3], children's educational achievement [4–6], economic and political preferences [7] and intelligence [8,9], have also been subjected to such DNA association analyses. In every case the amount of variance attributable to genetic differences in the measured trait is less, usually far less, than earlier estimates based on correlations between relatives. This difference is often called ‘missing heritability’ [3] and considerable effort has been expended in augmenting the heritability estimated from GWA studies to bring them closer to the higher values obtained from family studies. In what follows, we place these family studies in some historical context and ask why there should be a focus on ‘heritability’, missing or not.

    In the early 1970s, at Stanford University, William Shockley, a Nobel prize–winning professor of engineering, was expounding his profoundly racist eugenic views on intelligence [10]. At the same time, Arthur Jensen, a professor of educational psychology at the University of California, was promoting similarly eugenic views that he expressed in his notorious monograph [11], which begins: ‘Compensatory education has been tried and it has apparently failed.’ Jensen blames this failure on the poor genetic endowment of those who perform badly at school1 [12,13].

    Jensen chose IQ as the measure of likelihood to succeed in school, and focused on heritability as a measure of the extent of genetic determination. Heritability is denoted by h2, where, according to Jensen, h ‘tells us the correlation between genotypes and phenotypes in the population’ [11, pp. 42–43]. For IQ, Jensen suggested an average value for h2 of 0.81. However, in his assessment, the ‘most satisfactory’ [11, p. 47] and ‘most interesting’ [11, p. 52] estimate of h2 for IQ was by Burt [14], namely 0.86. Burt's studies were, however, discredited by Kamin [15] as having been based on fraudulent data, a few years after Jensen's 1969 laudation of Burt (see also Kamin [16]).

    The reductionist ferment of the early 1970s was the context in which Cavalli-Sforza & Feldman produced their first model for cultural transmission and gene-culture coevolution [17]. They showed that estimates of heritability, which had been interpreted as demonstrating that such human quantitative traits as IQ were mostly genetically determined, could be obtained under vertical cultural transmission; that is, cultural transmission of the trait from parents to offspring. The model used by Cavalli-Sforza & Feldman [17] was defined by a simple dynamic recursion system in which an offspring's phenotype was determined by its genotype and its parents' phenotypes, the latter by direct vertical cultural transmission. At equilibrium of the recursions, correlations between relatives were computed as functions of all transmission parameters, and it was shown that vertical cultural transmission had a profound effect on correlations between relatives, an effect that could be misinterpreted as being due to genetic variation. In other words, cultural transmission from parent to offspring can mimic genetic heritability, and researchers should account for this vertical cultural transmission to avoid inflated estimates of h2.

    From the 1970s to the 1990s, path analysis was widely used in the statistical analysis of complex human phenotypes, such as IQ. Sewall Wright used path analysis for estimating familial correlations under a linear model [18]. He applied path analysis to data on IQ of biological and adopted children that had been collected by Burks [19]. The linear model underlying Wright's and most subsequent path analyses of IQ is usually called ‘ACE’, where ‘A’ refers to the additive genetic contribution to a child's phenotype, ‘C’ is the contribution from environment common to or shared by offspring reared together in the natal home, and ‘E’ represents environmental contributions unique to each offspring. The phenotype of a child (e.g. IQ) is

    What is cultural transmission example?

    2.1

    and the data consist of values of P for parents and offspring, either true or adopted. From parent-offspring and sib-sib correlations (and sometimes correlations between other relatives) the path coefficients, representing the ‘causes’ of P, are estimated.

    Newton Morton's group at the University of Hawaii [20,21] applied path analysis to a collection of familial IQ correlations that included Burks' data and those published by Jencks [22], and estimated the heritability [20] of IQ to be h2 = 0.75, with a contribution c2 from shared environment of c2 = 0.09. Wright's estimate of h2, for Burks' data only, was 0.50, and was described [20] as ‘in at least qualitative agreement’ with 0.75. These estimates refer to the analysis of children's variance, but for adults (i.e. parents) the estimates are very much lower [20, table 15].

    Re-analysis by Morton's group [21] of Burks' data [19] produced a slightly lower estimate of h2, namely 0.67, while the estimate of c2 remained close to 0.09. In all of these analyses [20,21], mating was assumed to be random; that is, there was no assorting for IQ.

    In the late 1970s, a series of papers appeared that analysed dynamic models with genetic and cultural transmission together with assortative mating [23–26], namely the choice of mates based on phenotypic similarity. These analyses included larger sets of data and were remarkable in showing that the estimated correlation between the IQs of spouses was close to 0.5, and that the estimate of genetic heritability was much lower than all previous estimates, namely 0.32 [24] and 0.30 [26]. Even more remarkable was that the estimated fraction of variance due to cultural inheritance was not trivial relative to the heritability, namely 0.27 [25] and 0.29 [26].

    There was more to come! In 1982, Morton's group made another path analysis [27] of a somewhat larger dataset of IQ correlations among American family members. This time (and without citing their earlier estimates of 0.75 and 0.67) their estimate of genetic heritability, h2, was 0.31 with 0.42 for cultural heritability, c2. Thirteen years later, Otto et al. [28] applied the path analysis method to sixteen familial correlations for IQ published by Bouchard & McGue [29]. The estimated heritability depended on the type of assortative mating (social homogamy or phenotypic homogamy) and whether cultural transmission was direct or indirect. The estimates of h2 varied from 0.29 to 0.42, while that of c2 was close to 0.27.

    One of the reasons for the low estimates of h2 is that the correlations between the non-transmitted environments of dizygous and monozygous twins are not assumed to be equal. In addition, the correlation between the environments of monozygous twins reared apart, which had usually been ignored (set to zero), turns out not to be small [28]: even when reared apart, twins are likely to have similar environments, for example, two homes within the same extended family. On the basis of the heritability estimates between 0.30 and 0.32, Morton's group concluded [27, p. 197] ‘all analyses appear to rule out high heritability’.

    Twelve years after the publication in leading genetics journals of

    What is cultural transmission example?
    estimates close to 30%, Herrnstein & Murray write in The Bell Curve [30] on p. 105: ‘In fact IQ is substantially heritable … but half a century of work … permits a broad conclusion that the genetic component of IQ is unlikely to be smaller than 40% or higher than 80%. The most unambiguous direct estimates, based on identical twins reared apart, produce some of the highest estimates of heritability … we will adopt a middling estimate of 60% heritability’. Analysis of this book was public and intense: a 715-page book, The Bell Curve Debate, attests to the broad variety of responses the book engendered [31]. Would Herrnstein & Murray have written their 845-page tome [30] 25 years after Jensen's notorious document [11] if they had believed the heritability of IQ to be 30%?

    Thirty years after the debates described above, some investigators persist in regarding heritability, computed from analyses of twins, as saying something useful about the biological aetiology of human behavioural traits [32]. Turkheimer [33, p. 26] points out that ‘unless the twin studies were somehow mistaken, covariation between DNA and behavioural differences is inevitable.’ Thus almost all human complex traits show some level of heritability as inferred from correlations between relatives. Earlier, Turkheimer [34] introduced the term ‘weak genetic explanation’ to describe this statistical phenomenon, and stresses [33, p. 24] that this weak explanation does not entail that ‘complex individual differences have genetic mechanisms for scientists to discover.’

    It has been known for decades that the phenotype produced by a genotype in one environment may be radically different in another environment [35–37]. The same can be said about partitions of IQ variance in different environments. Nisbett et al. [38] review a number of studies of cognitive abilities in samples of families that differed on some SES-related measures [39–43]. They conclude that ‘the heritability of cognitive ability is attenuated among impoverished children and young adults in the United States.’ These findings may relate to the concept of norm of reaction. The norm of reaction is a ‘table of correspondence between phenotype, on the one hand, and genotype-environment combination on the other' [44]. In some environments, phenotypic variation among genotypes may vary a lot on the phenotype scale (high heritability), while in other environments phenotypic variation among genotypes may be small (low heritability); this is one explanation for higher heritabilities estimated from twins in higher SES environments.

    The relationship between SES and heritability of cognitive ability described above points to the likelihood of cultural and/or social factors that comprise an important component of the relevant environment, which may not fit naturally into the linear analysis of variance framework that underlies heritability estimates from correlations between relatives. Evolution of aspects of the environment may result in changes in the statistics of familial relationships in cognitive ability. Dickens & Flynn [45,46] call this the ‘social multiplier’ effect, although it can be regarded as part of the evolutionary process under cultural transmission that underpinned the model of Cavalli-Sforza & Feldman [17]. Nisbett et al. [38] review some aspects of the environment that may correlate with SES, that may affect familial statistics of cognitive ability, and that are plausibly culturally transmitted.

    Turkheimer [47] suggested that the results of decades of studies of correlations of behavioural traits between relatives to that date could be summarized by ‘Three Laws of Behaviour Genetics':

    (1)

    All behavioural traits are heritable. (We interpret this as h2 > 0.)

    (2)

    The effect of being raised in the same family is smaller than the effect of ‘genes’. (Our quotes added to indicate that ‘genes’ here refers to the A component in the linear variance ACE model rather than something from the DNA sequence.)

    (3)

    A substantial portion of the variance in complex human behavioural traits is not accounted for by the effects of genes or families.

    These laws apply equally well to traits such as IQ, cognitive level, years of schooling, body mass index and most other complex non-behavioural traits.

    A remarkable meta-analysis of twin studies by Polderman et al. [48] appeared in 2015. They evaluated variance components for almost 18 000 traits in 2748 publications including 14 558 903 twin pairs. They report h2 = 0.49 and c2 = 0.17 ‘across all traits’, that twin resemblance is solely due to additive genetic variation, and that ‘the data are inconsistent with substantial influences from shared environment or non-additive genetic variation’. Two ways of estimating heritability were used in this massive meta-analysis. Method 1 uses

    What is cultural transmission example?
    and
    What is cultural transmission example?
    , the correlations between monozygous and dizygous twins, respectively, and calculates
    What is cultural transmission example?
    as the genetic heritability
    What is cultural transmission example?
    , and
    What is cultural transmission example?
    as the common environment component
    What is cultural transmission example?
    [49, p. 172]. These estimates, which are frequently used in twin studies, are to be compared with those from Method 2, namely h2 and c2, respectively, which are derived from the ACE model.

    Polderman et al. [48] organized traits into groups (which they called ‘functional domains’) and present both heritability estimates for each group of traits. Our interest here is focused on the groups they called ‘cognitive’. For the functional domain designated ‘cognitive’,

    What is cultural transmission example?
    and
    What is cultural transmission example?
    were 0.55 and 0.10, respectively, while h2 and c2 were 0.47 and 0.18, respectively. These functional domains were subdivided, and one subgroup was designated as ‘higher-level cognitive function’ for which the estimate of
    What is cultural transmission example?
    and
    What is cultural transmission example?
    were 0.54 and 0.17, respectively. For this subgroup h2 and c2 were 0.55 and 0.18, respectively. The earlier lower estimates of family-based heritability, 30–40%, are not cited in this meta-analysis. Falconer & Mackay [49] point out that the use of
    What is cultural transmission example?
    relies on the environmental components of variance in MZ and DZ twins being the same. In fact, they go on to list [49, pp. 172–173] seven ways in which a difference in these environmental components could be produced. Not included among these seven are differences in cultural transmission, either vertical or horizontal, that can affect MZ and DZ twins differently [28], for example, due to parents actively trying to treat each member of an MZ twin pair differently, or peers treating MZ twins differently from DZ twins. As has been pointed out many times [44,50,51], Polderman et al.'s estimate of 49% ‘across all traits’ does not, as they claim, tell us about ‘the causes of individual differences in human traits’ nor will it ‘guide future gene-mapping efforts.’ That is, heritability in general does not imply a genetic causal mechanism.

    In the genomic era, intelligence has been the subject of several genome-wide association studies, the largest of which is a recent meta-analysis by Sniekers et al. [52]. This study included 78 308 people from 13 cohorts. Various measures of intelligence were used in eight of the cohorts, and the other five used Spearman's g (a statistical measure computed from a factor analysis of correlations among a number of psychometric tests of intelligence—the so-called general intelligence factor). There is remarkable heterogeneity among the measures of intelligence in the thirteen cohorts. In the two largest cohorts (54 119 individuals), intelligence was assessed by the number of correct answers out of thirteen questions produced in two minutes. The full meta-analysis included more than 12 million single nucleotide polymorphisms (SNPs). The paper begins with the announcement in the abstract that the heritability of IQ is 54%, which is actually the value of

    What is cultural transmission example?
    reported in the meta-analysis by Polderman et al. [48] and computed from
    What is cultural transmission example?
    and
    What is cultural transmission example?
    . Using the SNPs, and a recent variance analysis method called ‘polygenic score regression’, Sniekers et al. [52] obtain an estimate of 20% for the heritability of intelligence. However, again using polygenic scores [53], the meta-analysis was only able to explain between 2% and 4.8% of the variance in four other studies, the largest of which had 9904 samples. Meta-analysis of these 9904 samples explained 2% of the phenotypic variance, while the 4.8% represented results from meta-analysis of a subset of 1558 samples. The laudatory Nature editorial [54] sums up these statistics by stating: ‘The associations … could explain up to 4.8% of the variance in intelligence across these cohorts.’ Even if one believed in the underlying linear statistical models that gave this result, 4.8% does not seem worth writing home about. Indeed, as Nisbett et al. [38, p. 135] write, ‘It may simply be that the number of genes involved in an outcome as complex as intelligence is very large, and therefore the contribution of any individual locus is just as small as the number of genes is large.’ In fact, Chabris et al. [55] augment Turkheimer's three laws of behaviour genetics with a fourth law that summarizes many GWAS studies of behavioural traits.
    (1)

    A typical human behavioural trait is associated with very many genetic variants, each of which accounts for a very small percentage of behavioural variability.

    Cognitive ability assessed through IQ tests is just one of the many complex human behavioural traits whose ‘genetics’ has been investigated using data from twins. Personality traits such as extraversion and neuroticism are among those that have received most attention. Variation between twins in aspects of personality assessed using responses to questionnaires were detailed by Eaves et al. [56]. The dimensions of personality in this analysis were psychoticism, extraversion, neuroticism, and a ‘lie’ scale ‘designed to identify subjects responding in a ‘socially desirable’ manner’ [56, p. 74]. They concluded from these studies that there is an ‘overwhelming and consistent pattern’ of ‘a significant genetic component’ to all four personality measures with ‘no trace of a shared environmental component of twin resemblance’ [56, p. 121]. Interestingly, 24 years later one of these authors wrote ‘the structure of personality is inherent in the evolved phenotype, and is not the immediate consequence of either genetic or environmental organizing factors’ [57, p. 761].

    Martin et al. [58] used a model similar to that of Rice [26] to analyse data on social attitudes of MZ and DZ twins, supplemented by data on social attitudes of spouses. From one set of data a composite score for conservatism was derived from dichotomous answers by Australian twins to a fifty-item questionnaire. A second dataset was used to produce composite scores for radicalism and tough-mindedness derived from a forty-item questionnaire with items on a five-point scale. Inclusion of a sample of spouses allowed estimation of the degree of assortative mating for social attitudes. For the British radicalism sample, 72% of the variation in males was found to be genetic and the cultural component zero, while in females 24% was genetic and 12% was cultural. For the Australian conservatism sample, the genetic components in males and females, respectively, were 56% and 69%, while the cultural component in both was estimated to be zero.

    Martin et al. concluded [58] that their data on social attitudes were ‘largely consistent’ with a genetic model for family resemblance with ‘little evidence of vertical cultural transmission’. In reviewing this work on social attitudes, Eaves et al. [56] went further: ‘we may find that genetic differences between people are partly responsible for the distinction between godly and ungodly and between liberal and conservative in contemporary societies’ [56, p. 358]. However, a comprehensive review of such studies was made by Turkheimer et al. [59, p. 520], who document, in particular, the history of heritability estimates for personality traits. They summarize this history as follows: ‘One can identify broad dimensions of behaviour; quantify their relation to a broad spectrum of genes; and obtain consistent replicable results that fail to differentiate among behaviours and become uninteresting once they are established. Under most circumstances, both extraversion and introversion are heritable at approximately 0.4, and there is little more to be said.’ Again, it should be stressed that the existence of a genetic causal mechanism cannot be inferred from such statistics.

    The danger inherent in these studies of human behavioural, attitudinal, or personality variation resides in the meaning of ‘heritability’, whether estimated from ACE models or from statistical analysis of GWA studies. Morton [60, p. 327] makes the point succinctly: ‘one would be quite unjustified in claiming that heritability is relevant to educational strategy.’ That is, heritability estimated from familial correlations or from models designed to analyse GWA studies (although the latter began some 35 years after Morton was writing), and whether it is 5% or 95%, is not informative about the chance that environmental intervention will affect the trait under study. Despite Morton's admonition, we still see claims such as the following [61] made in 2016: ‘… soon a bit of saliva or blood from a newborn will be able to capture her full genetic potential for educational attainment …’ followed by ‘now that we have mapped the genetic architecture behind a wide range of outcomes—from height to cognitive ability—a brave new world has opened up whereby we can select our mates, and yes, even our children, by and for their genotypes’. Whereas this is plausible for relatively simple genetic traits, e.g. Mendelian diseases, it is quite implausible for height, educational attainment, cognitive ability or personality.

    How are geneticists and/or social scientists to interpret estimates of heritability made from linear statistical models for familial phenotypic relationships or the contributions of genomic polymorphisms to phenotypes? We must start from recognition that all complex human traits result from a combination of causes. If these causes interact, it is impossible to assign quantitative values to the fraction of a trait due to each, just as we cannot say how much of the area of a rectangle is due, separately, to each of its two dimensions. Thus, in the analyses of complex human phenotypes, such as those described above, we cannot actually find ‘the relative importance of genes and environment in the determination of phenotype’ [44].

    To illustrate their sceptical view of genetic interpretation of the heritability of personality traits, Turkheimer et al. [59, p. 532] consider marital status: ‘Divorce is heritable [62], but do we really expect that twin studies of marital processes will lead us to a genetic explanation of divorce? … The point is not that they are environmental as opposed to genetic; indeed as we cannot emphasize enough, marriage, divorce and whatever may cause them are just as heritable as anything else.’ But this heritability does not mean that either is ‘a biological process awaiting genetic analysis … they do not have a specific genetic aetiology.’

    In analyses of familial correlations for human traits, the linear statistical models that give estimates of amounts of phenotypic variance due to genes and environments (and hence heritability) rarely specify what constitutes the environment. Cavalli-Sforza & Feldman [17] took the parents' phenotypes to represent the environment in which an offspring develops its own phenotype, measured on the same scale as those of its parents, even though properties of parents other than those measured on the scale the phenotype are likely to have strong effects on that offspring's phenotype, as in the case of parental SES and children's IQ mentioned above [40]. Subsequent treatments followed Cavalli-Sforza & Feldman [17] and incorporated vertical cultural transmission (i.e. of phenotype from parent to child) into analyses of variance in IQ, attitudes and other traits [24–27,56,58].

    All models that incorporate genetic and vertical cultural transmission involve a dynamic process for the evolution of the phenotype and statistical analysis of familial correlations at the stationary state. A recent analysis by Feldman et al. [63] extended this class of models by making specific assumptions about how the different parental pairs of phenotypes contribute probabilistically to their offspring's phenotype. The simplest such case assumes a single gene, with genotypes AA, Aa, and aa, and two variants of a phenotype, labelled 1 and 2. Thus there are six phenogenotypes: AA1, AA2, Aa1, Aa2, aa1, aa2. The probability that an offspring acquires phenotype 1 depends on its own genotype but not on those of its parents; it does, however, depend on their phenotypes. The general form of such phenogenotypic transmission is shown in table 1 [63].

    Table 1.Rules of phenogenotypic transmission.

    parental phenotypesM × Foffspring phenogenotype probability (given offspring’s genotype)
    AA1AA2Aa1Aa2aa1aa2
    1 × 1α11 − α1α21 − α2α31 − α3
    1 × 2β11 − β1β21 − β2β31 − β3
    2 × 1γ11 − γ1γ21 − γ2γ31 − γ3
    2 × 2δ11 − δ1δ21 − δ2δ31 − δ3

    Although the framework exhibited in table 1 is quite simple, it does involve 12 phenogenotypic transmission rates. For this reason, Feldman et al. [63] simplified the transmission rule in table 1 to a form they called ‘bilinear transmission’, shown in table 2. In table 2, β is a baseline probability that any offspring, regardless of its genotype or parents' phenotypes, acquires phenotype 1; α is a transmission component due to an offspring carrying an A allele, with σ a measure of genetic dominance of A over a; η is the contribution to the offspring's chance of carrying phenotype 1 by each parent who carries phenotype 1, with τ a measure of marital dominance in transmission of phenotype 1. Thus, if

    What is cultural transmission example?
    , for example, a parental couple only one of whom has phenotype 1 transmits this phenotype at the same rate as a couple both of whom are of phenotype 1. The final parameter in this model is the rate m at which parents mate assortatively.

    Table 2.Bilinear transmission scheme.*

    matingM × Fprobability that phenotype offspring is 1
    AAAaaa
    1 × 1α1 = 2η + 2α + βα2 = 2η + σα + βα3 = 2η + β
    1 × 2β1 = γ1 = τη + 2α + ββ2 = γ2 = τη + σα + ββ3 = γ3 = τη + β
    2 × 1β1 = γ1 = τη = 2α + ββ2 = γ2 = τη + σα + ββ3 = γ3 = τη + β
    2 × 2δ1 = 2α + βδ2 = σα + βδ3 = β

    Since there is no selection in the model, the frequency, p, of allele A, does not change, and the evolutionary dynamics of the six phenogenotypes can be specified in terms of the frequency k of phenotype 1 and the frequency of allele A among individuals of phenotype 1. The dynamics converge to an equilibrium at which all the usual correlations between relatives, as well as additive effects of alleles A1 and A2 [49, pp. 112–115] can be computed. The latter are used to derive the actual narrow-sense heritability,

    What is cultural transmission example?
    , of the phenotype, namely
    What is cultural transmission example?
    , where
    What is cultural transmission example?
    is the additive genetic variance and
    What is cultural transmission example?
    is the phenotypic variance of the population at equilibrium. From the equilibrium values of the correlations between MZ and DZ twins, we can also calculate
    What is cultural transmission example?
    and
    What is cultural transmission example?
    , for different values of
    What is cultural transmission example?
    and m. Examples with assorting rate m = 0.5 are shown in table 3; m = 0.5 is chosen because it is very close to the value estimated for radicalism and tough-mindedness from the 562 British spousal pairs in Martin et al. [58].

    Table 3.Estimates of heritability for bilinear transmission*.

    What is cultural transmission example?
    What is cultural transmission example?
    What is cultural transmission example?
    What is cultural transmission example?
    What is cultural transmission example?
    0.2340.0580
    What is cultural transmission example?
    00.1770.48
    What is cultural transmission example?
    0.2760.1270
    What is cultural transmission example?
    What is cultural transmission example?
    What is cultural transmission example?
    What is cultural transmission example?
    What is cultural transmission example?
    0.480.1300
    What is cultural transmission example?
    00.0710.418
    What is cultural transmission example?
    0.3120.1450

    The parameter sets in table 3 were chosen to represent cases of table 2 in which there is only genetic determination (α = 0.4, η = 0); genetic determination and parental transmission are equally important (α = 0.2, η = 0.2); and there is only parental transmission (α = 0, η = 0.4). As in Polderman et al. [48],

    What is cultural transmission example?
    is computed as
    What is cultural transmission example?
    and
    What is cultural transmission example?
    is computed as
    What is cultural transmission example?
    while
    What is cultural transmission example?
    is the actual narrow sense heritability,
    What is cultural transmission example?
    . As expected, when η = 0 also
    What is cultural transmission example?
    , and when α = 0,
    What is cultural transmission example?
    and
    What is cultural transmission example?
    are also both zero. However, when α = η = 0.2, the dominance parameters σ and τ become important. With no dominance (σ = τ = 1) the environmental fraction of the phenotypic variance is about three times the genetic fraction, but with complete genetic and marital dominance (σ = τ = 2), the genetic value is almost twice the environmental contribution. In both cases, there are substantial discrepancies between
    What is cultural transmission example?
    and
    What is cultural transmission example?
    , but again the direction of these differences depends on the levels of dominance.

    The important feature of the model in table 2 is that it is not a linear statistical model designed for analysis of variance. It is an explicitly causative model from whose dynamic equilibrium familial and population statistics can be computed. In this simple model, the commonly computed variance estimates

    What is cultural transmission example?
    and
    What is cultural transmission example?
    do not reflect the relative importance of genetic and environmental causation.

    The biomedical and behavioural science literature over the past ten years has seen a deluge of GWA papers attempting to find common DNA markers that might be statistically associated with complex human behavioural phenotypes. As sample sizes have increased, more such markers have been found, but in few reports has the extent of environmental contribution to disease or behavioural phenotypes been taken very seriously. Epigenetic phenomena, e.g. methylation, have in some cases been shown to be influenced by such culturally transmitted environmental variables as diet or stress, but the scientific literature's focus has consistently been on common DNA polymorphisms whose effects on the phenotype under study have almost always been small [1,64,65]. Further, the contributions of these significant polymorphisms to the phenotypic variance have been small enough relative to the variance attributed to genes in analysis of familial contributions that the term ‘missing heritability’ entered the lexicon.

    Missing compared to what? In the first part of this note, we discussed the limitations of estimates of heritability from familial correlations, in particular their reliance on linear models and irrelevance with respect to potential environmental interventions. Why, then, should such heritabilities be the standards relative to which GWA-based variance analyses are compared? By including those polymorphisms that failed to be significant in GWA studies, analyses of new linear models [2,66] have produced increased estimates of the variance fraction explained by genomic variation. However, it almost always remains below that estimated from familial analyses.

    That heritability of a trait estimated from correlations between relatives is specific to the population in which the trait was assessed has been known for decades [50,51]. Cigarette smoking in U.S. adolescents and young adults is an example where twin-based heritability differs between whites and African-Americans [67]. A recent analysis of eight phenotypes on genomic data from the 1000 Genomes Project reference panel showed that such summary statistics as polygenic risk scores or heritability, derived from a GWAS in one population (e.g. Europeans), ‘may have limited portability to other populations’ [68]. It is also the case that genetic mutations that cause a phenotype in a population in one environment may produce an entirely different phenotype in members of a diaspora of that same population because of the changed environment experienced by the latter [69].

    Recent analyses of GWAS datasets for height and schizophrenia [65] have arrived at the conclusion that the effects of SNPs that actually influence complex phenotypes are likely to be extremely small. For example, more than 100 000 SNPs ‘exert independent causal effects on height’. This extreme polygenicity, termed ‘omnigenicity’ [65], also characterizes schizophrenia, for which it is inferred that ‘broadly expressed genes contribute more to overall heritability than do brain-specific genes’. If all genes have some interaction with causal genes, it can be predicted that gene-environment interactions, even if important for causal genes, will be difficult to detect because they are likely to have small genome-wide effects whose sum may exceed the magnitude of such interactions with causal genes. Nisbett et al. [38, p. 135] are not optimistic: ‘This problem is not likely to be solved by advances in genetic technology that are foreseeable at present.’

    The language used to interpret heritability has not changed much with advances in genomics, despite occasional genuflections towards its inability to assign causes and to gene-environment/gene-culture interactions. We still find statements on heritability such as ‘the same genes affect intelligence from age to age’ [9, p. 100], ‘intelligence shares genetic causes with education and social class’ [9, p. 104], and, referring to familial and SNP-based heritability estimates, ‘the same genes influence intelligence and social epidemiologists' ‘environmental’ variables of education, social class and height’ and ‘can enlighten research in health and social inequalities’ [9, p. 106].

    Although there have been minor changes in the lexicon surrounding the calculation of heritability, due to the evolution of genomic technology, the problem of the meaning of heritability has not gone away. Heritability estimated from linear models for variance analysis still depends on the environment in which it is measured, and an increase in SNP-based heritability of cognitive performance from 10% to 30% cannot provide useful information as to whether cultural or environmental intervention is likely to have an effect. It is almost 50 years since heritability of human traits became discredited as an indicator of genetic causation. To those who were around when Jensen's monograph appeared in 1969, it must seem like déjà vu all over again.

    This article has no additional data.

    We declare we have no competing interests.

    This study was funded in part by the Stanford Center for Computational, Evolutionary and Human Genomics.

    The authors thank Professors Noah Rosenberg and Nicole Creanza for their careful reading of an early draft.

    Footnotes

    1 Jensen's article appeared in the spring 1969 issue of the Harvard Educational Review. The summer 1969 issue of the same journal contained several critical responses to Jensen's thesis.

    One contribution of 16 to a theme issue ‘Bridging cultural gaps: interdisciplinary studies in human cultural evolution’.

    Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

    References

    • 1

      Manolio TAet al.2009Finding the missing heritability of complex diseases. Nature 461, 747–753. (doi:10.1038/nature08494) Crossref, PubMed, ISI, Google Scholar

    • 2

      Yang Jet al.2010Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569. (doi:10.1038/ng.608) Crossref, PubMed, ISI, Google Scholar

    • 3

      Golan D, Lander ES, Rosset S. 2014Measuring missing heritability: inferring the contribution of common variants. Proc. Natl Acad. Sci. USA 111, E5272–E5281. (doi:10.1073/pnas.1419064111) Crossref, PubMed, ISI, Google Scholar

    • 4

      Okbay Aet al.2016Genome-wide association study identifies 74 loci associated with educational attainment. Nature 533, 539–542. (doi:10.1038/nature17671) Crossref, PubMed, ISI, Google Scholar

    • 5

      Hayden EC. 2016Gene variants linked to education prove divisive. Nature 533, 154–155. (doi:10.1038/533154a) Crossref, PubMed, ISI, Google Scholar

    • 6

      Krapohl E, Plomin R. 2015Genetic link between family socioeconomic status and children's educational achievement estimated from genome-wide SNPs. Mol. Psychiatry 21, 437–443. (doi:10.1038/mp.2015.2) Crossref, PubMed, ISI, Google Scholar

    • 7

      Benjamin DJet al.2012The genetic architecture of economic and political preferences. Proc. Natl Acad. Sci. USA 109, 8026–8031. (doi:10.1073/pnas.1120666109) Crossref, PubMed, ISI, Google Scholar

    • 8

      Davies Get al.2011Genome-wide association studies establish that human intelligence is highly heritable and polygenic. Mol. Psychiatry 16, 996–1005. (doi:10.1038/mp.2011.85) Crossref, PubMed, ISI, Google Scholar

    • 9

      Plomin R, Deary IJ. 2015Genetics and intelligence differences: five special findings. Mol. Psychiatry 20, 98–108. (doi:10.1038/mp.2014.105) Crossref, PubMed, ISI, Google Scholar

    • 10

      Shockley W. 1972Dysgenics, geneticity, raceology: a challenge to the intellectual responsibility of educators. Phi Delta Kappan 3, 297–307. Google Scholar

    • 11

      Jensen AR. 1969How much can we boost IQ and scholastic achievement?Harv. Educ. Rev. 39, 1–123. (doi:10.17763/haer.39.1.l3u15956627424k7) Crossref, ISI, Google Scholar

    • 12

      Bodmer WF, Cavalli-Sforza LL. 1970Intelligence and race. Sci. Am. 223, 19–29. (doi:10.1038/scientificamerican1070-19) Crossref, PubMed, ISI, Google Scholar

    • 13

      Kevles DJ. 1995In the name of eugenics. Cambridge, MA: Harvard University Press. Google Scholar

    • 14

      Burt C. 1966The genetic determination of difference in intelligence: a study of monozygotic twins reared together and apart. Br. J. Psychol. 57, 137–153. (doi:10.1111/j.2044-8295.1966.tb01014.x) Crossref, PubMed, ISI, Google Scholar

    • 15

      Kamin LJ. 1973Heredity, Intelligence, Politics and Psychology. Unpublished. Eastern Psychological Association convention May 5, 1973. Google Scholar

    • 16

      Kamin LJ. 1974The science and politics of I.Q. Potomac, Maryland: Lawrence Erlbaum Associates, Publishers. Google Scholar

    • 17

      Cavalli-Sforza LL, Feldman MW. 1973Cultural versus biological inheritance: phenotypic transmission from parents to children (A theory of the effect of parental phenotypes on children's phenotypes). Am. J. Hum. Genet. 25, 618–637. PubMed, ISI, Google Scholar

    • 18

      Wright S. 1931Statistical methods in biology. J. Am. Stat. Assoc. 26, 155–163. Google Scholar

    • 19

      Burks BS. 1928The relative influence of nature and nurture upon mental development: a comparative study of foster parent–foster child resemblance and true parent–true child resemblance. In 27th Yearbook of the National Society for the Study of Education, Part 1. pp. 219–316. Bloomington, IN: Public School Publishing Co. Google Scholar

    • 20

      Rao DC, Morton NE, Yee S. 1974Analysis of family resemblance. II. A linear model for familial correlation. Am. J. Hum. Genet. 26, 331–359. PubMed, ISI, Google Scholar

    • 21

      Rao DC, Morton NE, Yee S. 1976Resolution of cultural and biological inheritance by path analysis. Am. J. Hum. Genet. 28, 228–242. PubMed, ISI, Google Scholar

    • 22

      Jencks C. 1972Inequality: a reassessment of the effect of family and schooling in America. New York, NY: Basic Books. Google Scholar

    • 23

      Cavalli-Sforza LL, Feldman MW. 1978The evolution of continuous variation. III. Joint transmission of genotype, phenotype and environment. Genetics 90, 391–425. Crossref, PubMed, ISI, Google Scholar

    • 24

      Cloninger CR, Rice J, Reich T. 1979Multifactorial inheritance with cultural transmission and assortative mating. II. A general model of combined polygenic and cultural inheritance. Am. J. Hum. Genet. 31, 176–198. PubMed, ISI, Google Scholar

    • 25

      Cloninger CR, Reich J, Reich T. 1979Multifactorial inheritance with cultural transmission and assortative mating. III. Family structure and the analysis of separation experiments. Am. J. Hum. Genet. 31, 366–388. PubMed, ISI, Google Scholar

    • 26

      Rice J, Cloninger CR, Reich T. 1978Multifactorial inheritance with cultural transmission and assortative mating. I. Description and basic properties of the unitary models. Am. J. Hum. Genet. 30, 618–643. PubMed, ISI, Google Scholar

    • 27

      Rao DC, Morton NE, Lalouel JM, Lew R. 1982Path analysis under generalized assortative mating. II. American IQ. Genet. Res. Camb. 39, 187–198. (doi:10.1017/S0016672300020875) Crossref, PubMed, ISI, Google Scholar

    • 28

      Otto SP, Christiansen FB, Feldman MW.1995Genetic and cultural inheritance of continuous traits. Morrison Institute for Population and Resource Studies Working Paper No. 64. See http://hsblogs.stanford.edu/morrison/morrison-institute-working-papers-pdf/. Google Scholar

    • 29

      Bouchard T, McGue M. 1981Familial studies of intelligence: a review. Science 212, 1055–1059. (doi:10.1126/science.7195071) Crossref, PubMed, ISI, Google Scholar

    • 30

      Herrnstein RJ, Murray C. 1994The bell curve: intelligence and class structure in American life. New York, NY: Free Press. Google Scholar

    • 31

      Jacoby R, Glauberman N (eds). 1995The bell curve debate. New York, NY: Times Books. Google Scholar

    • 32

      Plomin R, DeFries JC, Knopik VS, Neiderhiser JM. 2016Top 10 replicated findings from behavioral genetics. Perspect. Psychol. Sci. 11, 3–23. (doi:10.1177/1745691615617439) Crossref, PubMed, ISI, Google Scholar

    • 33

      Turkheimer E. 2016Weak genetic explanation 20 years later: reply to Plomin et al. (2016). Perspect. Psychol. Sci. 11, 24–28. (doi:10.1177/1745691615617442) Crossref, PubMed, ISI, Google Scholar

    • 34

      Turkheimer E. 1998Heritability and biological explanation. Psychol. Rev. 105, 782–791. (doi:10.1037/0033-295X.105.4.782-791) Crossref, PubMed, ISI, Google Scholar

    • 35

      Clausen J, Keck DD, Hiesey WM. 1940Experimental studies on the nature of species. I. Effects of varied environments on western north American plants. Washington, DC: Carnegie Institute of Washington. Google Scholar

    • 36

      Dobzhansky T, Spassky B. 1944Genetics of natural populations. XI. Manifestation of genetic variants in Drosophila pseudoobscura in different environments. Genetics 29, 270–290. Crossref, PubMed, Google Scholar

    • 37

      Kouchi M. 1996Secular change and socioeconomic differences in height in Japan. Anthropol. Sci. 104, 325–340. (doi:10.1537/ase.104.325) Crossref, ISI, Google Scholar

    • 38

      Nisbett RE, Aronson J, Blair C, Dickens W, Flynn J, Halpern DF, Turkheimer E. 2012Intelligence: new findings and theoretical developments. Am. Psychol. 67, 130–159. (doi:10.1037/a0026699) Crossref, PubMed, ISI, Google Scholar

    • 39

      Rowe DC, Jacobson KC, Van den Oord EJCG. 1999Genetic and environmental influences on vocabulary IQ. Child Dev. 70, 1151–1162. (doi:10.1111/1467-8624.00084) Crossref, PubMed, ISI, Google Scholar

    • 40

      Turkheimer E, Haley A, Waldron M, D'Onofrio B, Gottesman II. 2003Socioeconomic status modifies heritability of IQ in young children. Psychol. Sci. 14, 623–628. (doi:10.1046/j.0956-7976.2003.psci_1475.x) Crossref, PubMed, ISI, Google Scholar

    • 41

      Harden KP, Turkheimer E, Loehlin JC. 2007Genotype by environment interaction in adolescents' cognitive aptitude. Behav. Genet. 37, 273–283. (doi:10.1007/s10519-006-9113-4) Crossref, PubMed, ISI, Google Scholar

    • 42

      Tucker-Drob EM, Rhemtulla M, Harden KP, Turkheimer E, Fask D. 2011Emergence of a gene × socioeconomic status interaction on infant mental ability between 10 months and 2 years. Psychol. Sci. 22, 125–133. (doi:10.1177/0956797610392926) Crossref, PubMed, ISI, Google Scholar

    • 43

      Hanscombe KB, Trzaskowski M, Haworth CMA, Davis OSP, Dale PS, Plomin R. 2012Socioeconomic status (SES) and children's intelligence (IQ): in a UK-representative sample SES moderates the environmental, not genetic, effect on IQ. PLoS ONE 7, e30320. (doi:10.1371/journal.pone.0030320) Crossref, PubMed, ISI, Google Scholar

    • 44

      Lewontin RC. 1974The analysis of variance and the analysis of causes. Am. J. Hum. Genet. 26, 400–411. PubMed, ISI, Google Scholar

    • 45

      Dickens WT, Flynn JR. 2001Great leap forward: a new theory of intelligence. New Sci. 21, 44–47. Google Scholar

    • 46

      Dickens WT, Flynn JR. 2001Heritability estimates versus large environmental effects: the IQ paradox resolved. Psychol. Rev. 108, 346–369. (doi:10.1037/0033-295X.108.2.346) Crossref, PubMed, ISI, Google Scholar

    • 47

      Turkheimer E. 2000Three laws of behavior genetics and what they mean. Curr. Dir. Psychol. Sci. 9, 160–164. (doi:10.1111/1467-8721.00084) Crossref, ISI, Google Scholar

    • 48

      Polderman TJC, Benyamin B, de Leeuw CA, Sullivan PF, van Bochoven A, Visscher PM, Posthuma D. 2015Meta-analysis of the heritability of human traits based on fifty years of twin studies. Nat. Genet. 47, 702–709. (doi:10.1038/ng.3285) Crossref, PubMed, ISI, Google Scholar

    • 49

      Falconer DS, Mackay TFC. 1996Introduction to quantitative genetics, 4th edn. Essex, UK: Longman. Google Scholar

    • 50

      Lewontin RC. 1970Race and intelligence. Bull. At. Sci. 26, 2–8. (doi:10.1080/00963402.1970.11457774) Crossref, ISI, Google Scholar

    • 51

      Feldman MF, Lewontin RC. 1975The heritability hang-up. Science 190, 1163–1168. (doi:10.1126/science.1198102) Crossref, PubMed, ISI, Google Scholar

    • 52

      Sniekers Set al.. 2017Genome-wide association meta-analysis of 78,308 individuals identifies new loci and genes influencing human intelligence. Nat. Genet. 49, 1107–1112. (doi:10.1038/ng.3869) Crossref, PubMed, ISI, Google Scholar

    • 53

      Vilhjálmsson BJet al.2015Modeling linkage disequilibrium increases accuracy of polygenic risk scores. Am. J. Hum. Genet. 97, 576–592. (doi:10.1016/j.ajhg.2015.09.001) Crossref, PubMed, ISI, Google Scholar

    • 54

      Nature editorial. 2017Intelligence test: modern genetics can rescue the study of intelligence from a history marred by racism. Nature 545, 385–386. Google Scholar

    • 55

      Chabris CF, Lee JJ, Cesarini D, Benjamin DJ, Laibson DI. 2015The fourth law of behavior genetics. Curr. Dir. Psychol. Sci. 24, 304–312. (doi:10.1177/0963721415580430) Crossref, PubMed, ISI, Google Scholar

    • 56

      Eaves JF, Eysenck HJ, Martin NG. 1989Genes, culture and personality: an empirical approach. San Diego, CA: Academic Press. Google Scholar

    • 57

      Loehlin JC, Martin NG. 2013General and supplementary factors of personality in genetic and environmental correlation matrices. Pers. Indivd. Differ. 54, 761–766. (doi:10.1016/j.paid.2012.12.014) Crossref, ISI, Google Scholar

    • 58

      Martin NG, Eaves LJ, Heath AC, Jardine R, Feingold LM, Eysenck HJ. 1986Transmission of social attitudes. Proc. Natl Acad. Sci. USA 83, 4364–4368. (doi:10.1073/pnas.83.12.4364) Crossref, PubMed, ISI, Google Scholar

    • 59

      Turkheimer E, Pettersson E, Horn EE. 2014A phenotypic null hypothesis for the genetics of personality. Annu. Rev. Psychol. 65, 515–540. (doi:10.1146/annurev-psych-113011-143752) Crossref, PubMed, ISI, Google Scholar

    • 61
    • 62

      Johnson W, McGue M, Krueger RF, Bouchard TJ Jr. 2004Marriage and personality: a genetic analysis. J. Pers. Soc. Psychol 86, 285–294. Crossref, PubMed, ISI, Google Scholar

    • 63

      Feldman MW, Christiansen FB, Otto SP. 2013Gene-culture co-evolution: teaching, learning, and correlations between relatives. Israel J. Ecol. Evol. 59, 72–91. (doi:10.1080/15659801.2013.853435) Crossref, ISI, Google Scholar

    • 64

      Gibson G. 2012Race and common variants: twenty arguments. Nat. Rev. Genet. 13, 135–145. (doi:10.1038/nrg3118) Crossref, PubMed, ISI, Google Scholar

    • 65

      Boyle EA, Yang IL, Pritchard JD. 2017An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186. (doi:10.1016/j.cell.2017.05.038) Crossref, PubMed, ISI, Google Scholar

    • 66

      Yang J, Lee SH, Goddard ME, Visscher PM. 2011GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 58, 76–82. (doi:10.1016/j.ajhg.2010.11.011) Crossref, ISI, Google Scholar

    • 67

      Bares CB, Kendler KS, Maes HHM. 2016Racial differences in heritability of cigarette smoking in adolescents and young adults. Drug Alcohol. Depend. 166, 75–84. (doi:10.1016/j.drugalcdep.2016.06.028) Crossref, PubMed, ISI, Google Scholar

    • 68

      Martin AR, Gignoux CR, Walters RK, Wojcik GL, Neale BM, Gravel S, Daly MJ, Bustamante CD, Kenny EE. 2017Human demographic history impact genetic risk prediction across diverse populations. Am. J. Hum. Genet. 100, 635–649. (doi:10.1016/j.ajhg.2017.03.004) Crossref, PubMed, ISI, Google Scholar

    • 69

      McClellan JM, Lehner T, Kin M-C. 2017Gene discovery for complex traits: lessons from Africa. Cell 171, 261–264. (doi:10.1016/j.cell.2017.09.037) Crossref, PubMed, ISI, Google Scholar


    Page 15

    error_outline

    You have to enable JavaScript in your browser's settings in order to use the eReader.

    Or try downloading the content offline

    DOWNLOAD