Ambivalence, Avoidance, and Appeal: Alliterative Aspects of Anglo Anthroponyms

In several countries, one of the most pronounced trends in contemporary baby naming is selecting a comparatively uncommon name. Nevertheless, although a well-documented phenomenon, studies of uncommon name use are often limited to forenames. This study analyses approximately 22 million full names from England and 1 million from Wales, given between 1838 and 2014. It addresses the hypothesis that, consistent with the contemporary desire to choose an uncommon name, alliterative names – uncommon by definition – would become increasingly popular. More broadly, this study charts the long-term trends in alliterative naming over time. In both England and Wales, this pattern is consistent with a random expectation for much of the 19 century but declines significantly throughout the 20 century to its lowest use in the 1970s. This trend reverses towards the end of the 20 century, with alliterative naming becoming more common in contemporary records. These three aspects of alliterative name use are thematically referred to as ‘ambivalence’, ‘avoidance’ and ‘appeal’; and may reflect changing attitudes towards alliterative naming. The relatively renewed appeal of alliterative names towards the end of the 20 century complements previous research on the preponderance of uncommon names and the contemporary ‘need for uniqueness’ in naming.


Introduction
In several countries, one of the most pronounced trends in contemporary baby naming is to choose a comparatively uncommon name. This has previously been dataset was compiled primarily from the 'local BMD' (births, marriages and deaths) registers of England and detailed in previous publications (Bush 2019;Bush, Powell-Smith, and Freeman 2018). Three patterns of alliterative name use are identified. Thematically, they are referred to as 'ambivalence', 'avoidance' and 'appeal'. These uses reflect possible attitudes towards alliteration characteristic of particular time periods. From 1838 until the mid-20 th century, the proportion of alliterative birth records is largely consistent with a random expectation. At its simplest, for any given English-language surname, there is a 1 in 26 chance that the first name will have the same letter, and so a random (but naïve) expectation for the proportion of alliterative birth records in a given year is 3.8%. This baseline percentage could be interpreted as suggesting a degree of ambivalence towards alliteration: that there is neither a particular preference for, nor aversion to, alliterative naming in general. Deviations from this baseline may be interpreted as relative aversion or relative appeal, and can reflect changes either in the associations perceived of an alliterative name or (more likely) simply in the pool of names from which it is possible to draw. When adjusting for the latter, the observed proportion of alliterative birth records per year does not significantly differ from the expected proportion until the mid-20 th century (i.e., suggesting 'ambivalence' towards alliteration), whereupon the observed proportion falls below expectations ('avoidance') but, from the 1980s to the present day, rises again to approach, and occasionally exceed, the expected ('appeal'). The latter observation complements previous research on the preponderance of uncommon names and is discussed in the context of a contemporary 'need for uniqueness' in personal naming.

Source of Name Data
In England and Wales, birth, marriage, and death (BMD) registration began in July 1837. BMD records were obtained from the 'UK local BMD' project (http:// www.ukbmd.org.uk/local), a volunteer-led effort to transcribe the local indices of the UK BMD registers for digital preservation. These local indices were originally sent to a central body, the General Register Office in London, for compilation into a national catalogue; this catalogue is not publicly available. Birth records spanning the complete years 1838-2014 were downloaded in September 2016 as part of a previous study describing the application of network methods to onomastic data (Bush, Powell-Smith, and Freeman 2018). These records were then updated in January 2018 for a study describing the re-use of birth records in response to child bereavement (Bush 2019). Employing the data used for the latter, 23,468,892 birth records were parsed for the present study. 2 These birth records were collated from ten different locales, nine from England and one from Wales, each transcribed by different groups of volunteers, ostensibly from the same area (the city of Bath; the counties of Berkshire, Cheshire, Cumbria, Lancashire, Staffordshire, Wiltshire and Yorkshire and the West Midlands; and the more broadly defined region of North Wales). Accordingly, the data is non-uniform both in terms of the number of records per region and the depth of coverage over time. The records are assumed to be unbiased, being transcribed from birth registers that were filled on an ad hoc basis, and of sufficient breadth as to represent English and Welsh naming trends over time. The number of records per region and the years and regions covered are detailed in Supplementary material Table 1, 3 with the methods of data cleaning and curation described in the following resource: (Bush, Powell-Smith, and Freeman 2018). Data cleaning comprised the correction of typographical errors, removal of uninformative entries, and expansion of abbreviations, such as Wm for William. The available fields for each birth record were the first name, middle name(s) and surname, mother's pre-marriage surname (if applicable), year of birth, sub-district of the region in which the birth was registered, and identification number. This dataset approximates 130,000 to 230,000 records per year from 1838 to 1950; 25,000 to 100,000 records per year from 1951 to 2000; and 5000 to 15,000 records per year from 2001 to 2014.

Observed and Expected Alliterative Naming
For each alliterative two-letter combination (the first name and surname both beginning with A, then BB, CC, and so on), observed/expected ratios were calculated to account for the fact that the pool of names from which it is possible to draw in a given year is non-uniform. For each year, the expected number of alliterative records in, for example, the letter A equals the percentage of records with a first name beginning with A Â the total number of records with a surname beginning with A. The null hypothesis that the observed and expected numbers of alliterative birth records are drawn from the same continuous distribution was assessed using a two-sample Kolmogorov-Smirnov test. The significance of any difference between groups (observed and expected records within certain time periods) was assessed using a chi-squared test. All statistical analyses were conducted in R v3.6.1 (R Core Team 2019).

Results
Trends in Alliterative Naming: Ambivalence, Avoidance, and Appeal Trends in alliterative naming in England and Wales were examined using a dataset of 23,468,892 birth records spanning 177 years, from 1838 to 2014, of which 1,170,606 (5%) were alliterative, having the first and family names starting with the same character (irrespective of any middle name). Only 10,798 records, 0.05% of the total, were 'triply alliterative', with the first, middle, and family names starting with the same character. The proportion of alliterative birth records per year is shown in Figure 1 and summarised in Supplementary material Table 2. The corpus of alliterative names is given in Supplementary material Table 3. Broadly speaking, alliterative naming declines steadily throughout the 19 th and 20 th centuries, reaching its lowest point, approximately 3.8%, in the mid-1970s. From the 1980s onwards, however, this trend reverses with an increase in alliterative naming apparent, rising to approximately 5% of records per year from the year 2000.
While the initial decline in alliterative name use (see Figure 1) is pronounced, it is likely coincidental, reflecting the expansion of the name pool rather than (for instance) a negative perception to alliterative naming. This might be because, as the 20 th century progressed, a far greater number of names would have entered the cultural milieu, diversifying the pool of forenames from which to draw. With an expanding number of forenames to choose from (or create), and assuming free choice, the proportion of alliterative names in a random sample of English-language names is expected to be 3.8% (i.e., 1 in 26). A greater percentage of alliterative names are observed throughout much of the 19 th and 20 th centuries (see Figure 1) likely because, historically, there was both a smaller number of forenames to choose from and relatively reduced social freedom to follow fashion instead of tradition. As such, a minority of names accounted for the majority of birth records. The size of this majority has declined over time and no longer exists in the present day. The proportion of births registered with one of the top 10 most common female or male names has decreased from > 70% in 1850 to < 20% by 2010 ( Figure 2).
What this means in practice is that common surnames that begin with the same initial as a particularly popular forename would be disproportionately represented in the historical records of this dataset. Towards the present day, this  Table 2.
would become less pronounced as the popularity of any given forename declines. In this respect, the initial downward trend in Figure 1 can be interpreted as a consequence of the trend shown in Figure 2, an argument developed in more detail below. If considering the top 20 surnames in the dataset, these collectively represent 2,860,435 birth records, 12% of the total. These surnames are, in order, Jones, Smith, Williams, Taylor, Roberts, Davies, Hughes, Evans, Brown, Johnson, Jackson, Robinson, Wilson, Wood, Walker, Harrison, Edwards, Thompson, and Wright. If we assume an approximately equal number of births per year with each surname, then popular first names with the same letter as a common surname would account for a disproportionate number of alliterative birth records. Consistent with this finding, the most frequently occurring alliterative names in the dataset were John Jones (27,404 total records) and William Williams (11,631 records). John and William are the two most common male names overall, and were historically especially prevalent (Bush, Powell-Smith, and Freeman 2018). The most common alliterative female names were Jane Jones (8738 records) and Sarah Smith (6276 records), in both cases supported by relatively fewer records. This is likely because the pool of female names is generally larger than that of male names. 4 For the same reason, the most common 'triply alliterative' female names were Alice Ann Ashworth and Elizabeth Ellen Evans, albeit with only 91 and 54 records, respectively. The most common triply alliterative male name was John James Jones, at 130 records, followed by John James Jackson, John James Johnson, and John Joseph Jones at 87, 60, and 57 records, respectively. To account for the distorting effect popular first names would have upon the total number of alliterative records, we calculated observed/expected ratios for each alliterative two-letter combination (see Materials and Methods). By plotting the distribution of these ratios over time, a clearer overview was obtained of trends in alliterative naming. These trends are illustrated in Figure 3 and summarised in Supplementary material Table 4. In Figure 3, the boxes represent the interquartile range of the set of ratios, with midlines representing the median. Upper and lower whiskers extend, respectively, to the largest and smallest values no further than 1.5 X the interquartile range. Data beyond the ends of each whisker are outliers and plotted individually. The lower black line denotes y ¼1, where the observed number of alliterative names equals the expected. Note that only letter combinations observed in > 100 birth records are included in this figure. As fewer records are available for the years 2001-2014 (see Materials and Methods), this threshold excludes much of the data within this period. Of the 364 two-letter combinations within this period (26 combinations 14 years), observed/expected ratios were only available for eight, i.e., 2% of the total. Raw data for this figure is available as Supplementary material Table 4.The steady decline in the percentage of alliterative birth records from the 1840s to the 1920s, as illustrated in Figure 1, can now be seen as consistent with the changing composition of the name pool. Figure 3 shows that across this time period, the observed/expected ratio approximates 1. The null hypothesis that the observed and expected numbers of alliterative birth records were drawn from the same continuous distribution was assessed using a two-sample Kolmogorov-Smirnov test, and was not rejected (p ¼ 0.504). The Kolmogorov-Smirnov test compared the distributions of observed and expected counts from the years 1838 to 1920, each distribution containing 1234 data points. There were six prominent 19 th century outliers in Figure 1 (the years 1843, 1851, 1854, 1856, 1857, and 1867, where observed/expected ratios > 1.4) each of which could be associated with the letter D (see Supplementary material Table 4). We can only speculate as to why this is the case. Forenames that are especially common would increase the number of expected alliterative records for that letter, so increasing the number of observed alliterative records required to constitute an outlying observed/expected ratio. As Figure 2 illustrates, the ten most common male and female names between 1850 and 1870 account for a per-year average of 65% of the registered births, although neither begin with the letter D (the most popular names within this period are, in order, John, William, Thomas, James, George, Joseph, Henry, Charles, Robert, and Samuel for males, and Mary, Elizabeth, Sarah, Ann, Margaret, Jane, Alice, Ellen, Hannah, and Emma for females). Names not recorded in the top 10 are not necessarily uncommon. It is plausible that names in the top 50, for example, are being paired with a sufficient number of surnames so as to produce an outlier. For instance, in 1851, the fourteenth most common male name was David. An alternative explanation is that names beginning with D are disproportionately found in particular ethnic or regional subgroups within the wider dataset. In this case, the most plausible explanation is the inclusion of data from Wales, in which Welsh names such as Dafydd are, unsurprisingly, disproportionately represented. If repeating the analysis with records from Wales omitted, these 19 th century outliers disappear (see supplementary material). The potential confounding effects of ethnic and regional grouping are discussed in more detail below.
The observed/expected ratio steadily declines throughout the 20 th century to the extent that by the 1970s, alliterative names appear actively avoided, with ratios falling < 0.8. When comparing the distributions of observed and expected counts (n ¼ 1029) for records made between 1900 and 1970, the difference is statistically significant (Kolmogorov-Smirnov p ¼ 2.4 Â 10 À5 ). There were a significantly lower number of records observed than expected (within the time period of 1900-1970, the median number of observed and expected records per year was 877 and 1066, respectively), with a corresponding chi-squared test p < 2.2 Â 10 À16 (see Supplementary material Table 5).
From the 1970s to the last year represented in the figure (2006), the observed/ expected ratios rise again to approach 1. This suggests that towards the end of the 20 th century the trend to avoid alliterative naming has been reversed and that, relative to previous years, alliterative names had greater appeal (for data recorded from 1970 onwards, n ¼ 152, Kolmogorov-Smirnov p ¼ 3.6 Â 10 À3 ). It is important to note that compared to historical records there are relatively few contemporary records in this dataset (see Supplementary material Table 4). As such, it is unclear as to whether the observed number of alliterative records will consistently exceed the expected numberwhich Figure 3 suggests but does not showor whether early 21 st century alliterative naming patterns will resemble those observed of the 19 th century. Finally, the division of this dataset into three broad time periods is essentially arbitrary. It is therefore important to note that the difference in observed and expected distributions, when considering the entire dataset, remains statistically significant (n ¼ 2087, Kolmogorov-Smirnov p ¼ 7.8 Â 104). In absolute terms, and consistent with the mid-20 th century aversion to alliterative naming, there was also a significantly lower number of observed than expected alliterative records per year across the entire dataset (median records 6804 and 6933, respectively; chi-squared test p < 2.2 Â 10 À16 ).

Discussion
This study provides an overview of three patterns of alliterative name use observed in England from the 19 th century to the present day: 'ambivalence' (from the 19 th to early 20 th centuries, where the observed numbers of alliterative names do not significantly differ from the expected), 'avoidance' (the early to mid-20 th century, where the observed number is lower than expected) and 'appeal' (the mid-20 th century to the present day, where the 'avoidance' trend is reversed 5 ). A more nuanced exploration of these patterns would necessitate addressing several limitations of this dataset and the analysis.
Firstly, in order to parse the large volume of data comprising this study, the simplifying assumption was made that the majority of records use only one language, English. Alliteration was also defined pragmatically, as the repetition of initial characters in both the forename and surname. This definition was limited to the 26 characters of the English alphabet and so the analysis neither accommodates diacritics or other characters of non-English origin nor, given the focus on initial characters, digraphs such as the Welsh ff or ll. A clear limitation of this study is that in reality the dataset comprises names from an unknown number of languages. As such, an unknown number of records are expected using other alphabets found throughout the British Isles and beyond, including but not limited to Welsh (with its 29 letter alphabet), Manx (24 letters), and Scots Gaelic (18 letters). However, as languages are abundant within certain geographical regions, but not necessarily exclusive to them, the data cannot easily be partitioned into language-specific subsets. The Anglocentric definition of alliteration also assumes that identical characters correspond to identical sounds, which simultaneously discards names that are alliterative only when spoken, such as George Jones, and includes names that are only alliterative in print, such as George Grey. An assumption of this study is that this does not meaningfully alter the observed trends.
Secondly, when calculating the expected numbers of alliterative records per year, the assumption was made that any given surname could, in principle, be paired with any given forename although in a diverse population, this would overestimate the expected numbers. In reality, forenames and surnames co-occur on the basis of ethnocultural origin. Predicting ethnicity from name data is a non-trivial problem (Kandt and Longley 2018;Fiscella and Fremont 2006;Sood and Laohaprapanon 2018) that was not attempted here. A consequence for this study is that the observed/expected ratios will be underestimated. To control for this, we could repeat the analysis after restricting data to names most commonly associated with a self-reported 'white British' ethnicity, considered to be the ethnic majority within the dataset. However, this is complicated by the fact that forenames and surnames with a 'white British' association are not easy to ascertain. We could assume instead that the top 20 surnames, which are registered in every year of the dataset, are reasonably strong correlates of ethnic majority individuals. This is the same approach taken by Jones et al. (2004) who, in a study of US American names, controlled for ethnic confounds by restricting analysis to the surnames Smith, Johnson, Williams, Jones, and Brown, considering these to be "common European American," and ethnically white, names. This approach is pragmatic but flawed: (Simonsohn 2011) notes that in the US census of 2000, only 62% of individuals with these surnames reported their ethnicity as white. We can assume the UK data will be similarly confounded and that attempts to control for this using surnames will, at best, be crude.
Other approaches to reducing heterogeneity in the data are to repeat the analysis having restricted data only to England, to Wales, or to one English county, assuming the regional distributions of surnames may also exert an influence. Of the total number of records, 45% are sourced from the county of Lancashire, in north-west England, with the most spatially-restricted subset of records (2% of the total) limited to one south-western city, Bath (Supplementary material Table 1). In all, 6% of the records are sourced from Wales. Repeating the analysis using all five subsets (data only from a subset of surnames, only from England, from Wales, from Lancashire, or from Bath) results in no observable difference to the trends illustrated in Figure 3 (see Supplementary material Tables 6 through 11). Note that with the Wales, surname-subset, and Lancashire data, the 'ambivalence' and 'avoidance' patterns could be observed, but not the 'appeal'. This was because data was only available up to the years 1950, 1971, and 1982, respectively. The Bath subset, being the smallest, could only be visualised when requiring no minimum number of records for each letter combination. While the data is noisier as a consequence, the ambivalence/avoidance/appeal pattern can still be seen (Supplementary  material Table 9).
Thirdly, there is a relative paucity of contemporary records in this dataset. The avoidance of alliterative naming throughout the 20 th century is supported by >100,000 records per year until the 1950s (see Supplementary material Table  2). However, as there are fewer records in subsequent years, less clear is the extent to which alliterative naming has become more prevalent since. In absolute terms, a small number of records exacerbates the difficulty of studying alliterative names as, by their nature, only a small proportion of records (approximately 4-5%) contain them. We are further limited by the need to have both forename and surname data for the same individual; however, for privacy reasons this information may not be easily accessible.
Two related questions are suggested by the results of this study: what underlies the decline in popularity of alliterative names throughout at least the first half of the 20 th century, and what underlies their relative contemporary appeal? With regard to the former, an explanation may be suggested on the basis of a study of Swedish personal names in which alliteration was found actively avoided. (Hagåsen 2011) collated the first and last elements of the 3000 commonest Swedish dithematic surnames: those comprising two differentiallystressed lexemes, such as Haglund. The permillages were then calculated: the number of occurrences, per thousand, of the last element in relation to the first. It was found that alliteration between the two elements was substantially underrepresented in the set of extant names, with the tendency to avoid it strongest when assonance was also present. These findings were related to an "aspiration to dignity in name formation" which alliteration (and rhyme) could undermine: If the purpose of alliteration and rhyme is to provide a feast for the ear in poetry recitation and give the impression of playfulness and verve in speech, such effects have apparently not been very desirable in the Swedish name categories discussed here [ … ] the rejection of rhyming elements should certainly be ascribed to people's anxiety about forming names that might make a conspicuous and even ridiculous impression (Hagåsen 2011, 1o6).
We can speculate that alliterative naming makes a similar impression in English, albeit on the basis that Swedish and English share commonalities, both being Germanic languages. This conspicuous impression would not necessarily apply to other languages represented in the dataset. This Swedish study focussed on surnames, however, which have greater generational stability than forenames. The long-term avoidance of alliterative surnames suggests that negative perceptions of alliteration are consistent. This was not the case with our data as the avoidance of alliterative naming appeared to be a uniquely 20 th century phenomenon. The reason for this is unclear. We can speculate that a change in the prevalence of alliterative names in the cultural milieu, and the context in which they were used, has shaped perception at that time. Given the period in which alliterative names declined in popularity, we can speculate that this was concomitant with the spread of new forms of media, and its dissemination of new types of name. A fuller exploration of this hypothesis is beyond the scope of this study. However, it is worth noting that early to mid-20 th century Anglophone sources have documented an abundance of alliterative names in a variety of contemporary contexts, including comic strip characters (Tysell 1934(Tysell , 1935, advertising copy (Coard 1959), and the ring names of boxers (McCartney 1938).
Might the conspicuity of alliteration, and the impression this makes, also relate to the contemporary appeal of alliterative naming? This would be consistent with the contemporary preference for 'standing out' through the choice of an uncommon name. This point has been discussed by numerous authors (e.g., Twenge, Abebe, and Campbell 2010;Twenge, Dawson, and Campbell 2016). Alternatively, any negative associations with alliterative names may simply have been normalised towards the end of the 20 th century, and no longer hold by the present day. For much of the time period recorded in this study, the dominant attitude towards alliterative naming appears to be 'ambivalence'. Future research could focus on parental attitudes towards alliteration and whether any distinctiveness perceived of an alliterative name influences the choice. The effect an alliterative name has on the individual is similarly unknown although would presumably be limited only to those circumstances where the full name is used, and the alliteration actually apparent.
There is a wide body of anthroponomastic attitudinal research documenting the perceptions associated with names, particularly those considered (un)common, (un)familiar, or 'other' (e.g., Carpusor and Loges 2006;Hilliar and Kemp 2008), although comparatively little research on the impression made by an alliterative name. Within an Anglocentric context, relatively uncommon forenames have been positively associated with perceptions of academic ability (Erwin 1999), professional competence (Sadowski, Wheeler, and Cash 1983) and artistic creativity (Lebuda and Karwowski 2013), and so, in some contexts, may have a positive psychological effect (Zweigenhaft 1983), a possible explanation for their contemporary popularity. Other studies, however, have reported more negative connotations, across a range of attributes, for uncommonor perceptibly less familiarforenames (Levine and Willis 1994;Karlin and Bell 1995;Joubert 1993) and surnames (Colman, Sluckin, and Hargreaves 1981). There are fewer associations with alliterative names, although alliteration has previously been identified as a factor in mate selectionindividuals whose first names (Kopelman and Lang 1985) or nicknames (Brandwein et al. 2018) began with the same letter were found more likely to marry than random expectation (although these correlations may be spurious (Simonsohn 2011)). It would be of particular interest to explore the (possibly attractive) impression made by alliterative names in other circumstancessuch as job applicationswhere the use of both first and family names is hard to avoid.
In summary, this study identifies three distinct patterns of alliterative baby naming from 19 th century England to the present day. The data suggests that towards the mid-20 th century, there has been a change in perceptions of the impression made by an alliterative name. This is an underexplored phenomenon that may offer new insight into a period in which cultural norms around naming were in flux.  Harper 2000/ 2005). The preferential maintenance of a smaller number of traditionally male-typed names (for instance, when naming children for ancestors) would also be consistent with patriarchal tradition, and contribute to why the number of male names is smaller than female names. 5. It is worth clarifying that this use of the word 'appeal' refers to a trend, not an observed/expected ratio: the appeal of alliterative names (towards the end of the 20 th century) is relative to a period in which alliterative names were preferentially avoided (the mid 20 th century). As shown in Figure 3, this does not mean that the observed number of alliterative records is necessarily significantly greater than expected in a given year.