Mapping Digital Discourses of the Capital Region of Finland: Combining Onomastics, CADS, and GIS

This article discusses the three Finnish city names Helsinki, Espoo, and Vantaa, and the urban discourses that surround them. The study reveals patterns of socio-spatial differentiation by examining what meanings people attach to these capital region cities and investigating how these meanings are expressed in online discourses. Using the methodological approach of corpus-assisted onomastics (CAO), this study incorporates onomastics, geographical information systems (GIS), and corpus linguistics. This interdisciplinary research also examines how corpus-assisted discourse studies (CADS) and GIS can be combined to reveal and visualize the contextual information and discursive patterns of toponyms. For this investigation, a data set of 2.7 billion words was collected from the Suomi24 Corpus, one of the biggest discussion fora in Finland. The use of social media corpus as a data source increases the authenticity of this research, as the data was not collected specifically for this study. The analysis reveals that the most frequent digital discourses about the cities refer to places and directions, housing, and mobility. The occurrences differ quantitatively and qualitatively between each city. This paper paves the way for future onomastic studies to research actual name usage using this new methodology. Knowledge gained from such research may not only enrich the field of onomastics, but also facilitate more socially sustainable urban planning.


Introduction
This article brings together onomastics, geographical information systems (GIS), and corpus linguistics. In their research on discourses related to the toponyms Hesa and Stadi (slang terms for the city of Helsinki), Jantunen (2019, 2018) introduce the concept and method of corpus-assisted onomastics (CAO). They define CAO as a field in onomastic research, where electronic databases, i.e., corpora, are employed as data, where analysis is based on corpus research methods (such as wordlist analysis, concordancing and keyword analysis) and where the subject of the research is the prevalence of names, their usage in textual contexts (e.g., in collocations and other phraseological relations, and genres), as well as regional and local variation. (2019, 58; Ainiala & Jantunen's translati0n [2022]).
Corpus linguistics and onomastics have also been used together by other researchers, most recently by Motschenbacher (2020aMotschenbacher ( , 2020b. In 2020a, Motschenbacher presents a theoretical discussion of the subject and introduces corpus methods that can be used in onomastic studies, such as the collocation analysis used by Ainiala and Jantunen (2019), and keyword analysis used in the present study. Ainiala and Jantunen (2019) point out that big data opens pathways to research that cannot be as easily explored through qualitative methods. Together with Motschenbacher (2020a), they also claim that corpus linguistic methods have great potential to enrich name studies research. Corpora compiled from authentic discourse can, for example, provide important insights into real language use, while statistical analyses can be effectively applied to large amounts of quantitative data to make the findings more reliable.
The method used in the present study is corpus-assisted discourse studies (CADS). This approach combines qualitative and corpus methodologies to examine language patterns such as collocations and keyword frequencies to reveal discourse types (Partington et al. 2013). Despite its potential utility, with the exception of Ainiala and Jantunen (2019) and Poole (2018), CADS has rarely been used in onomastics, however. Another relatively recent approach that combines onomastics and electronic data is "Visual GISting" (Gregory & Hardie 2011) or "Geoparsing" . This method exploits corpora and spatial data provided by geographical information systems (GIS) to provide information about language use. In this study, we combine CADS and Visual GISting by first identifying discourses in Suomi24, a social media corpus of authentic Finnish discourse. These stretches of discourses are then analyzed and visualized on the basis of the associated spatial information.
Our study focuses on discourses that relate to the "Finnish Capital Region", a functional urban area that consists of four municipalities in Southeastern Finland: Helsinki, Espoo, Vantaa, and Kauniainen. Helsinki is the capital of Finland and is the biggest city in the nation-state with a population of 0.6 million. Espoo and 1.
What are the online digital discourses that relate to the names of these three capital region cities? 2. How do the discourses differ from one another? 3. What do these discourses reveal about the patterns of socio-spatial differentiation in Finland's capital region? 4. How does combining CADS and GIS work to reveal and visualize the contextual information and discoursal patterns of toponyms?
As the above research questions allude, it is hoped that the multidisciplinary approach used in this onomastic study can help shed new light on not only the meanings attached to the toponyms but also the various stances taken towards them (see, e.g., Ainiala & Östman 2017). This methodological approach is also relevant to geography and other social sciences, as it may allow us to identify and analyze the discursive processes of place-framing (Martin 2003). In doing so, it may be possible to gain new knowledge about "urban imaginaries" or "interpretive grids through which we think about, experience, evaluate, and decide to act in the places, spaces, and communities in which we live" (Soja 2000, 324). Knowledge about these processes is important, because it can help us understand how the construct of place forms the basis of both collective identities and political action. At the same time, it may help reveal how the discursive framing of such places connects with the material reality of those localities.

Background: Onomastics, Corpora, GIS, Discourses, and Geography Corpus Data in Onomastics
Comprehensive text corpora have seldom been used as data source in onomastics. Instead, the digital data most often used in this field has taken the form of name lists varying in size and quality (see Motschenbacher 2020b). The subjects that have been addressed in these studies include the morphophonological and lexical structure of personal names (e.g., MacAulay et al. 2019;Lappe 2002;Pagan 1998); combinations of descriptive appositives and proper names (Bjorge 2003); the topographical words included in place names (Nurminen 2012); surnaming patterns (Laversuch 2011); and most recently, US toponyms that mark the history of Native American peoples (Nick 2017).
Increasingly, the internet has also been used as a source of digital data. The increasing popularity of this choice for data collection is understandable, since the internet may provide a huge amount of onomastic material. Ohlander and Bergh (2004), for example, examined the use of the name Taliban in US and British newspapers; Tanaka (2016) studied Japanese surnames and their varying written forms in social media; and Hämäläinen (2019) looked at internet usernames. Meanwhile, in literary onomastics, researchers have increasingly compiled their material from digitalized versions of literary works. These studies have considered, for instance, the uses and functions of proper names in translated novels (see, e.g., Dalen-Oskam 2013;Tuñón 2013).
There are also examples of text corpora being used for onomastic research. Sinclair (2004), for example, used the Bank of English to analyze the choice of preposition in the titles of organizations (e.g., the European Society of Phraseology). Likewise, Pierini (2008) used the British National Corpus (BNC) to examine the occurrence of personal and place names in various phraseological units (e.g., idioms, binomials, formulae) across different linguistic registers, while Tse (2004) relied on the BNC to explore the use of definite articles in personal name classes. Most recently, the Corpus of Contemporary American English was used by Motschenbacher (2020b) to analyze the use of definite articles in the names of countries.
Text corpora featuring languages other than English have also been analyzed for onomastic research. A large Norwegian newspaper corpus of 150 million words, for instance, was used by Halverson and Engene (2010) to study the metonymic use of Maastricht and Schengen, while collocations featuring America in a Lithuanian newspaper corpus was examined by Vaičenonienė (2001); and in a similar study, Arab and Saudi Arabia was investigated by Kamandulytė (2006) with regard to semantic preferences and collocations. Finally, a large Finnish corpus of online texts, Suomi24, was used by Ainiala and Jantunen (2019) to explore the discourse prosodies of the slang toponyms of Hesa and Stadi for the Finnish capital of Helsinki. Paterson and Gregory (2018, 37) state that "traditionally, corpus linguists have not explored the geographies within texts that they have analysed". However, visual GISting, which involves identifying place names within textual data and linking them to spatial coordinates, has been used in studies dating back to as early as 2010. 1 In the spatial humanities, Murrieta-Flores et al. (2014), for example, automatically extracted place names in texts, assigned them geographic coordinates, and produced visualizations of the toponymic data. Although this approach can help to determine which places are mentioned in the data, it disregards what is said about them (Gregory & Hardie 2011, 305). In other words, this strategy fails to take into account the discourses linked to the places identified.

Corpus Data and Geographic Information Systems (GIS)
Discourse-oriented studies seeking to answer the "what" have frequently taken advantage of CADS research methods, though this practice is not always explicitly highlighted. For example, Gregory and Hardie (2011) relied on the automatic extraction of semantic tags to identify the discursive association between the construct of MONEY and the place name Dunkirk, while Donaldson et al. (2017) used a similar method for their geographical collocation analysis of the association between Lake District names and language users' associations with the constructs of BEAUTY and PICTURESQUE. Combining corpus linguistics and GIS techniques in the spatial humanities is also addressed by Poole (2018) in the investigation of environmental discourses related to the "Rosemont Copper Mine Debate" in Arizona. In this work, Poole concluded that "social media provides a vast geo-referenced data source and could be used to further display the importance of place and its discursive production in environmental communication" (538). Although Poole is specifically referring to environmental communication, the assertion is generalizable to other discussions as well.

Discourses and Geography
Human geographers have theorized about the connection between discourse and space since the "textual" or "discursive" turn of the 1980s, when their attention turned towards the importance of language beyond everyday meaning-making. At the time, scientific attention was also directed toward the ways in which power manifests itself in texts, images, and other cultural products (e.g., Barnes & Duncan 1992).
It is now widely acknowledged that people's ideas about spaces and places are both discursively produced and connected to material reality. Shared ideas "become the frame of mind for social agents as well as being the outcome of the historical and contextual conditions under which they are articulated" (Richardson & Jensen 2003, 15). One reason why discourses matter is that they are inextricably bound to the social processes through which spatial entities such as cities and neighborhoods are (re)produced. Discourses also play an important role in the everyday (re)production of urban space, so that some conceptions of the city become hegemonic by virtue of their already being mainstream, while others are marginalized. As several studies have shown, the naming of places, together with different place-making perspectives and representations, can create sociospatial distinctions, borders, and place identities based on a collective understanding of the urban space (e.g., Scott & Sohn 2019;de Koning 2015;Martin 2003). Scott and Sohn (2019), for instance, demonstrated the ways in which the sharing and naming of geographical representations (through, for example, maps, blogs, visitor information) have contributed to the emergence of new neighborhoods such as Kreuzkölln in Berlin and Nyócker in Budapest.
The use of big data from digital sources has facilitated the convergence of geography and humanities (Gregory & Hardie 2011). This convergence has manifested itself as the "spatial turn" in digital humanities (Presner & Shepard 2015, 247) or the emergence of the "digital geohumanities" (Crang 2015, 351). We follow the tradition of these studies, but also acknowledge the criticism that empiricist approaches may mistakenly presume that data can speak for itself, and is somehow "free of human bias or framing" (Kitchin 2013, 265). Therefore, to help to draw more reliable conclusions from the data, we utilize our "situated knowledge" (Haraway 1996), as Finns who are familiar with the sociocultural context of place-based meaning-making related to Finland's metropolitan region.

Data
Our data was extracted from messages posted on Suomi24 discussion forum. One of the most popular Finnish websites, as of 2019, this platform has approximately 2.2 million visits per month (FIAM). Suomi24 has 24 sub-fora and covers thematic areas such as traveling, health, and family. The Suomi24 Corpus was made available through the Language Bank of Finland (META-SHARE). The material collated for this investigation represents postings made between 2001 and 2016, and consists of 2.7 billion words.
For our study, we extracted data for the Vantaa and Espoo sub-corpora that included the names (or lemmas) Vantaa and Espoo. For the Helsinki sub-corpus, we compiled data from postings with the lemmas Helsinki, Hesa, and Stadi. The sizes of the three sub-corpora were 1.9 million tokens for Vantaa, 2.8 for Espoo, and 5.1 for Helsinki. The reference corpus for the keyword analysis was compiled using systematic sampling in which messages posted in the discussion forum were extracted at one minute past each hour around the clock. In this way, temporal thematic bias in the reference data were largely avoided.

Methodology: from Keywords to Discourses
For this methodology, keywords first needed to be determined. According to Scott and Tribble (2006), "keywords" are those words that occur in the data significantly more often than one would expect them to otherwise occur by chance. In the present study, the keywords were used to determine the semantic context forum users employed to talk about the three capital region cities. The keyword analysis was done using the Keyword List program within the AntConc corpus toolkit (Anthony 2017). The statistical measure chosen for the analysis was the Log-Likelihood test which is suitable and valid for frequency profiling (Rayson & Garside 2000). The first 300 statistically significant keywords for each of the three cities were then analyzed. A frequency of occurrence threshold was set at 50 to eliminate the noise which inevitably results from the repetition of identical postings. This precaution also helped us focus on established usage and meanings.
To uncover the subject of the discussions that made mention of the cities, we analyzed the "discourse prosodies": the repeatedly occurring associations made between the city names and their sets of semanticallyrelated items (see, e.g., Jantunen 2018; Baker et al. 2008;Stubbs 2001). This was done by grouping the keywords into data-driven semantic categories which can then be grouped into either patterns of discourse (Baker 2006) or a coherent discourse function (Hunston 2007). The multiple meanings and referents of the keywords were determined by analyzing concordance lines in the data. Two of this article's authors undertook the analysis. In cases where the meaning was not immediately clear, the entire research team discussed the possible options and came to a consensus. On the rare occasion when no consensus could be reached, the keyword was classified as "other". The classification process resulted in 17 discourse prosody groups (see Figure  1).
After this classification was completed, following the example of previous GIS studies (e.g., Donaldson et al. 2017), we mapped and visualized the discourse prosodies based on geographical information contained in the social media. Finally, after a close reading, we examined the three most wide-ranging discourse prosodies. In this discourse analysis, we thereby complemented the statistical analysis with an interpretation of the possible deeper cultural meanings of the utterances derived from the data.

An Overview of the Discourse Prosodies Associated with the Cities in the Capital Region
The categories and distributions, which emerged from the data, are illustrated in Figure 1. The largest discourse prosody is, by far, the category "places and directions" with 288 keywords (32.00%), followed by "mobility" (107, 11.89%) as the second largest. The difference is statistically significant (p < 0.001 2 , z = 10.31) and confirms Ainiala and Jantunen's (2019) finding that place names are typically associated with not only toponyms of cities, towns, neighborhoods and villages, but also citizens, locations (specific and non-specific), spatial adverbs, and points on the compass (see also Table 1). The category "places and directions" was found to be similarly large for all three cities: 32.00% for Helsinki and Espoo, and 31.00% for Vantaa. After "mobility", the third largest category was "housing", and the fourth, "politics and organizations". The percentage shares were less similar in these categories: 14.33% "mobility" for Helsinki, 12.00% for Vantaa, and 9.33% for Espoo; 7.00% "housing" for Helsinki, 10.00% for both Vantaa and Espoo; and 9.00% "politics and organizations" for Helsinki, 6.67% for Vantaa, and 10.33% for Espoo (see Figure 1). However, the differences were not statistically significant.
In contrast, "crime and justice" (e.g., "poliisi" 'the police') was found to have major discourse prosody for Vantaa and Espoo with a statistically significant differences compared to Helsinki (Vantaa-Helsinki: p < 0.001, z = 4.31; and Espoo-Helsinki: p < 0.01, z = 2.54). In fact, "crime and justice" was an almost non-existent discourse prosody in the Helsinki data examined (Figure 1). Almost as low in the Helsinki sub-corpus was "commerce", which was also significantly different to both Vantaa (p < 0.01, z = 2.80) and Espoo (p < 0.01, z = 2.60). Conversely, "people and individuals" was a discourse category significantly associated with Helsinki, when compared to Vantaa (p < 0.001, z = 5.31) or Espoo (p < 0.001, z = 5.05). The other categories proved less common overall. Consequently, there was less difference between cities. The following section will present the findings obtained for the three largest categories of keywords: "places and directions", "mobility", and "housing". Coates, Richard. 2006b. "Properhood."

Mapping the Three Most Significant Discourse Prosodies Places and Directions
In the discourse prosody "places and directions", there were several differences found between some of the subcategories and keywords they shared. Examples of these sub-categories and keywords are represented below in Table 1. The number of keywords in each subcategory are provided in Table 2. For each sub-category in Table  1 (as in Tables 3 and 5), illustrative examples of keywords are provided. The last column contains keywords that were shared by all three cities, while the other columns show keyword examples that were typical for only one or two of the cities. As shown in Table 2 (examples in Table 1), in all three cities the most keywords fall into the sub-category of Cities, Towns, and Municipalities, and there are no statistically significant differences. The differences in the second largest sub-category, Neighborhoods, on the other hand are statistically significant, Espoo having more mentions of neighborhoods than Helsinki (p < 0.05, z = 2,25). Espoo (n = 23) and Vantaa (n = 20) both appeared more often with local neighborhood names than Helsinki (n = 9) (see Figure 2). While the difference between Vantaa and Helsinki was almost statistically significant (p < 0.05, z = 2.31), the difference between Espoo and Helsinki did reach statistical significance (p < 0.01, z = 2.71). The neighborhoods mentioned were clearly within the boundaries of each respective city, as illustrated below in Figure 2. This finding suggests that, in comparison to Helsinki, Espoo and Vantaa were more closely linked in our investigation to toponyms that are meaningful to local communities, but less well known outside them. This may be due to the strong neighborhood identities that are reputedly connected to the polycentric urban structure of both Espoo and Vantaa.

Figure 2: Place Names in the Capital Region and Surrounding Municipalities
In excerpt 1 displayed below, the Suomi24 user describes the Espoo neighborhood of Leppävaara as the only real urban area in a city that they clearly see as their own ("meidän kaupunki" 'our city'). This excerpt illustrates how some neighborhoods were only discussed in relation to the city where they were located. Other neighborhoods, however, were discussed in connection to all three cities. In excerpt 2 below, the neighborhood of Vuosaari, located in the east of Helsinki, is compared to areas in Espoo. This passage demonstrates how place-framing can draw on and perpetuate socio-spatial differences by drawing parallels and distinctions between places. Vuosaari and the residential areas adjacent to the Länsiväylä Road are similar to each other in that they are both suburban with good traffic connections to the center of Helsinki. However, they differ demographically. For example, in Vuosaari, the proportion of residents whose mother tongue is not one of Finland's official languages is substantially higher (26%) than in Lauttasaari (7%) on the Länsiväylä Road (Ulkomaalaistaustaiset Helsingissä 2021).
[Leppävaara is the only decent-sized urban area in Espoo, the only part of our city that's a real urban area.] 3
[… and there are no areas that are as dodgy as Vuosaari in Espoo, and the traffic's better on the Länsiväylä Road.] The toponyms were also prevalent when broader geographical units were discussed. City, town, and municipality toponyms occurred in the data 28 times for Helsinki, 37 for Vantaa, and 36 for Espoo-including those mentioned in connection with at least two of the three cities. The Finnish provinces of Savo 'Savonia', Pohjanmaa 'Ostrobothnia', and Lappi 'Lapland' were also mentioned, but they were only discussed in connection with Helsinki. Such discursive pairings may point to a deep-rooted geographical dichotomy between the capital city and the 'rest of the country'. The toponyms mentioned at the municipal and regional scale are shown in the maps in Figure 3.

Tallinn (Tallinna) are only mentioned in conjunction with
Helsinki. In excerpt 4, for instance, London and Helsinki are both described as being good for shopping.
[I buy my clothes at Harrod's in London and occasionally from Stockmann's in Helsinki.] As well as referring to geographical units in terms of their relative size, the category of "places and directions" also includes utterances that depict the people living in particular areas as being different or special. For instance, one of the most striking differences found between Helsinki and its neighboring cities was that the Helsinki inhabitants were often juxtaposed with the rest of the country, whereas no such contrast was drawn with the residents of Espoo or Vantaa. This finding might mirror not only the views of many people living in Helsinki, but also the wider national geographical imagination which is continuously reinforced by Finland's mainstream media. Excerpt 5 also highlights this perceived contrast between Helsinki and elsewhere in Finland. This passage was taken from a forum discussion about the difficulties people face when they move to the capital.
[I don't know how this problem can be solved, but maybe the best thing would be that people from the country start by learning to take other people into consideration when they come to Helsinki.] The above suggestion may point to norms of formal politeness commonly associated with dense urban environments-the impersonal, transitory, superficial social interactions thought typical of big cities (Wirth 1938). Taken together, the findings of this study revealed that Helsinki was discussed in relation to the rest of the nation and other world capitals, whereas discussions about Espoo and Vantaa focused more on their inner provincial characteristics. This difference may highlight the special role of the capital city in the spatial order of the nation-state (Jokela & Linkola 2013). Helsinki is a node that links different geographical spaces to each other and, in some cases, acts as a metonym for the entire capital region, including Espoo and Vantaa. With its prominent architecture, state institutions, and urban culture, Helsinki is also, however, connected to some negative western myths of the 'big city'. In stark contrast with the seemingly more inferior rural settings of the provinces, Helsinki appears as superior, with political power and aloof urban sophistication (Jokela & Linkola 2013).

Housing
Within the discourse prosody "housing", the most important differences between the three cities relate to housing type. This difference was mainly evident in the qualitative analysis (Table 3), but was also to some extent present in the quantitative analysis as well (Table 4). Studios, called "yksiö" in Finnish, and two-room apartments or "kaksio" appeared in discourses compiled for all three cities, whereas three-room apartments, "kolmio", and certain types of buildings (e.g., apartment block "kerrostalo"; detached house "omakotitalo" and row house "rivitalo") were only found in the sub-corpora for Espoo and Vantaa but not in the Helsinki sub-corpus. These differences might reflect generally held ideas about housing patterns (Clapham 2005) in which young people leave home, start their adult life in the city, and then raise a family in bigger homes further away from the city centre. In Finland's capital region, Kallio and Punavuori, two traditional working-class districts located next to Helsinki's historical center, are popular among young adults and students who both favor small flats. This trend might also partly explain the prevalence qualitative properties for each city seemed to differ. In Table 5, the qualitative differences between the cities are highlighted with the help of examples of the keywords under various sub-categories. As shown in the Table 5, sub-categories of Public transport and Roads are common to all three cities, whereas the other categories are existent only for two of the cities. Helsinki is often being discussed in terms of air travel ("lento" 'flight', "lentää" 'to fly'). Because Helsinki's airport is geographically located in Vantaa, the keyword "airport" also features heavily in the context of Vantaa. Furthermore, in our investigation, Vantaa often appeared together with Helsinki in toponym Helsinki-Vantaan lentokenttä or Helsinki-Vantaa lentosasema ('Helsinki-Vantaa airport'), the last being the official Finnish name for the Helsinki airport. In the discussions, the role of Helsinki as the place for departing and arriving flights was often emphasized (see excerpt 8), even though the airport itself is not located within municipal boundaries of Helsinki. In addition to air travel, car travel was a common theme in the discussions. 'Roads' was a prominent subcategory for all three cities (Figure 4), which may point to the important role of private motoring and the polycentric urban structure of the cities we examined. Interestingly, the only make of car mentioned here in the corpus was Audi, a brand that is a high-status symbol in Finland (as was also illustrated in excerpt 7). The mention of this brand name probably corresponds to the relatively high socio-economic status of Espoo residents as compared to those in Vantaa and Helsinki. The semantic classes within the discourse prosody "Mobility", and the total number of keywords referring to each per city are presented in Figure 4. The size of each city's pie chart on the map is proportional to the total number of "Mobility" keywords for each city. In this way, the pie charts also show the relative proportions of semantic class based on the number of keywords in each. The figures around the charts correspond to the total number of keywords per color-coded semantic class. Discussions on different modes of transport were accompanied with references to movement in general. Verbs expressing visiting "käydä", leaving "lähteä", and arriving "saapua" appeared in both the Helsinki (4 verbs) and Vantaa (3 verbs) sub-corpora, but there were no such instances whatsoever detected in the Espoo sub-corpus.
One explanation might be that Espoo was less seen as a point of departure or arrival than either Helsinki or Vantaa.
One detail here, as in many of the other discourse prosodies, related to the occurrence of slang words associated with Helsinki (e.g., "dösä" for 'bus', "spora" for 'tram', "kilsa" for 'kilometer'). Helsinki slang is a unique variety that developed at the turn of the 20th century among the working classes. From the 1950s onwards, this urban speech became a common street language of young people throughout the Helsinki region (Ainiala & Lappalainen 2017). Even though slang words and names are used in Espoo and Vantaa as well, in our study, slang was most prominently used to convey users' 'Helsinkiness'. The connection between Helsinki and slang was further evident by the fact that Helsinki was the only one of our three cities to have slang names-Hesa and Stadi (see also Ainiala & Jantunen 2019;Ainiala & Lappalainen 2017).

Conclusions
This article investigates the digital urban discourses surrounding the names used for three cities in Finland's Capital Region: Helsinki, Espoo, and Vantaa. We used onomastics, geographical information systems (GIS), and corpus-assisted discourse analysis (CADS) to detect the potential differences in online discourses that made reference to these cities. Our analysis revealed stark patterns of socio-spatial differentiation and thus confirmed the perceived special role of capitals in the spatial order of nation-states. The knowledge gained from research such as ours may serve urban planners and branding experts who strive to understand the ways people attach meanings to particular cities and neighborhoods; to plan meaningful and efficient enquiries or participatory processes; and to anticipate potential conflicts in socially sustainable ways.
However, by concentrating only on the three most frequent discourse prosodies, we have barely scratched the surface. Moreover, because a frequency-based method for data analysis was used, less frequent phenomena that may have been more discernible with traditional methods of discourse analysis were probably overlooked. Nevertheless, analyzing keywords by grouping them into discourse prosodies helped to reveal what people in the discussion forum were saying about Helsinki, Espoo, and Vantaa. One of the benefits of this methodological approach is that data was not collected expressly for this study, but as part of a larger research objective. As a result, potential biases that may be introduced when researchers collect and observe data with a particular study in mind may have been reduced. But since the demographic groups using social media platforms vary greatly (Sadah et al. 2015), and because there are many Finns who do not use Suomi24, we must bear in mind that this forum is only partially representative. Furthermore, the corpus did not provide information about the relation of users to the area (e.g., whether they live in the area or not). For a more complete picture, other data would need to be analyzed. Another limitation of this study involves the ethical concerns of using social media to gather data without obtaining user consent. It is fair to say that many users in this study were not aware that their personal posts and information were being used for research purposes (Williams et al. 2017).
Despite these limitations, as Motschenbacher (2020b) notes, corpus data and corpus linguistic methods are a powerful way to study names in actual use. By using this approach, the study answers the call for onomastics to reveal more about actual name usage in authentic spoken and textual contexts (e.g., Ainiala & Östman 2017;De Stefani 2016). In this way, our research provides a stronger foundation for other new and innovative onomastic investigations. By using corpus-assisted mixed method approaches such as presented here, future big data studies can contribute to increasingly multidisciplinary study of names.

Endnotes
1 For more on the spatial humanities and Visual GISting, see, for example, Gregory et al. (2015) and Murrieta-Flores et al. (2011). 2 Statistical significances: p < 0.001 = extremely significant, p < 0.01 very significant and p < 0.05 significant. 3 All excerpts are translated by the authors.