Koffie’s vs koffies: how to find evidence of Dutch usage problems?

Marten van der Meulen is the next student in my MA course Testing Prescriptivism to write a blog post:

While recently investigating a piece on the greengrocer’s apostrophe, I read an earlier piece on the Bridging the Unbridgeable blog about an occurrence of koffie’s. The writer of the piece tries (succesfully in my opinion) to identify the reason for this usage. Whether variation between plural koffies and koffie’s is perceived as an actual Dutch usage problem is hard to say, because of the apparent lack of Dutch usage guides where we might be able to check this. I am, however, interested to see whether this is a completely accidental case, or if it is part of a larger usage phenomenon. The problem is, though, that finding structural data for Dutch usage is somewhat more complicated than for English . Nevertheless, based on some simple searches I can say that the use of plural koffie’s is far from incidental.

The main reason the search is somewhat difficult is that there are no readily available Dutch corpora online. English has the BNC and COCA, but the Dutch corpora such as the Corpus Gesproken Nederlands (Corpus Spoken Dutch) and various others have limited access. Futhermore, there is no Dutch option available in Google N-gram, which is a shame, since the program is very well suited to perform a comparative search such as koffie’s/koffies.

koffie's: a twitter searchHowever, there are two simple ways in which searches for koffie’s/koffies can be done, both on Twitter as on Google. All you have to know is a simple restriction, namely lang:nl, as in <lang:nl “koffie’s”>. This restricts any queries to only include tweets or pages written in Dutch. And for the question I’m interested in here, this yields some nice results: while a Twitter search does not provide us with a numerical frequency, it does at least show that the usage does occur. Frequency is noted in a similar Google search: lang:nl produced 13,500 hits when I searched for koffie’s, versus 40,900 for koffies.

The problem with these searches is that they do not disambiguate between possessive and plural, or that they don’t observe word boundaries, as in the first hit: the results have to be cleaned up manually. But even a quick look at the top results shows that the query for koffie’s produced mostly plurals. The anecdotal evidence is solidified! There does seem to be structural variation in Dutch between koffie’s and koffies. The next step (on which more later) will be to extend this finding to include more words in Dutch ending in –ie. Let me know if you have come across any words, so I can investigate them.

