Website van Alex Reuneker over taal, hardlopen, wielrennen en reizen

Taal & Literatuur

Posts over taal en literatuur

The Evenings, day 3: 24 December 1946/2025

It's already the third day of reading The Evenings, and I thoroughly enjoyed yesterday. Baldness is a recurring theme throughout the book – an early sign of mortality, and I found it interesting to see how Frits expressed his gnarly remarks on his brother 'getting extremely bald'. Yes, in English too, it sounds both astonishingly direct and funny.

Today, I'm reading chapter 3, and I thought I would extract keywords from that chapter using the tool available at https://www.reuneker.nl/files/keyword. Now, to index keywords, you need a reference corpus, because otherwise, you'd just get the most frequent words, which, in almost all cases, are 'just' grammatical words like the, a, he et cetera. Indeed, looking at the top 3 below, you immediately see what I mean.

Most frequent words in chapter 3 of The Evenings

Most frequent words in chapter 3 of The Evenings

In the keyword tool mentioned, you can select the British National Corpus (BNC) as a reference. That isn't perfect, because it doesn't match genre (newspapers vs literature) nor period, but as, again, it's just for fun, let's not get to picky about that. Pasting the third chapter into the tool and pressing 'extract keywords' gives the following results.

Keywords in chapter 3 of The Evenings

Keywords in chapter 3 of The Evenings

So yes, Dutch names are significantly more frequent in The Evenings than in the BNC. Not that surprising, of course, but if we glance over the names, we see words like said, which makes sense, because Frits talks a lot, and not only to himself. I was surprised to also see wortel (carrot) in the list, but this too is a name, introduced in a proto-bond like 'My name is Wortel. Arend Wortel.'

Again, these little test are just for fun. One chapter is a bit short to do a keyword analysis on, the reference corpus isn't perfect, and there are technical details concerning apostrophes and the like to deal with, but a brief quantitative look at a chapter just gives a nice and different little insight into a text many know so well.

Have fun reading today!

The Evenings, day 2: 23 December 1946/2025

Today, we're reading chapter 2 of The Evenings. I thought it would be fun to look at sentence lengths in that chapter and compare them between the original Dutch text and the English translation. There's no reason to expect a real difference – it's simply for fun.

First, the Dutch text. pasting the chapter into the sentence-length calculator at https://www.reuneker.nl/files/senlen tells us that there are 650 sentences, with a total number of 6720 words, and an average of 10.34 words per sentence (sd = 6.28). For the English translation, there were 3 sentences more (653), with a total number of 7135 words, and an average of 10.93 words per sentence (sd = 6.81). We already see that these numbers are very similar, but let's test for a difference anyway to see if there's something interesting to be learned.

Using the t-test calculator at https://www.reuneker.nl/files/t we get the following results. For De avonden (n=650), the sum total of words is 6720, the minimum number of words in a sentence is 2, the maximum is 40, the mean is 10.34, the median is 9 words, and the standard deviation is 6.28 words. For The Evenings (n=653), the sum total of words is 7135, the minimum number of words in a sentence is 2, the maximum is 44, the mean is 10.93, the median is 9 and standard deviation is 6.81. The difference in sentence lengths between The Evenings and De avonden is not significant (t (1301) = 1.62; p >= 0.05). The effect is negligible (Cohen's d = 0.09; Cohen, 1988). In a boxplot, that looks like this – indeed, nearly identical.

Sentence lengths in De avonden and The Evenings

Sentence lengths in De avonden and The Evenings

So, which sentences are those very long ones, then? And are they the same sentences across both editions? The answer is yes, as you can see below.

Dutch Op de trap, bij de kaartencontrole, troffen ze elkaar weer en liepen langs een spandoek met het opschrift Berends gymnasium, 1926-1946 naar boven, waar ze in een hal, kleiner dan die beneden, voor de ingang van een zaal kwamen.

English At the foot of the stairs, where the ticket takers stood, they met up again and climbed past a banner reading Berends Gymnasium, 1926-1946 and then, in a hall even smaller than the one below, found themselves before the entrance to an auditorium.

The Evenings, day 1: 22 December 1946/2025

Today, December 22nd of 2026, is the first day of reading The Evenings, the translation of De avonden by Gerard Reve. Reading this classic from 22-31 December is a tradition in Holland, although I have no clue how many people actually do it. This year, however, my wife also joins, so that makes for at least two. By reading it from 22 December until New Year's, a chapter a day, each chapter matches the day on which you read it.

It was kind of a weird experience reading the first chapter in English today. I know the book so well from all the annual readings, that I remember some passages by word. And now, of course, the actual words have changed. My first impression is that it makes the scenes at home with Frits, his mother and father less 'stingy', a bit nicer and more comfortable than they appear in the Dutch text – especially where Frits is concerned, because he really isn't a very nice person. This is reflected less in this English translation, I think. As I've read only the first chapter yet, let's not draw any real conclusions yet, though.

As I'm not only a nerd with respect to literature, but also to numbers, I conducted some quick lexical tests on the first chapter. I found it quite funny to see that one of the strongest n-grams in the chapter, and probably throughout the book, turns out to be the fourgram 'he said to himself'.

Fourgrams in the first chapter of The Evenings

Fourgrams in the first chapter of 'The Evenings'

Yes, Frits talks to himself quite a lot, which is part of the appeal of the book, I think: you get to know Frits on a really personal, intimate level, as you not only are invited into his inner musings, but you can also frequently witness the differences between what Frits actually thinks, and what he eventually says.

Looking forward to tomorrow!

Welke 'De Avonden' dit jaar?

Al jaren lees ik van 22 december tot en met oudjaar De avonden van Gerard Reve. Zo'n traditie vind ik heerlijk, niet in de laatste plaats omdat je weet dat je niet alleen bent – je bent natuurlijk bij de beslommeringen van Frits van Egters, maar ik bedoel vooral dat er nog meer mensen zijn die precies op deze dagen steeds een hoofdstuk uit deze literaire klassieker lezen. Zie daarvoor overigens ook het commentaar op https://neerlandistiek.nl/2025/12/zo-ver-weg-van-och-jeetje-achguttoch en lees dan vooral ook de mooie recensie van Marc van Oostendorp van Eric de Rooijs Uit tallozen, jij.

Toch is een beetje variatie op z'n tijd ook niet vervelend. Zo las ik vorig jaar voor de verandering eens de 'verstripping' van Dick Matena, het jaar ervoor, als ik het me goed herinner, de tweede druk en het jaar daarvoor de eerste. Ook heb ik een jaar met veel genoegen het luisterboek op cd beluisterd (inmiddels natuurlijk ook al weer hartstikke ouderwets), waarop een al oude Reve het boek voorleest. Heerlijk. Maar wat blijft er dan over voor dit jaar?

De Avonden in Engelse vertaling

De Avonden in Engelse vertaling

Niet lang na het verschijnen ervan staat hier ook de Engelse vertaling The Evenings in de kast, maar die uitgave heb ik nog steeds niet gelezen. Dat wordt 'm dus dit jaar. Afkicken van moeders 'hoei boei'? Misschien, want dat is in de vertaling van Sam Garret goodness gracious geworden, dat ik overigens toch vooral uit Jerry Lee Lewis' Great Balls of Fire ken.

Hirsch-Popescu Point added to Lexical Diversity Calculator

The Hirsch-Popescu Point (Popescu & Altmann, 2006) is an interesting metric to assess repetition in a text. It is determined by first calculating the frequency distribution of all words in the text. Then, words are ranked from the most frequent to the least frequent. The H-P Point is then defined as 'the point in which the ranking of a word in the distribution matches its frequency, just like the h-index in academia' (see Nunes, Ordanini, Valsesia, 2017, p. 20; Hirsh, 2005). Indeed, the h-index is a well-known measure of productivity and citation impact of publications. The smaller the H-P point, the less repetition a text contains and vice versa, i.e, the greater the HP-point, the more repetition a text contains.

If we apply the calculation to an example text, we easily see how it works exactly. The text used here is Sylvia Plath’s poem 'Lady Lazarus' (1965), and the resulting frequency table (distribution of words) can be seen below.

Word distribution in https://www.poetryfoundation.org/poems/49000/lady-lazarus”>Sylvia Plath’s poem ‘Lady Lazarus’ (1965)

Word distribution in Sylvia Plath’s poem ‘Lady Lazarus’ (1965)

In this table, we see that the word of occurs eight times in the poem, and it is also ranked at position eight in order from most to least frequent words. Therefore, 8 is the H-P Point. Of course, this frequency table, the sorting and determining the actual point at which frequency and order coincide is done by the Lexical Diversity Calculator for you, as can be seen below.

The Hirsch-Popescu Point as calculated by the Lexical Diversity Calculator

The Hirsch-Popescu Point as calculated by the Lexical Diversity Calculator

As always, if you find it useful, have fun!

Pagina 1 of 15