Website van Alex Reuneker over taal, hardlopen, wielrennen en reizen

Sorted frequency table for Hirsch-Popescu Point added to Lexical Diversity Calculator

As I was working on a very brief piece on the Hirsch-Popescu Point (HPP) in one of the chapters of Reve's The Evenings, it occurred to me that the Lexical Diversity Calculator does calculate the Hirsch-Popescu Point, but that it didn't yet offer the option to actually look at the sorted frequency table used for determining the HPP. As that table can be very informative, I now implemented the displaying of it in Lexical Diversity Calculator. The actual HPP, so the word which has a position in the sorted frequency list that matches it frequency, is marked in bold and red for easy identification.

HPP marked in bold and red

HPP marked in bold and red

If you'd like to use it, just head over to https://www.reuneker.nl/files/ld.

Hirsch-Popescu Point added to Lexical Diversity Calculator

The Hirsch-Popescu Point (Popescu & Altmann, 2006) is an interesting metric to assess repetition in a text. It is determined by first calculating the frequency distribution of all words in the text. Then, words are ranked from the most frequent to the least frequent. The H-P Point is then defined as 'the point in which the ranking of a word in the distribution matches its frequency, just like the h-index in academia' (see Nunes, Ordanini, Valsesia, 2017, p. 20; Hirsh, 2005). Indeed, the h-index is a well-known measure of productivity and citation impact of publications. The smaller the H-P point, the less repetition a text contains and vice versa, i.e, the greater the HP-point, the more repetition a text contains.

If we apply the calculation to an example text, we easily see how it works exactly. The text used here is Sylvia Plath’s poem 'Lady Lazarus' (1965), and the resulting frequency table (distribution of words) can be seen below.

Word distribution in https://www.poetryfoundation.org/poems/49000/lady-lazarus”>Sylvia Plath’s poem ‘Lady Lazarus’ (1965)

Word distribution in Sylvia Plath’s poem ‘Lady Lazarus’ (1965)

In this table, we see that the word of occurs eight times in the poem, and it is also ranked at position eight in order from most to least frequent words. Therefore, 8 is the H-P Point. Of course, this frequency table, the sorting and determining the actual point at which frequency and order coincide is done by the Lexical Diversity Calculator for you, as can be seen below.

The Hirsch-Popescu Point as calculated by the Lexical Diversity Calculator

The Hirsch-Popescu Point as calculated by the Lexical Diversity Calculator

As always, if you find it useful, have fun!