De fine versus – a Renaissance Version

Neven Jovanović, University of Zagreb, Croatia

Address of this page: croala.ffzg.unizg.hr/croatica-tyrolensia/collections/defineversus/
BibSonomy links: croalaDeFineVersus

Summary

On two collections of Renaissance Latin poetry available in digital form (the Poeti d'Italia in lingua latina, ed. Mastandrea et al, first edition 1999, c. 2.5 million words; the Croatiae auctores Latini / CroALa; ed. Jovanović et al, first edition 2009, c. 5 million words), the paper demonstrates importance of a standardized format (TEI XML) and a Creative Commons license (which permits extensive computing manipulation). Isolating a subset of all poems written between 1400 and 1650 (c. 483,000 verses) I discover and analyse reuse of clausulae, a long-recognized locus of poetic memory in Latin literature. Thus I intend to ask for Italian and Croatian Renaissance poems what Mastandrea and Tessarolo in 1993 asked for ancient Roman poetry: to what extent do texts, authors, periods, genres use similar clausulae – and what could that mean?

Cantonnement de Croates dans la Casa Picozzi près de Mestre, pendant le siège de Venise

Sources / corpora

Croatiae auctores Latini:
246,658 Latin verses
1,575,988 words (tokens)*
CroALa / Renaissance (all Latin verses written before 1700):
97,162 verses (39% of all CroALa verses)
602,406 words / tokens (38% of all words in CroALa verses)
Poeti d' Italia in lingua latina (excluding texts present in CroALa as well)
438,484 verses (1.78 times larger than CroALa, 4.5 times larger than CroALa / R)
2,706,946 words / tokens (1.7 times larger than CroALa, 4.5 times larger than CroALa / R)

Three-word endings

Repeated three-word clausulae in CroALa:
6,132 clausulae
comprising 14,695 verses
6% of the corpus
Unique three word clausulae in CroALa / Renaissance (before 1700):
93,507 verses
Repeated three-word clausulae in Croala / Renaissance:
1,604 clausulae
3,539 verses
3.6% of the Renaissance corpus
Unique three-word clausulae in PdILL:
396,041 verses
Repeated three-word clausulae in PdILL:
17,685 clausulae
42,272 verses
9.6% of the corpus

Two-word endings

Repeated two-word clausulae in CroALa:
21,186 clausulae
72,546 verses
29.4% of corpus
Repeated two-word clausulae in CroALa/R:
6,489 clausulae
18,807 verses
19% of corpus
Repeated two-word clausulae in PdILL:
44,148 clausulae
153,428 verses
35% of corpus

One-word endings

Repeated single word clausulae in CroALa:
16,629 clausulae
232,729 verses
94% of corpus
Repeated single word clausulae in PdILL/R:
9628 clausulae
88920 verses
91% of corpus
Repeated single word clausulae in PdILL:
22,853 clausulae
416,906 verses
95% of corpus

Repeated clausulae in the corpora

The most often repeated clausulae in each collection, as well as the longest repeated clausulae, are reported on two web pages: top 10 and longest.

Findings

Three-word clausulae repeated in both sets

Ordered by repetition counts (descending).

CroALa -- PdILL

10 most frequent in CroALa - 8 matches in PdILL
100 most frequent in CroALa - 64 matches in PdILL
1000 most frequent in CroALa - 309 matches in PdILL
1184 most frequent in CroALa (repetition count > 2) - 351 matches in PdILL: report
repeated twice in CroALa - 391 matches in PdILL: report

CroALa/R -- PdILL

Three-word clausulae:

14 most frequent in CroALa/R (repeated more than 5 times) -- 8 matches in PdILL
73 most frequent in PdILL (repeated more than 10 times) -- 31 matches in CroALa/R
further 191 repeated more than twice in CroALa/R -- 69 matches in PdILL

Two-word clausulae:

Of 144 most frequently repeated two-word clausulae in CroALa/R (n > 9) -- 142 matches in PdILL
First 45 longest two-word clausulae recurring in CroALa/R and in PdILL

Single word clausulae:

First 15 longest single word clausulae recurring in CroALa/R and in PdILL

Unique clausulae in the Poeti d'Italia which recur in CroALa/R

See the handout (PDF), Section 1.

Recurring clausulae in the Poeti d'Italia which occur only once in CroALa/R

See the handout (PDF), Section 2.

Unique clausulae in the Poeti d'Italia which occur only once in CroALa/R

See the handout (PDF), Section 3.

Reflection and further questions

There are identical clausulae in the Poeti d'Italia in lingua latina and in the Croatiae auctores Latini. There are identical clausulae which have no parallels in ancient Latin literature (as represented by the Musisque Deoque collection). Some of the identical clausulae are identical by purpose. It is also significant that parallels in CroALa are mostly later than their counterparts in the Poeti d'Italia. Should we explain this by imitation, or by influence? I think this is a bit risky, because it can be understood too simplistically (I remember how my students think about imitation). The correspondences and recurrences are here; we should research it further.

Two more thoughts. The accusation is often put forward that computational reading is a kind of "reading without reading", just sifting through strings and hiding behind the ostentative authority of numbers. On the other hand -- somewhat unrelated to digital humanities, but very pertinent to Neo-Latin literature -- it could be shown that, as a rule, we don't really do international study of Neo-Latin. We in Croatia will be first to confess that we don't; we study national Neo-Latin literature. But the literature is esentially cosmopolitan!

The digital corpora, such as I have been able to explore here, could offer an answer to both problems, to the one on reading without reading, which troubles the digital humanities, and to the one on national and international, which causes anxiety in the Neo-Latin studies. Comparing such corpora, we are invited, again and again, to move between the microscopic -- the individual clausulae and individual contexts -- and the macroscopic. I feel that it is in this to-and-fro, this up and down, that we begin to discover what writing in Latin actually meant for our authors.

Further questions

what does it mean?
which works and authors have the most hits in each collection?
are top repetitions the same in both collections?
what do we don't know about authors, texts? why does a similarity make it more interesting?
what about texts and authors which never repeat clausulae from the other collection?
can random searching be seen as a game (of serendipity)?
can we exclude accidental correspondences, chance? would two random sets of words yield similar results? (this can be checked by comparing with prose, or with words in the middle of verses)

Procedures

Build databases of clausulae

Turn a collection of XML files into database
Extract lines with verses into another database, noting also the id numbers of l nodes in the text collection database
Do some post-processing on the verses database: remove all tags, notes, critical apparatus, abbreviations etc.
Extract three words at the end of line into a database, keeping the original node id of the verse. Repeat for two words at the end of line, for the last word in line (create new database each time)
Find clausulae which are repeated in the clausulae corpus (database) itself; note how many times a clausula is repeated; separate the repetitions from the unique endings
Repeat the steps 1-5 for another collection of texts

Query one set of clausulae with strings from the other set

Using sets of clausulae (repeated and unique) from one collection, search for matches in sets of clausulae from the other collection
Report matches, compare counts of repetitions
Study contexts of matching clausulae

Further research

Repeat the procedure for one- and two-word clausulae
Repeat for verse beginnings, for strings of words inside the verse line

Results

Links to queries and reports

Clausulae repetitae in CroALa. A (first) list of repeated clausulae in the Croatiae auctores Latini collection of neo-Latin texts.

Sources and materials

CroALa XML files on Bitbucket (CC-BY)
XQueries and other scripts on Bitbucket (CC-BY)

Footnotes

1: Mastandrea, Paolo and Tessarolo, Luigi. De fine versus: repertorio di clausole ricorrenti nella poesia dattilica latina dalle origini a Sidonio Apollinare. Hildesheim; Zürich [etc.]: Olms-Weidmann, 1993.

2: XQuery to count words: count(for $e in //*:l//text() return ft:tokenize($e))