De fine versus – a Renaissance Version
Neven Jovanović, University of Zagreb, Croatia
- Address of this page: croala.ffzg.unizg.hr/croatica-tyrolensia/collections/defineversus/
- BibSonomy links: croalaDeFineVersus
Summary
On two collections of Renaissance Latin poetry available in digital form (the Poeti d'Italia in lingua latina, ed. Mastandrea et al, first edition 1999, c. 2.5 million words; the Croatiae auctores Latini / CroALa; ed. Jovanović et al, first edition 2009, c. 5 million words), the paper demonstrates importance of a standardized format (TEI XML) and a Creative Commons license (which permits extensive computing manipulation). Isolating a subset of all poems written between 1400 and 1650 (c. 483,000 verses) I discover and analyse reuse of clausulae, a long-recognized locus of poetic memory in Latin literature. Thus I intend to ask for Italian and Croatian Renaissance poems what Mastandrea and Tessarolo in 1993 asked for ancient Roman poetry: to what extent do texts, authors, periods, genres use similar clausulae – and what could that mean?
Sources / corpora
- Croatiae auctores Latini:
- 246,658 Latin verses
- 1,575,988 words (tokens)*
- CroALa / Renaissance (all Latin verses written before 1700):
- 97,162 verses (39% of all CroALa verses)
- 602,406 words / tokens (38% of all words in CroALa verses)
- Poeti d' Italia in lingua latina (excluding texts present in CroALa as well)
- 438,484 verses (1.78 times larger than CroALa, 4.5 times larger than CroALa / R)
- 2,706,946 words / tokens (1.7 times larger than CroALa, 4.5 times larger than CroALa / R)
Three-word endings
- Repeated three-word clausulae in CroALa:
- 6,132 clausulae
- comprising 14,695 verses
- 6% of the corpus
- Unique three word clausulae in CroALa / Renaissance (before 1700):
- 93,507 verses
- Repeated three-word clausulae in Croala / Renaissance:
- 1,604 clausulae
- 3,539 verses
- 3.6% of the Renaissance corpus
- Unique three-word clausulae in PdILL:
- 396,041 verses
- Repeated three-word clausulae in PdILL:
- 17,685 clausulae
- 42,272 verses
- 9.6% of the corpus
Two-word endings
- Repeated two-word clausulae in CroALa:
- 21,186 clausulae
- 72,546 verses
- 29.4% of corpus
- Repeated two-word clausulae in CroALa/R:
- 6,489 clausulae
- 18,807 verses
- 19% of corpus
- Repeated two-word clausulae in PdILL:
- 44,148 clausulae
- 153,428 verses
- 35% of corpus
One-word endings
- Repeated single word clausulae in CroALa:
- 16,629 clausulae
- 232,729 verses
- 94% of corpus
- Repeated single word clausulae in PdILL/R:
- 9628 clausulae
- 88920 verses
- 91% of corpus
- Repeated single word clausulae in PdILL:
- 22,853 clausulae
- 416,906 verses
- 95% of corpus
Repeated clausulae in the corpora
The most often repeated clausulae in each collection, as well as the longest repeated clausulae, are reported on two web pages: top 10 and longest.
Findings
Three-word clausulae repeated in both sets
Ordered by repetition counts (descending).
CroALa -- PdILL
- 10 most frequent in CroALa - 8 matches in PdILL
- 100 most frequent in CroALa - 64 matches in PdILL
- 1000 most frequent in CroALa - 309 matches in PdILL
- 1184 most frequent in CroALa (repetition count > 2) - 351 matches in PdILL: report
- repeated twice in CroALa - 391 matches in PdILL: report
CroALa/R -- PdILL
Three-word clausulae:
- 14 most frequent in CroALa/R (repeated more than 5 times) -- 8 matches in PdILL
- 73 most frequent in PdILL (repeated more than 10 times) -- 31 matches in CroALa/R
- further 191 repeated more than twice in CroALa/R -- 69 matches in PdILL
Two-word clausulae:
- Of 144 most frequently repeated two-word clausulae in CroALa/R (n > 9) -- 142 matches in PdILL
- First 45 longest two-word clausulae recurring in CroALa/R and in PdILL
Single word clausulae:
Unique clausulae in the Poeti d'Italia which recur in CroALa/R
See the handout (PDF), Section 1.
Recurring clausulae in the Poeti d'Italia which occur only once in CroALa/R
See the handout (PDF), Section 2.
Unique clausulae in the Poeti d'Italia which occur only once in CroALa/R
See the handout (PDF), Section 3.
Reflection and further questions
There are identical clausulae in the Poeti d'Italia in lingua latina and in the Croatiae auctores Latini. There are identical clausulae which have no parallels in ancient Latin literature (as represented by the Musisque Deoque collection). Some of the identical clausulae are identical by purpose. It is also significant that parallels in CroALa are mostly later than their counterparts in the Poeti d'Italia. Should we explain this by imitation, or by influence? I think this is a bit risky, because it can be understood too simplistically (I remember how my students think about imitation). The correspondences and recurrences are here; we should research it further.
Two more thoughts. The accusation is often put forward that computational reading is a kind of "reading without reading", just sifting through strings and hiding behind the ostentative authority of numbers. On the other hand -- somewhat unrelated to digital humanities, but very pertinent to Neo-Latin literature -- it could be shown that, as a rule, we don't really do international study of Neo-Latin. We in Croatia will be first to confess that we don't; we study national Neo-Latin literature. But the literature is esentially cosmopolitan!
The digital corpora, such as I have been able to explore here, could offer an answer to both problems, to the one on reading without reading, which troubles the digital humanities, and to the one on national and international, which causes anxiety in the Neo-Latin studies. Comparing such corpora, we are invited, again and again, to move between the microscopic -- the individual clausulae and individual contexts -- and the macroscopic. I feel that it is in this to-and-fro, this up and down, that we begin to discover what writing in Latin actually meant for our authors.
Further questions
- what does it mean?
- which works and authors have the most hits in each collection?
- are top repetitions the same in both collections?
- what do we don't know about authors, texts? why does a similarity make it more interesting?
- what about texts and authors which never repeat clausulae from the other collection?
- can random searching be seen as a game (of serendipity)?
- can we exclude accidental correspondences, chance? would two random sets of words yield similar results? (this can be checked by comparing with prose, or with words in the middle of verses)
Procedures
Build databases of clausulae
- Turn a collection of XML files into database
- Extract lines with verses into another database, noting also the id numbers of l nodes in the text collection database
- Do some post-processing on the verses database: remove all tags, notes, critical apparatus, abbreviations etc.
- Extract three words at the end of line into a database, keeping the original node id of the verse. Repeat for two words at the end of line, for the last word in line (create new database each time)
- Find clausulae which are repeated in the clausulae corpus (database) itself; note how many times a clausula is repeated; separate the repetitions from the unique endings
- Repeat the steps 1-5 for another collection of texts
Query one set of clausulae with strings from the other set
- Using sets of clausulae (repeated and unique) from one collection, search for matches in sets of clausulae from the other collection
- Report matches, compare counts of repetitions
- Study contexts of matching clausulae
Further research
- Repeat the procedure for one- and two-word clausulae
- Repeat for verse beginnings, for strings of words inside the verse line
Results
Links to queries and reports
- Clausulae repetitae in CroALa. A (first) list of repeated clausulae in the Croatiae auctores Latini collection of neo-Latin texts.
Sources and materials
- CroALa XML files on Bitbucket (CC-BY)
- XQueries and other scripts on Bitbucket (CC-BY)
Footnotes
1: Mastandrea, Paolo and Tessarolo, Luigi. De fine versus: repertorio di clausole ricorrenti nella poesia dattilica latina dalle origini a Sidonio Apollinare. Hildesheim; Zürich [etc.]: Olms-Weidmann, 1993.
2: XQuery to count words: count(for $e in //*:l//text() return ft:tokenize($e))