CroALa

approaching an unknown collection

Neven Jovanović / neven.jovanovic@ffzg.hr
University of Zagreb
Venice, 8 September 2016

Address of this page:
croala.ffzg.unizg.hr/2016-croala-aiucd/2016-croala-aiucd.html

The plan

What?

How? (At random, precisely, topically, comparatively)

To what end?

What?

Croatiae auctores Latini

Languages of culture used by Croatians in the past:
Latin, Italian, German.

Bibliographic research: 1263 authors, 4871 works 976-1984

Croatiae auctores Latini

Currently in CroALa: 5.7 million words, 467 documents, bibliographic data.

First edition 2009
ISBN: 978-953-175-356-2

croala.ffzg.unizg.hr (PhiloLogic 3)
PhiloLogic 4
BaseX
Github

License: CC-BY.

A tale of two sets

How?

A "black box"?

Our experience of the text is constrained
by print (in a book),
by a concordancer (PhiloLogic),
by a markup model system (TEI XML).

Each framework is at the same time a help and an impediment.

Flight from literature?

croala.ffzg.unizg.hr/basex/quaelibet

Documentation (Bitbucket)

How to point to a segment of text?

CITE/CTS

CITE Architecture and Canonical Text Service

Christopher Blackwell, Neel Smith (2010)

A specification for technology independent, machine-actionable citation of scholarly resources.

eXist / XQuery, angularJs / jQuery, RDF, MySQL
Homer Multitext, Perseus Digital Library.

A CTS URN

From CroALa

urn:cts:croala:tubero.commentarii.croala-loci:body1.div9.div3.p2.s8.placeName26

A CTS URN

From CroALa

urn:cts:croala:tubero.commentarii.croala-loci:body1.div9.div3.p2.s8.placeName26

A CTS URN

From CroALa

urn:cts:croala:tubero.commentarii.croala-loci:body1.div9.div3.p2.s8.placeName26

Insights from the CITE/CTS model

Multiple editions of the same text, each for a specific, individual purpose (text segmented into paragraphs, sentences, words and interpunction, metrical units; text marked for morphology, topics, named entities), each with easily reachable locations.

The system is reproducible; it is tractable by machines; it is open to others.

`

Insights from the CITE/CTS model

Multiple notes point at locations:
this location refers to a PLACE during a PERIOD.

Notes define relationships of points in different editions:
this NOUN in ACCUSATIVE is a PLACE NAME referring to a FICTIONAL PLACE.

The system is reproducible; it is tractable by machines ("how many locations refer to places? in which parts of the text?"); it may be opened to others.

`

Cf. a test set of CroALA CTS URNs

Cur lingua rerum index sit

(Marko Marulić, Repertorium, Problemata Aristotelis)

Approaching CroALa through indices

Approaching CroALa through indices

A list of persons in Croatian Latin school drama

A list of place references in CroALa as a CITE/CTS system
(an ongoing collaboration with Pelagios)

Ricorditi, lettor, se mai...

Approaching CroALa through comparation

Approaching CroALa through comparation

Some of the interesting clausulae (verse endings) from the Poeti d' Italia in lingua latina occurring in CroALa as well: Clausulae trium verborum

Counting Latin authors in Croatia and in Tyrol:
Croatica et Tyrolensia

To what end?

(Conclusions)

In the struggles with material agency [our] plans and goals too are at stake and liable to revision. And thus the intentional character of human agency has a further aspect of temporal emergence, being reconfigured itself in the real time of practice, as well as a further aspect of intertwining with material agency, being reciprocally redefined with the contours of material agency in tuning.

Andrew Pickering, The Mangle of Practice: Time, Agency, and Science (1995)

What had started as an ad hoc collection of TEI-encoded texts with time grew into a forest; the forest needed a map, and there was a thing that could be developed into a map; we needed specific parts of the forest, and there were things that could be used to extract parts of it; we realized that similar plants are growing in other forests, and there were things that could be used to compare plants.

What had started as a collection of texts turned into something that has to be "manipulated" — accessed — at the atomic and molecular level. These levels, and processes of accessing them, turn out to be intriguing in themselves, beyond engineering (when there is a reference to the urbes Italiae, it seems to refer to...).

To be continued