openstemmata / database Goto Github PK

View Code? Open in Web Editor NEW

12.0 3.0 11.0 47.2 MB

An open database of stemmata

License: Creative Commons Attribution Share Alike 4.0 International

R 0.88% HTML 92.85% Python 6.27%

database's Issues

Epic Poetry Stemmata Collections

Old French:
- Chansons de geste (JB)
- chivalric romances (arthurian, antique)?
Old Occitan
- chansons de geste (JB)
- ?
Medieval Latin
Old Norse
- Riddarasögur?
- native epics (edda, etc.)?
Middle High German

Epik

Roman
Old and Middle English
Middle Welsh,
- translations of Old French
- Mabinogi etc.
Medieval Italian
- cantari
- franco-italian chansons de

List of stemma already in the base

We need to find a way to produce a list of stemma in the base, not to have several times the same text. Which raises a question: do we accept different stemmas for a single text?

Complete/partial stemma

I think we might be missing one metadata field:

complete/partial stemma

With perhaps the need to distinguish between voluntarily partial (stemma of redaction A, or stemma without codices descripti) vs. unvoluntarily partial (3 mss found after the date), but that would be hard.

Or we need also to add a field like : current/deprecated stemma? (but I can imagine the discussions).

Add schematron rules to check metadata values (`keywords` and `terms`)

To be done at some point, to avoid typos and incorrect values.

German

Old and Middle High German work in progress

Stemma found and now encoding:

Checking if stemma exists in:

Eneide https://katalog.ub.uni-heidelberg.de/titel/1203753
Der große Alexander https://katalog.ub.uni-heidelberg.de/titel/1206385
Ogier von Dänemark https://katalog.ub.uni-heidelberg.de/titel/65534157
Willhelm von Österreich https://katalog.ub.uni-heidelberg.de/titel/3939340
Seifrits Alexander https://katalog.ub.uni-heidelberg.de/titel/2621922
Lancelot https://katalog.ub.uni-heidelberg.de/titel/3935449
Karl der Große und die schottischen Heiligen https://katalog.ub.uni-heidelberg.de/titel/3333067
Karl und Galie https://katalog.ub.uni-heidelberg.de/titel/3466841

Workflow error ?

We may have a problem with the workflow, @GusRiva , see:

https://github.com/OpenStemmata/database/runs/3746467477

Protected branch issue ? Something to change in the settings perhaps ?

Validity checks and tests

At some point, it would be nice to test the validity of the files with every pull request.

Unmarked contamination in not straightforward traditions

As for metadata, I was wondering whether it is not necessary to indicate the presence of textual contamination (when declared by editors), even if it is not represented in pedigrees. For instance, the case of Cligés (1993) by Gregory-Luttrell.

Trovato, Everything you Always New…

Also a lot of stemmata to add from there. I'll do this one.

Non-oriented stemma

@Benedetta-Salvati and I recently merged an unoriented stemma (#138). To encode the absence of orientation, I added dir=none to Benedetta's encoding. Is it something that is enough ? Should we add yet one more metadata field (I'm not enthusiastic about that), not accept unoriented trees (in a way, they are not stemmata), do something else?

Handling uncertainty in the data model

Should we decide of a way to encode uncertainty in the DOT format, for the various expression we encounter from time to time on stemmata ? Or should we discard this information ?

Automatic Conversion TEI

Hi @Jean-Baptiste-Camps @gabays !

I created new branch "transform" to test the automatic conversion to TEI. This branch should not be merged; it's just for testing. We can create a new branch and pull request when we actually want to merge all this into main.

What you can find:

Folder Transform with a python file and an xml template that I use to create the basic structure of each new TEI file.
TEI and GrapML versions of each stemma already in the database. I realized that I could also create GraphML versions with literally one line of code, so I also generate those automatically. I know we considered just providing XSLT transformations, but I think this is so much easier and people already have the files ready for any software that uses that format, like Gephi without having to do any transformations.
I set up an action to create/update the TEI and graphML files changed in each new commit. It's not working right now, but I have already tested this in my development environment and worked well, so I will probably figure out soon why it is not working here.

Probably the TEI are still missing things or some things are wrong. If you have time at some point to take a look at the results, let me know what I missed or did wrong 😄

Also, I still need to move the metadata.txt into the teiHeader!

Questions about the file names:

I quite like ".tei.xml" for the file extension and not just ".xml". Any preferences?
Should we use the folder name as the name for the files? for example: Segre_1971_Roland.tei.xml ? That is how @Jean-Baptiste-Camps did it in the Alexis example. I think that is a good idea. For simplicity I have not done it yet, but could happily implement it. The same for graphML. Segre_1971_Roland.graphml

TEMPLATE for STEMMA SUBMISSION

If you don't know how to make a PR but still want to submit, you can do it by following this template.

image: append your image file here
metadata: append your metadata file here

And copy / paste the dot content here:

Complete / partial stemma

To facilitate extraction, we should probably add a field to state if the stemma is complete (all known tradition), partial (we could differentiate: partial - sub-branch of the full tradition, partial - descripti removed, partial - derivatives omitted, or source text omitted for the case of translations, prosifications, etc.) and complete.

(following the discussion at our session at EADH2021).

VIAF

VIAF IDs do not work for two reasons:

For people, there can be multiple IDs in the VIAF (ISNI only provide unique IDs)
For works, there is no ID in the VIAF, just for editions. There is no point in adding an ID to an edition, what we would need is an ID for a work

Folder structure

Do I understand correctly, that we want to organise the data into this folder structure?:

database
    └─ data
        ├── fro
        └── gmh

Conversion GraphML

I have the function to convert from .gv to .graphml in Python. The resulting .graphml can be opened in Gephi and Cytoscape (I had to do some minor tweaks for it to work in Gephi)

How should we do these conversions? Should I create all the files locally and push them? and then update every once in a while? or should we program an automatic trigger in github? I don't have a lot of experience with that...

Isokrates

This is just a reminder for me to check the relationship between ms. Π and N in Isokrates

Book Chapter

In Geschichte der Textüberlieferung der antiken und mittelalterlichen Literatur, each chapter has a different author. I would tend to consider each individual chapter as the publication for the purposes of the metadata (publicationType, publicationTitle) and maybe the whole book as publicationSeries, but I'm not sure. Any thoughts on this?

Here is one of the metadata files:
https://github.com/OpenStemmata/database/blob/erbse/data/lat/Erbse_1961_Lykophron-Alexandra/metadata.txt

Wolfram von Eschenbach ?

I just added the two stemmata for Aliscans, and was thinking that it would be nice to have the stemma for the Willehalm (and other works of Wolfram) !

Post merge workflow committing ?

Are we functional on that aspect or not yet ? Doesn't look so for now (but perhaps it is normal),
e.g. https://github.com/OpenStemmata/database/tree/main/data/gmh/Holz_1897_Laurin

Edition number problem

Impossible to give a roman number for page edition…

Pindar

Check relationship ms. E, L and F in Pindar, Epikinia.

Producing guidelines

We have to produce clear guidelines, with examples, explanation and stuff.

Do you agree?
Do you want me to work on a draft? HTML or Markdown?

[edit]: I've seen there is something on the dev branch. I think we could, and we should do better, with pictures, etc. The better the guidelines are, the less questions you have… and it should be in HTML, with the website

Work place of origin/Date

Should we recommend good practictes regarding the Work place of origin/Date of the metadata? If we want to use them, it is better tidy up this. Is Burgundy/Dijon in France or in Burgundy?

Chaotic locations

@GusRiva Is it my fault if the Commedia (here)or the Suma (here) are pushed outside the correct folders? They should be in data/ita, no? I see your name in the commits.

Validate XML files

We should probably perform schema based validation of all tei files as part of the tests.

Testing and canonical data model

Hello both,
I'm in the process of writing tests for the database, and checking the validity of the input format. I have a basic question:

should I bother to write code that tests graphviz files and our metadata format ?
or should I instead write code to convert it to a canonical format such as XML/TEI, and then use schema validation for that ?

What do you think ?

Old Occitan

This issue documents the work in progress on new Old Occitan stemmata.

Right now we are focusing on epic and arthurian poetry.

Nicolodi_2003_Romani-Turco and Trovato_2004_Sannazaro-Arcadia

In #50 , we have two stemmata by Trovato that include dashed arrows going back in time (?). What does this mean ?
See,

data/ita/Nicolodi_2003_Romani-Turco/stemma.gv
data/ita/Trovato_2004_Sannazaro-Arcadia

What does it mean and how to encode it ?

Old French Epics

Stemmata to encode

Online

Zum Handschriftenverhältnis der chanson de Renaut de Montauban / Erich Korte, 1914, Greifswald : H. Adler, 1914,
Aliscans. Kritischer text von Erich Wienbeck, Wilhelm Hartnacke, Paul Rasch, Halle a. S., Niemeyer, 1903, xlviii + 544 p., https://archive.org/details/aliscanskritisch00alisuoft/page/xviii/mode/2up?view=theater
(-> several trees from different authors)

In library

Le siège de Barbastre, canzone di gesta del XIII secolo. Edizione critica con saggio introduttivo, note al testo e glossario, a cura di Emilia Muratori, Bologna, Pàtron (Biblioteca di filologia romanza della Facoltà di lettere e filosofia dell'Università di Bologna, 9), 1996, 586 p.
Tyssens, Madeleine, La geste de Guillaume d'Orange dans les manuscrits cycliques, Paris, Les Belles Lettres, 1967, p. 247-264.

Systematic review

Alphabetic review done up to: Aliscans.

To check

Aye d'Avignon, chanson de geste anonyme. Édition critique par S. J. Borg, Genève, Droz (Textes littéraires français, 134), 1967, 378 p.
Aymeri de Narbonne, édité par Hélène Gallé, Paris, Champion (Les classiques français du Moyen Âge, 155), 2007, 784 p.
Compte rendu: Andrea Ghidoni, dans Revue critique de philologie romane, 11, 2010, p. 3-7.
Aliscans, publié par Claude Régnier, Paris, Champion (Les classiques français du Moyen Âge, 110-111), 1990, 2 t.
Renaut de Montauban. Édition critique du manuscrit Douce par Jacques Thomas, Genève, Droz (Textes littéraires français, 371), 1989, 807 p.

For exhaustivity

L'épisode ardennais de Renaut de Montauban. Édition synoptique des versions rimées, éd. J. Thomas, Bruges, De Tempel, 1962, 3 t.
Verelst, Philippe, Renaut de Montauban. Édition critique du ms. de Paris, B.N., fr. 764 ("R"), thèse de doctorat, Rijksuniversiteit te Gent, 1985, 1480 p.
Renaut de Montauban, deuxième fragment rimé du manuscrit de Londres, British Library, Royal 16 G II ("B"). Édition critique par Philippe Verelst, Gand, Romanica Gandensia, 1988, 101 p.
Renaut de Montauban. Édition critique du manuscrit Douce par Jacques Thomas, Genève, Droz (Textes littéraires français, 371), 1989, 807 p.
Cordella, Paola, Studio sulla tradizione del "Renaut de Montauban" con edizione critica parziale del manoscritto BNF, fr. 766 (vv. 1-6000), thèse de doctorat, Università degli studi di Sassari, Sassari; École pratique des hautes études, Paris, 2006, 322 + [55] p. [theses.fr]

One line as multiple intermediary steps

In this stemma one line is crossed with two perpendicular dashes that mean there are multiple intermediary steps.

https://github.com/OpenStemmata/database/blob/a62aef7682c24b0beee71e5e969ed290fad10b22/data/grc/Erbse_1961_Xenophon-Opuscula/stemma.gv

Is this meaningful information? I feel we can just omit this, but I would love to hear what you think about it.

Language?

Shouldn't we choose the English name for regions, etc. when they exist?

database/examples/Segre_1971_Roland/metadata.txt

Line 11 in 7e989fc

workOrigPlace : "Normandie?"

Personal database + PhD

I should finish the conversion of my homemade database, and also add the few stemmata shown in my PhD (mostly a reminder to myself).

Hierarchy

I'm currently encoding the stemma of a classical text (The History of...). The tradition is for those stemmata to taking into account the chronology in the hierarchisation: the newest mss are at the bottom, the oldest are at the top. Manuscripts in between are organised by layers. Each layer represents a transmission phase. It is impossible to have two mss of different age on the same layer. I would like to know whether I can include this information. I find it very important.

Geschichte der Textüberlieferung (1961-1964)

Several volumes and many stemmata. Would be good to batch add them as well !

small details about the DOT language

There might be problems:

encoding things like question marks/parenthesis (L, on the upper right). Red circle
Do we correct the stemma? Blue circle

How do we encode subtilities like the situation of β2 and β3?

TODO: add to Guidelines

favour spirit over letter, when obvious.

Ruling and example in #20

Yes, I think we don't want to transcribe the image printed, but to translate the stemma that was meant to be expressed by that image; if this makes sense.

Normalise regions fields
Normalise date fields

See #21

I think we should normalise the date (and include that in Guidelines). I'd be in favor of either a single date (if dated)
1201
or a range
1201-1300 (for «13th c.).

Image format

Should we limit to a single image format ? And if so, which one ?

I'd say "yes" and "png", but I'm open to discussion.

Frappier 1936, Mort Artu

La Mort le roi Artu: : roman du XIIIe siècle / éd. par Jean Frappier, 1936, [BEC 8M342, 1Fb].

Lagomarsini_2018_GuironCourtois2: V1(+Fi)

In #40 , we have a stemma from Lagomarsini_2018_GuironCourtois2 that includes the curious label V1(+Fi).

@GusRiva :

I wonder if this is what is meant in the stemma by "VI(+Fi)", because normally two witnesses that have a common ancestor
are drawn in full. Maybe there is something else happening here?

@Jean-Baptiste-Camps

Good question. I do not remember anything in the main text regarding this…
Ok, F(i) is an Ashburnham, and V1 is cut. Could Fi be a ripped off part of V1 ?
I found a more complete version of both stemmata (but at a prior date) in there..
Both mss come from Western Italy, at the end of the 13th century.
If I read the table well, they both have only a short portion of the text.
I suggest opening an issue, and leaving as is for the moment.

typo

Star and not start

database/examples/Paris_1872_Alexis/Paris_1872_Alexis.xml

Line 194 in da7e4a4

Article as a source

If the source of the stemma, how can I fill in the "Edition place", etc.? It seems you're not talking about the edition but about the secondary source (edition, article, monograph…)

I have a stemma of the Milione (Polo's Devisement) and the source for the stemma is an article: Burgio, Eugenio ; Eusebi, Mario; "Per una nuova edizione del Milione"…

Simon

Lost manuscripts

Should we create an attribute for lost manuscripts?
Considering them to be hypothetical nodes would be very misleading, I think. Would it be useful to add a special attribute for them? Or should we just consider them like any extant manuscript?

contamination not included if "" missing

If you compare the output of fro/Zufferey_2007_Alexis and Zufferey_2010_Alexis you'll notice that if we have

 a -> i[style="dashed"];

conversion includes the 'contamination' info

but not if

 a -> i[style=dashed];

we should either handle this type of case or check for it in the tests !

Handbook of Stemmatology

There are a lot of stemmata and references to stemmata compiled in https://www.degruyter.com/document/doi/10.1515/9783110684384/html.

Would be good to batch add them.

Handle witnesses in metadata

We may want to handle an optional list of witnesses, with metadata such as : signature, orig date, place, etc. in the metadata format and associated form.

Repeated fields and multiple authors

Question asked by @Lena2001 :

Should we always use & if a text has been written by two authors?

Simple way of dealing with multiple authors would be to use & or another keyword (like AND) systematically. The question is potentially a bit larger, since there are other kinds of fields with potential multiple values. Should we repeat them instead ?

Greek letters

We write greek letters with latin alphabet? or label="α"

database/examples/Segre_1971_Roland/stemma.gv

Line 30 in 7e989fc

omega[color="grey"]

Improve Workflow for Tests and Transformation

We need to update the scripts as described here:

#98 (comment)

I will update the transformation script so that it is triggered on pushes to the main branch and can also be triggered manually, and it just converts everything.
We should unify the test scripts on pull requests in just one action. First the action should create the TEI files (without commiting them). Then the tests are done.