openstemmata / database Goto Github PK
View Code? Open in Web Editor NEWAn open database of stemmata
License: Creative Commons Attribution Share Alike 4.0 International
An open database of stemmata
License: Creative Commons Attribution Share Alike 4.0 International
Old French:
Chansons de geste (JB)
chivalric romances (arthurian, antique)?
Old Occitan
chansons de geste (JB)
?
Medieval Latin
Old Norse
Riddarasögur?
native epics (edda, etc.)?
Middle High German
Epik
Roman
Old and Middle English
Middle Welsh,
translations of Old French
Mabinogi etc.
Medieval Italian
cantari
franco-italian chansons de
We need to find a way to produce a list of stemma in the base, not to have several times the same text. Which raises a question: do we accept different stemmas for a single text?
I think we might be missing one metadata field:
With perhaps the need to distinguish between voluntarily partial (stemma of redaction A, or stemma without codices descripti) vs. unvoluntarily partial (3 mss found after the date), but that would be hard.
Or we need also to add a field like : current/deprecated stemma? (but I can imagine the discussions).
To be done at some point, to avoid typos and incorrect values.
Old and Middle High German work in progress
Stemma found and now encoding:
Checking if stemma exists in:
We may have a problem with the workflow, @GusRiva , see:
https://github.com/OpenStemmata/database/runs/3746467477
Protected branch issue ? Something to change in the settings perhaps ?
At some point, it would be nice to test the validity of the files with every pull request.
As for metadata, I was wondering whether it is not necessary to indicate the presence of textual contamination (when declared by editors), even if it is not represented in pedigrees. For instance, the case of Cligés (1993) by Gregory-Luttrell.
Also a lot of stemmata to add from there. I'll do this one.
@Benedetta-Salvati and I recently merged an unoriented stemma (#138). To encode the absence of orientation, I added dir=none
to Benedetta's encoding. Is it something that is enough ? Should we add yet one more metadata field (I'm not enthusiastic about that), not accept unoriented trees (in a way, they are not stemmata), do something else?
Should we decide of a way to encode uncertainty in the DOT format, for the various expression we encounter from time to time on stemmata ? Or should we discard this information ?
Hi @Jean-Baptiste-Camps @gabays !
I created new branch "transform" to test the automatic conversion to TEI. This branch should not be merged; it's just for testing. We can create a new branch and pull request when we actually want to merge all this into main.
What you can find:
Probably the TEI are still missing things or some things are wrong. If you have time at some point to take a look at the results, let me know what I missed or did wrong 😄
Also, I still need to move the metadata.txt into the teiHeader!
Questions about the file names:
If you don't know how to make a PR but still want to submit, you can do it by following this template.
And copy / paste the dot content here:
To facilitate extraction, we should probably add a field to state if the stemma is complete (all known tradition), partial (we could differentiate: partial - sub-branch of the full tradition, partial - descripti removed, partial - derivatives omitted, or source text omitted for the case of translations, prosifications, etc.) and complete.
(following the discussion at our session at EADH2021).
VIAF IDs do not work for two reasons:
Do I understand correctly, that we want to organise the data into this folder structure?:
database
└─ data
├── fro
└── gmh
I have the function to convert from .gv to .graphml in Python. The resulting .graphml can be opened in Gephi and Cytoscape (I had to do some minor tweaks for it to work in Gephi)
How should we do these conversions? Should I create all the files locally and push them? and then update every once in a while? or should we program an automatic trigger in github? I don't have a lot of experience with that...
This is just a reminder for me to check the relationship between ms. Π and N in Isokrates
In Geschichte der Textüberlieferung der antiken und mittelalterlichen Literatur, each chapter has a different author. I would tend to consider each individual chapter as the publication for the purposes of the metadata (publicationType, publicationTitle) and maybe the whole book as publicationSeries, but I'm not sure. Any thoughts on this?
Here is one of the metadata files:
https://github.com/OpenStemmata/database/blob/erbse/data/lat/Erbse_1961_Lykophron-Alexandra/metadata.txt
I just added the two stemmata for Aliscans, and was thinking that it would be nice to have the stemma for the Willehalm (and other works of Wolfram) !
Are we functional on that aspect or not yet ? Doesn't look so for now (but perhaps it is normal),
e.g. https://github.com/OpenStemmata/database/tree/main/data/gmh/Holz_1897_Laurin
Check relationship ms. E, L and F in Pindar, Epikinia.
We have to produce clear guidelines, with examples, explanation and stuff.
[edit]: I've seen there is something on the dev branch. I think we could, and we should do better, with pictures, etc. The better the guidelines are, the less questions you have… and it should be in HTML, with the website
Should we recommend good practictes regarding the Work place of origin/Date of the metadata? If we want to use them, it is better tidy up this. Is Burgundy/Dijon in France or in Burgundy?
We should probably perform schema based validation of all tei files as part of the tests.
Hello both,
I'm in the process of writing tests for the database, and checking the validity of the input format. I have a basic question:
What do you think ?
In #50 , we have two stemmata by Trovato that include dashed arrows going back in time (?). What does this mean ?
See,
data/ita/Nicolodi_2003_Romani-Turco/stemma.gv
data/ita/Trovato_2004_Sannazaro-Arcadia
What does it mean and how to encode it ?
Alphabetic review done up to: Aliscans.
In this stemma one line is crossed with two perpendicular dashes that mean there are multiple intermediary steps.
Is this meaningful information? I feel we can just omit this, but I would love to hear what you think about it.
Shouldn't we choose the English name for regions, etc. when they exist?
I should finish the conversion of my homemade database, and also add the few stemmata shown in my PhD (mostly a reminder to myself).
I'm currently encoding the stemma of a classical text (The History of...). The tradition is for those stemmata to taking into account the chronology in the hierarchisation: the newest mss are at the bottom, the oldest are at the top. Manuscripts in between are organised by layers. Each layer represents a transmission phase. It is impossible to have two mss of different age on the same layer. I would like to know whether I can include this information. I find it very important.
Several volumes and many stemmata. Would be good to batch add them as well !
Ruling and example in #20
Yes, I think we don't want to transcribe the image printed, but to translate the stemma that was meant to be expressed by that image; if this makes sense.
Normalise regions fields
Normalise date fields
See #21
I think we should normalise the date (and include that in Guidelines). I'd be in favor of either a single date (if dated)
1201
or a range
1201-1300 (for «13th c.).
Should we limit to a single image format ? And if so, which one ?
I'd say "yes" and "png", but I'm open to discussion.
La Mort le roi Artu: : roman du XIIIe siècle / éd. par Jean Frappier, 1936, [BEC 8M342, 1Fb].
In #40 , we have a stemma from Lagomarsini_2018_GuironCourtois2 that includes the curious label V1(+Fi)
.
@GusRiva :
I wonder if this is what is meant in the stemma by "VI(+Fi)", because normally two witnesses that have a common ancestor
are drawn in full. Maybe there is something else happening here?
Good question. I do not remember anything in the main text regarding this…
Ok, F(i) is an Ashburnham, and V1 is cut. Could Fi be a ripped off part of V1 ?
I found a more complete version of both stemmata (but at a prior date) in there..
Both mss come from Western Italy, at the end of the 13th century.
If I read the table well, they both have only a short portion of the text.
I suggest opening an issue, and leaving as is for the moment.
Star
and not start
If the source of the stemma, how can I fill in the "Edition place", etc.? It seems you're not talking about the edition but about the secondary source (edition, article, monograph…)
I have a stemma of the Milione (Polo's Devisement) and the source for the stemma is an article: Burgio, Eugenio ; Eusebi, Mario; "Per una nuova edizione del Milione"…
Simon
Should we create an attribute for lost manuscripts?
Considering them to be hypothetical nodes would be very misleading, I think. Would it be useful to add a special attribute for them? Or should we just consider them like any extant manuscript?
If you compare the output of fro/Zufferey_2007_Alexis
and Zufferey_2010_Alexis
you'll notice that if we have
a -> i[style="dashed"];
conversion includes the 'contamination' info
but not if
a -> i[style=dashed];
we should either handle this type of case or check for it in the tests !
There are a lot of stemmata and references to stemmata compiled in https://www.degruyter.com/document/doi/10.1515/9783110684384/html.
Would be good to batch add them.
We may want to handle an optional list of witnesses, with metadata such as : signature, orig date, place, etc. in the metadata format and associated form.
Question asked by @Lena2001 :
Should we always use & if a text has been written by two authors?
Simple way of dealing with multiple authors would be to use &
or another keyword (like AND
) systematically. The question is potentially a bit larger, since there are other kinds of fields with potential multiple values. Should we repeat them instead ?
We write greek letters with latin alphabet? or label="α"
We need to update the scripts as described here:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.