perseusdl / catalog_data Goto Github PK
View Code? Open in Web Editor NEWMODS and MADS data for the Perseus Catalog
MODS and MADS data for the Perseus Catalog
We already have opp-lat1, then there is also a catalog_pending record.
Are the works in these two records actually the same?
In response to request from Greg and Monica Berti I'm putting Epidoc versions of Aristophanes into Perseids and we need new CTS urns for the Epidoc editions and translations.
The epidoc versions will be based upon the following
Birds tlg0019.tlg006.perseus-grc1 & tlg0019.tlg006.perseus-eng1
Clouds tlg0019.tlg003.perseus-grc1 & tlg0019.tlg003.perseus-eng1
Ecclesiazusae tlg0019.tlg010.perseus-grc1 & tlg0019.tlg010.perseus-eng1
Lysistrata tlg0019.tlg007.perseus-grc1 & tlg0019.tlg007.perseus-eng1
@balmas and @AlisonBabeu This appears both ways as the type of an identifier. Which is preferable? The new records have "cts-urn" and older have "ctsurn."
When we pick one I can run a method to standardize all the records.
A search in the catalog for Naevius, Gnaeus reveals two authors http://catalog.perseus.org/?utf8=✓&utf8=✓&search_field=author&q=Naevius. I had thought this was due to the existence of a duplicate authority record (https://github.com/PerseusDL/catalog_data/blob/master/mads/PrimaryAuthors/G/Gnaeus%20Naevius/n85-356413.xml.mads.xml), but it seems something stranger is going on. This author has six individual works attributed to him that for some reason have been split between the two author records in the catalog, this record (http://catalog.perseus.org/catalog/Mlccnn85356413Naevi) lists four works while this record lists two (http://catalog.perseus.org/catalog/Mstoa0206Naevi).
Bibliographic information for the phi edition of Calpurnius Flaccus' Declamations:
http://latin.packhum.org/author/1100
Author: Calpurnius Flaccus
Year: 1978
Title: Calpurnii Flacci declamationum excerpta / eddidit Lennart Håkanson
Publisher: Stuttgart, Teubner
Series: Bibliotheca scriptorum Graecorum et Romanorum Teubneriana
ISBN: 3519011301
Thanks!
http://catalog.perseus.org/?utf8=%E2%9C%93&id=Atlg2403Persa&search_field=author&q=Persaeus
One of the authority records will need to be deleted with the second TLG added to it.
Two "records" for Persaeus, this Stoic author was also the attributed author of a a fragmentary history with a separate TLG that needs to be added into the main authority record,.
I'm creating an epidoc edition of urn:cts:phi1020.phi003 (based on urn:cts:phi1020.phi003.perseus-grc1) for editing in Perseids. We need a new MODS file for this.
For some reason this version https://github.com/PerseusDL/catalog_data/tree/master/mods/latinLit/phi0959/phi002/perseus-eng1 is in the Google Multiversions table and in the repository but did not make it into the catalog.
We should add the SAWS texts, which already have CTS urns defined in the greekLit namespace and a CTS TextInventory, to the Perseus Catalog.
http://www.ancientwisdoms.ac.uk/citations/api?request=GetCapabilities&inv=Inventory.xml
We need a new edition-level MODS file for the Epidoc version of the Athenaeus Kaibel edition. This digital edition is derived from the current P4 version http://data.perseus.org/texts/urn:cts:greekLit:tlg0008.tlg001.perseus-grc2 (it seems that maybe we're missing a catalog record for this too though...).
There is an issue with the way the individual Dialogi of Seneca are displaying in the catalog. The system has both "correctly" aggregated these works by their individual STOA numbers, for example see the work record for the De Brevitate Vitate (http://catalog.perseus.org/catalog/urn:cts:latinLit:stoa0255.stoa004 data at https://github.com/PerseusDL/catalog_data/tree/master/mods/latinLit/stoa0255/stoa004) but it has also created a top level work record for the Dialogi under phi 1017.12 (catalog view-http://catalog.perseus.org/catalog/urn:cts:latinLit:phi1017.phi012, files at (https://github.com/PerseusDL/catalog_data/tree/master/mods/latinLit/phi1017/phi012). This has led to the creation of duplicate edition entries for the individual Dialogi under the PHI identifier. Additionally this means that the full list of Dialogi do not appear as separate works under the authority record for Seneca http://catalog.perseus.org/catalog/urn:cts:latinLit:phi1017.
Title of work: De Rebus Bellicis
Author: Anonymi Auctoris De Rebus Bellicis
Edition: unknown text edition
Electronic resource: University of Oxford Text Archive - http://ota.ahds.ac.uk/desc/0309
The OTA page has a download link to a ZIP file with an TEI header document, but the text body in a plain-text format. It is licensed as Creative Commons Attribution Non-commercial 3.0 and will be the version of the text used for conversion to CTS TEI XML with as much of the OTA header intact as possible. I will attach the OTA header in a separate comment, as well as worldcat entries in another comment on the issue.
@srdee has made (or will make) epidoc versions of the Perseus thuc translations (tlg0003.tlg001.perseus-eng1, tlg0003.tlg001.perseus-eng2 and tlg0003.tlg001.perseus-eng3). We need new MODS files for these versions so they can be assigned new URNs.
For some reason none of the three works for the author Censorinus are displaying in the catalog. If you look at: http://catalog.perseus.org/catalog/urn:cts:latinLit:stoa0084.stoa001, http://catalog.perseus.org/catalog/urn:cts:latinLit:stoa0084.stoa002, or http://catalog.perseus.org/catalog/urn:cts:latinLit:stoa0084.stoa003. All of the metadata seems correct and the information is in the CITE Tables so I'm not sure what is going on. These records also do not contain an empty series element, which I know had caused this display issue in the past.
We need a new edition-level MODS file for the Epidoc version of the Athenaeus Yonge translation. This digital edition is derived from the current P4 version http://data.perseus.org/texts/urn:cts:greekLit:tlg0008.tlg001.perseus-eng1
Can we merge the records and correct the CITE tables for phi2806.phi002 and stoa0168.stoa001b? They are the same work but the records are slightly different.
I've already updated Justinian's author row in the CITE table to include the phi id as an alternate id (didn't have it before).
I've created an Epidoc version of tlg0007.tlg0012.perseus-grc1, named provisionaly tlg0007.tlg0012.perseus-grc1, for inclusion in Perseids. We need it officially added to the catalog.
There is an issue with the way the individual Dialogi of Seneca are displaying in the catalog. The system has both "correctly" aggregated these works by their individual STOA numbers, for example see the work record for the De Brevitate Vitate (http://catalog.perseus.org/catalog/urn:cts:latinLit:stoa0255.stoa004- data at https://github.com/PerseusDL/catalog_data/tree/master/mods/latinLit/stoa0255/stoa004) but it has also created a top level work record for the Dialogi under phi 1017.12 (catalog view-http://catalog.perseus.org/catalog/urn:cts:latinLit:phi1017.phi012, files at (https://github.com/PerseusDL/catalog_data/tree/master/mods/latinLit/phi1017/phi012). This has led to the creation of duplicate edition entries for the individual Dialogi under the PHI identifier. Additionally this means that the full list of Dialogi do not appear as separate works under the authority record for Seneca http://catalog.perseus.org/catalog/urn:cts:latinLit:phi1017.
Spaces before parentheses are missing from this page: http://catalog.perseus.org/catalog/urn:cts:greekLit:tlg2200
GitHub source-link: https://github.com/PerseusDL/catalog_data/tree/master/mods/greekLit/tlg2200
I assigned the wrong TLG (1864) instead of 0697 to Alexander Polyhistor, so I will need to update the authority record https://github.com/PerseusDL/catalog_data/blob/master/mads/PrimaryAuthors/A/Alexander%20Polyhistor/nr90028764.madsxml.xml as well as fix the corresponding MODS record and redirect the CTS-URN https://github.com/PerseusDL/catalog_data/tree/master/mods/greekLit/tlg1864/tlg001.
The list of works under Aristotle (http://catalog.perseus.org/catalog/urn:cite:perseus:author.204) includes many works that for some reason are only displaying the Greek title as well as supporting search by the Greek title. This seems strange because the MODS records have the Latin titles in them and are present in the Atom feeds (http://catalog.perseus.org/catalog/urn:cts:greekLit:tlg0086.tlg002).
The CITE Table records for these works do not display the Latin titles.
We need a new edition-level MODS file for the Epidoc version of the Athenaeus GULICK edition. This digital edition will be derived from the current P4 version http://data.perseus.org/texts/urn:cts:greekLit:tlg0008.tlg001.perseus-grc1
I'm wondering about possibly tweaking how the translations are created automatically, largely due to the discovery of a large number of incorrectly created translations.
For example, a recently ingested MODS records for Apuleius (https://github.com/PerseusDL/catalog_pending/blob/e59fdd0200b186c167fc3afd7117ff23f40184a4/mods/Apuleius/apuleius.%28LesBellesLettres-Beaujeu%281973%29.OpusculesPhilosophiques.mods.xml), has both objectPart="text" values of "lat" and "grc" for the host text, since one work in the volume is in Greek, but even though all of the other constituent records for this volume only included a Latin language element, Greek translations were also automatically assigned to these works, I'm presuming due to the inclusion of Greek in the host volume information (http://catalog.perseus.org/?f[exp_language][]=grc&q=%22Beaujeu%2C+Jean%22&search_field=editor&utf8=%E2%9C%93).
I'm wondering if we can have a default where it there isn't a language encoded within a constituent record than the system can flag it as an error so I have to look at a file and hopefully thus eliminate the creation of false translations.
There's a typo in this page: http://catalog.perseus.org/catalog/urn:cts:latinLit:stoa0203
It should be OCTAVIUS.
While the abbreviated epigram titles are now showing up for catalog records, I'm wondering if the edition level rather than the work level is the appropriate place for this information to display, because not all of the information is displaying. In the catalog display, the abbreviated epigram titles from just the first
edition record are used as those for the entire work record.
For example, for the epigrams of Simonides: http://catalog.perseus.org/catalog/urn:cts:greekLit:tlg0261.tlg003 shows abbreviated titles from first edition record but none of the others. Even when you click on the individual editions (http://catalog.perseus.org/catalog/urn:cts:greekLit:tlg0261.tlg003.opp-grc5) and there are a number of abbreviated titles in the ATOM feed, <mods:titleInfo type="abbreviated">mods:titleAG 6.2, AG 6.50, AG 6.52, AG 6.197, AG 6.212-217/mods:title/mods:titleInfo<mods:titleInfo type="abbreviated">mods:titleAP 6.2, AP 6.50, AP 6.52, AP 6.197, AP 6.212-217/mods:title/mods:titleInfo they are not showing up in the edition record.
we currently have 3 separate versions for the Perseus edition of this work (Diodorus Siculus Historical Library) tlg0060.tlg001.perseus-grc1 - tlg0060.tlg001.perseus-grc3 - but I think they are really only version right?
@srdee in pull request PerseusDL/canonical#137 created an epidoc version of urn:cts:greekLit:tlg0013.tlg002.perseus-grc1 as urn:cts:greekLit:tlg0013.tlg002.perseus-grc2. We need to add this version to the catalog, via a MODS record..
https://github.com/PerseusDL/catalog_data/tree/master/mads/PrimaryAuthors/B/Bacchius%20Senex
The authority record listed above is not appearing in the catalog although the author does appear in the GoogleFusion authors table. In addition, for some reason the associated work record https://github.com/PerseusDL/catalog_data/blob/master/mods/greekLit/tlg2136/tlg001/opp-grc1/tlg2136.tlg001.opp-grc1.mods1.xml has been assigned to Simonides. (http://catalog.perseus.org/catalog/urn:cts:greekLit:tlg2136.tlg001)
The authority record for Simonides has some very strange issues going on http://catalog.perseus.org/catalog/Mtlg0261Simon, three incorrect identifiers have been assigned to this record (stoa0027b, tlg2136b, tlg4150) ,although they don't show up in the Github authority record for Simonides (https://github.com/PerseusDL/catalog_data/blob/master/mads/PrimaryAuthors/S/Simonides/n85-298996.xml.mads.xml).
The short question for this is: Would it be possible to go ahead and add the tlg id associated with the work attributed to this author as a cts urn in this MADS https://github.com/PerseusDL/catalog_data/blob/master/mads/PrimaryAuthors/C/Ctesicles%20Historicus/viaf34843526.mads.xml?
The longer explanation is: I read the note that there is dispute about there potentially being two authors, or if they are the same person, but as it stands, we already have Ctesicles labeled as the author in the MODS record. This creates a semi-issue where we have two textgroups for Ctesicles, tlg2171 and VIAF34843526 where the VIAF one is tied to a MADS and the tlg is not. That is all perfectly kosher, but the only thing is that the MADS file won't appear in the atom feed for tlg2171 because of the different id so things are slightly less nice and connected. I also have the atom builder look by name to double check these sorts of situations, but that doesn't work here since the authority names are slightly different, "Ctesicles Historicus 3./4. Jh" and "Ctesicles Historicus 3./4. Jh. n. Chr."
Per conversation with Anna, this issue is being created to list several areas where the data in the catalog records (ATOM) is not being displayed or indexed, when such data might be useful.
Add support for the display of pages that were encoded as lists, for example in record
http://data.perseus.org/catalog/urn:cts:greekLit:tlg0126.tlg001.opp-grc2/atom, there were epigrams on multiple pages, and so was thus encoded as
<mods:extent unit="pages">
mods:list156-157, 174-175/mods:list
/mods:extent
Add the abbreviated titles into the expression level record and possibly make them searchable/indexable, this would be particularly useful in terms of the epigrammatists and fragmentary authors. For example, in the above record, the following information was included
<mods:titleInfo type="abbreviated">
mods:titleAG 5.58, AG 5.59, AG 5.98/mods:title
/mods:titleInfo
<mods:titleInfo type="abbreviated">
mods:titleAP 5.58, AP 5.59, AP 5.98/mods:title
/mods:titleInfo
And for the fragmentary historians, further information was also included in the displayLabel attribute,
for example in record, http://data.perseus.org/catalog/urn:cts:greekLit:tlg0116.tlg002/atom:
<mods:titleInfo type="abbreviated" displayLabel="FHG">
mods:titleFHG 4: 279-284/mods:title
/mods:titleInfo
<mods:titleInfo type="abbreviated" displayLabel="FGrH">
mods:titleFGrH 685/mods:title
/mods:titleInfo
As a side note, some fragmentary historians had these titles included in the work title, it seems this data may have been drawn in from the Authors-Abbreviations-Editions spreadsheet, but is uneven.
There are a number of MODS records that although they successfully made it from catalog_pending into catalog_data are for some reason not being displayed in the online catalog. This seems to apply in particular to Perseus Epidoc editions (see #24)
and editions of the CSEL such as the editions of Tertullian where I had included a CTS-URN in the record in catalog_pending. The new versions nonetheless made it into the CITE Tables.
For example, the MODS records referenced above, found here (https://github.com/PerseusDL/catalog_data/blob/master/mods/latinLit/phi1035/phi001/perseus-lat2/phi1035.phi001.perseus-lat2.mods1.xml) and (https://github.com/PerseusDL/catalog_data/blob/e0dc08189a36e493f23f0c9044c5bea864ec9aad/mods/latinLit/stoa0275/stoa030/opp-lat2/stoa0275.stoa030.opp-lat2.mods1.xml) have two CTS-URNs. In the first example,
<mods:identifier type="ctsurn">urn:cts:latinLit:phi1035.phi001.perseus-lat2/mods:identifier
<mods:identifier type="ctsurn">urn:cts:latinLit:phi1035.phi001.perseus-lat2/mods:identifier and in the second
<mods:identifier type="ctsurn">urn:cts:latinLit:stoa0275.stoa030/mods:identifier
<mods:identifier type="ctsurn">urn:cts:latinLit:stoa0275.stoa030.opp-lat2/mods:identifier.
We need a new edition-level MODS file for the Epidoc version of Thucydides Histories, which is based on urn:cts:greekLit:tlg0003.tlg001.perseus-grc1
(Like issue #9 for Athenaeus)
Duplicate file needs to be deleted.
mods/latinLit/phi1515/phi001/opp-lat3/phi1515.phi001.opp-lat3.mods1.xml
It appears that tlg2200.tlg005 has had all of its constituent records broken out into individual records. Can we get rid of these three files if that is the case as they are redundant?
We need a new catalog entry for the Epidoc edition of urn:cts:latinLit:phi1345.phi001.perseus-lat1
(see related request perseids-project/perseids_docs#186 )
(please move if this is not a catalog data issue)
http://catalog.perseus.org/catalog/urn:cts:latinLit:phi1351
http://catalog.perseus.org/catalog/urn:cts:greekLit:tlg7000
-------- Original Message --------
Subject: GetCapabilities XML errors
Resent-From: [email protected]
Date: Sun, 30 Nov 2014 19:46:34 -0500
To: Perseus DL Webmaster [email protected]
I have found the following errors in XML markup returned by GetCapabilities:
Textgroup latinLit:phi1351 groupname is coded as “C. Suetonius Tranquillus” but is actually Tacitus;
Textgroup greekLit:tlg7000 groupname is coded as “greekLit:tlg7000” but is actually Greek Anthology.
The facet search http://catalog.perseus.org/?f[year_facet][]=0 revealed an interesting issue as to how the system seems to be pulling "Edition or Translation Year Published". 218 records have the year 0. In some cases, such as with the new Arabic records I think this has been caused by the presence of this in the MODS record:
In some cases the Arabic record mods:dateIssued-/mods:dateIssued
Nonetheless, for some classical editions, I am wondering if it is failing to list a date due to different date encoding structures in the XML:
For example, http://data.perseus.org/catalog/urn:cts:latinLit:phi0134.phi004.opp-eng1/atom has the following date encoding:
mods:dateModified1900/mods:dateModified
mods:dateCreated1894/mods:dateCreated
and http://data.perseus.org/catalog/urn:cts:latinLit:phi0914.phi001.opp-eng4/atom has the following date encoding:
mods:copyrightDate1959/mods:copyrightDate
mods:dateModified1967/mods:dateModified
There don't seem to be any issues with records that have the following data encoding: dateIssued
mods:dateIssued1878/mods:dateIssued
In PerseusDL/canonical#112, Stella is adding Greek and Arabic versions of Aristotle Ars Poetica to the Perseus canonical repo in order to make them available for editing in Perseids.
In order to facilitate this, she assigned the following version identifiers to them:
urn:cts:tlg0086.tlg034.digicorpus-ara1
urn:cts:tlg0086.tlg034.digicorpus-grc1
Interestingly, these texts come to us by way of us, from the joint Harvard/Tufts GrecoArabic corpus, which is now apparently published at http://digicorpus.net4media-typo3.de/
The source files for these are also here https://bitbucket.org/grecoarabiccorpus/opensourcetexts
I think if we are adding them to Perseids and the Perseus canonical repo, we should also have a catalog entry for them. Not sure our process yet supports predefined version identifiers though.
The work page for this http://catalog.perseus.org/catalog/urn:cts:greekLit:tlg4082.tlg001
for some reason does not display the metadata for the work. The atom feed does contain the relevant metadata.
The author Agatharchides(http://catalog.perseus.org/catalog/Atlg0667Agath) had the wrong identifier for two works in the A-A-E spreadsheet (0667.001) instead of (0067.001), and 0667.004 instead of (0067.004). This will need to be fixed and the corresponding URNs redirected.
The author with TLG 0667 (Marcellinus) has no works and thus no authority record in the catalog as yet, but this will still affect one catalog record http://catalog.perseus.org/catalog/urn:cts:greekLit:tlg0667.tlg004 (which will need a new CTS-URN of urn:cts:greekLit:tlg0067.tlg004 ).
Our correspondent (via paper mail) asks that his translation be cataloged:
http://books.google.com/books/about/De_Libro_Medicinali_Quinti_Sereni_Sammon.html?id=LDDAGwAACAAJ
Should be a version of the work created in #6
See also PerseusDL/canonical#49
The CTS-URN urn:cts:greekLit:tlg0019.tlg001.opp-grc1 is invalid and needs to be redirected once the corrected file is published under a new URN for Bacchylides, likely urn:cts:greekLit:tlg0199.tlg001.opp-grc7 from looking at the CITE Collection tables.
We need a new MODS file for the Epidoc edition of Prometheus Bound, which is based on urn:cts:greekLit:tlg0085.tlg003.perseus-grc1. Simona Stoyanova did the epidoc conversion.
We need a new MODS file for the Epidoc version of Homer's Iliad, based on urn:cts:greekLit:tlg0012.tlg001.perseus-grc1
See related item PerseusDL/canonical#68
We need a new catalog entry for the Epidoc edition of urn:cts:latinLit:phi1035.phi001.perseus-lat1
(see related request perseids-project/perseids_docs#186 )
The author Priscian has incorrectly been assigned the TLG textgroup 0592 http://catalog.perseus.org/catalog/urn:cts:greekLit:tlg0592. The TLG 0592 is for the author Hermogenes, and the reason it was assigned to Priscian was because one of our works for Priscian is a translation of a work by Hermogenes, and it had the following in the MODS record: 0592.001. This Textgroup will need to be unassigned and the work directed to the correct authority record at http://catalog.perseus.org/catalog/Mstoa0234aPrisc
We need a new catalog entry for the Epidoc edition of urn:cts:latinLit:phi1020.phi001.perseus-lat1
(see related request perseids-project/perseids_docs#186 )
There appears to be something odd with Aulus Licinius Archias. He seems to have two MADS records one under PrimaryAuthors/A/Archias/n84-234520.xml.mads.xml and another PrimaryAuthors/A/Aulus Licinius Archias, 5th Cent B.C/AulusLiciniusArchias.mads.xml.
The alt names in the second record (AulusLiciniusArchias.mads.xml) are for Zeuxis, Heracleensis , 5. Jh. v. Chr. who seems to have been a painter?
Is this an odd duplicate case?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.