Git Product home page Git Product logo

recogito-tei's People

Contributors

arojascastro avatar gusriva avatar hdlabconicet avatar susannalles avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

recogito-tei's Issues

head

It would be interesting also to add the tag head for titles and the attr. type for ordering (subtitle, etc.).

Agenda 15 August

Please fill in with more tasks or topics:

Review of TEI encoding

  • xml:id used in relation: #8
  • listPlace and listPerson: #10
  • @:ana isn't a pointer: #13
  • < p > elements in < body >: #16
  • import and export TEI files: #18
  • something else?

Integration with publishing solutions

  • Set up export for Ed/Markdown: #14
  • What do we need to know about CETEIcean and how can we contribute?

Outreach

  • Proposal for Writing Sprint tutorial SP-EN
  • Workshop in Buenos Aires?

Pelagios Network

  • Latest news

Why TEI elements are removed in the output file?

If the input file is https://github.com/hdcaicyt/Recogito-TEI/blob/master/Ruy_Diaz-La_Argentina_Manuscrita-proposal.tei.xml and the output file from Recogito is https://github.com/hdcaicyt/Recogito-TEI/blob/master/Ruy_Diaz-La_Argentina_Manuscrita-sample.tei.xml, I am wondering why in the output file there are some structural TEI elements missing.

For instance:

input

<head>La Argentina</head>
      <div>
<p>Dedicatoria</p>

output

 <body><div><p>La Argentina


Dedicatoria

A don 

I think the output should contain exactly the same elements - plus the TEI elements added using Recogito, that is, persName, placeName etc. in addition to the list of relationships.

If this is not possible (for any reason), then it should be stated which TEI elements are supported in the input process. Encoding a text is sometimes time-consuming so no one should lose their work.

Minutes 17 December 2019

Hello, here are some notes:

  • Rainer agrees to export files and report the current state of Recogito from an interoperable point of view (if understood correctly).
  • Pelagios Network suggests to extend the Working Group until March. They have not contacted me formally yet. We decided to postpon the third report until the end of the WG.
  • Hugh Cayles says that he will explore a TEI stand-off export with Recogito that will look like JSON-LD. It should be easier and better. Deadline: February-March. If not implemented by March, this WG can draft a proposal for Rainer.
  • Current state of Markdown: if I understood well, this output is only available when exporting from TXT (not from TEI). Later thought: any chance to resuse OxGarage pipeline: https://oxgarage.tei-c.org/#
  • Rainer suggests to use CETEIcean for digital minimal editions rather than Markdown. Susanna agreed with Rainer. Gimena wants also collaborate in this development. Rainer suggests that this could be the content of our Third Report.
  • Antonio asks about the Grant/money and who is interested in using that money to organize an event. No decision. Pending.

Feel free to add actions or correct me.

listPlace and listPerson

Maybe it could be a good idea to create <listPlace> and <listPerson> elements? This way we wouldn't need to repeat the information in the attributes each time the same entity appears in the text. We could use just the @ref attribute to direct to the xml:id of the element in the list. This would also help solve the issue #8 , using the elements in the list instead of the elements in the text.

Import and Export TEI files

I would like to open this issue to discuss an alternative scenario. I understand that to the moment we have focused on TEI problems caused when a user imports a TXT. However, it is also possible that a person has already a TEI XML file with basic structural information and would like to enrich the text with annotation of places and people. For this reason, I uploaded several of my TEI files to Recogito and I found out three things:

  • metadata from TEI is not recovered in Recogito Settings:

Captura de pantalla 2019-08-15 a las 11 13 21

On the other hand, I do not know to what extent this is important, because the metadata is still in the TEI file and it is displayed in the author/editor mode.

  • users can select which NER software they want to use depending on the language (that is good, it is always great to support multilingualism)

  • at the moment, the export is not working and I cannot see how the entities are represented but I guess it follows the same method as discussed in TXT files:

Captura de pantalla 2019-08-15 a las 11 15 52-crop

There are other export options that are currently working (CSV).

Proper display of TEI header in GUI?

A question just brought up by @eltonteb:

Should Recogito display the TEI header information in the annotation view? Or should everything inside teiHeader remain hidden? (Or even be made collapsible etc.?)

Minutes 05/16

Present: Rainer Simon, Hugh Cayless, Antonio Rojas, Gustavo F. Riva, Susanna Alles, Gimena del Rio

Absent: Romina DeLeón, Nidia Hernández, Alex Gil

General goals

  • Customization of annotation display with Recogito
  • Working and publishing environment
  • Integration (CeTEIcean and Ed/Jekyll)
  • CSS and JavaScript in the same template in Ed
  • Hangouts and Github repository for sharing and communication purposes

First Tasks (TDB by May 30)

  • @ALL: revise the TEI export that you can download from Recogito
  • @Gimena will open the repository and upload LAM latest version
  • @Gimena will add a sample of La Argentina Manuscrita (LAM) as example
  • @rsimon will add a sample of supported TEI elements
  • @arojascastro will ask Rebeca about Pelagios Representative
  • @arojascastro will send a Doodle for our next meeting in two weeks
  • @arojascastro and @Gimena will write blog entry for Pelagios network

Other activities

  • Proposal for TEI Graz (sent)
  • Proposal for Writing Sprint tutorial SP-EN (accepted)
  • Maybe: workshop in Buenos Aires/Workshop in Germany related to Gimena’s/Antonio’s projects

Definition of TEI sublanguage supported by Recogito

Problem:

  • (Some) external TEI files can be loaded but not annotated in Recogito
  • They are XML-valid. There is no warning, just no popup/editor window.

Examples:

Environment:

Other observations:

Request/Question:

  • Is there any documentation of the TEI (sub-)language supported by Recogito? Keep in mind that TEI is an "extensible standard" (a devil's advocate may say, not a standard at all, but just a collection of violable conventions), so that users are actually encouraged to provide their own customizations, and whether or not any such extensions are supported must be made explicit. (Found nothing on Recogito website.)

<p> elements

It would be very useful to automatically divide the <body> into <p> elements. The boundary between two of these should be a \n in the original file.

xml:id used in relation

Looking at the examples from the Argentina Manuscrita.
Right now each persName and placeName has an automatically generated xml:id. Place names are linked to some external reference with the @ref . So, the only way to identify two places as the same is through this attribute.

Why are the relations in the element done considering an xml:id? This is unique for one element, but what is meant is actually the reference, not the element itself. It should use the values in the @ref.

Element note target

<note> elements are outside the elements they refer to, usually as following-sibling. However, it might be useful to add a @target attribute pointing to the corresponding xml:id to make it easier to identify the reference.

Imprint metadata

We don't record this information in Recogito. I'd be slightly reluctant to add more metadata fields to Recogito. It's deliberately supposed not t be a metadata management tool. IMO more specific data such as this could be added afterwards, outside of Recogito. Alternatively, when uploading the source file as TEI, this data would be retained from the original, anyway.

https://github.com/hdcaicyt/Recogito-TEI/blob/b3f8ef18a615b79ac164ce10cb4b9292fe16a318/Ruy_Diaz-La_Argentina_Manuscrita-proposal.tei.xml#L25

From Recogito to Ceteicean

What do we need to know about making a minimal edition using Recogito to mark up texts and creating a TEI/XML file?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.