Git Product home page Git Product logo

ebu / ebu-tt-live-toolkit Goto Github PK

View Code? Open in Web Editor NEW
25.0 25.0 10.0 114.63 MB

Toolkit for supporting the EBU-TT Live specification

Home Page: http://ebu.github.io/ebu-tt-live-toolkit/

License: BSD 3-Clause "New" or "Revised" License

Makefile 0.14% Python 63.72% Batchfile 0.09% HTML 1.84% CSS 0.20% Gherkin 17.12% JavaScript 16.89%
broadcast captioning captions ebu-tt live python subtitles subtitling video

ebu-tt-live-toolkit's People

Contributors

eyallavi avatar frans-ebu avatar kozmaz87 avatar malikbeytrison avatar nigelmegitt avatar skhameed86 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ebu-tt-live-toolkit's Issues

COMPONENT: Archiver

As a broadcaster/access services provider, I want to archive EBU-TT Live documents so that I can reuse and distribute them after live transmission as a single Part 1 document.

Input: a sequence of EBU-TT-Live documents.
Output: a single EBU-TT part 1 document.

COMPONENT: Delay node

As a playout service provider, I want to add a delay to the sequence so that I can synchronise subtitle with audio.

Input: a sequence of part 3 documents.
Output: a delayed sequence of part 3 documents.

  • implement a fixed delay node #231
  • implement a variable delay node #232
  • write tests for delay node #233

Timebase semantic testing

Test semantic validation on timebase rules :

Start point of a temporal interval associated with a tt:body element.
If the timebase is "smpte" the type shall be ebuttdt:smpteTimingType .
If the timebase is "media" the type shall be ebuttdt:mediaTimingType .
If the timebase is "media" the time expression should be the offset from a syncbase of "00:00:00.0".
If the timebase is "clock" the type shall be ebuttdt:clockTimingType .

COMPONENT: Simple Part 3 consumer (Semantic validation)

As a developer/playout service provider, I want to validate an EBU-TT Live sequence against the specification so that I know if they conform to the specification.

Input: Single part 3 doc or sequence of part 3 documents
Output: display of text with begin and end times and validation message(s)

It requires the following steps to be completed:

  • validation framework for semantic rules #72
  • live semantic validation #73
  • timing calculation(consumer)
    • copy across timing constraints from BBC/kozma/consumerlogic branch #88
    • calculate document computed begin and end times #89
    • calculate resolved activation begin and end times #109
    • document begin and end times should be in that order on the timeline according to R16
    • write tests for correct resolution of activation time calculation
  • produce output
    • generate output requirements - see #87
    • write output

(this list is not yet complete)

COMPONENT: Handover node

As a subtitle provider/broadcaster, I want to combine documents from alternating respeakers/stenographers into a single EBU-TT live

Input: 2 or more part 3 sequences; handover options.
Output: a single part 3 sequence.

Subtasks:

  • Implement core functionality #363
  • Create configurator #368
  • Document handover node #364
  • Unittest handover node #373
  • BDD testing for handover node #397 (duplicate of #311)
  • UIP modifications to support handover use-case #374

COMPONENT: Switcher node

As a playout service provider, I want to switch between multiple sequences so that I can choose which sequence is output.

Input: multiple part 3 sequences; switching options.
Output: a single part 3 sequence.

COMPONENT: Reference clock

As a developer/tester, I want an external reference clock so that I can ensure documents are processed correctly.
Input: none.
Output: UTC date-time value.

Timeformats in documents

Time formats in documents are a bit confusing :

<tt:body tt:begin="63016289ms" tt:dur="00:00:01">

Can we chose the format used to convert timedeltas more precisely ? For example here, having begin in hh:mm:ss.ms and dur in xxxxxms format would be more logical.

Raised from discussion during 13/07/2016 call.

graphviz dependency not documented

I think graphviz needs to be installed in order to see the figures in the documentation correctly, but this is not clear from the README.md.

(Building sphinx gives a warning it cannot find the dot executable if the user has not installed it).

I see two options:

  • install graphviz automatically (not easy to do platform independently?)
  • document in README.md that the user needs to install graphviz and add the location of the dot executable to the PATH

COMPONENT: User input producer

As a developer/tester, I want a sequence of part 3 documents generated from text input so that I can develop/test an implementation against it.

Input: Text file provided by user.
Output: Sequence of EBU-TT Live documents.

Slack notifications granularity

Check if we all agree with changing the notifications to only show:

  • Failures
  • Changes (includes the first build after fails)

Responses:

  • Frans OK
  • Zoltan OK
  • Eyal
  • Nigel
  • Gil

COMPONENT: EBU-TT-D encoder

As a playout service provider, I want to convert EBU-TT-Live documents to EBU-TT-D documents so that I can distribute them to the end device

Input: sequence of part 3 documents
Output: sequence of EBU-TT-D documents

Subtasks:

  • Add EBU-TT-D XSD 1.1 to repository and update it if necessary #177
  • Create bindings (move EBU-TT-3 current bindings to a sensible place) #176
  • Conversion initiation logic and validation #174
  • Create EBU-TT-D conversion classes #170

COMPONENT: Simple Part 3 consumer (XML)

As a developer/tester I want a simple view of the text and times in a sequence of EBU-TT Live documents so that I can verify that they will be processed correctly.

Input: Sequence of part 3 documents; 'validate only' option.
Output: display of text with begin and end times and validation message(s)

COMPONENT: RTP carrier

As a playout service provider, I want to carry EBU-TT Live documents using the RTP protocol so that I can use EBU-TT Live over IP networks.

Input: a sequence of part 3 documents.
Output: a sequence of part 3 documents for interfacing with RFC 3550.

COMPONENT: Fixed input producer

As a developer/tester, I want a sequence of EBU-TT Live documents generated automatically so that I can develop/test an implementation against it without the need to input text.

Input: none.
Output: a sequence of Part 3 documents.

Badge-per-branch for Travis CI?

Currently the Travis CI badge referred to in the readme.md points to the Master branch.

This means the badge may show incorrect status on other branches (unless the readme.md is modified).

Some people have done provided a pre-built hook to get around this, but it seems to be a bit fragile: http://stackoverflow.com/questions/18673694/referencing-current-branch-in-github-readme-md

Also check out: http://stackoverflow.com/questions/19810386/showing-travis-build-status-in-github-repo

tests for smpte timebase semantic checks

Test that when timebase is set to smpte, documents are validated/rejected depending on the format of the time. Also test that presence and correctness of all needed parameters is checked

COMPONENT:XSD ttm:agent element defined as string, should be a complexType

The ttm:agent element is defined in metadata.xsd as xs:string, but it should be as in TTML1 §12.1.5:

<ttm:agent
  type = (person|character|group|organization|other)
  xml:id = ID
  xml:lang = string
  xml:space = (default|preserve)
  {any attribute not in default or any TT namespace}>
  Content: ttm:name*, ttm:actor?
</ttm:agent>

where

<ttm:name
  type = (full|family|given|alias|other)
  xml:id = ID
  xml:lang = string
  xml:space = (default|preserve)
  {any attribute not in default or any TT namespace}>
  Content: #PCDATA
</ttm:name>

and

<ttm:actor
  agent = IDREF
  xml:id = ID
  xml:lang = string
  xml:space = (default|preserve)
  {any attribute not in default or any TT namespace}>
  Content: EMPTY
</ttm:actor>

This issues is copied from the BBC repo: bbc#8

CI e-mail notifications

Add CI build notifications to team (members).

Note that it seems committers get info on their commits already anyhow:

By default, email notifications are sent to the committer and the commit author, if they are members of the repository (that is, they have push or admin permissions for public repositories, or if they have pull, push or admin permissions for private repositories).

Options for wider notifications include:

COMPONENT: Complex noise introducer

As a developer/tester, I want to control the level of 'noise' and complexity in a stream of part 3 documents so that I can test my implementation against different scenarios.

Input: a sequence of part 3 documents; options for introducing complexity into the stream.
Output: a sequence of modified part 3 documents.

Unit-Test documents comparison (ComparableMixin)

Documents use the mixin ComparableMixin (defined in the project root in file utils.py). This mixin allows for correct and easy comparison of documents :

  • two documents with the same sequenceIdentifier will be compared using their sequenceNumber :
document1 < document2 if document1.sequenceNumber < document2.sequenceNumber
  • If the documents do not have the same sequenceIdentifier, there is no comparison possible and an error is raised.

With this issue I want to address the fact that this was not tested yet, so I implemented tests to ensure that document comparison works as intended.

COMPONENT: WebSocket carrier

As a playout service provider, I want to carry EBU-TT Live documents using the WebSocket protocol so that I can use EBU-TT Live over TCP.

Input: a sequence of part 3 documents.
Output: a sequence of part 3 documents for RFC 6455.

COMPONENT: XSD

As a developer/playout service provider, I want to validate EBUT-TT Live documents so that I know if they are valid XML

Input: a single part 3 document.
Output: validation message.

COMPONENT: SDI carrier

As a playout service provider, I want to carry EBU-TT Live documents over HD SDI so that I can pass EBU-TT Live around with video in a broadcast environment.

Input: a sequence of part 3 documents.
Output: a sequence of part 3 documents in HD SDI

COMPONENT: Downstream validator

As a developer/playout service provider, I want to check that an EBU-TT Live document conforms to downstream requirements so that I know it will be successfully processed.

Input: a single part 3 document; metadata instructions.
Output: validation message.

COMPONENT: Part 1 cued

As a playout service provider, I want to consume a cued EBU-TT Live sequence from an EBU-TT Part 1 document .

Input: a single EBU-TT Part 1 document; cueing options.
Output: a sequence of Part 3 documents released according to the cueing options.

Basic testing setup

  • Find a way to handle xml files (with templates for example)
  • Setup a basic test infrastructure for bdd
  • Write some tests

TimecountTimingType regex is wrong

Spotted this while working on my understanding of the bindings and how to add smpte <-> timedelta conversion.

Python's regexes are read sequentially, meaning that if you have for example [0-9]+(h|m|s|ms) :

  • 9h is parsed normally
  • 9m also
  • 9s also
  • However, 9ms is parsed as 9m and the s is forgotten

This is really problematic because it also allows 9mh for example and extracts 9m, however 9mh is not permitted in this case.

To solve the problem, the solution is to add a $ at the end of the regex, so for timecountTimingtype :

  • [0-9]+(\.[0-9]+)?(h|m|s|ms)$ in the xsd
  • ?P<numerator>[0-9]+(?:\\.[0-9]+)?)(?P<unit>h|m|s|ms)$ in bindings

COMPONENT: Teletext-to-part 3

As a broadcaster, I want to convert subtitles in teletext format to EBU-TT Live documents so that I can use legacy systems.

Input: Teletext (data feed/VBI/VANC)
Output: Sequence of EBU-TT Live documents

Document manifest format

We need to document the manitfest format. E.g.

  • in the code
  • in the wiki
  • in the header of the manifest file?

related to: #59

COMPONENT: Simple noise introducer

As a developer/tester, I want to consume 'noisy' sequences of part 3 documents so that I can test my implementation against a known set of scenarios.

Input: a sequence of part 3 documents; noise options.
Output: a sequence of modified part 3 documents.

Investigate CI plug-ins

As @kozmaz87 suggested, 2 Jenkins plug ins that would be useful:

  • The junit results plugin, which makes navigation of the test suite results easy and adds retrospective of the last builds, so you can see how the test metrics changed in the last X builds
  • the cobertura coverage plugin which does the same just with coverage

Differentiate empty and missing template variable

For example, if : test_var = "?None?" and we have template :

{% if test_var != "?None?" %}
    <a_tag>{{ test_var }}</a_tag>
{% endif % }

So if test_var = '' the tag will be present but will contain an empty string.

Timedelta <-> SMPTE conversion

Conversion between XML time formats and timedelta values is done in this file. This setup allows us to do the conversions during pyxb binding loop.

Conversion for SMPTE is a bit more complicated than conversion for clock and media times. Indeed for SMPTE we need to access values of some attributes of the <tt> element, which is not easily done through pyxb at the stage of the binding where the conversion happens.

COMPONENT: Distributor node

As a playout service provider, I want to distribute a sequence so that it can be processed by multiple consumers.

Input: a sequence of part 3 documents
Output: a sequence of part 3 documents available to multiple destinations

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.