ebu / ebu-tt-live-toolkit Goto Github PK

View Code? Open in Web Editor NEW

25.0 25.0 10.0 114.63 MB

Toolkit for supporting the EBU-TT Live specification

Home Page: http://ebu.github.io/ebu-tt-live-toolkit/

License: BSD 3-Clause "New" or "Revised" License

Makefile 0.14% Python 63.72% Batchfile 0.09% HTML 1.84% CSS 0.20% Gherkin 17.12% JavaScript 16.89%

broadcast captioning captions ebu-tt live python subtitles subtitling video

ebu-tt-live-toolkit's People

Contributors

Stargazers

Watchers

Forkers

bbc prernaburadkar malikbeytrison adamstrawson savard02 ccma-enginyeria

ebu-tt-live-toolkit's Issues

COMPONENT: Archiver

As a broadcaster/access services provider, I want to archive EBU-TT Live documents so that I can reuse and distribute them after live transmission as a single Part 1 document.

Input: a sequence of EBU-TT-Live documents.
Output: a single EBU-TT part 1 document.

COMPONENT: Delay node

As a playout service provider, I want to add a delay to the sequence so that I can synchronise subtitle with audio.

Input: a sequence of part 3 documents.
Output: a delayed sequence of part 3 documents.

implement a fixed delay node #231
implement a variable delay node #232
write tests for delay node #233

Timebase semantic testing

Test semantic validation on timebase rules :

Start point of a temporal interval associated with a tt:body element.
If the timebase is "smpte" the type shall be ebuttdt:smpteTimingType .
If the timebase is "media" the type shall be ebuttdt:mediaTimingType .
If the timebase is "media" the time expression should be the offset from a syncbase of "00:00:00.0".
If the timebase is "clock" the type shall be ebuttdt:clockTimingType .

Create time metric tests (for smpte)

See #67 and #68: issue is to create tests that verify that the correct time units are used and the code does not get ms and m confused with each other.

COMPONENT: Simple Part 3 consumer (Semantic validation)

As a developer/playout service provider, I want to validate an EBU-TT Live sequence against the specification so that I know if they conform to the specification.

Input: Single part 3 doc or sequence of part 3 documents
Output: display of text with begin and end times and validation message(s)

It requires the following steps to be completed:

(this list is not yet complete)

Consider best way to track spec coverage

For example:

as comments in the code
as comments in the spec (Word, pdf, ...)
in a separate list/dbase

As @kozmaz87 suggested, probably best to annotate the spec with the test code locations.

COMPONENT: Handover node

As a subtitle provider/broadcaster, I want to combine documents from alternating respeakers/stenographers into a single EBU-TT live

Input: 2 or more part 3 sequences; handover options.
Output: a single part 3 sequence.

Subtasks:

Implement core functionality #363
Create configurator #368
Document handover node #364
Unittest handover node #373
BDD testing for handover node #397 (duplicate of #311)
UIP modifications to support handover use-case #374

COMPONENT: Switcher node

As a playout service provider, I want to switch between multiple sequences so that I can choose which sequence is output.

Input: multiple part 3 sequences; switching options.
Output: a single part 3 sequence.

Create text equivalent list of normative spec requirements

Generate a text format list of the normative spec requirements against which we can write tests.

COMPONENT: Reference clock

As a developer/tester, I want an external reference clock so that I can ensure documents are processed correctly.
Input: none.
Output: UTC date-time value.

py.test is not callable

py.test is not callable, but CONTRIBUTING.md says it is

markerMode continuous ?

From what I understand from https://git.ebu.io/ebutt/ebutt-part-3/issues/115 , markerMode="continuous" is permitted with timeBase="smpte", however this is not allowed by the spec and the schema definitions. Should it be added in the xsd files so we have at least semantically correct smpte cases ? (here for example : https://github.com/ebu/ebu-tt-live-toolkit/blob/a7a545d189e4a12d6c8632929fd019252de2e4cb/testing/bdd/templates/referenceClockIdentifier.xml )

Timeformats in documents

Time formats in documents are a bit confusing :

<tt:body tt:begin="63016289ms" tt:dur="00:00:01">

Can we chose the format used to convert timedeltas more precisely ? For example here, having begin in hh:mm:ss.ms and dur in xxxxxms format would be more logical.

Raised from discussion during 13/07/2016 call.

graphviz dependency not documented

I think graphviz needs to be installed in order to see the figures in the documentation correctly, but this is not clear from the README.md.

(Building sphinx gives a warning it cannot find the dot executable if the user has not installed it).

I see two options:

install graphviz automatically (not easy to do platform independently?)
document in README.md that the user needs to install graphviz and add the location of the dot executable to the PATH

COMPONENT: User input producer

As a developer/tester, I want a sequence of part 3 documents generated from text input so that I can develop/test an implementation against it.

Input: Text file provided by user.
Output: Sequence of EBU-TT Live documents.

No dur attribute if ttp:markerModer="discontinuous"

tt:dur is not allowed when a document has both ttp:timeBase="smpte" and ttp:markerMode="discontinuous"

Slack notifications granularity

Check if we all agree with changing the notifications to only show:

Failures
Changes (includes the first build after fails)

Responses:

COMPONENT: EBU-TT-D encoder

As a playout service provider, I want to convert EBU-TT-Live documents to EBU-TT-D documents so that I can distribute them to the end device

Input: sequence of part 3 documents
Output: sequence of EBU-TT-D documents

Subtasks:

Add EBU-TT-D XSD 1.1 to repository and update it if necessary #177
Create bindings (move EBU-TT-3 current bindings to a sensible place) #176
Conversion initiation logic and validation #174
Create EBU-TT-D conversion classes #170

COMPONENT: Simple Part 3 consumer (XML)

As a developer/tester I want a simple view of the text and times in a sequence of EBU-TT Live documents so that I can verify that they will be processed correctly.

Input: Sequence of part 3 documents; 'validate only' option.
Output: display of text with begin and end times and validation message(s)

COMPONENT: RTP carrier

As a playout service provider, I want to carry EBU-TT Live documents using the RTP protocol so that I can use EBU-TT Live over IP networks.

Input: a sequence of part 3 documents.
Output: a sequence of part 3 documents for interfacing with RFC 3550.

Test referenceClockIdentifier validation

Allows the reference clock source to be identified. Permitted only when
ttp:timeBase="clock" AND ttp:clockMode="local" OR when
ttp:timeBase="smpte".

COMPONENT: Fixed input producer

As a developer/tester, I want a sequence of EBU-TT Live documents generated automatically so that I can develop/test an implementation against it without the need to input text.

Input: none.
Output: a sequence of Part 3 documents.

Badge-per-branch for Travis CI?

Currently the Travis CI badge referred to in the readme.md points to the Master branch.

This means the badge may show incorrect status on other branches (unless the readme.md is modified).

Some people have done provided a pre-built hook to get around this, but it seems to be a bit fragile: http://stackoverflow.com/questions/18673694/referencing-current-branch-in-github-readme-md

Also check out: http://stackoverflow.com/questions/19810386/showing-travis-build-status-in-github-repo

tests for smpte timebase semantic checks

Test that when timebase is set to smpte, documents are validated/rejected depending on the format of the time. Also test that presence and correctness of all needed parameters is checked

COMPONENT:XSD ttm:agent element defined as string, should be a complexType

The ttm:agent element is defined in metadata.xsd as xs:string, but it should be as in TTML1 §12.1.5:

<ttm:agent
  type = (person|character|group|organization|other)
  xml:id = ID
  xml:lang = string
  xml:space = (default|preserve)
  {any attribute not in default or any TT namespace}>
  Content: ttm:name*, ttm:actor?
</ttm:agent>

where

<ttm:name
  type = (full|family|given|alias|other)
  xml:id = ID
  xml:lang = string
  xml:space = (default|preserve)
  {any attribute not in default or any TT namespace}>
  Content: #PCDATA
</ttm:name>

and

<ttm:actor
  agent = IDREF
  xml:id = ID
  xml:lang = string
  xml:space = (default|preserve)
  {any attribute not in default or any TT namespace}>
  Content: EMPTY
</ttm:actor>

This issues is copied from the BBC repo: bbc#8

CI e-mail notifications

Add CI build notifications to team (members).

Note that it seems committers get info on their commits already anyhow:

By default, email notifications are sent to the committer and the commit author, if they are members of the repository (that is, they have push or admin permissions for public repositories, or if they have pull, push or admin permissions for private repositories).

Options for wider notifications include:

Using the Slack Travis CI integration to channel the build messages to the CI channel.
Adding additional e-mail addresses to the Travis set up

COMPONENT: Complex noise introducer

As a developer/tester, I want to control the level of 'noise' and complexity in a stream of part 3 documents so that I can test my implementation against different scenarios.

Input: a sequence of part 3 documents; options for introducing complexity into the stream.
Output: a sequence of modified part 3 documents.

Unit-Test documents comparison (ComparableMixin)

Documents use the mixin ComparableMixin (defined in the project root in file utils.py). This mixin allows for correct and easy comparison of documents :

two documents with the same sequenceIdentifier will be compared using their sequenceNumber :

document1 < document2 if document1.sequenceNumber < document2.sequenceNumber

If the documents do not have the same sequenceIdentifier, there is no comparison possible and an error is raised.

With this issue I want to address the fact that this was not tested yet, so I implemented tests to ensure that document comparison works as intended.

Decide on which python versions we support

check existing code/libraries
decide
add to CONTRIBUTING.md
add to .travis.yml

skip individual scenario example lines

It would be good to be able to skip individual example lines in scenarios without commenting them out, so that those scenarios are counted better.

COMPONENT: WebSocket carrier

As a playout service provider, I want to carry EBU-TT Live documents using the WebSocket protocol so that I can use EBU-TT Live over TCP.

Input: a sequence of part 3 documents.
Output: a sequence of part 3 documents for RFC 6455.

COMPONENT: XSD

As a developer/playout service provider, I want to validate EBUT-TT Live documents so that I know if they are valid XML

Input: a single part 3 document.
Output: validation message.

Set up Travis CI correctly

make sure tests run
trigger on pull requests only or any commit? (maybe both for now)

COMPONENT: SDI carrier

As a playout service provider, I want to carry EBU-TT Live documents over HD SDI so that I can pass EBU-TT Live around with video in a broadcast environment.

Input: a sequence of part 3 documents.
Output: a sequence of part 3 documents in HD SDI

COMPONENT: Downstream validator

As a developer/playout service provider, I want to check that an EBU-TT Live document conforms to downstream requirements so that I know it will be successfully processed.

Input: a single part 3 document; metadata instructions.
Output: validation message.

COMPONENT: Part 1 cued

As a playout service provider, I want to consume a cued EBU-TT Live sequence from an EBU-TT Part 1 document .

Input: a single EBU-TT Part 1 document; cueing options.
Output: a sequence of Part 3 documents released according to the cueing options.

Basic testing setup

Find a way to handle xml files (with templates for example)
Setup a basic test infrastructure for bdd
Write some tests

File system carriage mechanism

Refers to BBC's fork issue 18

This issue asks for the implementation of a functionality that allows the simple producer to write its output to the file system along with a manifest file with availability times.

TimecountTimingType regex is wrong

Spotted this while working on my understanding of the bindings and how to add smpte <-> timedelta conversion.

Python's regexes are read sequentially, meaning that if you have for example [0-9]+(h|m|s|ms) :

9h is parsed normally
9m also
9s also
However, 9ms is parsed as 9m and the s is forgotten

This is really problematic because it also allows 9mh for example and extracts 9m, however 9mh is not permitted in this case.

To solve the problem, the solution is to add a $ at the end of the regex, so for timecountTimingtype :

[0-9]+(\.[0-9]+)?(h|m|s|ms)$ in the xsd
?P<numerator>[0-9]+(?:\\.[0-9]+)?)(?P<unit>h|m|s|ms)$ in bindings

COMPONENT: Teletext-to-part 3

As a broadcaster, I want to convert subtitles in teletext format to EBU-TT Live documents so that I can use legacy systems.

Input: Teletext (data feed/VBI/VANC)
Output: Sequence of EBU-TT Live documents

Document manifest format

We need to document the manitfest format. E.g.

in the code
in the wiki
in the header of the manifest file?

related to: #59

COMPONENT: Simple noise introducer

As a developer/tester, I want to consume 'noisy' sequences of part 3 documents so that I can test my implementation against a known set of scenarios.

Input: a sequence of part 3 documents; noise options.
Output: a sequence of modified part 3 documents.

Investigate CI plug-ins

As @kozmaz87 suggested, 2 Jenkins plug ins that would be useful:

The junit results plugin, which makes navigation of the test suite results easy and adds retrospective of the last builds, so you can see how the test metrics changed in the last X builds
the cobertura coverage plugin which does the same just with coverage

Differentiate empty and missing template variable

For example, if : test_var = "?None?" and we have template :

{% if test_var != "?None?" %}
    <a_tag>{{ test_var }}</a_tag>
{% endif % }

So if test_var = '' the tag will be present but will contain an empty string.

Set up automatic documentation building

http://blog.gockelhut.com/2014/09/automatic-documentation-publishing-with.html

Timedelta <-> SMPTE conversion

Conversion between XML time formats and timedelta values is done in this file. This setup allows us to do the conversions during pyxb binding loop.

Conversion for SMPTE is a bit more complicated than conversion for clock and media times. Indeed for SMPTE we need to access values of some attributes of the <tt> element, which is not easily done through pyxb at the stage of the binding where the conversion happens.