Git Product home page Git Product logo

Comments (6)

andreastai avatar andreastai commented on July 30, 2024 1

@jayotirana There is a longer answer necessary for this issue. Bear with me a bit. I come back to you soon and explain the context. It has to do how conversion steps depend on each other.

from scf.

jayotirana avatar jayotirana commented on July 30, 2024

I can understand, more scenarios come in picture when we test it and those can we covered before releasing a new tag. I already provided the sample file in my previous issue, you guys can use it to see the output file. This one is a good service for converting files from one format to another.
Thanks
@TairT

from scf.

andreastai avatar andreastai commented on July 30, 2024

@jayotirana Thanks for your comment. I can understand the interest in having a generic conversion from an EBU-TT Part 1 file to EBU STL.

It gives me the opportunity to explain one important transformation strategy we adopted in the SCF.

Your assumption may have been that the scope of the module EBU-TT2STLXML is the conversion of every possible EBU-TT document to STLXML. You may have overread the hint in the README of the module that says:

Also note that this module is mainly designed to allow the process of round-tripping from EBU STL to EBU-TT and back using SCF. This means that there is no guarantee that EBU-TT Part 1 files that were not converted to EBU-TT by using the SCF can be (correctly) converted back to EBU STL - though the source file might comply to the respective subtitling standards.

The scope of the module is therefore not covering the complete EBU-TT Part 1 spec.

The starting point for the SCF is and was the conversion from EBU-STL to EBU-TT Part 1. You can use different strategies to convert EBU-STL to EBU-TT Part 1, but we wanted to apply the very well documented mapping in EBU Tech 3360. From this document, we created requirements that are all covered by tests. The requirements are documented (as for any other module of the SCF) in a requirements document that is published together with the source code.

From this initial conversion, we expand a transformation chain. Although each step in the chain can be executed independently the kind of input that is accepted is defined by the steps that have been applied before. That means for example that a EBU-TT Part 1 to EBU-TT-D conversion only covers as input the structure of an EBU-TT Part 1 document that can be produced by the STLXML to EBU-TT step. The same is for the EBU-TT to STLXML conversion: a document structure of an EBU-TT document that falls outside the mapping strategy for the STLXML to EBU-TT conversion is not in the scope of the module and will, therefore, produce unpredictable results.

The input you try to transform contains for example forced line breaks with <br> elements inside <span> elements, e.g.:

<span>Hello<br/>world</span>

The mapping strategy of EBU Tech 3360 documents only examples where a forced line-break is applied between <span> elements but not inside <span> elements. Therefore the conversion of STLXML2EBU-TT only produces line-breaks like

<span>Hello</span><br/><span>world</span>

This is therefore also the only supported pattern for the EBU-TT to STLXML conversion.

This does not imply that it is the only or best way to do it. Although my personal opinion is that it pretty much reflects the way how Teletext rows are built you may also find other reasons why a line-break inside span elements should be used.

The main reason for us to limit the scope of each module is to spend the available development resources on the most needed features.

So how could you proceed with the type of input document you provided? One option could be to extend the conversion scope of the module to every possible EBU-TT Part 1 document. But from my perspective, this would require possibly too many changes, and then also the question arises if other modules should be refactored in this way. Another option would be to write a separate XSLT that transforms your input document type (that can be possibly limited by patterns of the BBC Subtitle Guidelines into what the EBU-TT2STLXML module expects. This seems more realistic. You could try to go down this road and open-source your results. This way we could support you in case of questions. If this should be integrated later in the SCF it also needs documented requirements and test files for all mappings.

from scf.

nigelmegitt avatar nigelmegitt commented on July 30, 2024

Thanks for the mention of the BBC Subtitle Guidelines @TairT ! I want to add that general conversion of EBU-TT Part 1 to STL necessarily requires heuristics and rules for working around cases that STL cannot support. There are numerous examples of this, including:

  • Unicode is normally used in EBU-TT but not supported in STL. So you need a rule for what to do if, say, you encounter a symbol you cannot convert, like maybe €.
  • More characters can be placed on a line in EBU-TT than can be represented in Teletext. So you need a rule for how to handle this, like truncating text, reflowing onto additional lines, etc.
  • EBU-TT can specify font sizes much larger (or smaller) than can be supported in STL. Again, you need a rule for how to handle those if they are encountered.
  • EBU-TT can specify more fine-grained positioning than STL can. You need a rule for how to map the positions in the EBU-TT file into Teletext-style positions in STL.

There are many many more such examples of constraints in STL that are absent in EBU-TT. It is not an easy task!

from scf.

andreastai avatar andreastai commented on July 30, 2024

@nigelmegitt You are absolutely right. That is why we only support a very well defined use case. In operation, this is currently mostly used in a round-tripping mode. EBU-STL are translated to XML, manipulated and then again translated back to EBU-STL.

from scf.

andreastai avatar andreastai commented on July 30, 2024

@jayotirana I will close this issue for now, because the issue you mention is currently out of scope. But feel free to comment if you need further assistance to adjust your transformations. I would also be interested to hear more about your general use case.

from scf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.