Git Product home page Git Product logo

dotify.api's People

Contributors

bertfrees avatar dkager avatar joeha480 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

dotify.api's Issues

Export "metadata" about the conversion

Some metadata about the conversion would be useful to have. I'm talking about for example:

  • The start and end page number of each section.
  • The start and end values of each marker class in each section. (The idea of this is to be able to compute the print page number range of each volume.)
  • ...

I'm not sure how the metadata would be exported. Some metadata would probably be suitable for storing in the PEF itself, but for other metadata this is not the case. So maybe just an additional output document, or possibly some callback function in Java?

This is a request from Dedicon, who would like to store some metadata in a file, including the braille and print page number ranges.

MediaType is not a proper interface

MediaType is not a proper interface. It would be more useful if it were an enum with some additional information attached to it, such as file extensions for each entry.

Provide way to pass "context" to BrailleTranslationResult functions

where "context" = info about the preceding and following segment.

This is needed if you want the line breaking of multiple consecutive segments to be correct. If line breaking is performed on the different segments independently, it may happen that a line is broken exactly in between two segments while it isn't always allowed.

The context could look something like

PrecedingSegmentInfo preceding, FollowingSegmentInfo following

where

interface PrecedingSegmentInfo {
    public boolean endsWithBreakOpportunity();
    public boolean endsWithCollapsibleSpace();
}

interface FollowingSegmentInfo {
    public boolean startsWithBreakOpportunity();
    public boolean startsWithCollapsibleSpace();
}

and BrailleTranslationResult would implement PrecedingSegmentInfo, FollowingSegmentInfo

Another use case (mainly for illustration, because the current solution kind of works with some workarounds) could be to leave the white space normalisation entirely up to the braille translator. For example, right now the Dotify formatter strips leading white space from blocks because this can't be left up to the translator. When passing the context this would become possible.

Don't require embossers to publish internal tables

Embossers are required to publish internal tables in order to be used with it.
Instead, the embossers could provide the list of internal tables directly.
These could be set by identifier.

The current mechanism is only used if the embosser allows other tables and
should be set using a Table instance.

There's a potential problem with solving this. Since all tables currently published are available to any implementation, this is exploited by some users who prefer to use the GenericEmbosser (which accepts any table) over the make specific implementation to which a table belongs.

How to number footnotes?

How can footnotes be numbered automatically in Dotify, and how can the 
numbering be reset on each braille page?

Example 1:

Input: 
https://raw.githubusercontent.com/snaekobbi/functional-testing/e3db2c61bd8d64fbc
a57b1787aa3f3d4f1bf4e3c/src/resources/100.1/source.xml
Output: 
https://raw.githubusercontent.com/snaekobbi/functional-testing/e3db2c61bd8d64fbc
a57b1787aa3f3d4f1bf4e3c/src/resources/100.1/result.pef

Example 2:

Input: 
https://raw.githubusercontent.com/snaekobbi/functional-testing/e3db2c61bd8d64fbc
a57b1787aa3f3d4f1bf4e3c/src/resources/100.2/source.xml
Output: 
https://raw.githubusercontent.com/snaekobbi/functional-testing/e3db2c61bd8d64fbc
a57b1787aa3f3d4f1bf4e3c/src/resources/100.2/result.pef

Original issue reported on code.google.com by bertfrees on 7 Jan 2015 at 1:38

VolumeContentBuilder allows invalid sequences of operation

The VolumeContentBuilder currently allows invalid sequences of operation, such 
as
newSequence(null);
newOnTocStart();

Doing so could throw an IllegalStateException, but it is really up to the 
implementation to do so. This causes unnecessary uncertainty which can be 
avoided by moving the state dependent methods to separate interfaces. E.g. 
newSequence(null) could return a FormatterCore for that sequence.

And newTocSequence(null) could return another interface containing the methods 
needed e.g. newOnTocStart() (which in turn could return FormatterCore);

Original issue reported on code.google.com by [email protected] on 21 May 2014 at 7:07

Add support for an editable source DOM to the formatter API

The formatter API is currently a push interface which has to be accessed in a specific order (similar to the order of an OBFL source). In order to fully support interactive editing, the API should be redesigned to support unordered read/write operations on all formatter interfaces pertaining to the DOM.

Add support for "fail" option as fallback method

In EightDotFallbackMethod, the supported fallback methods are:
mask, replace, remove.

Add support for "fail" option as fallback method. If this method is chosen, an
exception should be raised if a character in the 2840-28FF range is encountered.

Remove API uses of get/setFeature in embosser package

The get/setFeature methods are overused in the embosser package (in fact, they are relied on for normal operation, most notably for getting/setting a Table), this is confusing for new users of the API. The Embosser interface implies that some features can be set (via various "supports"-methods), but the relationship between this interface and EmbosserFeatures is completely implicit.

The get/setFeature should remain, but only for non-API use, meaning features not defined by the API.

BrailleTranslator should support unresolved text

There should be a way to provide text that is "unresolved", in other words text that may change before that particular line has been rendered with nextTranslatedRow in BrailleTranslatorResult. This will also require dynamically sized TextAttributes. Doing this would also remove the need for MarkerProcessor which could be deprecated from external use. This will simplify the addition of new translators.

Use custom objects instead of passing e.g. strings

Create special objects instead of passing around strings. This will clarify intent and which strings might accept the same values and which probably doesn't:
Example of new objects:

  • TranslatorMode
  • TextStyle
  • RowSpacing
  • InternetMediaType
  • FileExtension

Note that it is not necessary to limit accepted values to a predefined set, the purpose is simply to clarify the type of string expected. However, it might be useful to also provide an enum with common values, e.g.

enum TranslatorModeCatalog {
  UNCONTRACTED(new TranslatorMode("uncontracted"))
  private TranslatorModeCatalog(TranslatorMode mode) {
    ....
  }
}

Improve TextBorderFactory API

The TextBorderFactory API is very difficult to use, due to lack of any specified way to create a border. This should be fixed.

Support at least:

  • newTextBorderStyle(Border border)

A factory should be created with newFactory(String mode)

(Have a look at the other factories, and try to make it more similar to them)

Improve MetaDataItem

  • It's odd to have a builder when there are only three arguments, and very little complexity.
  • the naming of methods is strange (getKey()->getElementName(), getValue()->getTextContents())

Deprecate engine package

The engine package could be deprecated as it is a less general version of streamline-api's tasks. Verify that it can be used as a substitute, and then deprecated it.

Stricter BrailleTranslator contract

This is primarily just a note to myself.

I expected a BrailleTranslator to be something that can quite accurately predict what kind of input it will get, given that it has knowledge about the OBFL that is feed into the formatter. But it turns out that besides the text nodes of the OBFL and numbers, other input is possible such as spaces and question marks. As this is not explicitly in the contract, it makes the input in fact completely unpredictable. Joel says this is by design. The BrailleTranslator contract is that it should be able to translate any input. (Note: because there is "braille" in the name it would probably be good to add that the output should be braille. Maybe create a new interface that does not have this restriction and which can be used for PagedMediaWriters other than PEFWriter.)

I say that if the contract were more specific, you could make more assumptions on the input which would allow you to make simpler implementations that don't do more than strictly needed. IMO this would be a tiny bit nicer from the point of SoC.

Of course this whole issue comes from my opinion that it should be possible to have a pre-translated OBFL so that the BrailleTranslator only has to worry about formatting numbers which is a relatively easy task compared to a full-blown braille translator.

However this is not a huge issue for me because I can easily provide a fallback braille translator that can be used when unexpected input is received.

Improve DefaultTextAttribute

  • DefaultTextAttribute.Builder could have a build() method with an implicit length, in other words:
    if builder.attributes.size()>0 then length is equal to the lengths of all builder.attributes
  • Also, the attributes are not copied, meaning that the builder could change the already built instance.
  • Attributes with no children doesn't really need a builder, and could be created more directly.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.