Git Product home page Git Product logo

iconvg's Introduction

IconVG

IconVG is a compact, binary format for simple vector graphics: icons, logos, glyphs and emoji.

WARNING: THIS FORMAT IS EXPERIMENTAL AND SUBJECT TO INCOMPATIBLE CHANGES.

It is similar in concept to SVG (Scalable Vector Graphics) but much simpler. Compared to SVG Tiny, which isn't actually tiny, it does not have features for text, multimedia, interactivity, linking, scripting, animation, XSLT, DOM, combination with raster graphics such as JPEG formatted textures, etc.

It is a format for efficient presentation, not an authoring format. For example, it does not provide grouping individual paths into higher level objects. Instead, the anticipated workflow is that artists use other tools and authoring formats like Inkscape and SVG, or commercial equivalents, and export IconVG versions of their assets, the same way that they would produce PNG versions of their vector art. It is not a goal to be able to recover the original SVG from a derived IconVG.

It is not a pixel-exact format. Different implementations may produce slightly different renderings, due to implementation-specific rounding errors in the mathematical computations when rasterizing vector paths to pixels. Artifacts may appear when scaling up to extreme sizes, say 1 million by 1 million pixels. Nonetheless, at typical scales, e.g. up to 4096 × 4096, such differences are not expected to be perceptible to the naked eye.

Example

Cowbell image

  • cowbell.png is 18555 bytes (256 × 256 pixels)
  • cowbell.svg is 4506 bytes
  • cowbell.iconvg is 1012 bytes (see also its disassembly)

The test/data directory holds these files and other examples.

File Format

  • IconVG Specification
  • Magic number: 0x8A 0x49 0x56 0x47, which is "\x8aIVG".
  • Suggested file extension: .iconvg
  • Suggested MIME type: image/x-iconvg

Implementations

This repository contains:

The original Go IconVG package also implements a decoder and encoder, albeit for an older (obsolete) version of the file format.

Disclaimer

This is not an official Google product, it is just code that happens to be owned by Google.


Updated on January 2022.

iconvg's People

Contributors

dependabot[bot] avatar hixie avatar nigeltao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

iconvg's Issues

Proposal: File Format Versions 1, 2 and Beyond

Summary

I propose to:

  • Retroactively name the current (as of May 2021) IconVG wire format as FFV 0 (File Format Version 0).
  • Consider FFV 0 a deprecated experiment and no longer supported.
  • Introduce FFV 1 as a subset (in terms of features) of FFV 0 but incompatible (in terms of wire format), in order to remove features that are hard to combine with animation, and to clean up some design warts.
  • FFV 2 (if it happens, in the future) will be a superset of FFV 1 in terms of both features and wire format (other than a one-byte-change version-number bump). The intended headline feature for FFV 2 is animation (issue #2). FFVs 1 and 2 can equivalently be considered Small and Large profiles of the overall IconVG file format. For example, a future font format could embed FFV 1 (static) graphics but not FFV 2 (static and animated).

Background

Since its inception in 2016, IconVG has always carried the caveat that "WARNING: THIS FORMAT IS EXPERIMENTAL AND SUBJECT TO INCOMPATIBLE CHANGES".

Issue #2 in this repository is about adding animation to IconVG graphics. Tweening would almost certainly involve transformations (in the "affine transformation" sense) and interpolation.

The original IconVG design took the entirety of the SVG path model, including elliptical arc segments. Unlike line_to, quad_to and cube_to, arc_to's parameterization is unique, not being a sequence of (x, y) coordinate pairs, and a boolean argument like large-arc-flag is impossible to interpolate smoothly.

Rasterization backends like Cairo and Skia also don't provide arc_to as a primitive, or if they do, not in the way that SVG parameterizes it. We usually approximate arcs as cubic splines.

Also recall that IconVG is a presentation format, not an authoring format, and it already isn't able to represent groups, strokes, text, etc 'natively'. Authoring tools like Illustrator or Inkscape, if they could export to IconVG, are expected to 'lower' e.g. stroked paths to more primitive operations (filled paths), the same way that they would 'flatten' layers if exporting to PNG. I'd expect such tools could also 'lower' arcs to cubic Béziers during export.

Thus, I'm considering removing arcs from the file format. This new version (File Format Version 1) would not be a superset of FFV 0 per se, but FFV 0 files could be converted in a straightforward way and the rasterizations would be equivalent. In essence, 'lowering' arcs becomes the responsibility of the authoring tools (which get more complicated) instead of the presentation tools (which get simpler).

Separately, the original Go implementation (the golang.org/x/exp/shiny/iconvg package in a separate repository) was released as an interim milestone of the unfinished 'Shiny' Go GUI project. IconVG hasn't had much adoption so far, as the only implementation was in Go and so not usable from e.g. C++, Dart or Python GUI programs. In recent weeks, this repository has gained a brand new C implementation, but we still don't yet have a vast back-catalogue of existing IconVG files to constrain us.

Bringing all of the above together, if I were ever to make an IconVG FFV 1, especially one that isn't a superset of FFV 0 (because arcs), then now is the time to do it.

This issue is a place to discuss that process and what other features to add or warts to remove as part of FFV 1.

File Format Changes

See the spec for context.

The major change is:

  • Remove the A and a arc-related drawing opcodes.

Minor clean-up changes are:

  • Change the first byte of the magic header from 0x89 to 0x8A, so that we can distinguish IconVG from PNG (from JPEG from WebP etc) just from the first byte of the file. https://en.wikipedia.org/wiki/List_of_file_signatures doesn't show any previous claims on 0x8A.
  • Add an explicit FFV number in the wire format. Specifically, change the fourth byte of the magic header from 0x47 (ASCII 'G') to 0x31 (ASCII '1') for FFV 1, 0x32 (ASCII '2') for FFV 2, etc.
  • MID numbers must use the shortest possible encoding.
  • Re-number the ViewBox and Suggested Palette MIDs (Metadata IDs) from 0 and 1 to 8 and 16 (which are represented on the wire as 0x10 and 0x20). Since metadata is presented in increasing MID order, the gaps allow future extensions to insert (optional) metadata chunks before these existing ones.
  • Prohibit encoded real numbers being NaNs. In the end state, the spec should no longer mention "undefined behavior".
  • Tighten restrictions on gradients: there must be at least two stops and the offsets must span from 0 to 1 inclusive.
  • Maybe some other small things I've forgotten.

Implementations

  • This repository's C and Go libraries (the latter also to be called the 'new' Go library) will speak FFVs 1+ but not FFV 0.
  • The original written-in-2016 Go library (the 'old' Go library), at golang.org/x/exp/shiny/iconvg, will speak FFVs 0 and 1+, delegating the latter to the 'new' Go library.
  • The 'old' Go library will also gain tools to upgrade FFV 0 files to FFV 1.

Notably, any existing Go code (using the 'old' Go library) displaying existing (FFV 0) files will continue to work.

Timeline

FFV 1 should be finalized 'soon'. FFV 2 is more open ended and will require extensive prototyping.

C implementation allows `T` drawing command to reflect off `C` control point

If I'm reading the code correctly, in the C implementation, a 0xAF opcode (C command) followed by a 0x4F opcode (T command) will result in the T command using the last control point of the first opcode to compute the first control point of the second opcode, but the SVG spec says "if the previous command was not a Q, q, T or t, assume the control point is coincident with the current point".

Can LOD be set to NaN?

The specification states that:

It is valid for the 4 byte encoding to represent infinities and NaNs, but if not loaded into LOD0 or LOD1, the subsequent rendering is undefined.

This suggests that LOD0/1 can be set to NaN, but nowhere in the spec mentions the behavior in that case.

Q: Recommended in-browser IconVG decoder?

Hi! Do you know of any usable work, or have any recommendations, for IconVG decoding in the browser?

I'm thinking of something that'd be able to replace inline elements, like say <img src='data: image/x-iconvg;base64,iklWRwMLEVFRsbE1gVkzWYGBqTWFlTR9lX19NYV1NH11fW2I' />, with the rendered image.

I suppose compiling @Hixie's contributed DART decoder to JS might be a fruitful approach – though of course supporting the current IconVG format would be preferred.

Thanks!

Animation

The original (2016) specification provides static (non-animated) vector graphics.

It might be nice to extend the file format (in a backwards compatible way) to support animated vector graphics. I don't have a specific design proposal at this stage, but this issue is the place to discuss what that might look like, for the file format and for the API.

Examples for gradient parsing

It would be useful if the spec gave some canonical examples of gradient decoding, to make sure the implementor interpreted the logic correctly. (This is already done in a number of other places, e.g. decoding numbers.)

Guidance on how to create IconVG file

Thanks for you efforts.
Really looking forward to official Flutter support!

Just out of curiosity how would those files be created?
Are all files at this point handcrafted or is there some library already to generate those from SVGs?

IconVG virtual machine also has x and y registers

One of the important but implicit details about an IconVG virtual machine is that in addition to all the described registers, it also holds x and y registers (float values of some sort; I guess the precision isn't strictly observable), which must be updated as drawing opcodes are run. Currently this is only implied by the deference to the SVG spec and its definition of "current point".

Drawing opcodes section could be clarified

The drawing opcodes section is a bit confusing. First, the deference to SVG makes everything have an additional level of (very ambiguous) indirection, which is really confusing. SVG is defining all this in the context of a text attribute, not an abstract data model, so a great deal of squinting and hand-waving is required to strictly map the IconVG definition to the SVG spec text.

There's also some more specific issues in this text. For example, it's not 100% clear exactly how to interpret the parameters, for example:

The opcode is followed by (2 * RC) coordinates, RC sets of (x, y).

...presumably is intended to mean that you read two coordinate numbers and apply the "lineto" command for those two numbers, RC times overall? There's a lot of vagueness here that is only really disambiguated by guessing at the most likely intention based on knowing how SVG path data works, which is not ideal for a spec. :-)

Fill type unspecified?

It's not clear to be what the expected fill type is for paths in IconVG. Is it even-odd? non-zero? something else?

Clarity on LOD Jump decoding

The spec mentions that an LOD Jump is followed by two floating point numbers

The JumpCount is followed by two floating point numbers, LOD0 and LOD1

However, in tests/data/lod-polygon.iconvg.disassembly, they appear to be decoded as coordinate numbers.

3a            #0003 Jump Level-of-Detail
07                  Target: #0007 (PC+3)
81                  +0
02 d0               +80

The spec mentions early on that floating point numbers always take four bytes.

Floating point numbers are always encoded in 4 bytes: the little-endian encoding of a 32-bit IEEE 754 float32 number.

Following this leads to incorrectly decoding lod-polygon.

I suggest the verbage be updated to say that LOD0 and LOD1 are coordinate numbers.

Specifications

  • Version: FVV1
  • Platform: N/A

"IVG" has strong meaning in French

For French speakers, the suggested extension .ivg will remind them of "IVG", which is the acronym of "Interruption volontaire de grossesse", meaning "Abortion".

I just wanted to share for awareness.

cowbell.svg includes a lot of unnecessary stuff

This is an interesting project, but to make a better case against the size of SVG, your cowbell.svg example should probably be more representative of how it should be used for final-form delivery.

For example, it has the following style attribute on a path element:

color-rendering:auto;text-decoration-color:#000000;color:#000000;isolation:auto;mix-blend-mode:normal;shape-rendering:auto;solid-color:#000000;block-progression:tb;text-decoration-line:none;image-rendering:auto;white-space:normal;text-indent:0;text-transform:none;text-decoration-style:solid

I'm not even sure what tool would put those there - text properties on a path?? But also I think each of them is the default value anyway.

Many of the elements have an unnecessary id attribute.

The file also defines the rdf namespace, the creative commons namespace, and the dc namespace - none of which are used in the content.

Maybe it should be run through some optimisation tool before being used as an example?

(Side note: the actual transfer size of the SVG is about 2.7Kb)

Specify color space (sRGB?)

I didn't see a mention of either "color space" nor "srgb" in https://github.com/google/iconvg/blob/main/spec/iconvg-spec.md, and neither did I find references in the original IVG spec. I believe the (implicit) color space should be specified, and because colors are limited to 32 bit, I suggest the common sRGB color space. As a corollary, conformant rasterizers must blend and compute gradients in linear RGB space.

How to initialize H in an unconstrained environment

It's not clear what H value to use when the rendering is going to be set to whatever the viewBox height is, since the viewBox height hasn't yet been parsed at the moment where the IconVG virtual machine is being configured.

Linear gradient matrix math

The appendix says that the linear gradient matrix can have d=0 and e=0. However, that means that the matrix in question is not invertible, so it's not clear to me what the expected effect actually is.

For example, suppose your matrix is:

| a=-0.06, b=-0.14, c= 7.04 |
| d= 0.00, e= 0.00, f= 0.00 |
|    0.00,    0.00,    1.00 |

How do you go from that to the x1,y1, x2,y2 pair that describes the linear matrix?

Call opcodes and updated encoder

Hello.. love this project. I was sort of obsessed today with trying to create a function to output text.. which I think might not be possible, but it seems like at least I can make a function for outputting a character at a time.
But looking at the code, seems that call is not implemented.
Also would you consider updating the encoder for the new version?
I understand if you have other obligations. Just wanted to mention my interest in the continued work.

Gradients through transparent colors

The spec specifies that colors use premultiplied alpha. This causes problems for gradients that go through fully transparent (0x00000000) colors, since there is no way to know what the lerped color value should be. For example, a gradient from (RGBA) 0x00000000 to 0xFFFF00FF could be going from transparent black to opaque yellow, or transparent yellow to opaque yellow, or transparent green to opaque yellow, each of which has rather different effects.

The spec does not specify how to handle this.

Some of the golden images suggest that 0x00000000 should be treated as fully-transparent whatever-happens-to-be-the-next-stop?

"The suggested palette is encoded in at least one byte." unclear

The MID 1 section of the spec says:

The suggested palette is encoded in at least one byte. The low 6 bits of that byte form a number N. [...] The chunk then contains N+1 explicit colors, in that 1, 2, 3 or 4 byte encoding.

It's not clear what the "at least one byte" part of this means. Since N cannot be negative, and there's always N+1 colors, wouldn't a more accurate statement be "at least two bytes"? It may be clearer to just say "The low 6 bits of the first byte of the metadata block after the MID form a number N...".

See also #11.

Design: Do we need color registers at all?

So well, I've tried to implement IconVG as an experiment and noticed a lot of things that hasn't been yet mentioned elsewhere. I found the discussion in #4 helpful (and I agree to @Hixie that this can and should be made much simpler with an explicit set of goals in mind) but too broad in my humble opinion. So I will try to give a series of bite-sized feedbacks (others to come) that should be actionable at your discretion.

It is understandatable that IconVG is fundamentally a series of commands given its goals, but the use of registers is unusual. It has a large number (5+) of different encodings for colors which includes a pseudo-operation (blend) and a clever partial encoding of gradients that refer to other registers. I presume this design is a result of these considerations:

  • All 256⁴ 8-bit RGBA colors (or to be exact, 1,082,146,816 premultiplied RGBA colors) should be representable.
  • Some colors appear a lot, so they should be reusable.
  • Some set of palette colors are (linearly) related so it would be great to derive some palette colors from others.
  • Some colors would be animationable in the future.
  • Some (in fact, probably most) colors do not need a greater precision and are better quantized.
  • No other fill textures than solid colors and two kinds of gradients are expected in the future.

Still, the resulting design feels simultaneously too complex and yet unsatisfactory to me.

  • Two different quantizations (5³ and 16⁴) with somewhat overlapping ranges.
  • A blend (3-byte indirect encoding) of two concrete opaque colors is redundant. It is not as redundant if one color is fully opaque and another is fully transparent (acting as set-alpha operation), but that's pretty much the only use.
  • A blend operation is commutative, so one of two possible encodings is redundant.
  • 2-byte encoding (16⁴ quantization) can encode gradients with 17n stops. To be fair this is noted in the spec, but still is surprising and practically useless.
  • Register opcodes can access 7 different registers without changing selectors, but this is only useful for setting a large number of colors or numbers at once, i.e. gradients.
  • Gradients are practically limited to 58 stops, since matrix and stop positions share the same registers.
  • Suggested palettes can only use a single encoding.

If the compactness is a goal, redundant encodings should be avoided. If the simplicity is a goal, the whole gradient and blend business is absurd. The current design is a hodge-podge of two somewhat conflicting goals.

Concrete (Overlong) Shower Thought

It occurred to me that:

  • It is always possible to keep register usage to the minimum, it's just a matter of exploiting redundancy for the compression.
  • An LRU cache is useful for the compression in general.
  • With an exception of gradients, there is no reason to put some color or number to a particular register.
  • As an approximation to an LRU cache, a stack with implicit queuing can result in more compact encoding.
  • Stack machines rulez.

In my current proposal, styling opcodes are repurposed as follows:

0x00 .. 0x3f    c = CREG[opcode]; CREG[CSEL++] = c
0x40            CREG[CSEL-1] = blend CREG[CSEL-2] and CREG[CSEL-1] with subsequent byte
0x41            LOD0 = NREG[NSEL-2]; LOD1 = NREG[NSEL-1]; NSEL -= 2
0x42            CREG[CSEL++] = make gradient reference out of next three bytes:
                (NSTOPS - 2) + gradient shape * 128, CBASE + spread * 64, NBASE
0x43            start drawing with CREG[--CSEL]

0x80 .. 0x8f    CREG[CSEL++] = 3-byte color (32⁴ quantization)
0x90 .. 0xf3    CREG[CSEL++] = 4-byte color, 4x2 most significant bits encoded in opcode
0xf4            CREG[subsequent byte (should be < 64)] = CREG[--CSEL]

0xfb            NREG[subsequent byte] = NREG[--NSEL]
0xfc            NREG[NSEL++] = 1-byte natural number - 256
0xfd            NREG[NSEL++] = 1-byte natural number
0xfe            NREG[NSEL++] = 2-byte natural number / 128 - 256
0xff            NREG[NSEL++] = 4-byte IEEE 754 binary32

There are still 64 color "registers" and 256 number "registers" (up from 64), but they are mostly used as stack elements. Since they are defined in terms of registers pushing more than 64/256 elements would overwrite the bottom of stacks; this is intentional.

Stack references are absolute, counting from the bottom. This makes referencing the same color over and over refers to the same index. This does assume that each command statically "knows" current selectors; the addition of functions will require some thought (e.g. selectors can "rotate" on the function call).

Since it is possible to refer to elements beyond CREG[CSEL] and NREG[NSEL], they can be filled with good defaults. The suggested palette of size N can go to CREG[64-N] .. CREG[63] for example. Palette opcodes are removed for this reason.

The encoding of 4-byte color plus 1-byte opcode amazingly fits to 4 bytes. This is possible because we use premultiplied colors; there are only 1³ + 2³ + 3³ + 4³ = 100 possible combinations of 4 sets of most significant two bits. So we can pack remaining 4x6 bits to 3 bytes and then make use of remaining 156 combinations for other things. There is also a simpler alternative encoding relying on the MSB of alpha: 32 bits long if MSB is set (store alpha as opcode), 28 bits long if MSB is not set (assign 16 opcodes and pack to opcode + 3 bytes).

I decided to remove one-byte (5³) quantization and expand two-byte (16⁴) quantization to make it 20 bits long (32⁴). It is still possible to add more quantizations, but the bulk of IconVG data consists of coordinates and not colors, so I don't think it's worth. Since the shortest color opcode was already 2 bytes long, this doesn't make much difference anyway.

Blend is now a separate opcode, consuming one of two operands (c0 c1 -- c0 blend(c0,c1)). I think it is likely that one operand is fixed and another is changing, so making the operation asymmetric makes sense.

The switch to the drawing pops the topmost color. Since we can draw multiple paths in one sitting, the color is likely not reused. In the rare case that the color has to be reused (e.g. LOD changes) it takes just one more byte.

I kept the gradient reference to allow some stops of the gradient to be changed without making a new reference. (I once considered to make three different opcodes, but that will make functions less useful.) The actual encoding (like, alpha=0 and blue>=128) however is opaque to the content author. The opcode argument mostly follows the original 3-byte "invalid" color, but the gradient shape has to be moved because NBASE is now 8 bits long.

I frankly feel additional number opcodes hardly matter since we are not using numbers that much. For now I've specifically tuned to the matrix usage (0xfc .. 0xfe for c and f; 0xff for others). Maybe though we should remove all number operands except for binary32 and add them back according to the observed operand distribution.

That's all. Any thoughts would be appreciated.

Canonical unit tests

It would be great if the spec had a bunch of canonical test cases in some unambiguous format (something not prone to variations in antialiasing etc). This could be done for some aspects of the decoding, e.g. given an input custom palette, what does CREG contain at the start of execution, or given an IconVG file consisting only of styling opcodes, what are the values in the registers, viewBox, etc, at the end of execution.

Potential error in the specification of encoding types

The spec mentions:

0x03 means a 1-byte encoding.

However, tests/data/elliptical.iconvg.disassembly seems to use a 4-byte encoding if the lower two bits equal 0x03.

Line 18 of tests/data/elliptical.iconvg.disassembly

ab aa 2a 3d         +0.041666668

I'm unsure where the error is. Looking at the other disassembled files, it seems like the problem lies in elliptical.iconvg.disassembly.

Specifications

  • Version: FFV1 (commit ccf9563)
  • Platform: N/A

Do you use IconVG? Tell us!

This is not an issue so much as a lightweight way of gathering information on who is using IconVG. This is mostly to satisfy our curiosity, but might also help us decide how to evolve the project.

So, if you use IconVG for something, please chime in here and tell us more!

Allow only simple, absolute segments

SVG path specification is extremely overcomplicated and error-prone (afaik, the current iconvg C implementation is also incorrect #23). Maybe it would be easier to stick just to M, L, C, Z segments (absolute only)?

SVG supports relative commands to simplify authoring and reduce size on large coordinates. And I'm not sure it affects iconvg in any way.

Meaning of viewBox should be clearer

It's not clear what exactly the viewBox means.

  • Must an implementation clip drawing operations outside the viewBox? (Hopefully not, clips are expensive.)
  • May an implementation drop drawing operations outside the viewBox? (Different implementations would have different renderings.)
  • May a drawing draw outside the viewBox?
  • Should an implementation honour the viewBox aspect ratio when rendering the image?

Metadata block error conditions underdefined

It's not clear if it's legal for data in a metadata block to extend past the end of itself.

For example, consider a situation where the metadata section has a block declared with length 0. The byte after the length gives the MID. Suppose (for simplicity) that it's not recognized. That byte also then gives the first opcode of the file. Is that valid?

I can imagine (absurd) situations where an IconVG serializer finds ways to overlap data in this way to save a few bytes...

Similarly, is it valid for a metadata block to be too long? e.g. what if a MID=0 block has length 1024? Are the extra bytes just ignored, or is that an error?

Design: Coordinates Encoding

As per #29, it seems that we limit ourselves to lines, quadratic Bezier and cubic Bezier curves. This would change the coordinates distribution and thus gives an opportunity to simplify and optimize the coordinates encoding.

Concrete Analysis

To get a gist of the typical usage, I've analyzed the Material Design Icons as of Templarian/MaterialDesign-SVG@63c5cb3 (which also includes the Google's icon set). A quick and dirty Python script used is available here.

               0       1       2       3       4       5       6
        --------------------------------------------------------
    M |    22073       2       0       0       0       0       0 |    22075
    L |    16774    3547   10394    3325    1668     154      18 |    35880
    C |    17939    4987    5525     873     221      46       1 |    29592
    Q |        9      66      13      28      19       0       0 |      135
    Z |     5614       0       0       0       0       0       0 |     5614
        ---------------------------------------------------------+---------
 line |    18816    3221    8243    1572     701      55       9 |    32617
        ---------------------------------------------------------+---------
           62409    8602   15932    4226    1908     200      19 |    93296

This table shows the number of operations and (run length - 1) in bits. So 0 corresponds to a run of 1, 1 to a run of 2, 2 to runs of 3--4, 3 to runs of 5--8 and so on. All shapes have been normalized to one of M, L, C, Q and Z; arcs in particular have been converted to cubic Bezier curves with the same algorithm as the current C version of IconVG.

It seems that there are significant gaps between 2--3 bits and 4--5 bits. This is of course a characteristic of this particular icon set, but it seems that we can safely limit the max run length to 16 with a minimal size increase (<0.3% in this case, theoretically 6.25% max).

The "line" row is a subset of L where only perpendicular lines are allowed. This was to see if having separate V or H opcodes with a run is desirable; it seems not, given that a vast majority is a run of just one. It however shows that most lines are perpendicular, suggesting a per-coordinate flag.

              rel=0   rel=2   rel=3   rel=4   rel=5   rel=6  no rel
           --------------------------------------------------------
   abs=0 |      477       2      51      32      14      10     167 |      753
   abs=1 |     2068     164     283     174      90      68     737 |     3584
   abs=2 |    17969     516    3415    1192    1035     933    6386 |    31446
   abs=3 |    26993    1290    6440    5666    4693    1025   13364 |    59471
   abs=4 |    53465    3088   15459   12100    7933       4   29965 |   122014
   abs=5 |    55161    2272   10986    5645    7161    4601   23959 |   109785
   abs=6 |        0       0       0       0       0       1       0 |        1
  no abs |    91903    2418    2467    1134     740      39  174819 |   273520
           ---------------------------------------------------------+---------
             248036    9750   39101   25943   21666    6681  249397 |   600574

This table shows the frequencies of coordinates that can be safely represented in one-byte absolute or relative encoding. I've made tons of assumptions:

  • No further quantization has been performed. If the number is not close enough (±0.01) to the nearest integer it was assumed to not fit in one byte at all.

  • The icon set has a fixed view box of (0,0)--(24,24). This means that absolute integral coordinates almost always fit in 5 bits. To be fair I think this small view box does model the post-quantization distribution. I've also tested with every coordinate multiplied by 2--10, 16 and 100; the entire table is too large to include here though.

  • There is no relative encoding with a single bit, since it will entirely consist of a sign bit. For more bits 2's complement representation was assumed.

  • The relative encoding always refers to the last coordinates. This means that, say, if there are 2 coordinates per one operation then the second coordinates are encoded relative to the first coordinates. This is different from SVG but seems reasonable especially given that we no longer has arcs (which center points significantly deviate from the path). Just to be sure I've also tested with SVG-like reference coordinates and found them to be much worse.

  • The initial coordinates used for the relative encoding are (0,0). There might be a better default or even a general way to maximally use available one-byte space, but such improvements cannot harm this baseline encoding.

  • We don't care about the exact encoding (would it be 2 bytes quantized or 4 bytes IEEE 754?) if the number doesn't fit to one byte in any encoding. I do think it should be quantized to 2 bytes in that case.

There are two kinds of encodings imaginable with this table.

  1. Use a separate flag, allocate A bits to absolute and B bits to relative. This would use 1+max(A,B) bits; the asymmetric A != B cases can be used to stuff other encodings.
  2. Allocate A bits to absolute, but switches to relative if the encoded number is less than 2^B. This would use A bits; both the absolute-only encoding (B = 0) and the relative-only encoding (B = A) are included.

In addition to this, it is possible that coordinates are multiplied by some scaling factor, possibly differently for absolute and relative encodings. The shared scaling factor models the optimization performed by authoring tools, so it doesn't need to be a part of the encoding itself. The ratio between absolute and relative encodings would have to be fixed or somehow encoded however.

I've simulated both encodings with all possible parameters and collected the best and runner-ups within 10% of the best (plus a selected few indicated with *). The first number is the number of required multi-byte encodings, so lower is better.

4 BITS:
236518   4 bits 1x absolute
246139   4 bits 1x absolute, 2 bits reinterpreted as 1x relative
253685   4 bits 1x absolute, 3 bits reinterpreted as 1x relative
268101*  1 bit flag, 3 bits 1x absolute or 3 bits 1x relative

5 BITS:
181894   5 bits 1x absolute
193787   5 bits 1x absolute, 2 bits reinterpreted as 1x relative
211320*  1 bit flag, 4 bits 1x absolute or 4 bits 1x relative

6 BITS:
161368   6 bits 2x absolute, 2 bits reinterpreted as 1x relative
161855   6 bits 2x absolute
169863   6 bits 2x absolute, 3 bits reinterpreted as 1x relative
174859   1 bit flag, 5 bits 1x absolute or 5 bits 1x relative
175599   1 bit flag, 5 bits 1x absolute or 4 bits 1x relative
176733   1 bit flag, 5 bits 1x absolute or 3 bits 1x relative
181893*  6 bits 1x absolute

7 BITS:
156161   7 bits 4x absolute, 3 bits reinterpreted as 1x relative
156467   7 bits 4x absolute, 2 bits reinterpreted as 1x relative
157162   1 bit flag, 6 bits 2x absolute or 6 bits 1x relative
157174   1 bit flag, 6 bits 2x absolute or 5 bits 1x relative
157506   1 bit flag, 6 bits 2x absolute or 4 bits 1x relative
158129   1 bit flag, 6 bits 2x absolute or 3 bits 1x relative
158312   7 bits 4x absolute
159624   1 bit flag, 6 bits 2x absolute or 2 bits 1x relative
161367   7 bits 2x absolute, 2 bits reinterpreted as 1x relative
161579   1 bit flag, 6 bits 2x absolute or 0 bits 1x relative
161854   7 bits 2x absolute
164619   7 bits 4x absolute, 4 bits reinterpreted as 1x relative
169862   7 bits 2x absolute, 3 bits reinterpreted as 1x relative
174819*  1 bit flag, 5 bits 1x absolute or 6 bits 1x relative
181893*  7 bits 1x absolute

8 BITS:
129256   8 bits 10x absolute, 3 bits reinterpreted as 1x relative
129903   8 bits 10x absolute, 4 bits reinterpreted as 1x relative
130264   8 bits 10x absolute, 2 bits reinterpreted as 1x relative
131482   8 bits 10x absolute
138738   8 bits 10x absolute, 5 bits reinterpreted as 1x relative
152645*  1 bit flag, 7 bits 4x absolute or 7 bits 100x relative
174819*  1 bit flag, 5 bits 1x absolute or 7 bits 1x relative
181893*  8 bits 1x absolute

At least for this particular icon set, both encodings seem to perform relatively well, especially with scaling. The first kind of encodings does leave additional opcode space though so they are preferred. It also seems that 2x and 10x scaling factors are common, where the latter is perhaps an artifact from the decimal encoding of SVG.

It should be also noted that there are disproportionally many coordinates that are exactly equal to the previous (i.e. rel=0). This is, of course, the original rationale of H/V opcodes. Any efficient encoding should take care of that.

Concrete Proposal

There would be five more sets of opcodes in the combined styling-drawing mode:

  • 16 L opcodes, corresponding to 1--16 line segments.
  • 16 C opcodes, corresponding to 1--16 cubic Bezier segments (thus 3x coordinates).
  • 16 Q opcodes, corresponding to 1--16 quadratic Bezier segments (thus 2x coordinates).
  • one Z opcode, closing the current path. If there is no current path this is a nop.
  • one ZM opcode, closing the current path and move to the following coordinates.

Alternatively, if the opcode space is sufficient the following opcodes might be possible as well:

  • 16 L, C and Q opcodes as above.
  • 16 ML, MC and MQ opcodes. They have an additional pair of coordinates to move before other coordinates.
  • one Z opcode.

In any case coordinates always come in pairs, in the order of x then y. Each coordinate is encoded as follows.

<- LSB     MSB ->
+-+-+-----------+
|1|Z|   number  | one-byte absolute
+-+-+-----------+

+-+-+-+---------+
|0|1|Z|  delta  | one-byte relative
+-+-+-+---------+

+-+-+-+-+-------+---------------+
|0|0|1|Z|         number        | two-byte absolute
+-+-+-+-+-------+---------------+

+-+-+-+---------+---------------+-------------+-+-------------+-+
|0|0|0|                mantissa               |    exponent   |S| four-byte absolute
+-+-+-+---------+---------------+-------------+-+-------------+-+

One-byte absolute encodes an absolute integer from -32 to 31. The actual number encoded is (integer + 32).

One-byte relative encodes an integral difference from -16 to 15. The actual number encoded is (difference + 16). The reference coordinate for x is always the previous x decoded, even in the previous operations or operations with multiple coordinates pairs (unlike SVG path); same for y. The reference coordinates for the first coordinates are (0, 0).

Note: This particular encoding corresponds to "1 bit flag, 6 bits 2x absolute or 4 bits 1x relative" above. (Due to the scaling difference, the specified encoding of "1 bit flag, 6 bits 2x absolute or 5 bits 2x relative" is no worse than it.)

Note: I've also experimented with an opportunistic absolute encoding, where if the absolute range and relative range completely overlaps the relative range is reinterpreted as a distinct absolute range. This sadly had no effect for this particular icon set. Moreover reference coordinates have to be an exact integer for this to work, and it is very hard to ensure that this is repeatable. It is possible, but the resulting complexity didn't justify itself.

Two-byte absolute encodes an absolute number from -64 to (64 - 1/64) in little endian. The actual number encoded is (number + 64) * 64.

Four-byte absolute encodes an IEEE 754 binary32 number in little endian. The encoded number has only 20 bits of coded mantissa (compared to 23 in the original); the lowest 3 bits are fixed to zero, allowing a direct conversion from the wire format to the number (#33). I personally don't want this at all, but the format's flexible view box demands this unfortunate addition.

A flag bit Z is in the lowest position of the decoded integer and, if set, indicates that next coordinates use a compressed relative zero and only one of two coordinates are encoded. Z is assumed to be unset in the four-byte absolute encoding. (As mentioned above, I expect four-byte encoding to be very uncommon anyway.)

In the initial state, Z in the x coordinate (hereafter ZX) and Z in the y coordinate (hereafter ZY) are interpreted as follows:

  • ZX=0, ZY=0: Both x and y are encoded for next coordinates.
  • ZX=1, ZY=0: Only y is encoded for next coordinates. The x coordinate is assumed to be same to the previous (reference) x coordinate.
  • ZX=0, ZY=1: Only x is encoded for next coordinates. The y coordinate is assumed to be same to the previous (reference) y coordinate.
  • ZX=1, ZY=1: Reserved. I currently don't have any good idea for this case.

In the state where the only single coordinate is encoded, Z is interpreted as follows:

  • Z=0: Both x and y are encoded for next coordinates.
  • Z=1: Different coordinate is encoded for next coordinates. In the other words, if x was encoded this time, Z=1 indicates that only y would be encoded next time, and so on. It is highly unlikely that the same component is repeatedly zero (only possible with some Bezier curves), so this encoding optimizes for the much more common case.

It is invalid that the very last coordinates in the run have any Z bit set. (This can be changed to affect the next run for optimization, but I'm not sure it's worth.)

It is equally valid to make this in the other direction, Z=1 indicating previous coordinates had a zero. This however was more difficult to specify and performed slightly worse than the current proposal (70.5% vs. 70.0% of the original).

Make binary32 numbers interpretable in place

Currently all number encodings except for natural numbers use IEEE 754 binary32 for the longest (4-byte) case. Since there are 30 bits of data the lowest two bits of the mantissa is assumed to be zeroes, but the actual encoding has ones so they can't be read as is. This is particularly annoying when the implementation language doesn't natively support the binary32-uint32 reinterpretation. Since otherwise the choice of those two bits doesn't matter, I propose to flip them.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.