Git Product home page Git Product logo

Comments (8)

pjanx avatar pjanx commented on May 3, 2024

Complete means tEXt, zTXt, iTXt. Former two use Latin 1, last one UTF-8. There's also a minor complication in that these chunks can be at both ends of the file, the end user probably doesn't want to keep that distinction.

libpng simply accumulates them (though you need to read the start and end chunks both explicitly, unless using the constrained API) and then you can read an array out of the info structure. spng is rather similar.

What I'm doing is requesting values for particular keys, and I'm not interested in “translated keywords”. There can be multiple values for a given key, and the simplest viable high-level API is func(keyword string)[]string. ISO 8859-1 is trivially converted to UTF-8, so that defines the encoding. A low-level API would call back with:

struct {
    keyword, text string
    languageCode, translatedKeyword *string
}

Have a look at ~/.cache/thumbnails in a GNOME/KDE system. Specification.

from wuffs.

nigeltao avatar nigeltao commented on May 3, 2024

Thanks for the comment. Just a quick reply to the idea of string, *string and []string in the APIs...

Wuffs' higher level auxiliary C++ API can do this but one memory-safety constraint on the lower level C API is that it cannot allocate memory and specifically, the callee cannot create or hold arbitrarily long strings. C also just doesn't have a good string type. I'm not sure which level you had in mind but, if your program is C and not C++, then you're obviously restricted to the C API, not C++.

In terms of API design, I'll also copy/paste from what I just wrote at #39 (comment)

Wuffs' image, image metadata and color correction APIs span many file formats... so there's some abstraction that might look weird at first glance.

In particular, languageCode and translatedKeyword may be part of PNG iTXt chunks, but IIRC they're not part of GIF, JPEG, etc. comments.

from wuffs.

pjanx avatar pjanx commented on May 3, 2024

The content encoding can be simply tagged, that would also be viable. I haven't had the necessity to learn about GIF or JPEG.

zTXt is compressed, not sure how you want to handle the decompression there, then.

from wuffs.

nigeltao avatar nigeltao commented on May 3, 2024

iCCP payload is also zlib-compressed and we already handle that.

from wuffs.

pjanx avatar pjanx commented on May 3, 2024

Where/by whom is the decompression buffer allocated?

from wuffs.

nigeltao avatar nigeltao commented on May 3, 2024

Throughout Wuffs' C API, it's always the caller (not the callee) who allocates variable-length buffers. Pass that caller-allocated buffer as the wuffs_base__io_buffer* a_dst argument to wuffs_base__image_decoder__tell_me_more or wuffs_png__decoder__tell_me_more.

If the buffer is too short, that call will return wuffs_base__suspension__short_write and it's up to the caller to re-allocate and call again, or otherwise abort (because it cannot or will not allocate more memory).

The C++ API (e.g. used by example/imageviewer) manages the memory for you, so that the C++ callback (e.g. imageviewer's MyDecodeImageCallbacks::HandleMetadata) gets a contiguous (ptr, len) pair: the raw arg. If you want to dig into the C++ code, start with this line:

sync_io::DynIOBuffer raw(max_incl_metadata_length);

from wuffs.

pjanx avatar pjanx commented on May 3, 2024

That sounds like ISO Latin 1 could still be just another decoder. Sadly, with zlib compression, it might be two levels of transformations to do.

Code for the encoding conversion is a trivial exercise, having it done automatically would just make the API less awkward to use.

Speaking of awkward, it looks like both libpng and spng make you assume the encoding based on the ‘type’ of the chunk, which is exposed in their structures. Luckily for me, everything above ASCII happens to be percent-encoded in thumbnails…

from wuffs.

nigeltao avatar nigeltao commented on May 3, 2024

Wuffs' PNG decoder should now be able to give you text chunks (whether iTXt, tEXt or zTXt). If you're using the C++ API, this patch shows how to get the data. Starting at:

Apply:

diff --git a/example/imageviewer/imageviewer.cc b/example/imageviewer/imageviewer.cc
index 96938453..b0327414 100644
--- a/example/imageviewer/imageviewer.cc
+++ b/example/imageviewer/imageviewer.cc
@@ -175,6 +175,23 @@ class MyDecodeImageCallbacks : public wuffs_aux::DecodeImageCallbacks {
               1e5 / (g_flags.screen_gamma * minfo.metadata_parsed__gama());
           break;
       }
+    } else {
+      const char* name = nullptr;
+      switch (minfo.metadata__fourcc()) {
+        case WUFFS_BASE__FOURCC__KVPK:
+          name = "Key";
+          break;
+        case WUFFS_BASE__FOURCC__KVPV:
+          name = "Val";
+          break;
+      }
+      static char buf[4096];
+      if (name && (raw.len < 4096)) {
+        // Convert raw (a wuffs_base__slice_u8) to a NUL-terminated C string.
+        memcpy(buf, raw.ptr, raw.len);
+        buf[raw.len] = 0x00;
+        printf("    %s %s\n", name, buf);
+      }
     }
     return wuffs_aux::DecodeImageCallbacks::HandleMetadata(minfo, raw);
   }
@@ -244,6 +261,7 @@ load_image(const char* filename) {
   uint64_t dia_flags = 0;
   if (g_flags.screen_gamma > 0) {
     dia_flags |= wuffs_aux::DecodeImageArgFlags::REPORT_METADATA_GAMA;
+    dia_flags |= wuffs_aux::DecodeImageArgFlags::REPORT_METADATA_KVP;
   }
 
   MyDecodeImageCallbacks callbacks;

For example, running ./build-example.sh example/imageviewer && gen/bin/example-imageviewer ~/.cache/thumbnails/large/etc.png prints:

    Key Thumb::URI
    Val file:///usr/share/images/desktop-base/desktop-grub.png
    Key Thumb::MTime
    Val 1559986823

If you're using the C API, test_wuffs_png_decode_metadata_kvp has some code you can study (link below). To keep the test simple, it assumes that the entire PNG input can fit into memory at once, as will each key/value pair within. If you're streaming, then you'll need to take care of Wuffs' usual "short read/write" suspensions.

test_wuffs_png_decode_metadata_kvp() {


The callee, not the caller, translates from Latin-1 to UTF-8 when necessary.

from wuffs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.