Git Product home page Git Product logo

go-xmp's Introduction

go-xmp

GoDoc

go-xmp is a native Go SDK for the Extensible Metadata Platform (XMP) as defined by the Adobe XMP Specification Part 1, Part 2 and Part 3, a.k.a ISO 16684-1:2011(E).

Features

Included metadata models

  • XMP DublinCore (dc)
  • XMP Media Management (xmpMM)
  • XMP Dynamic Media (xmpDM)
  • XMP Rights (xmpRights)
  • XMP Jobs (xmpBJ)
  • XMP Paged Text (xmpTPg)
  • EXIF v2.3.1 (exif, exifEX)
  • Adobe Camera Raw (crs)
  • Creative Commons (cc)
  • DJI Drones (dji)
  • ID3 v2.2, v2.3, v2.4 (id3)
  • iXML audio recorder (ixml)
  • iTunes/MP4 (itunes)
  • ISO/MP4 (mp4)
  • Quicktime (qt)
  • PhotoMechanic (pm)
  • Tiff (tiff)
  • Riff (riff)
  • Photoshop (ps)
  • PDF (pdf)

Metadata models available under commercial license

  • ACES Image Metadata
  • AEScart
  • ARRI Camera Metadata
  • ASC CDL
  • EBU Broadcast WAV
  • Getty Images
  • IPTC Core 1.2, IPTC Extension 1.3, IPTC Video Metadata 1.0
  • Plus Licensing Metadata
  • SMPTE DPX Image Metadata
  • SMPTE MXF Metadata
  • OpenEXR Image Header Metadata
  • XMP Media Production SDK (Universal Metadata Container)

Documentation

Installation

Install go-xmp using the "go get" command:

go get github.com/trimmer-io/go-xmp

The Go distribution is go-xmp's only dependency.

Examples

Benchmarks

go test ./test/ -bench=. -benchmem

goos: darwin
goarch: amd64
pkg: trimmer.io/go-xmp/test
BenchmarkUnmarshalXMP_5kB-8       5000      321524 ns/op     58071 B/op     1056 allocs/op
BenchmarkMarshalXMP_5kB-8         5000      270981 ns/op     61384 B/op      758 allocs/op
BenchmarkMarshalJSON_5kB-8        5000      338354 ns/op     91855 B/op     1023 allocs/op
BenchmarkUnmarshalJSON_5kB-8      5000      382196 ns/op     60387 B/op     1022 allocs/op
BenchmarkUnmarshalXMP_85kB-8       300     5152080 ns/op    902794 B/op    17779 allocs/op
BenchmarkMarshalXMP_85kB-8         300     4292143 ns/op    966356 B/op    12209 allocs/op
BenchmarkMarshalJSON_85kB-8        300     5378268 ns/op   1453004 B/op    16535 allocs/op
BenchmarkUnmarshalJSON_85kB-8      200     5512114 ns/op    880161 B/op    14497 allocs/op

XMP Marshal Benchmark using premiere-cc.xmp, a rather large xmpDM file with history, xmpMM:Pantry etc.

  Compression Results 417        mean               min                  max
  -----------------------------------------------------------------------------
        Original sizes       4013 (100.0)        918 (100.0)      86940 (100.0)
             XMP sizes       3545 ( 90.4)        723 ( 52.3)      78325 (147.0)
        XMP Gzip sizes       1177 ( 32.0)        369 (  8.4)       8086 ( 61.3)
      XMP Snappy sizes       1195 ( 32.5)        387 (  8.4)       8104 ( 62.9)
            JSON sizes       2147 ( 52.8)        389 ( 31.6)      65127 ( 91.7)
       JSON Gzip sizes        889 ( 23.7)        209 (  6.9)       7714 ( 50.5)
     JSON Snappy sizes        907 ( 24.3)        227 (  7.0)       7732 ( 50.6)
  -----------------------------------------------------------------------------
        XML->XMP times          371.674µs           74.091µs         4.816883ms
       XMP->JSON times          234.785µs            36.91µs         4.517591ms
        XMP->XML times          259.084µs           20.004µs         4.505254ms
        XMP Gzip times          214.342µs          105.685µs         1.036886ms
      XMP Gunzip times           59.948µs           19.924µs          285.113µs
      XMP Snappy times           28.325µs            7.975µs          234.161µs
    XMP Unsnappy times           25.841µs            7.899µs          195.265µs
       JSON Gzip times          197.268µs            93.86µs          968.985µs
     JSON Gunzip times           53.622µs           17.655µs         2.913516ms
     JSON Snappy times           23.581µs            7.864µs          221.674µs
   JSON Unsnappy times           37.215µs            7.856µs          398.361µs

Size matters when storing XMP in a database or sending documents over a network. Above is a quick comparison between common compression methods gzip and snappy regarding runtime and size for documents in the samples/ directory. What's also included is a comparison of the uncompressed documents in XMP/XML and XMP/JSON format. Original means the initial XMP document as stored in .xmp sidecar files. To be fair, some originals use padding, so the mean size distribution is larger than what go-xmp generated here because padding was turned off during write.

Contributing

See CONTRIBUTING.md.

License

go-xmp is available under the Apache License, Version 2.0.

go-xmp's People

Contributors

echa avatar martinhoefling avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

go-xmp's Issues

XMP Sequences are not parsed

I am extracting XMP data from PDF files that includes the dc:creator field, which looks like this:

<dc:creator>
    <rdf:Seq>
        <rdf:li>Mark.Brown</rdf:li>
    </rdf:Seq>
</dc:creator>

When I create a document and decode it, the creator field isn't present because go-xmp doesn't seem to know about sequences. Neither Dump() for ListPaths() shows the creator field.

Digikam xmp model

Are you interested in a digikam model? If so, I can prepare a pull request.

doesn't correctly handle dates without time zones

According to the XMP Specification, Part I, section 8.2.1.2, "the time zone designator need not be present in XMP. When not present, the time zone is unknown, and an XMP processor should not assume anything about the missing time zone."

The go-xmp library parses time fields (e.g. xmp:CreateDate) using the Go time.Parse function, which assumes UTC if it doesn't see a time zone. More fundamentally, the library stores time fields in a Go time.Time structure, which has no way to represent a time without a time zone. There's no way for a library user to tell that the time zone is unknown. As a result, a photo viewer could be showing a "local" time that is many hours off from the correct one.

Since XMP times cannot be faithfully represented in a Go time.Time, it would be best to leave them in string form, and provide the conversion routines as separate functions.

Certificate has expired

When I run go get trimmer.io/go-xmp I have the following error:

go: trimmer.io/[email protected]: unrecognized import path "trimmer.io/go-xmp": https fetch: Get "https://trimmer.io/go-xmp?go-get=1": x509: certificate has expired or is not yet valid: current time 2021-01-11T13:05:37+01:00 is after 2021-01-08T09:52:26Z

Your certificate has expired 4 days ago...

Data race

These statistic variables at the package scope are not atomic:

npAllocs, npFrees, npHits, npReturns int64

So I'm getting data races when processing multiple separate files concurrently, each in their own goroutine:

WARNING: DATA RACE
Read at 0x000008332ea0 by goroutine 1243:
  github.com/trimmer-io/go-xmp/xmp.NewNode()
      /home/matt/go/pkg/mod/github.com/trimmer-io/[email protected]/xmp/node.go:83 +0xcb
  github.com/trimmer-io/go-xmp/xmp.copyNode()
      /home/matt/go/pkg/mod/github.com/trimmer-io/[email protected]/xmp/node.go:121 +0xa8
  github.com/trimmer-io/go-xmp/xmp.copyNodes()
      /home/matt/go/pkg/mod/github.com/trimmer-io/[email protected]/xmp/node.go:133 +0x169
  github.com/trimmer-io/go-xmp/xmp.copyNode()
      /home/matt/go/pkg/mod/github.com/trimmer-io/[email protected]/xmp/node.go:126 +0x42c
  github.com/trimmer-io/go-xmp/xmp.(*Decoder).decodeNode()
      /home/matt/go/pkg/mod/github.com/trimmer-io/[email protected]/xmp/unmarshal.go:175 +0x693
  github.com/trimmer-io/go-xmp/xmp.(*Decoder).Decode()
      /home/matt/go/pkg/mod/github.com/trimmer-io/[email protected]/xmp/unmarshal.go:133 +0x1a84
  github.com/trimmer-io/go-xmp/xmp.Unmarshal()
      /home/matt/go/pkg/mod/github.com/trimmer-io/[email protected]/xmp/unmarshal.go:60 +0xce
...

Previous write at 0x000008332ea0 by goroutine 2555:
  github.com/trimmer-io/go-xmp/xmp.NewNode()
      /home/matt/go/pkg/mod/github.com/trimmer-io/[email protected]/xmp/node.go:83 +0xe8
  github.com/trimmer-io/go-xmp/xmp.(*Node).UnmarshalXML()
      /home/matt/go/pkg/mod/github.com/trimmer-io/[email protected]/xmp/node.go:379 +0x3f4
  github.com/trimmer-io/go-xmp/xmp.(*Node).UnmarshalXML()
      /home/matt/go/pkg/mod/github.com/trimmer-io/[email protected]/xmp/node.go:380 +0x4a9
  encoding/xml.(*Decoder).unmarshalInterface()
      /usr/local/go/src/encoding/xml/read.go:211 +0x1ac
  encoding/xml.(*Decoder).unmarshal()
      /usr/local/go/src/encoding/xml/read.go:363 +0xc6d
  encoding/xml.(*Decoder).DecodeElement()
      /usr/local/go/src/encoding/xml/read.go:156 +0x2a4
  encoding/xml.(*Decoder).Decode()
      /usr/local/go/src/encoding/xml/read.go:140 +0x84
  github.com/trimmer-io/go-xmp/xmp.(*Decoder).Decode()
      /home/matt/go/pkg/mod/github.com/trimmer-io/[email protected]/xmp/unmarshal.go:81 +0x204
  github.com/trimmer-io/go-xmp/xmp.Unmarshal()
      /home/matt/go/pkg/mod/github.com/trimmer-io/[email protected]/xmp/unmarshal.go:60 +0xce

Fortunately, these are int64, so the easy solution is to use the atomic package, for example to add 1 to a variable, you simply do atomic.AddInt64(&npAllocs, 1).

How can I write to a string array tag?

I try to create a model for digikam.

This is how my example of creating an empty document and writing it as xmp to stdout looks like:

package main

import (
	"fmt"
	"os"
	"trimmer.io/go-xmp/xmp"
)

var (
	NsDk = xmp.NewNamespace("digikam", "http://www.digikam.org/ns/1.0/", NewModel)
)

func init() {
	xmp.Register(NsDk, xmp.XmpMetadata)
}

func NewModel(name string) xmp.Model {
	return &Digikam{}
}

func MakeModel(d *xmp.Document) (*Digikam, error) {
	m, err := d.MakeModel(NsDk)
	if err != nil {
		return nil, err
	}
	x, _ := m.(*Digikam)
	return x, nil
}

func FindModel(d *xmp.Document) *Digikam {
	if m := d.FindModel(NsDk); m != nil {
		return m.(*Digikam)
	}
	return nil
}

type Digikam struct {
	TagsList     xmp.StringArray `xmp:"digiKam:TagsList"`
}

func (x Digikam) Can(nsName string) bool {
	return NsDk.GetName() == nsName
}

func (x Digikam) Namespaces() xmp.NamespaceList {
	return xmp.NamespaceList{NsDk}
}

func (x *Digikam) SyncModel(d *xmp.Document) error {
	return nil
}

func (x *Digikam) SyncFromXMP(d *xmp.Document) error {
	return nil
}

func (x Digikam) SyncToXMP(d *xmp.Document) error {
	return nil
}

func (x *Digikam) CanTag(tag string) bool {
	_, err := xmp.GetNativeField(x, tag)
	return err == nil
}

func (x *Digikam) GetTag(tag string) (string, error) {
	if v, err := xmp.GetNativeField(x, tag); err != nil {
		return "", fmt.Errorf("%s: %v", NsDk.GetName(), err)
	} else {
		return v, nil
	}
}

func (x *Digikam) SetTag(tag, value string) error {
	if err := xmp.SetNativeField(x, tag, value); err != nil {
		return fmt.Errorf("%s: %v", NsDk.GetName(), err)
	}
	return nil
}


func main() {
	d := xmp.NewDocument()
	m := NewModel("somestring")
	if _, err := d.AddModel(m); err != nil {
		panic(err)
	}
	if err := m.SetTag("TagsList","holla\n"); err != nil {
		panic(err)
	}
	buf, _ := xmp.Marshal(d)
	os.Stdout.Write(buf)
}

A digikam xmp would contain something like:

   <digiKam:TagsList>
    <rdf:Seq>
     <rdf:li>People/Herbert</rdf:li>
     <rdf:li>Mountains</rdf:li>
    </rdf:Seq>
   </digiKam:TagsList>

Any ideas what I got wrong?

Imports broken due to expired TLS certificate

Websites prove their identity via certificates, which are valid for a set time period. The certificate for trimmer.io expired on 8/17/2020.

Error code: SEC_ERROR_EXPIRED_CERTIFICATE

Please don't use vanity imports unless there's a really good reason.

xmp: parsing xml failed: XML syntax error on line 1: invalid UTF-8

Turns out nearly every photo from my phone has XMP data with "invalid UTF-8" in the makernotes.

Here's a snippet as produced by fmt.Printf("%s", packet):

fHy6To/e/1t1j���BCwxNl3v65eE2aZb08J5YkZUvsmV7C7loOkw2EJaL7VTYXHMXd3Rb0lCUy1Auhoe8gbS9ZafzrjIOOCVjpmk3H2WeRAiUwdRFieMdt+S1VvDOobXxCg6Nk8f16tykfSdpwFUi0XVaeVTiLYF00p4FBUPJET6THkH8HM5ol

This input results in xmp: parsing xml failed: XML syntax error on line 1: invalid UTF-8 within d.Decode() when it calls the stdlib xml.Decoder.Decode().

Here's are the actual file bytes starting at the same offset as the snippet above. But you'll notice that after the ??? things seem to diverge. I'm not sure what is wrong here, I don't really know enough about encodings:

66  48  79  36  54  6F  2F  65  2F  31  74  31  6A  6A  6A  47  46  6A  77  76  72  70  6F  2B  35
73  4E  64  6E  67  42  75  6C  4D  33  48  2B  65  77  5A  4F  64  61  32  46  77  35  51  73  4A
39  6F  75  2F  4D  59  48  4D  31  43  79  57  6D  7A  47  5A  77  56  78  72  4D  68  4F  36  2B
70  51  7A  76  7A  4C  48  6C  53  36  FF  E1  0E  75  68  74  74  70  3A  2F  2F  6E  73  2E  61
64  6F  62  65  2E  63  6F  6D  2F  78  6D  70  2F  65  78  74  65  6E  73  69  6F  6E  2F  00  33
31  43  41  32  34  33  45  36  31  39  34  33  41  39  36  39  34  31  39  41  35  35  34  46  39
39  43  42  46  45  43  00  01  0D  DA  00  00  FF  B2  42  43  77  78  4E  6C  33  76  36  35  65
45  32  61  5A  62  30  38  4A  35  59  6B  5A  55  76  73  6D  56  37  43  37

The "invalid UTF-8" is apparently at the sequence of 6A 6A 6A on the first line there. Here's a screenshot of the file in my hex viewer:

image

In the hex viewer, it shows "j j j G F" for those bytes.

ExifTool and other XMP parsers handle things fine, so I'm sure this is technically valid. It's just that the Go XML parser doesn't like it. Do you know any way to fix it?

xmpMM: ResourceEvent fields tags.

I'm having problem with the PDF MediaManagement metadata validation. While I'm putting the document with the Version filled
the PDF/A verifier (https://demo.verapdf.org/) marks it as invalid XMP composition. If all the other fields besides version are set, then there is no problem, so it looks like a problem with serialization of the Versions or ResourceEvent.

While I checked and compared the content of the XMP metadata I've found out that the structures differs compared to some example on the Adobe.

This is a bag of versions created by go-xmp:

<rdf:Bag>
    <rdf:li stVer:comments="Optimize document to PDF/A standard" stVer:modifyDate="2021-12-08T13:50:51+01:00" stVer:version="1" rdf:parseType="Resource">
        <stVer:event stEvt:action="formatted" stEvt:instanceID="7a023287-49a7-4bf9-b809-4f99c10e0b67" stEvt:parameters="Optimized to PDF/A" stEvt:softwareAgent="Custom Software agent" stEvt:when="2021-12-08T13:50:51+01:00"></stVer:event>
    </rdf:li>
</rdf:Bag>

Comparing to some Adobe forum content: (https://community.adobe.com/t5/photoshop-ecosystem-discussions/exif-history/td-p/9931177)

 <rdf:Seq>
   <rdf:li rdf:parseType="Resource">
        <stEvt:action>saved</stEvt:action>
        <stEvt:instanceID>xmp.iid:D8D91F3EE568E811852FBF74E8439D24</stEvt:instanceID>
        <stEvt:when>2018-06-05T18:24:03+01:00</stEvt:when>
        <stEvt:softwareAgent>Adobe Photoshop CS5 Windows</stEvt:softwareAgent>
        <stEvt:changed>/</stEvt:changed>
    </rdf:li>
</rdf:Seq>

Looks like a ResourceEvent is not set as attributes but separate fields.

Do you know how it could be solved?
If needed I can help in contribution.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.