Git Product home page Git Product logo

zredshift / mimemagic Goto Github PK

View Code? Open in Web Editor NEW
97.0 3.0 9.0 15.19 MB

Powerful and versatile MIME sniffing package using pre-compiled glob patterns, magic number signatures, XML document namespaces, and tree magic for mounted volumes, generated from the XDG shared-mime-info database.

License: GNU General Public License v2.0

Go 100.00%
mime mime-types mime-database filetype detection sniffing magic-numbers extension golang go

mimemagic's Introduction

mimemagic

GoDoc Build Status Codecov Go Report Card

Powerful and versatile MIME sniffing package using pre-compiled glob patterns, magic number signatures, xml document namespaces, and tree magic for mounted volumes, generated from the XDG shared-mime-info database.

License

The generated code in magicsigs.go, globs.go, treemagicsigs.go, namespaces.go and mediatypes.go makes this a derivative work of shared-mime-info, and therefore falls under the GPL-2.0-or-later license. See the discussion.

For an MIT licensed branch with the generated code removed, please see this. Providing a hypothetical permissively licensed freedesktop.org.xml file for parsing is still required, to redistribute the compiled executable with that license, however.

Features

  • All in native go, no outside dependencies/C library bindings
  • 1003 MIME types, with a description, an acronym (where available), common aliases, extensions, icons, and subclasses
  • 493 magic signature tests (comprising of 1147 individual patterns), featuring range searches and bit masks, as per the xdg specification
  • 1099 glob patterns, for filename-based matching
  • 11 Tree Magic signatures and 28 XML namespace/local name pairs, offered for completeness' sake.
  • Included is the xml file parser to generate your own MIME definitions
  • Also included is a CLI based on this library that is fully featured and blazing-fast, beating the native 'file' and KDE's 'kmimetypefinder' in performance
  • Cross-platform support

Installation

The library:

go get github.com/zRedShift/mimemagic/v2

The CLI:

go get github.com/zRedShift/mimemagic/v2/cmd/mimemagic

API

See the Godoc reference, and cmd/mimemagic for an example implementation.

Usage

The library:

package main

import (
	"fmt"
	"github.com/zRedShift/mimemagic/v2"
	"strings"
)

func main() {
	// Ignoring Read errors that might arise
	mimeType, _ := mimemagic.MatchFilePath("sample.svgz", -1)

	// image/svg+xml-compressed
	fmt.Println(mimeType.MediaType())

	// compressed SVG image
	fmt.Println(mimeType.Comment)

	// SVG (Scalable Vector Graphics)
	fmt.Printf("%s (%s)\n", mimeType.Acronym, mimeType.ExpandedAcronym)

	// application/gzip
	fmt.Println(strings.Join(mimeType.SubClassOf, ", "))

	// .svgz
	fmt.Println(strings.Join(mimeType.Extensions, ", "))

	// This is an image.
	switch mimeType.Media {
	case "image":
		fmt.Println("This is an image.")
	case "video":
		fmt.Println("This is a video file.")
	case "audio":
		fmt.Println("This is an audio file.")
	case "application":
		fmt.Println("This is an application.")
	default:
		fmt.Printf("This is a(n) %s.", mimeType.Media)
	}

	// true
	fmt.Println(mimeType.IsExtension(".svgz"))
}

The CLI:

Usage: mimemagic [options] <file> ...
Determines the MIME type of the given file(s).

Options:
  -c    Determine the MIME type of the file(s) using only its content.
  -f    Determine the MIME type of the file(s) using only the file name. Does
        not check for the file's existence. The -c
         flag takes precedence.
  -i    Output the MIME type in a human readable format.
  -l int
        The number of bytes from the beginning of the file mimemagic will
        examine. Reads the entire file if set to a negative value. By default
        mimemagic will only read the first 512 from stdin, however setting this
        flag to a non-default negative value will override this. (default -1)
  -t    Determine the MIME type of the directory/mounted volume using tree
        magic. Can't be used in conjunction with with -c, -f or -x.
  -x    Determine the MIME type of the xml file(s) using the local names and
        namespaces within. Can't be used in conjunction with -c, -f or -t.

Arguments:
  file
        The file(s) to test. '-' to read from stdin. If '-' is set, all other
        inputs will be ignored.

Examples:
  $ mimemagic -c sample.svgz
    	application/gzip
  $ mimemagic *.svg*
    	Olympic_rings_with_transparent_rims.svg: image/svg+xml
    	Piano.svg.png: image/png
    	RAID_5.svg: image/svg+xml
    	sample.svgz: image/svg+xml-compressed
  $ cat /dev/urandom | mimemagic -
    	application/octet-stream
  $ ls software; mimemagic -i -t software/
    	autorun
    	UNIX software

Benchmarks

See Benchmarks. For Match(), the average across over 400 completely different files (representing a unique MIME type each) is 13 ± 7 μs/op. For MatchGlob() it's 900 ± 200 ns/op, and for 12 ± 7 μs/op MatchMagic().

mimemagic's People

Contributors

dependabot[bot] avatar fho avatar zredshift avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

mimemagic's Issues

How to return "inode/symlink" mime-type for symlink file?

Is there a way to not follow symlink to target file and return "inode/symlink" directly?

I wrote a simple example here, as you can see the mime-type of symlink bar file is the same as source file foo.

It seems in mediatypes.go, the symlink media type has already been supported, How can I use mimemagic APIs to achieve this?

Thanks for your help!

CLI install instructions don't work

Instructions

mimemagic/README.md

Lines 42 to 45 in 5028b72

The CLI:
```bash
go get github.com/zRedShift/mimemagic/v2/cmd/mimemagic
```

Result

/tmp $ go get github.com/zRedShift/mimemagic/v2/cmd/mimemagic
go: go.mod file not found in current directory or any parent directory.
	'go get' is no longer supported outside a module.
	To build and install a command, use 'go install' with a version,
	like 'go install example.com/cmd@latest'
	For more information, see https://golang.org/doc/go-get-install-deprecation
	or run 'go help get' or 'go help install'.
/tmp [1]$

I'm on go version go1.21.1 darwin/arm64.

freedesktop.org.xml file license

I've historically been the maintainer of shared-mime-info for around 15 years, and cmd/parser/freedesktop.org.xml looks like it's a copy of the database shipped with shared-mime-info, which is released under the GPL, with shared-mime-info's translators work merged in, and the GPL header removed.

The license that you're shipping mimemagic under (MIT) isn't compatible with shared-mime-info's.

There are a number of possibilities to fix this problem:

  • change the mimemagic license to be GPL compatible
  • parse the XML file that shared-mime-info ships at runtime, and don't ship it in a codebase with an incompatible license

Using a GPL file as a source makes your whole codebase a derived work, making it all GPL, so I think it's pretty important that this problem gets corrected before somebody uses it in a pure MIT codebase, or a closed-source application.

You will also need to re-add the GPL header to the shared-mime-info XML file as a matter of urgency.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.