Git Product home page Git Product logo

tinyexif's Introduction

TinyEXIF: Tiny ISO-compliant C++ EXIF and XMP parsing library for JPEG

Introduction

TinyEXIF is a tiny, lightweight C++ library for parsing the metadata existing inside JPEG files. No third party dependencies are needed to parse EXIF data, however for accesing XMP data the TinyXML2 library is needed. TinyEXIF is easy to use, simply copy the two source files in you project and pass the JPEG data to EXIFInfo class. Currently common information like the camera make/model, original resolution, timestamp, focal length, lens info, F-stop/exposure time, GPS information, etc, embedded in the EXIF/XMP metadata are fetched. It is easy though to extend it and add any missing or new EXIF/XMP fields.

Usage example

#include "TinyEXIF.h"
#include <iostream> // std::cout
#include <fstream>  // std::ifstream
#include <vector>   // std::vector

int main(int argc, const char** argv) {
	if (argc != 2) {
		std::cout << "Usage: TinyEXIF <image_file>" << std::endl;
		return -1;
	}

	// open a stream to read just the necessary parts of the image file
	std::ifstream istream(argv[1], std::ifstream::binary);

	// parse image EXIF and XMP metadata
	TinyEXIF::EXIFInfo imageEXIF(istream);
	if (imageEXIF.Fields)
		std::cout
			<< "Image Description " << imageEXIF.ImageDescription << "\n"
			<< "Image Resolution " << imageEXIF.ImageWidth << "x" << imageEXIF.ImageHeight << " pixels\n"
			<< "Camera Model " << imageEXIF.Make << " - " << imageEXIF.Model << "\n"
			<< "Focal Length " << imageEXIF.FocalLength << " mm" << std::endl;
	return 0;
}

See main.cpp for more details.

Copyright

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  • Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
  • Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FREEBSD PROJECT OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

tinyexif's People

Contributors

cdcseacave avatar reunanen avatar sdicke avatar shinji-yoshida avatar simfeo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tinyexif's Issues

Integer overflow in TinyEXIF::EntryParser::parseString

Hey

Inside parseString, the arguments data, and num_components are entirely user controlled. They are added together with base and used to check if we can safely read the contents of our string inside our given len. The problem here is that both data and num_components are unsigned int. This means if the user provides a big enough value for the first check, we can overflow the calculation:

	static std::string parseString(const uint8_t* buf,
		unsigned num_components,
		unsigned data,
		unsigned base,
		unsigned len,
		bool intel)
	{
		std::string value;
		if (num_components <= 4) {
                // ...
		} else
                // [1] Calculation here, data = 0xffffffff, num_components < len
		if (base+data+num_components <= len) {
			const char* const sz((const char*)buf+base+data);
			unsigned num(0);
                        // [2] segfault when we check `sz[num] != '\0'`
			while (num < num_components && sz[num] != '\0')
				++num;
			while (num && sz[num-1] == ' ')
				--num;
			// Copy `num` chars from `sz` into `value`.
			value.assign(sz, num);
		}
		return value;
	}
};

If, for example we set data as 0xffffffff, and then num_components as a small value, the calculation will overflow and the result can be less than len. Then when we pass this check, we add data to our buf pointer. In this case, sz will now be pointing to unmapped memory owing to the addition of 0xffffffff, and will cause a segfault when we try to dereference to check for null bytes.

To mitigate this, perhaps you could add a small check before [1] to ensure it doesnt overflow.

Note that this doesnt have to be a segfault. With smaller values, this will alow us to read out of bounds on the heap, reading contents from other chunks and copying them into value.

Thanks for your time.

Error in strrnstr() when compiling with _UNICODE

In the line:

				0 == _tcsncmp(haystack, needle, needle_len))

You are using TCHAR-ized name of the strncmp() function assuming that the project will always be compiled with multi-byte strings. This results in the compile error when the project is compiled with UNICODE support.

That's not the only place where this happens and frankly I don't see a point of using those names if the resst of the code uses std::string which is char *, not wchar_t *.

Add list of fields found

It would be great to have a vector or set containing the names of the fields that were found when decoding the EXIF data.
Not having it is a problem since the fields have default values that could be valid and (I guess that) it is not possible to detect otherwise.

Another option is to use std::optional on the fields, which may be better, but may break old implementations if/when they update TinyEXIF. Also, it requires C++17.

An example where this problem may arise is when 0, despite being unknown, is a possible input value, and developers will not be able to distinguish between the unknown and the missing.

I just added a branch to my fork of this repository that apply the change above, but I'm unsure if I should create a pull request for it.

DLL support seems broken

I compile a DLL by setting TINYEXIF_EXPORT macro, and upon linking to it with TINYEXIF_IMPORT in another project I am getting a lot of C4251 warnings in the following form:

'TinyEXIF::EXIFInfo::ImageDescription': class 'std::basic_string<char,std::char_traits<char>,std::allocator<char>>' needs to have dll-interface to be used by clients of class 'TinyEXIF::EXIFInfo' (compiling source file ImageCodecJPEG.cpp)
2>C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.23.28105\include\xstring(4440): message : see declaration of 'std::basic_string<char,std::char_traits<char>,std::allocator<char>>' (compiling source file ImageCodecJPEG.cpp)

There seem to be two approaches to solving the problem:

  1. Exporting STL Components Inside & Outside of a Class
  2. Using dynamic allocation/deallocation of STL structures:
class EXPORTED ExportedClass
{
private:
    std::vector<int> *_integers;
public:
    ExportedClass()
    {
        _integers = new std::vector<int>();
    }
    ~ExportedClass()
    {
        delete _integers;
    }
};

Example impractical for large images

The example usage suggests to gulp up the entire file:

	std::ifstream file(argv[1], std::ifstream::in|std::ifstream::binary);
	file.seekg(0,std::ios::end);
	std::streampos length = file.tellg();
	file.seekg(0,std::ios::beg);
	std::vector<uint8_t> data(length);
	file.read((char*)data.data(), length);

This is impractical for large files - and many of us deal with very large images (like 360 panoramas easily exceeding 20MB). I have just spent ~2hrs studying JPEG format to assure myself APP1 segment can only live before the SOS and that - with way less confidence - that APP0 and APP1 must immediately follow SOI (which yields the max offset of 120kB instead of the whole image). Studying jpeg is precisely what tinyexif should relieve users from needing to do. It would be a massive favour if the authors, certainly knowledgeable in JPEG innards, have fixed the example usage with this subtlety in mind.

Third party dependencies required

The Readme leaves the impression, that this library does not depend on "third party dependencies", but it includes tinyxml2. It would be nice if that could be pointed out. Or you could just supply tinyxml2.cpp tinyxml2.h with the source.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.