Git Product home page Git Product logo

video-encoder's Introduction

Video-Encoder

Documentation

Problem Statement

The project involves design and implementation of an efficient encoder of images into a file, which can further be decoded and played as a video. The task is to combine the information from all the images, compress into a file which can easily be stored. In most cases, to achieve compression, significant amount of information is dropped which when later retrieved can result in loss of quality. The challenge is to efficiently compress the size without losing much information or in other words quality of the video.

Description

A video is a sequence of images displaying them at a certain rate. Instead of storing all the images (or frames), a video file can store information from all the frames and reconstruct the frames when a video file opened. This project basically does the same, but in an efficient way.

A lot of information in a frame is more or less similar to it's nearby frames. The video encoder that is developed takes advantage of the same. In this video encoder, an efficient algorithm Ramer-Douglas-Peucker is implemented to drop the redundant information and store significant information is such a way that the stored information can further generate the redundant information when required.

Encoding

Ramer-Douglas_Puecker Algorithm For a given segment (sequence of frames), the algorithm accepts an array of pixel values for a spatial coordinate. Some of the values in this array may be significantly different from the their nearby values. Initially the first and last values in the array are marked and a linear curve is assumed between the two. The algorithm sequentially check all the values between the two points and marks the first value which is farthest from this line based on a parameter epsilon which can be varied. It then recursively calls itself on the two subarrays generated by dividing the inital array at the marked value.

The algorithm returns set of marked values with frame number which significantly vary from the values in nearby frames.

For each spatial coordinate, the Douglas-Puecker returns set of frame number and corresponding pixel value which is then further stored.

Decoding

From the encoded video file, all the frames are generated. Since the Douglas puecker algorithm dropped all the values between two points which were less than epsilon distance far from the linear curve, the values that were dropped are generated assuming a linear curve. Applying this apporach for all the spatial coordinates generates all the frame which can further be played at the required rate.

Features

  • Able to encode up to 100 (JPEG,PNG) Gray Scale images of same dimensions into one encoded file in under 5 minutes. The images are read and each frame is stored as a BufferedImage object giving every pixel data of all spatial coordinates. For each spatial coordinate, Ramer-Douglas-Peucker algorithm is used to extract frames numbers with pixel value which shows significant change in pixel values. For each spatial coordinate, a Tree-Map is used to store frame number and corresponding pixel value since the map is sorted to natural ordering of it's key. TreeMap provides guaranteed log(n) time for get, put and remove operations. An arraylist over the TreeMap for each spatial coordinate is stored using BufferedOutputStream.

  • Able to play back the gray-scale images from the encoded file with at least 10 frames per second The pixel values are retrived from the encoded file and all the frames are generated as an array of BufferedImage. Since for each spatial coordinate, some of the pixel values which were dropped during encoding are generated. The pixel values between two stored values are generated considering a linear curve between the stored values.

  • A command line user interface for encoding and viewing To create an encoded file, the format of command line input is: encode [file1][file2]...[filen] --output [outputfile] To play a decoded file, the format of command line input is: view [outputfile]

References

Code

The project contains four classes-

  1. VideoEncoderMain: The main class which accepts the command line argument and performs the required task (encoding or playing).
  2. VFrame: The constructor of this class accepts buffered image and frame number, and stores pixel data as byte array.
  3. VSegment: This class is called both when encoding or playing is required. The methods that it contains are- 3.1 addFrames(): This method accepts an array of objects of VFrame and stores into another array of objects of VFrame for encoding. 3.2 compressSegment(): This method accepts the output file name and a parameter epsilon which trades memory space for quality. The method calls another method (mentioned below) runDPAlgorithm() which runs Douglas-Puecker Algorithm on each spatial coordinate. compressSegment creates a TreeMap storing the significant frame number and corresponding pixel value and stores into an ArrayList for all spatial coordinates. 3.3 decompressSegment(): This method is called during the playback of encoded file. It accepts the output file and generates the same number of frames as were encoded. It generates the pixel values that were dropped by Douglas Puecker Algorithm and returns an array of BufferedImage to be played. 3.4 runDPAlgorithm(): This method accepts the spatial coordinate, length of segment (number of frames given to Douglas Puecker) and parameter epsilon. It returns a TreeMap of significant frame number and coppesponding pixel values for the particular spatial coordinate.
  4. VideoPlayer: The constructor of this class accepts the array of BufferedImage retured by the decompressSegment of class VSegment and the frames per second rate at which video is supposed to be played. 4.1 playVideo(): This method of class VideoPlayer creates a JFrame object which is passed the the object of VideoPlayer. This method play frames at the required rate.

Sample Data

100 sample images are contained in the folder named Images in the src folder of the root directory of the repository.

video-encoder's People

Contributors

sanyakalra avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.