Git Product home page Git Product logo

gyakobo / sparse_matrix Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 168 KB

This project aims utilize a sparse matrix as form of matrix or image value compression by basically implementing a special kind of data structure where it omits one continuously recurring value ultimately saving space only for "important" variables.

License: MIT License

C 96.50% Shell 3.50%
c compression-algorithm sparse-matrix malloc c-struct image-compression weissman-score njit

sparse_matrix's Introduction

Using a Sparse Matrix for lossless data compression

image image image image

Author: Andrew Gyakobo

This project mainly aims to utilize a sparse matrix as form of matrix or image value compression by basically implementing a special kind of data structure where it omits one continuously recurring value ultimately saving space only for the "important" variables.

Note

Just a small personal note and a small gag per say, this program is written in clean C and has no errors or warnings. The program was run and precompiled into an .exe with the following bash command:

$ sudo gcc -ansi -Wpedantic -Wextra -Wall main.c -o exe

Introduction

Giving a sample matrix of numbers:

0 0 0 8 0 0 0 3 0 0 0
0 0 0 0 0 3 0 0 0 0 8
2 0 0 0 0 0 0 0 0 0 0
0 0 7 0 0 0 0 5 0 0 0
0 0 0 8 0 0 0 0 0 8 0

We immediately notice the overwhelming issue with this matrix that it has too many 0's which don't necessarilly need to use up space hence they could just be left out.

The Sparse Matrix's compression rate fully depends on the rate of the recurrent value is being omitted.

Review of the Sparse Matrix Data Structure

  1. As the program scans the txt file row by row each non-omitted number is subsequently stored in a sample "node" in a struct called Element. Each essential value is then saved in the value integer in struct Element. The code snippet is as follows:
struct Element {
    int value;
    int row;
    int column;
    struct Element * rowElement; // Pointer to next row Element
    struct Element * colElement; // Pointer to next column Element
};
  1. Afterwards, after the struct Element object is created, alongside it two more concomitant objects are created as well. Here this is where we'd have to utilize pointers and a little bit of our imagination. Using void* malloc we allocate space for (basically create) two struct Header structs, each of which would be pointing to both the aforementioned created element and the next struct Header element. If you haven't noticed already we're creating a table of sorts with specific column and row objects pointing to said element objects. The struct Header objects would also connect to each other using once again pointers struct Header * header;
struct Header {
    int index;
    struct Header * header;
    struct Element * element;
};
  1. The entire structure can then be saved by just having two pointers to the column and row headers:
struct Header * headRowHeader;
struct Header * headColHeader;

The headers in their place would contain the said element structure. We could also add other elements to the struct Matrix like number of rows, columns, and the arbitrary recurrent value that would be omitted.

struct Matrix {
    struct Header * headRowHeader;
    struct Header * headColHeader;

    int rows;
    int cols;
    int fillValue;
};

Diagram of a sparse matrix

To hearken back to the matrix back in our introduction:

0 0 0 8 0 0 0 3 0 0 0
0 0 0 0 0 3 0 0 0 0 8
2 0 0 0 0 0 0 0 0 0 0
0 0 7 0 0 0 0 5 0 0 0
0 0 0 8 0 0 0 0 0 8 0

This is a visual representation of how our data structure would ultimately look like:

Note

Assuming the image per say gets bigger or is expanded in its size, then hypothetically the size of the sparse matrix should remain unscaved as only the gaps(numbers equal to zero) between the relevant values(in our case any number not equal to zero) would be prone to change. In simpler terms, if the image is shrunk or expanded the overall information weight remains the same.

Compression Rates and Analysis

When it comes to compression rates let's test the program out on the following data:

1 1
1 1

main.c would then output the following snippet:

index: 1
val: 1
val: 1

index: 2
val: 1
val: 1

==========================

index: 1
val: 1
val: 1

index: 2
val: 1
val: 1


Size of sparse mtrx: 96
Size of simpler mtrx: 8

Important

The first two indices above the division line indicates the first two columns, whilst the second two indices - the first two rows.

As we can see from the displayed sizes of the sparse matrix and just a normal array the array is uses 16 bytes and the sparse matrix utilizes 96 bytes. You might immediately mention that the sparse matrix data structure is not efficient with this example and you'd be definitely right. Let's however take another example into consideration. Below is practically the same data table but skewed to the right:

0 0 1 1
0 0 1 1

In this case we get the following sizes. As you can see the sparse matrix remains unscaved in its size 96 bytes, however, the array only increments with every new number. Current array size: 32 bytes.

Size of sparse mtrx: 96
Size of simpler mtrx: 32

Such behaviour would be encountered repeatedly until the size of the said examined data structure would surpass the array itself.

Here are a list of examples proving the previous point:

0 0 0 0 1 1
0 0 0 0 1 1
Size of sparse mtrx: 96
Size of simpler mtrx: 48
0 0 0 0 0 0 1 1
0 0 0 0 0 0 1 1
Size of sparse mtrx: 96
Size of simpler mtrx: 64

License

MIT

sparse_matrix's People

Contributors

gyakobo avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.