This project implements Bitvector and Dictionary database compression. There is a memory corrupting bug in one of the implementations somewhere so beware.
The purpose of the programming tasks is to deepen your knowledge in selected aspects of the lecture. This year, we decided to set this focus on compression techniques in column oriented database management systems. Furthermore, we choose C++ as programming language, because it is the most frequently used programming language for database management systems (except C). The task is to implement compression techniques in our framework. We provide a set of classes as presetting, where you have to include an implementation w.r.t. an interface. You can download the sources here. A set of unit tests will help you during the development process to identify errors. The same unit tests will be used at the end of the term to validate your solution. A working implementation is a necessary prerequisite to participate in the exam!
You may choose between the following compression techniques (you may suggest other compression techniques as well):
- Run Length Encoding
- Delta Coding
- Bit-Vector Encoding
- Dictionary Encoding
- Frequency Partitioning
All compression techniques are explained in the lecture.