Git Product home page Git Product logo

ssp's Introduction

Subset Sum problem

Each problem requires a solution. And the subset sum problem is not an exception. Here, you can find an algorithm that effectively solves this problem in natural numbers for density > 1.

Where density is:

$$d = n / log2(max(S))$$

A formal definition of the subset sum problem

  • A set of positive integers S is given

  • Also given a target X

  • Find a subset s โˆˆ S such as:

$$\sum_{i=0}^n s_n = X$$

Usage

Achtung! Windows support was not properly implemented and tested!

To get started:

git clone [email protected]:kirilenkobm/ssp.git
cd ssp
./_test.sh

To generate datasets:

make rnd
./bin/generate_input [num_num] [max_num] [min_num] [to_ans] [output_file]

To repeat dataset I used for evaluations:

./create_test_set.sh

SSP.py usage:

usage: SSP.py [-h] [--subset_size SUBSET_SIZE] [--get_density] [--deep]
              [--verbose] [--ext_out]
              input requested_sum

positional arguments:
  input                 Text file containing input numbers, or stdin stream,
                        just write stdin for that
  requested_sum         Sum requested

optional arguments:
  -h, --help            show this help message and exit
  --subset_size SUBSET_SIZE, -s SUBSET_SIZE
                        Specify particular size of subset, look only for this
  --get_density, --gd   Compute dataset density
  --deep, -d            Include deep target search, drastically increases the
                        runtime
  --verbose, -v         Show verbose messages.
  --ext_out, -e

Implementation details and limitations

The main part of the software is implemented in both Python and C. C code is designed to be compiled as a shared library. In turn, Python script "SSP.py" is wrapped around the shared library, taking care of argument parsing and verification of input. Actually, the compiled library might be simply used individually, apart from the python script. The minimal input for the Python wrapper is a text file with an integer number at each line (might be stdin as well) and an integer, representing the target.

The limitations are:

  • The C-written path actively uses uint64_t data type, maximal capacity of which is 18446744073709551615, the majority of limitations are closely related to this number.

  • All input numbers must be positive integers, zeros are allowed but don't make any sense in the context of problem.

  • Number of input elements should not exceed the uint64_t maximal capacity.

  • Each input number also should not exceed the uint64_t capacity.

  • The most regrettable restriction - input array sum also should not exceed the uint64_t capacity due to the algorithm design. Shifting of this limit to __uint128_t capacity is planned.

  • Requested sum cannot be smaller than the smallest element of array.

  • Also, the target should not be bigger than he overall sum of input array.

Master script SSP.py and C code don't require any external libraries.

Performance measurements

Algorithm complexity in the worst case:

$$O(N^3)$$

Performance tests vere performed with variable array lengths and forming answer subset sizes. The program was tested on 6300 independent datasets. Max number was fixed to 10M. Subset sums that form answer were 25, 50 and 75% of overall dataset size. Density varied from 6.32 to 632.38. In 100% cases the program found a subset with requested sum.

Overall picture:

alt text

Shows that in general program works very fast, and in some minor cases the runtime is almost awful.

Zoomed in:

alt text

Shows that runtime grows very slow in average.

For this plot only the cases that took > 0.2 sec vere selected:

alt text

Shows that even in the worst case the runtime grows not faster than N^3 / C.

ssp's People

Contributors

kirilenkobm avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

ssp's Issues

sigkill

happens in minority of cases (1 per 1000), but happens, need to be fixed somehow
Probably, it would be better to implement proper verbosity first

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.