Git Product home page Git Product logo

dice's Introduction

DiceDB

DiceDB is a drop-in replacement of Redis with SQL-based real-time reactivity baked in.

Note: DiceDB is still in development and it supports a subset of Redis commands. So, please do not use it in production. But, feel free to go through the open issues and contribute to help us speed up the development.

How is it different from Redis?

  1. DiceDB is multi-threaded and follows shared-nothing architecture.
  2. DiceDB supports a new command called QWATCH that lets clients listen to a SQL query and get notified in real-time whenever something changes.

Get started

Using Docker

The easiest way to get started with DiceDB is using Docker by running the following command.

$ docker run dicedb/dice-server

The above command will start the DiceDB server running locally on the port 7379 and you can connect to it using DiceDB CLI and SDKs, or even Redis CLIs and SDKs.

Note: Given it is a drop-in replacement of Redis, you can also use any Redis CLI and SDK to connect to DiceDB.

Setting up

To run DiceDB for local development or running from source, you will need

  1. Golang
  2. Any of the below supported platform environment:
    1. Linux based environment
    2. OSX (Darwin) based environment
$ git clone https://github.com/dicedb/dice
$ cd dice
$ go run main.go

Setting up CLI

The best way to connect to DiceDB is using DiceDB CLI and you can install it by running the following command.

$ pip install dicedb-cli

Because DiceDB speaks Redis dialect, you can connect to it with any Redis Client and SDK also. But if you are planning to use the QWATCH feature then you need to use the DiceDB CLI.

Running Tests

Unit tests and integration tests are essential for ensuring correctness and in the case of DiceDB, both types of tests are available to validate its functionality.

For unit testing, you can execute individual unit tests by specifying the name of the test function using the TEST_FUNC environment variable and running the make unittest-one command. Alternatively, running make unittest will execute all unit tests.

Executing one unit test

$ TEST_FUNC=<name of the test function> make unittest-one
$ TEST_FUNC=TestByteList make unittest-one

Running all unit tests

$ make unittest

Integration tests, on the other hand, involve starting up the DiceDB server and running a series of commands to verify the expected end state and output. To execute a single integration test, you can set the TEST_FUNC environment variable to the name of the test function and run make test-one. Running make test will execute all integration tests.

Executing a single integration test

$ TEST_FUNC=<name of the test function> make test-one
$ TEST_FUNC=TestSet make test-one

Running all integration tests

$ make test

Work to add more tests in DiceDB is in progress and we will soon port the test Redis suite to this codebase to ensure full compatibility.

Running Benchmark

$ go test -test.bench <pattern>
$ go test -test.bench BenchmarkListRedis

Getting Started

To get started with building and contributing to DiceDB, please refer to the issues created in this repository.

The story

DiceDB started as a re-implementation of Redis in Golang and the idea was to - build a DB from scratch and understand the micro-nuances that come with its implementation. The database does not aim to replace Redis, instead, it will fit in and optimize itself for multi-core computations running on a single-threaded event loop.

How to contribute

The Code Contribution Guidelines are published at CONTRIBUTING.md; please read them before you start making any changes. This would allow us to have a consistent standard of coding practices and developer experience.

Contributors can join the Discord Server for quick collaboration.

Contributors

Troubleshoot

Forcefully killing the process

$ sudo netstat -atlpn | grep :7379
$ sudo kill -9 <process_id>

dice's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dice's Issues

Define a charter for the database

Without a charter, the database will be built in random directions. We need to define a charter that keeps us aligned. Every feature we build should align with the charter.

Efficient deep comparisor of complex types for test suite

Is your feature request related to a problem? Please describe.
RESP encoding and decoding need to be tested with complex types and there is no one good way to equate two objects. WE need a way to go it.

Describe the solution you'd like
go-cmp library can be used as a reference.

Reinforce RESP handler

Describe the bug
RESP handler often breaks, need to fix it.

To Reproduce
Steps to reproduce the behavior:
DO THIS-

(printf '+PING\r\n+PING\r\n';) | nc localhost 7379
(printf '+PING\r\n+PING\r\n';) | nc localhost 7379
(printf '+OING\r\n+PING\r\n';) | nc localhost 7379

Expected behavior
This should log "Possible security attack" on the server side and disconnect the client issuing the command.

Additional context
Try Fuzzers to break the current RESP implementation to explore more steps to reproduce the larger issue with RESP handler.

Support to try out in mac / arm64

Is your feature request related to a problem? Please describe.
No

Describe the solution you'd like
go run main.go works in mac or make run with something behind docker like functionality

Describe alternatives you've considered
Alternative was buying a linux laptop

Additional context
None

Something like below

Makefile

OS := $(if $(GOOS),$(GOOS),$(shell go env GOOS))
ARCH := $(if $(GOARCH),$(GOARCH),$(shell go env GOARCH))
BUILD_IMAGE ?= golang:1.19.2-buster

run: 
	docker run                                                  \
	    -i                                                      \
	    --rm                                                    \
	    -u $$(id -u):$$(id -g)                                  \
	    -v $$(pwd):/src                                         \
	    -w /src                                                 \
	    -v $$(pwd)/.go/bin/$(OS)_$(ARCH):/go/bin                \
	    -v $$(pwd)/.go/bin/$(OS)_$(ARCH):/go/bin/$(OS)_$(ARCH)  \
	    -v $$(pwd)/.go/cache:/.cache                            \
	    -v $$(pwd)/.go/pkg:/go/pkg                              \
	    $(BUILD_IMAGE)                                          \
	    /bin/sh -c "                                            \
	        ARCH=$(ARCH)                                        \
	        OS=$(OS)                                            \
	        go run main.go                                      \
	    "

or may be built it for mac and give some image for mac.

Shell

make run

Support for DEBUG POPULATE & INFO command

Is your feature request related to a problem? Please describe.

Since DiceDB is not TELNET compatible, in order to send the command, one has to serialize the commands as per RESP spec and then send it over the wire. Thus, sending the commands for mass insertion or deletion becomes a high labor task for a mere performance or diagnostics test. A workaround is to use any redis-compatible language driver and perform the operation. However, the latter solution is external to DB and introduces the side effects and non-conforming delays even on loopback interface. Another problem is related to the monitoring since we are already based on event loops i.e. a kernel space operation, hence attaching debug probes/breakpoints in code that is in userspace doesn't really help to know the behavior of DB with respect to various memory footprints such as network, spikes during insertions, etc.

Describe the solution you'd like
For the first problem. the solution I propose is to introduce a DEBUG POPULATE command, which doesn't have to cross any network boundaries to generate keys for any diagnostics test.

This should work as follows-

DEBUG POPULATE <number-of-keys-to-generate> <key-size-in-bytes>

for instance, the below command should generate 3 million keys with 2kb string or integer keys

DEBUG POPULATE 3000000 2000

For the second problem, I propose INFO command.
Below should provide info on all buffers related to Networking

INFO network

The same way, below should provide statistics on keys, how many are dirty, expired, and peak memory, committed (AOF), etc.

INFO store

Describe alternatives you've considered
For key generations - Redis-compatible drivers.
For INFO - debug probes and profilers

Additional context
Why all this hard work, can't we use alternatives? Because printf() is the best debugger on the planet!!

Add support for Bloom Filters

Is your feature request related to a problem? Please describe.
Bloom filters are essential to check for existence, and having support for that is essential for real-world usecases.

Describe the solution you'd like
Add commands BINIT, BADD, and BEXIST to initialize, add, and check existence within the bloom filter. BINIT can take in arguments around tolerance and initial allocation.

Some points to consider

  • the size of the bloom filter
  • seamless resize

https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=bloom+filter&btnG=

Support ingestion and processing of large inputs

Is your feature request related to a problem? Please describe.
Currently, the buffer we use to take input can hold only 512 bytes. This prevents us from firing query that is longer than that. This is also the limitation while reading the response on the client side.

Describe the solution you'd like
Re-write the socket code to support large input bytes sent from the client and encode them into native objects.

If the input is partial wait until the complete input is received.

Pull Request Template

Is your feature request related to a problem? Please describe.
PRs from different devs would be in different format which brings non-uniformity across the project.

Describe the solution you'd like
Create a PR template which will help aligning uniformity across PRs and thus for the project as of whole

Additional context
Refs: https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/creating-a-pull-request-template-for-your-repository

Add support for replicas

( we can persists data in Disk/SDD). and PUT key-value in replicas parallelly ( by go routine)

Add support for KQueue

Is your feature request related to a problem? Please describe.
Dice DB does not run on OSX.

Describe the solution you'd like
Dice DB currently does not run on OSX because it does not support EPOLL, hence we need to add support for KQueue and do a conditional build.

Wrong/Un-implemented commands returns "PONG"

Describe the bug
Wrong/un-implemented commands when fired returns "PONG".
This can create confusion on the behaviour altogether

To Reproduce

  1. run some
  2. enter gibberish as a command

Expected behavior
Return a correct response stating either command not implemented or wrong command fired.

@arpitbbhayani to share the correct response format.

Add support for in-memory key/value based ETL

Is your feature request related to a problem? Please describe.
What if I want to find MAX/SUM/%ile of filtered keys/values? Doing this today is a big pain as it requires a massive compute power.

Given that Dice DB is built on Golang and it has super-efficient Goroutines, what it we build capability of doing in-memory ETL through simple Piped Commands?

Describe the solution you'd like
Build an ability to represent and accept ETL command execute them on multiple cores on mutually exclusive subset of data and quickly compute the response.

This way, Dice DB is single threaded for execution, but each command execution can be multi-threaded leveraging all the cores of the underlying hardware.

Describe alternatives you've considered
There is no way to do value based operations and key based filters are expensive. No matter what we are not leveraging all the underlying cores

Additional context
This would open up the whole world of multi-core data structures and algorithms to be added to Dice.

Add support for command `FILTERKEYS`

Is your feature request related to a problem? Please describe.
We need a way to filter out keys on the basis of wildcard patterns; so that the KV tuples can be piped through other commands for in-mem aggregations.

Describe the solution you'd like
Create a command FILTERKEYS that accepts wildcards as an argument and returns the list of KV pairs as a response.

> FILTERKEYS "*user:*"
0) 0) user:1
    1) value:1
1) 0) user:2
    1) value:2

a JSON equivalent of the response is

[
    ["user:1", "value:1"],
    ["user:2", "value:2"],
    ...
]

Add support for command `FILTERVALS`

Is your feature request related to a problem? Please describe.
We need a way to filter out values on the basis of some condition on the value.

Describe the solution you'd like
Create a command FILTERVALS that accepts wildcards as an argument and returns the list of KV pairs as a response.

> FILTERVALS "*value:*"
0) 0) user:1
    1) value:1
1) 0) user:2
    1) value:2

a JSON equivalent of the response is

[
    ["user:1", "value:1"],
    ["user:2", "value:2"],
    ...
]

use dice unicode "⚄" instead of dice emoji

Describe the bug
We render a Dice emoji when the server starts and not all terminals hold the capability to render an emoji. Hence switch from an emoji to a Unicode "⚄".

To Reproduce

  1. start the Dice DB server on the 8-bit terminal

Expected behavior
The dice emoji should render but it shows a broken Unicode. Hence replace it with ⚄

Tests for command `GET`

Currently, there are almost no unit tests for the GET command. We need a comprehensive test suite for this command that also checks the correctness and completeness.

  1. covers the correctness of single command
  2. covers the correctness when multiple commands are fired
  3. no commands are dropped abruptly
  4. ensure we check for small, large, and massive inputs (if applicable)

Dockerfile for this project

Is your feature request related to a problem? Please describe.
This is a relatively new project and I guess it would be nice if we can make it easier for people to play-around. Docker image will let people do that.

Describe the solution you'd like
Creating a Dockerfile with linux image as base.

I would love to take this up, won't be a huge task.

Complexity with GC, mutex & storage engine layout

Is your feature request related to a problem? Please describe.
Atomics and Mutexes are common solutions for locking. Behind the scenes, they are tuned for exclusive access and not for quick turn-around.

A common remedy to his problem is lock-free programming but not everything can be turned into a lock-free construct. Turnaround time should be minimal if the primary use case is a key-value store, moreover, if more data types and eviction policies are to be introduced in the future then this turnaround time would be significant with the current storage engine based on map[string]*object.

For instance, in LFU, while inserting key-value I've to keep track of the access frequency and if the store is found to be near its capacity, it should evict the least frequently used value, this is a classic example of a chain where every single operation being blocked by a mutex and releasing of those mutexes itself depend on the response time of storage layout.

Describe the solution you'd like
Not just in Golang, but in C++ also we have been using constructs like singleton, monostate, etc which are designed around exclusivity. And to address those problems we use " signals and slots".

So the solution I propose is to use "Buffered channels" with a slot size of 1. The above problem of LFU policy can easily be modeled using slots, a buffered channel essentially operates on a slot being filled or empty and has the potential to make a quick turnaround.

As discussed earlier in #31 (comment) , a storage engine based on slices is friendly enough for GC and we should only use maps/hashtables for DB index ( db 0, db 1, db 2....) as on top of weird GC issues with maps, the computation for hash function also causes a significant delay on an "ms" scale.

Describe alternatives you've considered
N/A

Additional context
N/A

EC2/aws cant run

Describe the bug
cant run in a linux ec2

To Reproduce
launch an ubuntu ec2
clone
go run main.go

fails

Expected behavior
Not able to start in ec2 linux

Additional context

ubuntu at ec2 in ~/dice on master
$ go run main.go
# sort
/usr/local/go/src/sort/zsortfunc.go:10:6: insertionSort_func redeclared in this block
	/usr/local/go/src/sort/zfuncversion.go:10:6: other declaration of insertionSort_func
/usr/local/go/src/sort/zsortfunc.go:20:6: siftDown_func redeclared in this block
	/usr/local/go/src/sort/zfuncversion.go:19:6: other declaration of siftDown_func
/usr/local/go/src/sort/zsortfunc.go:38:6: heapSort_func redeclared in this block
	/usr/local/go/src/sort/zfuncversion.go:38:6: other declaration of heapSort_func
/usr/local/go/src/sort/zsortfunc.go:329:6: swapRange_func redeclared in this block
	/usr/local/go/src/sort/zfuncversion.go:65:6: other declaration of swapRange_func
/usr/local/go/src/sort/zsortfunc.go:335:6: stable_func redeclared in this block
	/usr/local/go/src/sort/zfuncversion.go:163:6: other declaration of stable_func
/usr/local/go/src/sort/zsortfunc.go:378:6: symMerge_func redeclared in this block
	/usr/local/go/src/sort/zfuncversion.go:187:6: other declaration of symMerge_func
/usr/local/go/src/sort/zsortfunc.go:464:6: rotate_func redeclared in this block
	/usr/local/go/src/sort/zfuncversion.go:252:6: other declaration of rotate_func

ubuntu at ec2 in ~/dice on master
$ uname -a
Linux ec2 5.15.0-1020-aws #24-Ubuntu SMP Thu Sep 1 16:04:17 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Screenshot 2022-10-18 at 10 26 01 PM

Add support for IOCP

Is your feature request related to a problem? Please describe.
Dice DB does not run on Windows natively.

Describe the solution you'd like
Dice DB currently does not run on Windows because it does not support EPOLL, hence we need to add support for IOCP and do a conditional build.

BGREWRITEAOF is not failure safe

Describe the bug
BGREWRITEAOF starts overwriting the existing file; hence in case of a failure, while creating the AOF file, the old flush is lost while the new one is partially updated.

To Reproduce

  1. add random sleep in BGREWRITEAOF after each command is flushed on the disk
  2. crash the process while BGREWRITE is happening
  3. see the AOF file content

Expected behavior
Until the new AOF file is not ready, we should not alter the old file.

Add a test suite that can test commands on a temp server

Is your feature request related to a problem? Please describe.
Unit tests are fine but we need to fire the command and test the changes.

Describe the solution you'd like
Provision a test suite that allows unit testing and an ability to test commands over a temporary Dice server

Redundant conditional in evalGET

Describe the bug
Not a bug but more of something that caught my eye when reading the codebase. It's my first time looking at or attempting anything open source. Lmk if this is the wrong category to file this. I wanted to open this issue so I could make a corresponding PR and fix this.

Package: core
File: eval.go
Method: evalGET

Get(key) will delete any expired key already, so the conditional listed below is redundant in code.

if hasExpired(obj) {
		return RESP_NIL
}

Expected behavior
Remove redundant condition since Get(key) takes care of the case.

Benchmarking utility

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

This might be too early, but I am curious on how software benchmarking is handled. I got to know redis has a redis-benchmark utility. Can we have benchmarking utility for dice as well. This feature might not be urgent but I am curious to look into this.

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Tests for command `PING`

Currently, there are almost no unit tests for the PING command. We need a comprehensive test suite for this command that also checks the correctness and completeness.

  1. covers the correctness of single command
  2. covers the correctness when multiple commands are fired
  3. no commands are dropped abruptly
  4. ensure we check for small, large, and massive inputs (if applicable)

Tests for command `TTL`

Currently, there are almost no unit tests for the TTL command. We need a comprehensive test suite for this command that also checks the correctness and completeness.

  1. covers the correctness of single command
  2. covers the correctness when multiple commands are fired
  3. no commands are dropped abruptly
  4. ensure we check for small, large, and massive inputs (if applicable)

Communication medium like slack/discord

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Is there any medium for communication like slack or discord?

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Add support for command `SUMVAL`

Is your feature request related to a problem? Please describe.
There are no KV databases that support values-based aggregations. Doing a summation of integer values in the KV store would help us address a bunch of real-world use-cases

  • all expenses dumped as KV, summation will give net balance quickly

Describe the solution you'd like
Introduce a command SUMVAL that does a summation of all the values and returns the result.

issue: `exDurationSec` code bug in SET command

Bug Desc
In evalSET function the following line has an issue. since the command may take multiple arguments in-future and ex argument can be in any position.
exDurationSec, err := strconv.ParseInt(args[3], 10, 64)

it should be
exDurationSec, err := strconv.ParseInt(args[i], 10, 64)

DiceDB doesn't build on linux/arm64

Describe the bug
DiceDB doesn't build on linux/arm64. It gives the following error:

~:/dice# go build
# github.com/dicedb/dice/core
core/eval.go:178:44: undefined: syscall.SYS_FORK

As mentioned, it is because of the use of syscall.SYS_FORK in eval.go

func evalBGREWRITEAOF(args []string) []byte {
	newChild, _, _ := syscall.Syscall(syscall.SYS_FORK, 0, 0, 0)
        ...
        ...
}

To Reproduce
Run go build on a linux/arm64 machine

Expected behavior
The build should happen successfully without any error messages and generate a binary

Additional context
Upon simply googling about this, came across this issue in golang repo: golang/go#11981
Also, found a related stackoverflow answer: https://stackoverflow.com/questions/28370646/how-do-i-fork-a-go-process/28371586#28371586

TL;DR: Calling fork() using Syscall is unsafe, recommended API is ForkExec instead

dockerize project

Dockerize the project so that it will be easier for contributor and user to quickly run the project with minimal effort on any machine

Tests for command `SET`

Currently, there are almost no unit tests for the SET command. We need a comprehensive test suite for this command that also checks the correctness and completeness.

  1. covers the correctness of single command
  2. covers the correctness when multiple commands are fired
  3. no commands are dropped abruptly
  4. ensure we check for small, large, and massive inputs (if applicable)

Run expiraion in a separete periodic Goroutuine

Is your feature request related to a problem? Please describe.
The expiration runs in the event loop unnecessarily starving the core execution.

Describe the solution you'd like
Let there be a goroutine that periodically wakes up and deletes the expired keys. We need to take care of a close-edge case though. What if a key is about to

Naive solution: lock and execute
Con: unnecessarily the entire hashtable is locked, giving us potentially no benefit.

Possible solution: park the expired value in a different hashmap (trash can) for some time, if it is accessed within n seconds, well and good. otherwise, delete it. This two-stage deletion will ensure we are minimizing (not eradicating) the edge case. In the trash can, each element has a separate ticker.

Advantage: no need to lock the hash table (key store)

Additional context
This approach is inspired by Garbage Collectors.

Add comments in the code above each command telling what it does

Following functions need better comments

  • all the commands implemented in eval.go file. Write exhaustive documentation suggesting what it does.

Here's why I think it is difficult to understand the code

Code may not be very difficult to understand, but adding the documentation will make it easier for the new contributors to get started quickly.

Adding support for command `SHUTDOWN`

Is your feature request related to a problem? Please describe.
There is no way for a client to initiate a shutdown of the server. We need this to shut down the server gracefully after running tests.

Describe the solution you'd like
We need a command SHUTDOWN that shuts down the server, gracefully.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.