dicedb / dice Goto Github PK

View Code? Open in Web Editor NEW

960.0 960.0 199.0 220 KB

A drop-in replacement of Redis with SQL-based realtime reactivity.

Home Page: https://dicedb-docs.netlify.app/

License: Other

Go 99.37% Dockerfile 0.15% Makefile 0.30% Shell 0.18%

database golang

dice's People

Stargazers

Watchers

Forkers

dharanad jain-yakshit-1 rohanverma94 sudhanshu179 danish45007 veerareddyvishal144 rohan23chhabra pratikmishra356 shubham-chhimpa 0xtheprodev tanish2000 hakai-shin capedcrusader23 maina-alex skmonjurul rak3n asutosh97 5idu mrchocha divinenaman sriramr98 alok87 knandan123 golu360 sahil1913 raja-mishra1 mukeshsharma04 mayukhsobo jatin-bansal-21 funcguy zeina1i yashkumarverma bmwtsn098 harjyotbagga mystigan manosriram deep-adeshraa pranjal-verma the-e3n flamingo09 ankitsharma22458 chowmean marufukuu karan2021ugce034 samarthjuneja24 sanchitlohia2711 yahya1608 agathemmanuel srikantbadri harin-ramesh manav2401 keshavchand ameyajoshi35 vignesh25nithin beerus11 pk-218 jyotindersingh mayhemheroes subkanthi talk2sohail lagging raknay anuragch nitesh16s kunal202 pathakmihir sshande humanbeeng dart-vinay ankitajuneja13 rohankumardubey rishisc naumanjabbar syk-coder mappie-grofers anmolbinani c-harish coder-tle ajith588 sumitmallick reapedjuggler aavishkarmishra naveedh27 shubhamagarwaliitr pratik151192 omarmahamid wyaadarsh mickeymoon agentmishra virajbhatnagar thedevis ratnakarjha raj3k ksaturn jaineil aniketpagar mayank-032 mpattanaik7 ayo-ajayi ambi88dex

dice's Issues

need inline documentation on why BGREWRITEAOF forked a process

Following functions need better comments

Here is a list of files along with the lines highlighted that need better comments making it easier to understand what's happening

https://github.com/DiceDB/dice/blob/master/core/eval.go#L170-L176

Here's why I think it is difficult to understand the code

understanding how we are forking, referring to Go doc for that, and the reason for doing it would help.

Add support for command `FILTERKEYS`

Is your feature request related to a problem? Please describe.
We need a way to filter out keys on the basis of wildcard patterns; so that the KV tuples can be piped through other commands for in-mem aggregations.

Describe the solution you'd like
Create a command FILTERKEYS that accepts wildcards as an argument and returns the list of KV pairs as a response.

> FILTERKEYS "*user:*"
0) 0) user:1
    1) value:1
1) 0) user:2
    1) value:2

a JSON equivalent of the response is

[
    ["user:1", "value:1"],
    ["user:2", "value:2"],
    ...
]

BGREWRITEAOF is not failure safe

Describe the bug
BGREWRITEAOF starts overwriting the existing file; hence in case of a failure, while creating the AOF file, the old flush is lost while the new one is partially updated.

To Reproduce

add random sleep in BGREWRITEAOF after each command is flushed on the disk
crash the process while BGREWRITE is happening
see the AOF file content

Expected behavior
Until the new AOF file is not ready, we should not alter the old file.

Introduce string data type and optimize for length computation

Support ingestion and processing of large inputs

Is your feature request related to a problem? Please describe.
Currently, the buffer we use to take input can hold only 512 bytes. This prevents us from firing query that is longer than that. This is also the limitation while reading the response on the client side.

Describe the solution you'd like
Re-write the socket code to support large input bytes sent from the client and encode them into native objects.

If the input is partial wait until the complete input is received.

Support to try out in mac / arm64

Is your feature request related to a problem? Please describe.
No

Describe the solution you'd like
go run main.go works in mac or make run with something behind docker like functionality

Describe alternatives you've considered
Alternative was buying a linux laptop

Additional context
None

Something like below

Makefile

OS := $(if $(GOOS),$(GOOS),$(shell go env GOOS))
ARCH := $(if $(GOARCH),$(GOARCH),$(shell go env GOARCH))
BUILD_IMAGE ?= golang:1.19.2-buster

run: 
	docker run                                                  \
	    -i                                                      \
	    --rm                                                    \
	    -u $$(id -u):$$(id -g)                                  \
	    -v $$(pwd):/src                                         \
	    -w /src                                                 \
	    -v $$(pwd)/.go/bin/$(OS)_$(ARCH):/go/bin                \
	    -v $$(pwd)/.go/bin/$(OS)_$(ARCH):/go/bin/$(OS)_$(ARCH)  \
	    -v $$(pwd)/.go/cache:/.cache                            \
	    -v $$(pwd)/.go/pkg:/go/pkg                              \
	    $(BUILD_IMAGE)                                          \
	    /bin/sh -c "                                            \
	        ARCH=$(ARCH)                                        \
	        OS=$(OS)                                            \
	        go run main.go                                      \
	    "

or may be built it for mac and give some image for mac.

Shell

make run

use dice unicode "⚄" instead of dice emoji

Describe the bug
We render a Dice emoji when the server starts and not all terminals hold the capability to render an emoji. Hence switch from an emoji to a Unicode "⚄".

To Reproduce

start the Dice DB server on the 8-bit terminal

Expected behavior
The dice emoji should render but it shows a broken Unicode. Hence replace it with ⚄

Tests for command `GET`

Currently, there are almost no unit tests for the GET command. We need a comprehensive test suite for this command that also checks the correctness and completeness.

covers the correctness of single command
covers the correctness when multiple commands are fired
no commands are dropped abruptly
ensure we check for small, large, and massive inputs (if applicable)

Benchmarking utility

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

This might be too early, but I am curious on how software benchmarking is handled. I got to know redis has a redis-benchmark utility. Can we have benchmarking utility for dice as well. This feature might not be urgent but I am curious to look into this.

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Store `int` value by reference and not serialized in `str`

Make raw string as `[]byte` instead of `string` for efficiency

Converting string to []byte is not a typecast but a mem copy and rebuilding, hence it becomes expensive.

Setup GitHub actions to auto-build Dice for platforms

We need a way to build Dice DB for all platforms upon creating a new release tag. We can use GoReleaser to do this.

Need standardized errors

We need a way to standardize the errors

the language
format
conventions
global declarations
error codes

https://github.com/uber-go/guide/blob/master/style.md#errors

Complexity with GC, mutex & storage engine layout

Is your feature request related to a problem? Please describe.
Atomics and Mutexes are common solutions for locking. Behind the scenes, they are tuned for exclusive access and not for quick turn-around.

A common remedy to his problem is lock-free programming but not everything can be turned into a lock-free construct. Turnaround time should be minimal if the primary use case is a key-value store, moreover, if more data types and eviction policies are to be introduced in the future then this turnaround time would be significant with the current storage engine based on map[string]*object.

For instance, in LFU, while inserting key-value I've to keep track of the access frequency and if the store is found to be near its capacity, it should evict the least frequently used value, this is a classic example of a chain where every single operation being blocked by a mutex and releasing of those mutexes itself depend on the response time of storage layout.

Describe the solution you'd like
Not just in Golang, but in C++ also we have been using constructs like singleton, monostate, etc which are designed around exclusivity. And to address those problems we use " signals and slots".

So the solution I propose is to use "Buffered channels" with a slot size of 1. The above problem of LFU policy can easily be modeled using slots, a buffered channel essentially operates on a slot being filled or empty and has the potential to make a quick turnaround.

As discussed earlier in #31 (comment) , a storage engine based on slices is friendly enough for GC and we should only use maps/hashtables for DB index ( db 0, db 1, db 2....) as on top of weird GC issues with maps, the computation for hash function also causes a significant delay on an "ms" scale.

Describe alternatives you've considered
N/A

Additional context
N/A

Redundant conditional in evalGET

Describe the bug
Not a bug but more of something that caught my eye when reading the codebase. It's my first time looking at or attempting anything open source. Lmk if this is the wrong category to file this. I wanted to open this issue so I could make a corresponding PR and fix this.

Package: core
File: eval.go
Method: evalGET

Get(key) will delete any expired key already, so the conditional listed below is redundant in code.

if hasExpired(obj) {
		return RESP_NIL
}

Expected behavior
Remove redundant condition since Get(key) takes care of the case.

Add a test suite that can test commands on a temp server

Is your feature request related to a problem? Please describe.
Unit tests are fine but we need to fire the command and test the changes.

Describe the solution you'd like
Provision a test suite that allows unit testing and an ability to test commands over a temporary Dice server

DiceDB doesn't build on linux/arm64

Describe the bug
DiceDB doesn't build on linux/arm64. It gives the following error:

~:/dice# go build
# github.com/dicedb/dice/core
core/eval.go:178:44: undefined: syscall.SYS_FORK

As mentioned, it is because of the use of syscall.SYS_FORK in eval.go

func evalBGREWRITEAOF(args []string) []byte {
	newChild, _, _ := syscall.Syscall(syscall.SYS_FORK, 0, 0, 0)
        ...
        ...
}

To Reproduce
Run go build on a linux/arm64 machine

Expected behavior
The build should happen successfully without any error messages and generate a binary

Additional context
Upon simply googling about this, came across this issue in golang repo: golang/go#11981
Also, found a related stackoverflow answer: https://stackoverflow.com/questions/28370646/how-do-i-fork-a-go-process/28371586#28371586

TL;DR: Calling fork() using Syscall is unsafe, recommended API is ForkExec instead

dockerize project

Dockerize the project so that it will be easier for contributor and user to quickly run the project with minimal effort on any machine

Reinforce RESP handler

Describe the bug
RESP handler often breaks, need to fix it.

To Reproduce
Steps to reproduce the behavior:
DO THIS-

(printf '+PING\r\n+PING\r\n';) | nc localhost 7379
(printf '+PING\r\n+PING\r\n';) | nc localhost 7379
(printf '+OING\r\n+PING\r\n';) | nc localhost 7379

Expected behavior
This should log "Possible security attack" on the server side and disconnect the client issuing the command.

Additional context
Try Fuzzers to break the current RESP implementation to explore more steps to reproduce the larger issue with RESP handler.

Add support for Hyperloglog

Run expiraion in a separete periodic Goroutuine

Is your feature request related to a problem? Please describe.
The expiration runs in the event loop unnecessarily starving the core execution.

Describe the solution you'd like
Let there be a goroutine that periodically wakes up and deletes the expired keys. We need to take care of a close-edge case though. What if a key is about to

Naive solution: lock and execute
Con: unnecessarily the entire hashtable is locked, giving us potentially no benefit.

Possible solution: park the expired value in a different hashmap (trash can) for some time, if it is accessed within n seconds, well and good. otherwise, delete it. This two-stage deletion will ensure we are minimizing (not eradicating) the edge case. In the trash can, each element has a separate ticker.

Advantage: no need to lock the hash table (key store)

Additional context
This approach is inspired by Garbage Collectors.

Add support for KQueue

Is your feature request related to a problem? Please describe.
Dice DB does not run on OSX.

Describe the solution you'd like
Dice DB currently does not run on OSX because it does not support EPOLL, hence we need to add support for KQueue and do a conditional build.

EC2/aws cant run

Describe the bug
cant run in a linux ec2

To Reproduce
launch an ubuntu ec2
clone
go run main.go

fails

Expected behavior
Not able to start in ec2 linux

Additional context

ubuntu at ec2 in ~/dice on master
$ go run main.go
# sort
/usr/local/go/src/sort/zsortfunc.go:10:6: insertionSort_func redeclared in this block
	/usr/local/go/src/sort/zfuncversion.go:10:6: other declaration of insertionSort_func
/usr/local/go/src/sort/zsortfunc.go:20:6: siftDown_func redeclared in this block
	/usr/local/go/src/sort/zfuncversion.go:19:6: other declaration of siftDown_func
/usr/local/go/src/sort/zsortfunc.go:38:6: heapSort_func redeclared in this block
	/usr/local/go/src/sort/zfuncversion.go:38:6: other declaration of heapSort_func
/usr/local/go/src/sort/zsortfunc.go:329:6: swapRange_func redeclared in this block
	/usr/local/go/src/sort/zfuncversion.go:65:6: other declaration of swapRange_func
/usr/local/go/src/sort/zsortfunc.go:335:6: stable_func redeclared in this block
	/usr/local/go/src/sort/zfuncversion.go:163:6: other declaration of stable_func
/usr/local/go/src/sort/zsortfunc.go:378:6: symMerge_func redeclared in this block
	/usr/local/go/src/sort/zfuncversion.go:187:6: other declaration of symMerge_func
/usr/local/go/src/sort/zsortfunc.go:464:6: rotate_func redeclared in this block
	/usr/local/go/src/sort/zfuncversion.go:252:6: other declaration of rotate_func

ubuntu at ec2 in ~/dice on master
$ uname -a
Linux ec2 5.15.0-1020-aws #24-Ubuntu SMP Thu Sep 1 16:04:17 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Adding support for command `SHUTDOWN`

Is your feature request related to a problem? Please describe.
There is no way for a client to initiate a shutdown of the server. We need this to shut down the server gracefully after running tests.

Describe the solution you'd like
We need a command SHUTDOWN that shuts down the server, gracefully.

Add support of lists implemeted using `ziplist`

the list can have a cap of entries (configuration) we can put in there
commands to add, and remove elements from the list given an index

Add support for LFU eviction using Approximate Counting

Tests for command `SET`

Currently, there are almost no unit tests for the SET command. We need a comprehensive test suite for this command that also checks the correctness and completeness.

covers the correctness of single command
covers the correctness when multiple commands are fired
no commands are dropped abruptly
ensure we check for small, large, and massive inputs (if applicable)

Tests for command `TTL`

Currently, there are almost no unit tests for the TTL command. We need a comprehensive test suite for this command that also checks the correctness and completeness.

covers the correctness of single command
covers the correctness when multiple commands are fired
no commands are dropped abruptly
ensure we check for small, large, and massive inputs (if applicable)

Pull Request Template

Is your feature request related to a problem? Please describe.
PRs from different devs would be in different format which brings non-uniformity across the project.

Describe the solution you'd like
Create a PR template which will help aligning uniformity across PRs and thus for the project as of whole

Additional context
Refs: https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/creating-a-pull-request-template-for-your-repository

Make `BGREWRITEAOF` happen through a background thread

Machine freezes after hiting GET curl request

I cloned your repo
ran go run main.go
and, just for curiosity, did curl HTTP://0.0.0.0:7379
after that machine freezes ( htop stats showing 80% RAM consumed by go-build)

Add support for GepSpatial queries using Geohash

Add support for string sets

Define a charter for the database

Without a charter, the database will be built in random directions. We need to define a charter that keeps us aligned. Every feature we build should align with the charter.

Add comments in the code above each command telling what it does

Following functions need better comments

all the commands implemented in eval.go file. Write exhaustive documentation suggesting what it does.

Here's why I think it is difficult to understand the code

Code may not be very difficult to understand, but adding the documentation will make it easier for the new contributors to get started quickly.

Add support for replicas

( we can persists data in Disk/SDD). and PUT key-value in replicas parallelly ( by go routine)

Add support for IOCP

Is your feature request related to a problem? Please describe.
Dice DB does not run on Windows natively.

Describe the solution you'd like
Dice DB currently does not run on Windows because it does not support EPOLL, hence we need to add support for IOCP and do a conditional build.

Tests for command `PING`

Currently, there are almost no unit tests for the PING command. We need a comprehensive test suite for this command that also checks the correctness and completeness.

covers the correctness of single command
covers the correctness when multiple commands are fired
no commands are dropped abruptly
ensure we check for small, large, and massive inputs (if applicable)

Support for DEBUG POPULATE & INFO command

Is your feature request related to a problem? Please describe.

Since DiceDB is not TELNET compatible, in order to send the command, one has to serialize the commands as per RESP spec and then send it over the wire. Thus, sending the commands for mass insertion or deletion becomes a high labor task for a mere performance or diagnostics test. A workaround is to use any redis-compatible language driver and perform the operation. However, the latter solution is external to DB and introduces the side effects and non-conforming delays even on loopback interface. Another problem is related to the monitoring since we are already based on event loops i.e. a kernel space operation, hence attaching debug probes/breakpoints in code that is in userspace doesn't really help to know the behavior of DB with respect to various memory footprints such as network, spikes during insertions, etc.

Describe the solution you'd like
For the first problem. the solution I propose is to introduce a DEBUG POPULATE command, which doesn't have to cross any network boundaries to generate keys for any diagnostics test.

This should work as follows-

DEBUG POPULATE <number-of-keys-to-generate> <key-size-in-bytes>

for instance, the below command should generate 3 million keys with 2kb string or integer keys

DEBUG POPULATE 3000000 2000

For the second problem, I propose INFO command.
Below should provide info on all buffers related to Networking

INFO network

The same way, below should provide statistics on keys, how many are dirty, expired, and peak memory, committed (AOF), etc.

INFO store

Describe alternatives you've considered
For key generations - Redis-compatible drivers.
For INFO - debug probes and profilers

Additional context
Why all this hard work, can't we use alternatives? Because printf() is the best debugger on the planet!!

Wrong/Un-implemented commands returns "PONG"

Describe the bug
Wrong/un-implemented commands when fired returns "PONG".
This can create confusion on the behaviour altogether

To Reproduce

run some
enter gibberish as a command

Expected behavior
Return a correct response stating either command not implemented or wrong command fired.

@arpitbbhayani to share the correct response format.

Add support for command `SUMVAL`

Is your feature request related to a problem? Please describe.
There are no KV databases that support values-based aggregations. Doing a summation of integer values in the KV store would help us address a bunch of real-world use-cases

all expenses dumped as KV, summation will give net balance quickly

Describe the solution you'd like
Introduce a command SUMVAL that does a summation of all the values and returns the result.

Add support for integer sets

Add support for in-memory key/value based ETL

Is your feature request related to a problem? Please describe.
What if I want to find MAX/SUM/%ile of filtered keys/values? Doing this today is a big pain as it requires a massive compute power.

Given that Dice DB is built on Golang and it has super-efficient Goroutines, what it we build capability of doing in-memory ETL through simple Piped Commands?

Describe the solution you'd like
Build an ability to represent and accept ETL command execute them on multiple cores on mutually exclusive subset of data and quickly compute the response.

This way, Dice DB is single threaded for execution, but each command execution can be multi-threaded leveraging all the cores of the underlying hardware.

Describe alternatives you've considered
There is no way to do value based operations and key based filters are expensive. No matter what we are not leveraging all the underlying cores

Additional context
This would open up the whole world of multi-core data structures and algorithms to be added to Dice.

Add support for Bloom Filters

Is your feature request related to a problem? Please describe.
Bloom filters are essential to check for existence, and having support for that is essential for real-world usecases.

Describe the solution you'd like
Add commands BINIT, BADD, and BEXIST to initialize, add, and check existence within the bloom filter. BINIT can take in arguments around tolerance and initial allocation.

Some points to consider

the size of the bloom filter
seamless resize

https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=bloom+filter&btnG=

Efficient deep comparisor of complex types for test suite

Is your feature request related to a problem? Please describe.
RESP encoding and decoding need to be tested with complex types and there is no one good way to equate two objects. WE need a way to go it.

Describe the solution you'd like
go-cmp library can be used as a reference.

Add support for `QUEUEINT`

To power Milestone 1: Pokemon Premier League #46, we need to support the queue data structure. We start with an integer queue that needs to have very optimal on space and time. Also, add tests to ensure the correctness of the system.

Commands Specification: https://github.com/DiceDB/dice/wiki/Queue-of-Integers

Add support for command `FILTERVALS`

Is your feature request related to a problem? Please describe.
We need a way to filter out values on the basis of some condition on the value.

Describe the solution you'd like
Create a command FILTERVALS that accepts wildcards as an argument and returns the list of KV pairs as a response.

> FILTERVALS "*value:*"
0) 0) user:1
    1) value:1
1) 0) user:2
    1) value:2

a JSON equivalent of the response is

[
    ["user:1", "value:1"],
    ["user:2", "value:2"],
    ...
]

Communication medium like slack/discord

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Is there any medium for communication like slack or discord?

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

issue: `exDurationSec` code bug in SET command

Bug Desc
In evalSET function the following line has an issue. since the command may take multiple arguments in-future and ex argument can be in any position.
exDurationSec, err := strconv.ParseInt(args[3], 10, 64)

it should be
exDurationSec, err := strconv.ParseInt(args[i], 10, 64)

Dockerfile for this project

Is your feature request related to a problem? Please describe.
This is a relatively new project and I guess it would be nice if we can make it easier for people to play-around. Docker image will let people do that.

Describe the solution you'd like
Creating a Dockerfile with linux image as base.

I would love to take this up, won't be a huge task.

dicedb / dice Goto Github PK

dice's People

Stargazers

Watchers

Forkers

dice's Issues

Something like below

Makefile

Shell

Recommend Projects

Recommend Topics

Recommend Org