Git Product home page Git Product logo

Comments (26)

pleia2 avatar pleia2 commented on May 24, 2024 6

In case it helps as you evaluate this, my team at IBM runs an incredibly active, free, s390x virtual machine program for open source projects. If they Valkey project would like VMs, please reach out to us: https://community.ibm.com/zsystems/form/l1cc-oss-vm-request/

from valkey.

madolson avatar madolson commented on May 24, 2024 5

there's probably a couple other dozen things too, but that's my long-term gripe list I'm really surprised most people haven't been motivated to see the problems and fix them yet for long term project reliability and continuity.

Honestly, your comment of "culture conservatism" resonated strong with me. I spent a bunch of time trying to move to python tests, and the resistance from the former Redis guys (Oran and Yossi) was insurmountable. I'm perhaps a bit more conservative in the fact that I think we should be deeply fearful of breaking changes for Redis, given it's place in the stack, but definitely believe we need to be moving faster.

from valkey.

wenerme avatar wenerme commented on May 24, 2024 4

If we are live in 2024, can we have some http(ws,http1,http2,http3) based protocol support builtin ?

from valkey.

mattsta avatar mattsta commented on May 24, 2024 3

I still have drafts of a redis binary network protocol from 2014-2015 based on http/2 I was planning (including things like multiplexing using per-command client command ids matched to reply ids for non-head-of-line blocking concurrency, etc).

I think it's interesting to note the difference between the network protocol and the data protocol though. Currently redis operates a hybrid network+data protocol, but we can easily split the network protocol into a more streamlined binary format with multiple addressable streams allowing different data output formats too.

For data output, it turns out of all formats possible, JSON is the most efficient format if your data isn't majority binary blobs.

What does JSON solve?

  • you don't need extra bytes for length prefixes on strings since all strings are contained within two quotes as the start/end string signal: "string"
  • you don't need to pre-specify the length or depth or type of all your data structures either since JSON data structures are also start/end delimited
    • though, JSON isn't an amazing format for streaming without workarounds like ndjson (and it doesn't have a set type or other extensible types without creating custom tagged union wrappers)
  • numbers written in ascii decimal are optimal because shorter numbers require fewer bytes, and shorter numbers are more common
  • implicit type lengths defined by {} and "" and [] and split by , are more efficient than requiring up-front lengths and type markers delimited everywhere like $5\r\nHELLO\r\n instead of just "hello" for user data
    • *2\r\n$5\r\nhello\r\n$5\r\nworld\r\n is just ["hello","world"] instead, etc
  • escaping JSON strings can be checked for bad values and converted quickly using SIMD operations on intel and ARM in 16 byte chunks (or potentially even larger)
  • PLUS, you don't have to write yet another conversion layer from some rando server protocol to JSON, you can just yeet your command results directly back at users (so your app servers just act as literal data proxies instead of structured transformation nodes if your data is scoped properly with no other intermediate transformation steps).

from valkey.

stockholmux avatar stockholmux commented on May 24, 2024 3

finally remove all master/slave terminology

+1 this is the right time start this.

from valkey.

bitnom avatar bitnom commented on May 24, 2024 3

Things I want:

  • JSON/REJSON
  • Graph
  • Search
  • Timeseries
  • Vectors
  • Websockets (And possibly gRPC, additionally but not instead of)
  • Clustering
  • A fully compatible (And clusterable) WASM build.

We should probably do these via polls in the Discussion section though.

from valkey.

zuiderkwast avatar zuiderkwast commented on May 24, 2024 2

Matt, your perspective and experience is very valuable. I'm glad you came to this fork. Like Madelyn, I think we need to be both radical and conservative.

We'll need to create issues for each of these points to discuss them one by one, and have some kind of categorization and decision making process.... We'll come to that.

from valkey.

kadler avatar kadler commented on May 24, 2024 2

Every time I have to slow down and guard something with if (isBigEndian()) { thing = byteswap(thing) } it just feels like wasted effort

Well yeah, that's not a sensible solution to the problem. Endianness pretty much only matters in serialization: to the network and/or to disk (and sometimes this doesn't even matter). The better solution is to pick an endianness for your data (little being most appropriate nowadays, despite what network byte order would dictate) then write functions or macros that read from or write to that endianness, byteswapping as appropriate. You do the endian checks in one place (preferably at compile time) and then just always use those functions and never have to think about endianness again.

from valkey.

madolson avatar madolson commented on May 24, 2024 2

wasm and wasi support would be amazing

Say more? Are you interested in having Valkey be compiled into WASM or support for it in modules/scripting?

from valkey.

PingXie avatar PingXie commented on May 24, 2024 1

+1 on all, @mattsta!

These are all great points and suggestions! These changes would have little impact on the existing users but would go a long way to support future innovations. I would love to work with you to make these happen.

from valkey.

mattsta avatar mattsta commented on May 24, 2024 1

I think we should be deeply fearful of breaking changes for Redis, given it's place in the stack, but definitely believe we need to be moving faster.

Exactly. There is a balance between continuity of existing systems while also not remaining stuck with 15 year old designs for the next 15+ years. I raised a lot of these standard project maintenance issues 10 years ago and they never got better, so now it looks even more archaic in a lot of places.

Honestly, your comment of "culture conservatism" resonated strong with me. I spent a bunch of time trying to move to python tests, and the resistance from the former Redis guys (Oran and Yossi) was insurmountable.

I'm sorry I couldn't have prevented the current situation. I tried to advocate for both not giving the project away to a corrupt company and also for more consistent full time project management and architecture improvements (which was misunderstood as trying to "take over the project" then everything blew up), but I failed to generate the change I wanted to see in the world. Remember this one? ah, memories: http://antirez.com/news/87

But as for moving tests, it would obviously be great to move them to python. I bet with some careful work we could paste tcl files into Claude and ask for pytest formatted results. It would also be nice to stand up more concurrent testing if we can isolate tests to not step on each other.

from valkey.

zuiderkwast avatar zuiderkwast commented on May 24, 2024 1

Yeah, if we drop the idea of vendoring all dependencies (see #15), we can definitely have optional support for those. I'd like to have RESP over QUIC (multiplexed streams of commands over one connection) and optional compilation with liburing (io_uring).

from valkey.

madolson avatar madolson commented on May 24, 2024 1

@bitnom I would prefer to have issues instead of discussions, it's easier for us to keep all of the features there then trying to review them in two places.

from valkey.

vmorris avatar vmorris commented on May 24, 2024 1

nobody cares about big endian

Speak for yourself, and maybe do some research or provide evidence before you make such a claim?

from valkey.

iapicca avatar iapicca commented on May 24, 2024 1

wasm and wasi support would be amazing

Say more? Are you interested in having Valkey be compiled into WASM or support for it in modules/scripting?

I have 2 use cases in mind

  • wasi support for fermyon (in both nomad and kubernetes)
  • wasm support for flutter web (and mobile and desktop)

from valkey.

wenerme avatar wenerme commented on May 24, 2024 1

If valkey can compiled into WASM with Websocket support, so I can setup a simple replicate, dose that mean I have an offline first kv db in browser 🤔

from valkey.

zuiderkwast avatar zuiderkwast commented on May 24, 2024

We can add opt-in JSON along as HELLO json. It has everything that RESP2 has and parts of what RESP3 has (maps) but lacks some features: no difference between arrays and sets (not a big deal), push messages (can be done at protocol lever if we use JSON over HTTP/2 though).

We don't have a JSON dependency right now though. Can we split this out into a separate issue please?

[Edit] One huge disadvantage of JSON it that it can't store binary data. Strings must be valid Unicode. To store binary data in JSON strings, people use tricks like Base64.

from valkey.

wenerme avatar wenerme commented on May 24, 2024

Speaking from my personal experience, my preference for NATS over Redis stems from NATS's support for WebSocket-based transport. This choice does not imply that I will use NATS directly in a browser environment. However, it allows me to leverage the existing infrastructure while bypassing the constraints associated with port requirements.

from valkey.

mattsta avatar mattsta commented on May 24, 2024

Just a note looking through more recent issues people are adding:

These are all literally things I brought up as major design flaws 10 years ago (with plans to fix!). I'm really surprised there's been no progress on so many of these basic architecture and design issues.

Even simple things like "the script isn't replicated everywhere so you get random failures" was happening 10 years ago ya'll and nobody decided to improve the system in the meantime? I'm curious what's missing. Initiative? Permission? Ability? Project management prioritization? Lack of curiosity? Only following "profitable" improvements instead of overall stability? (another interpretation may be the project has intentionally preferred to remain less extensible and flexible to retain lock-in so people don't "grow the project" in "unapproved" directions where profit can't be captured)

Cluster topology still is causing client problems? AOF and RDB formats and differences are still causing problems and requires a full version revision for all changes instead of having extensible metadata built-in? It's almost like the project has been afraid of addressing any of the original architecture and design inconsistencies?

It's worth remembering a lot of these architecture and design decisions weren't made by some expert committee having a combined 100 years of experience in distributed systems and distributed consistency protocols and flexible persistence and reliability and storage formats... it's mostly just "the ideas of some guy having fun building a personal database on a macbook air in 2010." Somehow, "because redis has always done it this way" became codified as a reason to never change or improve any of these core faults of the design? The core design of redis has always been treated as almost sacred and unquestionable and unchangeable even though high visibility deficiencies are scattered throughout. It's all just software and not some immutable laws of the universe.

wut.

from valkey.

bitnom avatar bitnom commented on May 24, 2024

I added some under: https://github.com/orgs/valkey-io/discussions

from valkey.

mattsta avatar mattsta commented on May 24, 2024

Speak for yourself,

such is the default state of speaking

and maybe do some research or provide evidence before you make such a claim?

??? if you have more information feel free to share. details are always useful.

wikipedia contributes:

The IBM System/360 uses big-endian byte order, as do its successors System/370, ESA/390, and z/Architecture. The PDP-10 uses big-endian addressing for byte-oriented instructions. The IBM Series/1 minicomputer uses big-endian byte order. The Motorola 6800 / 6801, the 6809 and the 68000 series of processors use the big-endian format. Solely big-endian architectures include the IBM z/Architecture and OpenRISC.

None of those really matter in modern hosting environments, and if they matter to individual companies for unique low-demand use cases, well, why are you using anonymous free software for mission critical services. If "big endian" is officially supported, it also means the entire CI cycle needs duplicate itself for big endian VMs running every update as well. It's technically "supported" now, but never gets tested unless somebody complains. I seem to recall some parts didn't convert endianness in the save file properly for 10 years and nobody complained because nobody uses it:

This comment was added in 2018 after the file format had been broken for over 5 years:

/* This function loads a time from the RDB file. It gets the version of the

  • RDB because, unfortunately, before Redis 5 (RDB version 9), the function
  • failed to convert data to/from little endian, so RDB files with keys having
  • expires could not be shared between big endian and little endian systems
  • (because the expire time will be totally wrong). The fix for this is just
  • to call memrev64ifbe(), however if we fix this for all the RDB versions,
  • this call will introduce an incompatibility for big endian systems:
  • after upgrading to Redis version 5 they will no longer be able to load their
  • own old RDB files. Because of that, we instead fix the function only for new
  • RDB versions, and load older RDB versions as we used to do in the past,
  • allowing big endian systems to load their own old RDB files.

It's also worth noting the new key value cabal is almost all from "hyperscaler" hosting providers on modern x64 or arm.

These days I care more about matching development effort to sustainable forward-looking developer experience.

low level C developers are literally dying out, so the way forward is to simplify and remove as many traps a possible. Every time I have to slow down and guard something with if (isBigEndian()) { thing = byteswap(thing) } it just feels like wasted effort (plus, we don't even know if it's correct anymore since no CI is running big endian VMs anyway and the only reason for maintaining "big endian" support is for hybrid architecture deployments — so the CI would actually need to run: little endian VM, big endian VM, save dump file from each, load into opposite architecture, replicate between architectures and confirm, cluster between architectures and confirm... fun matrix combinatorial complexity for the benefit of... ???).

4321 > 1234

from valkey.

mattsta avatar mattsta commented on May 24, 2024

that's not a sensible solution to the problem.

true, but the code still has to exist somewhere (like the 5+ year bug example above... it was in the serialization code and it just didn't have the conversion check around it (combined with no integration tests verifying the goals of dual-architecture save/restore consistency)).

You do the endian checks in one place (preferably at compile time) and then just always use those functions and never have to think about endianness again.

technically true, but the design of redis isn't necessarily that coherent. it's mostly "a cruise ship built up from assorted scrap found on a beach over 15 years."

the problem isn't actual conversions, but rather the hand-evolved custom byte-by-byte writers and readers not respecting any formal definitions, so every refactor/improvement is a gamble as to whether we're breaking things. 🤷

from valkey.

iapicca avatar iapicca commented on May 24, 2024

wasm and wasi support would be amazing

from valkey.

aruanruan avatar aruanruan commented on May 24, 2024

system command/op (such as migration/sync command) maybe isolated from user data access command by different network channel or protocol, user access command should keep atomic, but system commands may work for one long time or complex task with multiple commands

from valkey.

iapicca avatar iapicca commented on May 24, 2024

I have 2 use cases in mind

  • wasi support for fermyon (in both nomad and kubernetes)
  • wasm support for flutter web (and mobile and desktop)

If valkey can compiled into WASM with Websocket support, so I can setup a simple replicate, dose that mean I have an offline first kv db in browser 🤔

@wenerme
that's exactly the use case I have in mind for flutter, it's already doing that with sqllite (not kv of course)

from valkey.

aruanruan avatar aruanruan commented on May 24, 2024

we can use yaml format as config file

from valkey.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.