Comments (9)
@zuiderkwast Yes, a traffic model.
However, it is difficult to achieve real-time access to user data (as it involves privacy and company assets), and there may also be differences among different companies. So, I think this model can be constantly updated, and the operational standards can be implemented first.
We need to control the scope of the discussion, we can continue our discussion on https://github.com/orgs/valkey-io/discussions/398
the current issue requires a performance benchmark standard. We cannot expect this method to detect all performance issues. It can do:
- Command performance testing: For example (SET, LPUSH/POP, SADD, HSET, XADD, ZADD)
- Performance testing with fixed parameters, such as data size and number of clients
It cannot perform dynamic validation, such as expiration and eviction strategies,it needs to be designed separately.
A little idea. At least have a fixed performance report first
from valkey.
@valkey-io/core-team Would like thoughts on the above proposal, and implicitly would appreciate a vote.
from valkey.
re: hardware
I guess one precursor question: What hardware/architectures is Valkey planning on targeting?
from valkey.
I wasn't aware that there were performance benchmarking tools but I love this idea so adding my vote explicitly here
from valkey.
I guess one precursor question: What hardware/architectures is Valkey planning on targeting?
Ideally one pair of arm hosts and one pair of x86 hosts. Something like an m7i and an m7g is probably "broadly sufficient". If GCP would like to donate some hardware, we could run it on their infra as well. :)
from valkey.
A few fixed jobs is good to have, but what I've felt the need for when doing certain optimizations is specific runs to indicate performance improvement for specific scenarios/workloads. For example, I had a PR to avoid looking up the expire dict for keys that don't have a TTL. This is only slow if there are many keys in the expire dict and also many accesses to keys that don't have a TTL. I had convincing (to myself) results on my laptop by running several times with similar results, but that automated Redis benchmark could see it.
When we test a few fixed workloads, we will always miss other workloads and scenarios.
from valkey.
I totally agree with this. Before designing the test, I would like to propose several my concerns.
- How to decide the client connection number, and thread number as memtier-benchmark
- How to decide the data-size (aka workload) for each kind of data type, such as value is 10 bytes, 10k, or 10MB
- If maxmemory is set, if we should consider all key-eviction policies
from valkey.
I have an idea about performance.
We can refer to the process of TPC (Transaction Processing Performance Council) and design the server configuration to be suitable for various workloads of the NoSQL database.
For example(A system similar to Quora):
-Real time application data caching:likes number, user information cache
-Real time session stores:User Session
-Real time leaderboards:Quora article ranking
The data will be generated proportionally and discretely to cover scenes of different sizes.
At the same time, it can also be added to the workload of key observation policies.
This work will only involve managing workloads that are in line with actual production (including both the client and server).
from valkey.
@artikell Do you mean we should have an advanced "traffic model" where we can define how many of each kind of command and the size of data with probabilities?
I have heard about such benchmarks (for some commercial products) where statistics is collected from users and this is used to run benchmark tests with the user's traffic model. It can be very powerful. Maybe the first thing we need is a way to collect these statistics from a running node. (It shall contain only statistics, no actual key names or value content.)
from valkey.
Related Issues (20)
- Revert mmap_rnd bits back to default value
- [NEW] Support different bind addresses for plain TCP and TLS port
- Deprecate MacOS 11 build target
- New MPUBLISH command to publish multiple messages. HOT 5
- Replace CentOS 7 image with CentOS Stream 9 HOT 1
- Handling edge cases on connSet(Read/Write)Handler HOT 2
- Validate format of YAML files HOT 2
- [Improvement][Cluster Mode] Remove Unowned Keys After Loading Persistence Files At Server Startup HOT 6
- [NEW] Limit maximum size on disk of AOF files. Avoid disk full, long load times.
- [BUG] Inaccurate total_active_defrag_time calculation?
- [LEGAL] Please remove my copyright notice from the source code HOT 7
- [NEW] Pub/Sub in Cluster Mode HOT 8
- [CRASH] If you have version.script file, make sure update Redis* to Valkey* HOT 9
- Valkey on Flash plans? (leverage nvme disk storage) HOT 6
- [BUG] redis command still available on the docker image HOT 3
- [NEW] Light-weight cluster bus pubsub message HOT 2
- [NEW] Improve Valkey client output buffer limits handling
- [BUG] PUBSUB SHARDNUMSUB doesn't return -CROSSSLOT for cross slot shard-channels HOT 5
- Integrate the bugfixes of Redis 7.2.5 HOT 16
- [CRASH] Command duration is not reset when client is blocked on XGROUPREAD and the stream's slot is migrated, failing an assertion HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from valkey.