lru,phuslu

API & features feedback

I'm plan to release v1.0.0, before this we need

Finalize API signature
Add feature if really needed

Welcome feedback, thanks!

是否可以添加一个回调当kv被驱逐出缓存的时候执行。类似 Evicted(func(key, value)

某些场景下，需要做一个类似回调

Please add the feature: sliding caching

Thanks for share the nice repo, I have just added the Loader to my code, the new version makes it outdated ;-), the last feature I'm missing is sliding caching, now the cache is absolute TTL, shall we add it?

A few questions about your implementation

Hi @phuslu, very interesting work, but some things in your repository raise questions for me.

Throughput benchmarks

No need to run benchmarks on a very busy github runner, as its cores are forced to switch context frequently.
b.SetParallelism(32) is also unnecessary, since you run more goroutines than the number of available cores and the scheduler ends up spending a lot of CPU time switching between them.
Ristretto is configured incorrectly, which greatly affects its performance. You specify in its parameters MaxCost: 2 << 30, while specifying 1 as the cost of each item. Because of this, it will store 2 << 30 items, which is much more than one million. The correct configuration is as follows

cache, _ := ristretto.NewCache(&ristretto.Config{
    NumCounters: 10 * cachesize, // number of keys to track frequency of (10M).
    MaxCost:     cachesize,      // maximum cost of cache (1M).
    BufferItems: 64,             // number of keys per Get buffer.
})

Otter is also configured incorrectly, but it's not critical for now. But it will be soon!🙂

And now if you run the original benchmarks without runtime limit on 8 free perfomance cores with the command go test -run='^$' -cpu=8 -bench . -timeout=0 -benchmem

goos: darwin
goarch: arm64
pkg: github.com/maypok86/otter/bench
BenchmarkCloudflareGetSet-8     19329662                59.74 ns/op           16 B/op          1 allocs/op
BenchmarkEcacheGetSet-8         30167113                42.04 ns/op            2 B/op          0 allocs/op
BenchmarkLxzanGetSet-8          19655550                64.18 ns/op            0 B/op          0 allocs/op
BenchmarkFreelruGetSet-8        16393722                76.60 ns/op            0 B/op          0 allocs/op
BenchmarkRistrettoGetSet-8      13675732                91.52 ns/op           29 B/op          1 allocs/op
BenchmarkTheineGetSet-8          7859980               154.2 ns/op             0 B/op          0 allocs/op
BenchmarkOtterGetSet-8          19729608                70.02 ns/op            8 B/op          0 allocs/op
BenchmarkPhusluGetSet-8         23397532                52.11 ns/op            0 B/op          0 allocs/op
PASS
ok      github.com/maypok86/otter/bench 17.342s

After other fixes we get the following

goos: darwin
goarch: arm64
pkg: github.com/maypok86/otter/bench
BenchmarkCloudflareGetSet-8     18841335                61.23 ns/op           16 B/op          1 allocs/op
BenchmarkEcacheGetSet-8         28833619                39.66 ns/op            2 B/op          0 allocs/op
BenchmarkLxzanGetSet-8          24512806                50.89 ns/op            0 B/op          0 allocs/op
BenchmarkFreelruGetSet-8        14905558                79.83 ns/op            0 B/op          0 allocs/op
BenchmarkRistrettoGetSet-8      22861460                52.04 ns/op           28 B/op          1 allocs/op
BenchmarkTheineGetSet-8          7006892               171.7 ns/op             0 B/op          0 allocs/op
BenchmarkOtterGetSet-8          30895054                47.92 ns/op            7 B/op          0 allocs/op
BenchmarkPhusluGetSet-8         26106796                47.16 ns/op            0 B/op          0 allocs/op
PASS
ok      github.com/maypok86/otter/bench 17.371s

Here you should make sure that all implementations use the same number of shards, but I'm too lazy to do it anymore.

In benchmarks you use this construct i := int(fastrandn(cachesize)); i <= threshold. Do you really want to always write the same items and read the same items? Because that is exactly what you are doing.
Next are general questions about key distribution used in benchmarks. The main one is why a unique key distribution is used. It is completely unrealistic and distributes contention across shards, which hides implementation problems. And why the cache is only half full, do you really want to check the speed of cache misses, which are worthless unlike cache hits? You can read more here
And if you run more CPU cache-aware benchmarks the use of sync.Mutex will start to have an impact.

goos: darwin
goarch: arm64
pkg: github.com/maypok86/otter/perf
BenchmarkCache/Zipf_Theine_reads=100%,writes=0%-8               11608960               107.7 ns/op         9283558 ops/s               0 B/op          0 allocs/op
BenchmarkCache/Zipf_Ristretto_reads=100%,writes=0%-8            48676160                25.32 ns/op       39493565 ops/s              16 B/op          1 allocs/op
BenchmarkCache/Zipf_Phuslu_reads=100%,writes=0%-8               32948250                34.86 ns/op       28687550 ops/s               0 B/op          0 allocs/op
BenchmarkCache/Zipf_Otter_reads=100%,writes=0%-8                287808117                4.196 ns/op     238324807 ops/s               0 B/op          0 allocs/op
BenchmarkCache/Zipf_Theine_reads=75%,writes=25%-8               10141712               106.3 ns/op         9405869 ops/s               0 B/op          0 allocs/op
BenchmarkCache/Zipf_Ristretto_reads=75%,writes=25%-8            27029410                46.52 ns/op       21498335 ops/s              27 B/op          1 allocs/op
BenchmarkCache/Zipf_Phuslu_reads=75%,writes=25%-8               33284792                33.89 ns/op       29508730 ops/s               0 B/op          0 allocs/op
BenchmarkCache/Zipf_Otter_reads=75%,writes=25%-8                190370718                6.527 ns/op     153221398 ops/s               2 B/op          0 allocs/op
BenchmarkCache/Zipf_Theine_reads=50%,writes=50%-8               13183764                88.23 ns/op       11334143 ops/s               0 B/op          0 allocs/op
BenchmarkCache/Zipf_Ristretto_reads=50%,writes=50%-8            22789605                52.02 ns/op       19224501 ops/s              34 B/op          1 allocs/op
BenchmarkCache/Zipf_Phuslu_reads=50%,writes=50%-8               32335388                34.77 ns/op       28761246 ops/s               0 B/op          0 allocs/op
BenchmarkCache/Zipf_Otter_reads=50%,writes=50%-8                106489442               12.22 ns/op       81857267 ops/s               6 B/op          0 allocs/op
BenchmarkCache/Zipf_Theine_reads=25%,writes=75%-8               13120154                92.32 ns/op       10832209 ops/s               0 B/op          0 allocs/op
BenchmarkCache/Zipf_Ristretto_reads=25%,writes=75%-8            14231132                70.28 ns/op       14228276 ops/s              60 B/op          1 allocs/op
BenchmarkCache/Zipf_Phuslu_reads=25%,writes=75%-8               32800737                33.55 ns/op       29807980 ops/s               0 B/op          0 allocs/op
BenchmarkCache/Zipf_Otter_reads=25%,writes=75%-8                51792225                23.23 ns/op       43054563 ops/s              12 B/op          0 allocs/op
BenchmarkCache/Zipf_Theine_reads=0%,writes=100%-8               31050308                41.97 ns/op       23827702 ops/s               0 B/op          0 allocs/op
BenchmarkCache/Zipf_Ristretto_reads=0%,writes=100%-8            13311984               128.6 ns/op         7775102 ops/s             122 B/op          3 allocs/op
BenchmarkCache/Zipf_Phuslu_reads=0%,writes=100%-8               33116732                34.05 ns/op       29370837 ops/s               0 B/op          0 allocs/op
BenchmarkCache/Zipf_Otter_reads=0%,writes=100%-8                10602829               120.2 ns/op         8316114 ops/s              64 B/op          1 allocs/op
PASS
ok      github.com/maypok86/otter/perf  30.320s

Memory usage benchmarks

Ristretto is misconfigured again.
Apparently your not storing the key twice unlike other implementations, then using values taking up less space than keys doesn't look very fair (it saves about 15 MB)
Also I didn't find any data structure for deleting expired items. This for these benchmarks saves about 16 bytes per item or about 15MB in total, and will also in real life lead to much higher memory usage than the other implementations. Oh, and not being able to disable clock goroutine looks sad.
Comparing memory usage with ristretto, theine, otter doesn't look particularly fair, because due to cost based eviction support, the hash table grows many times.
Many implementations use a few extra bytes to increase hit ratio. For example, theine and otter should use about 8-16 bytes extra per element, but unfortunately this is 16-24 bytes due to alignment issues😔.

So your implementation does show good memory utilization, but after fixing the implementation and benchmarks the gap will not be so big.

General questions

Do you understand why your implementation is faster than ristretto? This is important because cutting the eviction policy is a very dangerous operation that will end badly in a large number of cases.
I tried to simulate the hit ratio of this implementation and found out some unpleasant subroutines. This one looks very questionable, because in real life you will store many more items than you need. Also, here are the results of the hit ratio simulation. Almost everywhere too many items are used, but the most confusing thing is the performance on S3, are there definitely no bugs?

phuslu / lru Goto Github PK

lru's People

Contributors

Stargazers

Watchers

Forkers

lru's Issues

API & features feedback

是否可以添加一个回调当kv被驱逐出缓存的时候执行。类似 Evicted(func(key, value)

Please add the feature: sliding caching

A few questions about your implementation

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent