maypok86 / otter Goto Github PK
View Code? Open in Web Editor NEWA high performance lockless cache for Go.
Home Page: https://maypok86.github.io/otter/
License: Apache License 2.0
A high performance lockless cache for Go.
Home Page: https://maypok86.github.io/otter/
License: Apache License 2.0
I am trying to use bytes as my cost function, but the policy is silently casting the capacity from an int to a uint32. This effectively makes the max cache size 4GB, where values over that overflow and get a ~random size
Set the capacity to math.MaxUint32 + 1 every entry will be rejected.
Normal caching (or at least an error)
i wanna add some method to otter to use more conveniently
add three methods i'm thinking now
otter/internal/hashtable/map.go
Line 148 in c66ac1f
atomic.StorePointer(&m.table, unsafe.Pointer(&t))
?
This project use a module maphash
, It use the golang map internal struct to generate the hash value, as:
a := any(make(map[K]struct{}))
i := (*mapiface)(unsafe.Pointer(&a))
h = i.typ.hasher
And there are some comments like:
//go:build go1.18 || go1.19
// +build go1.18 go1.19
If the map internal struct changed in the future, that convert will fail.
How about just use the map address as the hash value like:
m := any(make(map[int]struct{}))
p := uintptr(unsafe.Pointer(&m))
fmt.Println(p)
Or extract the hasher implement from goland(/src/runtime/alg.go)
func typehash(t *_type, p unsafe.Pointer, h uintptr) uintptr
i wanna unmarshal []byte from file to a *otter.Cache[string, any],but i get nothing in *otter.Cache[string, any]
support json unmarshal or how can i unmarshal correctly
Hello, i'm interested in testing the performance of your cache in my application. Currently I use theine. Only thing I can see that I am missing from the API is the ability to save and restore the cache from file, so warm restarts are possible (very important for my app).
Important that the file object is e.g. a generic io.ReadCloser, as we use afs (abstract file system) to use any cloud object store.
Thanks!
We need a specific hash table that ignores hash collisions and supports node swapping
For the moment we will probably use a modification of the hash table from this repository, but in the future we should implement ConcurrentHashMap from the standard Java library or some other implementation.
Added hash table implementation with desired characteristics and covered by tests
I am the author of Bucket-Based Expiration Algorithm: Improving Eviction Efficiency for In-Memory Key-Value Database and today I find this project :) Really nice implementation in Go!
For curiosity, I wonder how you found this paper. Do you have a benchmark for the expiration? And is there any space I could contribute to this project? (I have >5 years experiences with Go.)
Hi, Is there anyway to get array of all keys?
And if no, Do you have any intention to add this feature?
I need to get list of keys exist in the cache at certain point of my program that i don't have keys.
curious on the lock contention and performance on multicore systems.
the original s3-fifo measured up to 16cores only.
Just random thoughts about new contracts...
e.g. .HitCount() return the number of times the key is hit etc. possible to make it so that it returns both the value and hit count at the same time?
e.g. value, hitcount, err := cache.GetWithHitCount("key")
this will be extremely useful to check if corresponding data should be saved in disk cache etc
Currently, Otter uses ring buffer based queues, but this type of queue does not allow easy deletion of an arbitrary element, and we need it because of the expiration policy and arbitrary calls to the Delete function by the user.
Queues on a doubly-linked list can solve this problem, but instead require 16 bytes for each key-value pair (prev and next pointers), and put additional pressure on gc. So one would like to come up with a workaround for this.
And the workaround suggestion is pretty simple. Let's store in each key-value pair store an index to the location where it is stored in the buffer. For example, like this:
type Node[K comparable, V any] struct {
...
policyIndex uint32
}
And when deleting it, just put nil in the right cell. Such a solution will additionally waste 12 bytes (pointer to node and index). The only problem is that we need to maintain indexes in nodes when expanding the buffer. Also, after deleting a node we still occupy space for the cell and when adding new nodes the buffer will have to be expanded twice, but we will not use half of these cells.
That's why we want to understand what type of queues suits us better and implement it
Now otter static configuration of nodes, because of that in most cases, so it is worth trying with codegeneration to reduce memory consumption, and also it will allow using queue for expiration policy with constant ttl (queue) and timing wheel.
I need stats for
these stats give me a chance to tune cache configuration and to monitor cache operation is fine.
Unfortunately, at the moment, otter memory consumption in small cache sizes is quite large compared to other caches. This is caused by the buffers that otter uses, and creates them too large without knowing in advance the characteristics of the workloads. Therefore, it is proposed to implement a growable write buffer that will be able to grow from minimum capacity to maximum capacity.
Right now otter uses quite a few extra bytes per node (at least 48).
I have an idea how to easily reduce this to 40 bytes.
Right now the expiry time is stored as the number of seconds since January 1, 1970 UTC in uint64 and takes 8 bytes. But we can only store the time difference between the start and the node's expiry time, then we can store in uint32. I doubt a program using otter will ever run continuously for 4294967295 seconds or 136+ years. 🙂
Also otter stores a mutex per node which takes 8 bytes, but it seems we just need a simple spinlock which will take 4 bytes
Add SetWithTTL to automatically use permanent TTL when TTL is 0. People usually like to use 0ttl or -1TTL to represent permanent TTL. For example, Ristretto 0TTL is permanent TTL, and Redis -1TTL is permanent TTL.
The ttl passed in using the SetWithTTL function is less than 0 and is automatically set to permanent tt. This is also what Ristretto does in Redis.
I have initiated a merge request #38
bigcache
and fastcache
. Comparison with which does not look fair.Now we need to select those implementations.
If my understanding is correct, the current implementation has a bug that may evict more objects than needed. I believe it should return
at this line instead of continuing to evict because an object has already been evicted from the main queue https://github.com/maypok86/otter/blob/main/internal/s3fifo/small.go#L62
As you have mentioned (Yiling-J/theine-go#29), it is better to store a fingerprint instead of the full object in the ghost. Storing data in the ghost defeats the purpose of using a ghost queue...
the size of the ghost queue does not need to be accurate, so you can estimate using the total cache size divided by the mean object size. VMware found that the ghost queue size has little impact between 40% and 200% of the main queue size (in terms of #objects).
there are multiple if statement seems not necessary, e.g., https://github.com/maypok86/otter/blob/main/internal/s3fifo/small.go#L55
Every node has a cost value, when add a node otter will compare the node cost with the policy.MaxAvailableCost()
, if it is larger than later, otter will reject to add this node.
I don't fullt understand how the cost infect the s3-FIFO policy. As i know, in s3-FIFO, the length of the queues of s3-FIFO is limited by the cost, a node with large cost will occupy more queue space, but the cost won't affect the evict sequence.
So what the cost does to the otter, will it improve some performance or hit rate in some cases?
A cache system needs statistics to optimize its performance and improve the efficiency of data retrieval. Statistics provide valuable information about the access patterns and usage of data, which can be used to make informed decisions regarding caching strategies and resource allocation. Here are a few reasons why statistics are important for a cache system:
Access patterns: Statistics help identify the most frequently accessed data items or resources. By analyzing access patterns, the cache system can prioritize caching those items that are accessed more frequently, reducing the cache miss rate and improving overall performance.
Cache hit rate: Statistics allow the cache system to measure the cache hit rate, which is the percentage of requests that are successfully served from the cache
Added fast and test covered statistics package
is there any possibility for disk backed version? async or otherwise with in memory cache or without
#clarificationandmotivation
Use Set and SetWithTTL to return false if the Key setting exceeds the maximum capacity of NewCache and fails.
#Acceptance criteria
For example, if I use NewCache to set the maximum capacity to 3, and I set 4 Keys, the first three keys will return true if they do not exceed the maximum capacity limit, and the fourth key will return false if they exceed the maximum capacity limit.
The necessary expiration algorithm for a cache determines when and how items in the cache should be evicted or removed. The goal is to remove items that are no longer needed or have expired in order to make space for new items.
Expiry algorithm for variable ttl is implemented
Now otter uses an api very similar to other go cache libraries, but unfortunately it is not particularly beautiful and extensible.
For example, now we have Set(key, value)
and SetWithTTL(key, value, ttl)
functions, but if we want to add SetIfAbsent(key, value)
function, we also need SetIfAbsentWithTTL(key, value, ttl)
function, which doesn't look nice anymore. Especially since when we create the cache in the builder we already know its further behavior and whether it will need ttl or not.
We would like to create a cache with a clean api and with the functions it needs using parameters in the builder. This will allow to provide such api as Set(key, value, ttl)
and SetIfAbsent(key, value, ttl)
. Also, since in most cases ttl is the same for all entries in the cache, we don't need to pass it every time at all, further simplifying the api. This will also allow us to use optimizations for the expiration policy, since with the same ttl we only need to maintain a simple fifo queue of items and check their expiration one by one.
We need to implement a basic version of the cache with ttl support in the api
Basic version of the cache is implemented
My go-to cache library for Go so far has been bluele/gcache - probably much less performant than many other alternatives but it has a very handy helper function that handles coordinating loads to avoid thundering herd.
https://pkg.go.dev/github.com/bluele/gcache#readme-loading-cache
Builder could have a new WithLoaderFunc(func(K) V).
Calling Get(key_not_loaded_yet) in multiple goroutines concurrently will call the optional LoaderFunc once and then return to all goroutines all at once.
otter/internal/hashtable/map.go
Line 5 in 9d61c67
Here define a growThreshold
, if the total nodes in the map table above this value, it needs to resize the map table.
growThreshold
defined as growThreshold := float64(tableLen) * bucketSize * loadFactor
, here tableLen
(len(t.buckets)
)is the buckets chain number(t.buckets[i]
is a chain), not the total nodes number, so we can't compare the chain number with nodes number like:
if t.sumSize() > int64(growThreshold)
Am i right?
is this lru cache? can u make it lru without ttl feature?
I tried to use Otter in FrankenPHP and it made our test suite crash when using the race detector with the following error:
runtime: marked free object in span 0x7f1032cb5358, elemsize=48 freeindex=4 (bad use of unsafe.Pointer? try -d=checkptr)
It looks like it crashes when calling internal/core.NewCache()
.
Here is the PR: dunglas/frankenphp#540
And the specific patch to switch to Otter (that creates the crash): dunglas/frankenphp@3dfc3e2
(#540)
No crash :)
FrankenPHP relies a lot on cgo (but not where otter is used).
runtime: marked free object in span 0x7f1032cb5358, elemsize=48 freeindex=4 (bad use of unsafe.Pointer? try -d=checkptr)
0xc000210000 alloc marked
0xc000210030 alloc marked
0xc000210060 alloc unmarked
0xc000210090 alloc marked
0xc0002100c0 alloc marked
0xc0002100f0 alloc unmarked
0xc000210120 alloc marked
0xc000210150 free unmarked
0xc000210180 free unmarked
0xc0002101b0 free unmarked
0xc0002101e0 free unmarked
0xc000210210 alloc unmarked
0xc000210240 free unmarked
0xc000210270 free unmarked
0xc0002102a0 free marked zombie
0x000000c0002102a0: 0x746163696c707061 0x77772d782f6e6f69
0x000000c0002102b0: 0x752d6d726f662d77 0x65646f636e656c72
0x000000c0002102c0: 0x0000000000000064 0x0000000000000000
0xc0002102d0 alloc marked
0xc000210300 alloc unmarked
0xc000210330 alloc unmarked
0xc000210360 free unmarked
0xc000210390 alloc unmarked
0xc0002103c0 alloc marked
0xc0002103f0 free unmarked
0xc000210420 alloc unmarked
0xc000210450 alloc marked
0xc000210480 alloc marked
github.com/dunglas/frankenphp_test.runTest.func1({0xba4770, 0xc0002a4d40}, 0xc00067b200)
/home/runner/work/frankenphp/frankenphp/frankenphp_test.go:72 +0x127 fp=0xc000249d18 sp=0xc000249c88 pc=0x9c2767
github.com/dunglas/frankenphp_test.TestPostSuperGlobals_worker.testPostSuperGlobals.func1(0xc0004ca3e0, 0x0?, 0x63)
/home/runner/work/frankenphp/frankenphp/frankenphp_test.go:281 +0x44e fp=0xc000249f70 sp=0xc000249d18 pc=0x9c8eee
github.com/dunglas/frankenphp_test.runTest.func2(0x63)
/home/runner/work/frankenphp/frankenphp/frankenphp_test.go:86 +0x6e fp=0xc000249fb8 sp=0xc000249f70 pc=0x9c25ae
github.com/dunglas/frankenphp_test.runTest.gowrap2()
/home/runner/work/frankenphp/frankenphp/frankenphp_test.go:88 +0x42 fp=0xc000249fe0 sp=0xc000249fb8 pc=0x9c2502
runtime.goexit({})
/opt/hostedtoolcache/go/1.22.0/x64/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000249fe8 sp=0xc000249fe0 pc=0x483441
created by github.com/dunglas/frankenphp_test.runTest in goroutine 793
/home/runner/work/frankenphp/frankenphp/frankenphp_test.go:85 +0x9fb
goroutine 1716 gp=0xc000203340 m=nil [select]:
runtime.gopark(0xb0c940, 0x0, 0x9, 0x3, 0x1)
/opt/hostedtoolcache/go/1.22.0/x64/src/runtime/proc.go:402 +0x136 fp=0xc00022d9d0 sp=0xc00022d9a0 pc=0x447776
runtime.selectgo(0xc00022dc30, 0xc00022dbb0, 0xc00022dc50?, 0x1, 0x12?, 0x1)
/opt/hostedtoolcache/go/1.22.0/x64/src/runtime/select.go:327 +0xac6 fp=0xc00022db80 sp=0xc00022d9d0 pc=0x45b5c6
github.com/dunglas/frankenphp.ServeHTTP({0xba4770, 0xc0002a43c0}, 0xc00067a120)
/home/runner/work/frankenphp/frankenphp/frankenphp.go:462 +0x325 fp=0xc00022dc88 sp=0xc00022db80 pc=0x926de5
github.com/dunglas/frankenphp_test.runTest.func1({0xba4770, 0xc0002a43c0}, 0xc00067a000)
/home/runner/work/frankenphp/frankenphp/frankenphp_test.go:72 +0x127 fp=0xc00022dd18 sp=0xc00022dc88 pc=0x9c2767
github.com/dunglas/frankenphp_test.TestPostSuperGlobals_worker.testPostSuperGlobals.func1(0xc0004ca3e0, 0x4d45545f52454e4e?, 0x5b)
/home/runner/work/frankenphp/frankenphp/frankenphp_test.go:281 +0x44e fp=0xc00022df70 sp=0xc00022dd18 pc=0x9c8eee
github.com/dunglas/frankenphp_test.runTest.func2(0x5b)
/home/runner/work/frankenphp/frankenphp/frankenphp_test.go:86 +0x6e fp=0xc00022dfb8 sp=0xc00022df70 pc=0x9c25ae
github.com/dunglas/frankenphp_test.runTest.gowrap2()
/home/runner/work/frankenphp/frankenphp/frankenphp_test.go:88 +0x42 fp=0xc00022dfe0 sp=0xc00022dfb8 pc=0x9c2502
runtime.goexit({})
/opt/hostedtoolcache/go/1.22.0/x64/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc00022dfe8 sp=0xc00022dfe0 pc=0x483441
created by github.com/dunglas/frankenphp_test.runTest in goroutine 793
/home/runner/work/frankenphp/frankenphp/frankenphp_test.go:85 +0x9fb
Thank you for this library @maypok86
Creating a cache leads to one core being utilized at 100%. Creating multiple caches bumps this by one core at a time.
Steps to reproduce the behavior:
package main
import "github.com/maypok86/otter"
// We assign this so that we're sure it doesn't get garbage collected
var Cache *otter.Cache[string, string]
func main() {
b, err := otter.NewBuilder[string, string](1_000)
if err != nil {
panic(err)
}
c, err := b.Build()
if err != nil {
panic(err)
}
Cache = c
// Block forever
select {}
}
CPU usage is negligible.
I believe the issue lies in the goroutine spawned here https://github.com/maypok86/otter/blob/main/cache.go#L74
I have a script that updating some variables and users get them from cache. But I want to get expiration for showing them last update time.
Get expiration time with get function or new function with expiration time
Hi @maypok86
Please update the link "BP-Wrapper: A Framework Making Any Replacement Algorithms (Almost) Lock Contention Free(http://web.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-09-1.pdf)".
I found another link from your reply Yiling-J/theine-go#29 (comment).
And perhaps you can add the following links about s3fifo:
https://blog.jasony.me/system/cache/2023/08/01/s3fifo
https://blog.jasony.me/system/cache/2023/12/28/fifo
Thanks for the development of this project, I believe there will be a stable version for production use soon ;)
otther completes at 26ns while phuslu completes at 29ns
./cachemain otter 10000000
otter 10000000 1216 MB 1600 MB 1431 MB
./cachemain nottl 10000000
nottl 10000000 355 MB 433 MB 438 MB
// memusage.go
package main
import (
"fmt"
"os"
"runtime"
"time"
"strconv"
theine "github.com/Yiling-J/theine-go"
"github.com/cespare/xxhash/v2"
cloudflare "github.com/cloudflare/golibs/lrucache"
ristretto "github.com/dgraph-io/ristretto"
freelru "github.com/elastic/go-freelru"
hashicorp "github.com/hashicorp/golang-lru/v2/expirable"
ccache "github.com/karlseguin/ccache/v3"
lxzan "github.com/lxzan/memorycache"
otter "github.com/maypok86/otter"
ecache "github.com/orca-zhang/ecache"
phuslu "github.com/phuslu/lru"
)
const keysize = 16
var keys []string
func main() {
name := os.Args[1]
cachesize, _ := strconv.Atoi(os.Args[2])
keys = make([]string, cachesize)
for i := 0; i < cachesize; i++ {
keys[i] = fmt.Sprintf(fmt.Sprintf("%%0%dd", keysize), i)
}
var o runtime.MemStats
runtime.ReadMemStats(&o)
map[string]func(int){
"nottl": SetupNottl,
//"phuslu": SetupPhuslu,
"freelru": SetupFreelru,
"ristretto": SetupRistretto,
"otter": SetupOtter,
"lxzan": SetupLxzan,
"ecache": SetupEcache,
"cloudflare": SetupCloudflare,
"ccache": SetupCcache,
"hashicorp": SetupHashicorp,
"theine": SetupTheine,
}[name](cachesize)
var m runtime.MemStats
runtime.ReadMemStats(&m)
fmt.Printf("%s\t%d\t%v MB\t%v MB\t%v MB\n",
name,
cachesize,
(m.Alloc-o.Alloc)/1048576,
(m.TotalAlloc-o.TotalAlloc)/1048576,
(m.Sys-o.Sys)/1048576,
)
}
func SetupNottl(cachesize int) {
cache := phuslu.New[string, int](cachesize)
for i := 0; i < cachesize; i++ {
cache.Set(keys[i], i)
}
}
/*
func SetupPhuslu(cachesize int) {
cache := phuslu.NewTTLCache[string, int](cachesize)
for i := 0; i < cachesize; i++ {
cache.Set(keys[i], i, time.Hour)
}
}
*/
func SetupFreelru(cachesize int) {
cache, _ := freelru.NewSharded[string, int](uint32(cachesize), func(s string) uint32 { return uint32(xxhash.Sum64String(s)) })
for i := 0; i < cachesize; i++ {
cache.AddWithLifetime(keys[i], i, time.Hour)
}
}
func SetupOtter(cachesize int) {
//cache, _ := otter.MustBuilder[string, int](cachesize).WithVariableTTL().Build()
cache, _ := otter.MustBuilder[string, int](cachesize).Build()
for i := 0; i < cachesize; i++ {
cache.Set(keys[i], i)
}
}
func SetupEcache(cachesize int) {
cache := ecache.NewLRUCache(1024, uint16(cachesize/1024), time.Hour)
for i := 0; i < cachesize; i++ {
cache.Put(keys[i], i)
}
}
func SetupRistretto(cachesize int) {
cache, _ := ristretto.NewCache(&ristretto.Config{
NumCounters: int64(10 * cachesize), // number of keys to track frequency of (10M).
MaxCost: int64(cachesize), // maximum cost of cache (1M).
BufferItems: 64, // number of keys per Get buffer.
})
for i := 0; i < cachesize; i++ {
cache.SetWithTTL(keys[i], i, 1, time.Hour)
}
}
func SetupLxzan(cachesize int) {
cache := lxzan.New[string, int](
lxzan.WithBucketNum(128),
lxzan.WithBucketSize(cachesize/128, cachesize/128),
lxzan.WithInterval(time.Hour, time.Hour),
)
for i := 0; i < cachesize; i++ {
cache.Set(keys[i], i, time.Hour)
}
}
func SetupTheine(cachesize int) {
cache, _ := theine.NewBuilder[string, int](int64(cachesize)).Build()
for i := 0; i < cachesize; i++ {
cache.SetWithTTL(keys[i], i, 1, time.Hour)
}
}
func SetupCloudflare(cachesize int) {
cache := cloudflare.NewMultiLRUCache(1024, uint(cachesize/1024))
for i := 0; i < cachesize; i++ {
cache.Set(keys[i], i, time.Now().Add(time.Hour))
}
}
func SetupCcache(cachesize int) {
cache := ccache.New(ccache.Configure[int]().MaxSize(int64(cachesize)).ItemsToPrune(100))
for i := 0; i < cachesize; i++ {
cache.Set(keys[i], i, time.Hour)
}
}
func SetupHashicorp(cachesize int) {
cache := hashicorp.NewLRU[string, int](cachesize, nil, time.Hour)
for i := 0; i < cachesize; i++ {
cache.Add(keys[i], i)
}
}
Hi @maypok86,
First of all, thank you for making this amazing library. I didn't know S3FIFO before until now. I learned a lot from your work.
Then this issue is not a feature request because I guess setting capacity by bytes is almost impossible based on my understanding of S3FIFO which requires the maximum entry count to be set beforehand.
However, setting the maximum entry count at the beginning is hard in cases of caching data of different sizes which I believe are the most of the cases. Therefore I think it will be better for users to set the capacity by bytes.
I just want to know if you have any idea about this topic or alternatives. Thanks!
Since otter supports cost-based eviction, we cannot pre-initialize internal structures, but in reality this feature is rarely used, and other users may not be happy because otter allocated more memory than it could. So we would like to add an InitialCapacity
function with which users could preallocate otter's internal data structures.
The usage is envisioned like this
c, err := otter.MustBuilder[int, int](size).
InitialCapacity(size).
WithTTL(time.Second).
Build()
InitialCapacity
function is added to the builder.All implementations of caches in Go that are trying to be contention-free (ristretto, theine, otter, ccache) currently have problems with maintaining correct operation in the eventual consistency mode.
Delete
and Add
operations to the eviction policy will be applied in the wrong order and as a result, the eviction policy will be corrupted by the zombie entry. This problem exists in all the mentioned caches.We need to get rid of this problem in otter. Most likely, the best solution is to use a finite state machine of two states, live
and dead
.
The eviction policy for a cache determines which items should be evicted (removed) from the cache when it becomes full and a new item needs to be added. The goal of an eviction policy is to maximize cache hit rate and minimize cache misses by making informed decisions about which items to remove.
We will use an eviction policy called S3-FIFO based on an paper from https://blog.jasony.me/system/cache/2023/08/01/s3fifo
I'll add a basic implementation, but no tests and possibly with bugs for now
I really like otter, but what do miss (in comparison to ristretto for example) is something similar to a tag option. I need it sometimes to invalidate a part of the cache, and an easy option (in my opinion at least) to implement this would be to add a function to delete a part of the the cache depending on a function.
This would go along something like this:
// DeleteByFunc removes the association for this key from the cache if the selectFunc returns true.
func (bs baseCache[K, V]) DeleteByFunc(selectFunc func(key K, value V) bool) {
// aggregate all keys to delete
keysToDelete := make([]K, 0)
bs.Range(func(key K, value V) bool {
if selectFunc(key, value) {
keysToDelete = append(keysToDelete, key)
}
return true
})
for _, key := range keysToDelete {
bs.Delete(key)
}
}
You probably need a mutex, but the general idea should be clear.
Any additional ideas are appreciated.
set capacity to 10, but the cache can store over 10 elements.
Steps to reproduce the behavior:
cache, err := otter.MustBuilder[int, int](10). //set capacity to 10
Cost(func(key int, value int) uint32 { return 1 }).
WithTTL(time.Second).
Build()
if err != nil {
panic(err)
}
// put 20(>10) items to cache
for i := 0; i < 20; i++ {
cache.Set(i, i)
}
cntCacheMiss := 0
for i := 0; i < 20; i++ {
value, ok := cache.Get(i)
fmt.Println(value)
if !ok {
cntCacheMiss++
}
}
assert.Equal(t, 10, cntCacheMiss) //fails
test pass
github.com/maypok86/otter v1.2.1
Hi @maypok86 :)
What do you think about sieve ?)
https://junchengyang.com/publication/nsdi24-SIEVE.pdf
I found a rep scalalang2.
He released sieve, and otter comparison in repo with benchmark:
https://github.com/scalalang2/go-cache-benchmark (strange sieve behavior with concurrency>1, need to retest with current version of otter)
And implementation in
https://github.com/1a1a11a/libCacheSim
We need benchmarks to compare the performance of ours with the other caches available in go
Implemented basic benchmarks for this project
Before add a node to Variable
, it needs to call findBucket to find a location for the node.
func (v *Variable[K, V]) findBucket(expiration uint32) node.Node[K, V] {
duration := expiration - v.time
length := len(v.wheel) - 1
for i := 0; i < length; i++ {
if duration < spans[i+1] {
ticks := expiration >> shift[i]
index := ticks & (buckets[i] - 1)
return v.wheel[i][index]
}
}
return v.wheel[length][0]
}
My question is why use ticks := expiration >> shift[i]
(not ticks := duration >> shift[i]
) to calculate the location in v.wheel[i]
?
I think use ticks := duration >> shift[i]
is more reasonable, because it represent the distance(Or time difference) from the avaliable data location(v.time>> shift[i]
). expiration
is a random value(unixtime.Now()+ttl
), it can be anywhere in v.wheel[i]
, conflict with the RemoveExpired
.
This is awesome lib. But if have callback function for evict items or RemovalListener, it is very useful for maintain or free memory of item value.
Thanks.
Currently otter uses spinlocks to synchronize nodes while getting values/updating information in nodes. This solution blocks reading values from the same node and slows down reads a lot. Unfortunately, we can't use sync.RWMutex because of xadd operations inside it, and since the amount of work under locks is very small, we will get the same speed, but will waste more memory than spinlock. There are several solutions to this problem:
atomic.Value
to store values. This solution leads to fantastic speed of reads/updates, but will require additional 16 bytes per node to store an empty interface and unnecessary allocation for each value.-race
flag was not passed, but still not very nice.func (m *Map[K, V]) Delete(key K) node.Node[K, V]
and func (m *Map[K, V]) Set(n node.Node[K, V]) node.Node[K, V]
can run concurrently, when Delete
run into this line and Set
run into this loop, there is chance that Set
node to an invalid(shrinked) table.
I think this may acceptable in this situation.
And here is another question, in hash map set
, there is a judgment if m.newerTableExists(t)
, does it used to lower the probability set
to an invalid table? Because when concurrently run multi set
, there still may create a new table even after this judgment if m.newerTableExists(t)
Thanks!
In internal/lossy/buffer.go, I think func (b *Buffer[K, V]) Add(n node.Node[K, V]) *PolicyBuffers[K, V]
and func (b *Buffer[K, V]) Free()
can be called at the same time,
so when runAdd
( pb.Returned = append(pb.Returned, b.nodeManager.FromPointer(v))
) and Free()
may concurrent access the same slice because there is no locker or atomic.CompareAndSwapPointer
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.