I've been researching the library and it looks really neat! I've bee

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

This is addressed by opening up an extensibility point in <a class="issue-link js-issu

Customizing cache implementation about equinox HOT 10 CLOSED

jet commented on May 14, 2024

Customizing cache implementation

from equinox.

Comments (10)

bartelink commented on May 14, 2024

Thanks!

Have not felt the need to generalize in this direction but can appreciate your desire to have only one set of cache stuff to reason about.

One thing to bear in mind is that a distributed cache is questionable; the S.C.MC holds object refs and does not require serialization - a distributed cache in front of GES is pretty pointless as GES already has good caching built-in and a distributed cache in front of cosmos likely will cost as much in management overhead as it will save in RU cost.

Assuming it doesnt cause too much mess and achieves something worthwhile, would take a PR to make the wiring in of the cache provider pluggable via an optional arg on the resolver etc. (would prefer not to have two concrete impls in the actual file EventStore.fs and Cosmos.fs files though - if anything it'd be nice to define an interface in a shared place and then have your impl and the stock one). At present there are small variances in the key strategy between Cosmos and ES; the code has enough tests to make doing a spike feasible, but I wont suggest it's going to be trivial to do.

Can you explain a little about how your MemoryCache operates - does it hold a ref to an object and self-prune like SRCMC?

from equinox.

DSilence commented on May 14, 2024

I do agree that the benefits of distributed caching are very questionable (and you likely have investigated into it way more than I have). Although, I believe that building a common abstraction may be handy as more providers are added.

As for the MemoryCache, overall it works in a similar manner to SRCMC:

The self-prune mechanism based on the "cost" of the object - which has to be specified manually, as it is not the "actual" size of the object, but some abstract value specified during insert process (https://docs.microsoft.com/en-us/aspnet/core/performance/caching/memory?view=aspnetcore-2.2#use-setsize-size-and-sizelimit-to-limit-cache-size). By default, if nothing is specified, it effectively functions as an object count limit
It doesn't support windows performance counters
It does hold a ref to the original object
Overall the API is somewhat nicer and there is no need to use ConfigurationManager to adjust it's behavior

from equinox.

bartelink commented on May 14, 2024

I wouldn't say we've investigated it as such; the penalty of an extra roundtrip in terms of latency etc and having to care and feed a server and/or state thereof are the main reasons why its not been done.
(eta: interesting paper on distributed caches: Fast key-value stores:
An idea whose time has come and gone)

The most primary reason of all of course is that doing it in memory means that blue/green deploys are not relevant (the schema does not require versioning), and there is no burden for the fold state to require a codec.

A common abstraction (likely an interface) would likely clarify the interconnect between the Resolver and the Store so would likely be useful regardless.

While the perf counters in SRCMC are nice, its definitely not a decider given the importance of netcore. Similarly, while there are some funky bits in the API, that's not a major concern (I'm not aware of any use cases where making adjustments at runtime and/or varying much beyond the max allocation would make sense?)

The largest stumbling block however seems to be computing the "cost"; the GC-derived heuristic for SRCMC fits the need perfectly in that we're dealing with a third party object 'tree'. Being able to rely on an upper limit to the memory consumption (not risking OOM) is pretty key in typical runtime environments.

from equinox.

bartelink commented on May 14, 2024

@DSilence Did you get any time to investigate or think further on this? If there are to be any minor changes to the contracts (I can't picture any but...), it'd be nice to get them into the 2.0 final release...

from equinox.

DSilence commented on May 14, 2024

Hey @bartelink. I did some initial prototyping by introducing the interface with the following signature:

type CacheItemOptions =
    | AbsoluteExpiration of DateTimeOffset
    | RelativeExpiration of TimeSpan

type ICache =
    abstract member UpdateIfNewer: policy:CacheItemOptions -> key: string -> Async<'entry>
    abstract member TryGet: key: string -> Async<'entry option>

The breaking change here is usage of async, to enable more scenarios than in-memory implementation.

What is your opinion on this?
I didn't have a lot of time to properly evaluate and test this and it does seem like a breaking change, so if you agree with the approach, I'd keep it out of 2.0 final release.

from equinox.

bartelink commented on May 14, 2024

Assuming the actual interface works and doesn't make a mess (looks fine), I won't lose sleep over the cost of the Async (or the builder could internally wrap in a DU e.g. type Cache = AsyncCache of ICache | NotAsync of Cache) - it may well be possible to address it by adding a builder overload etc.

Have you considered how you're going to to compute a weight for an arbitrary folded-state value given the interface you described? Will you instead do a max items count or something else ? The reality of how that's addressed will have big effects on the viability your side. And an idea from left field: what are the chances of migrating your other usages to MemoryCache instead - do you e.g. use the Distributed Cache ?

I'm happy to keep the issue open and let you drill in at a point that works for you. I've tagged some final naming tweaks items as 2.0, but there is no actual date set so, who knows: you might have reached a point before then.

from equinox.

DSilence commented on May 14, 2024

Regarding the perf impact - I actually did the numbers. This is for System.Runtime.Caching.MemoryCache.
Code:

[<MemoryDiagnoser>]
type BenchmarkAsync () =
    let mutable cache : MemoryCache option = None

    [<GlobalSetup>]
    member self.InitCache () =
        let config = NameValueCollection(1)
        config.Add("cacheMemoryLimitMegabytes", "50")
        cache <- Some(new MemoryCache("Test", config))
        cache.Value.Add("test", new Object(), new DateTimeOffset(2020, 12, 12 ,10, 20, 30, TimeSpan.Zero)) |> ignore

    [<GlobalCleanup>]
    member self.CleanupCache () =
        cache.Value.Dispose()

    [<Benchmark>]
    member self.GetWithAsyncReturn () =
        async {
            return cache.Value.Get "test"
        } |> Async.RunSynchronously

    [<Benchmark>]
    member self.GetWithTaskBuilder () =
        let task = ContextInsensitive.task {
            return cache.Value.Get "test"
        }
        task.Result

    [<Benchmark(Baseline = true)>]
    member self.GetDirectly () =
        cache.Value.Get "test"

Results:

BenchmarkDotNet=v0.11.5, OS=Windows 10.0.18362
Intel Core i7-7700HQ CPU 2.80GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=2.2.301
  [Host]     : .NET Core 2.2.6 (CoreCLR 4.6.27817.03, CoreFX 4.6.27818.02), 64bit RyuJIT DEBUG
  DefaultJob : .NET Core 2.2.6 (CoreCLR 4.6.27817.03, CoreFX 4.6.27818.02), 64bit RyuJIT

Method	Mean	Error	StdDev	Median	Ratio	RatioSD	Gen 0	Gen 1	Gen 2	Allocated
GetWithAsyncReturn	606.7 ns	18.563 ns	52.961 ns	578.3 ns	3.73	0.36	0.2537	-	-	800 B
GetWithTaskBuilder	224.5 ns	4.677 ns	9.658 ns	221.3 ns	1.42	0.06	0.0482	-	-	152 B
GetDirectly	163.4 ns	1.653 ns	1.546 ns	163.2 ns	1.00	0.00	0.0100	-	-	32 B

Regarding your questions:

At the moment, I've used System.Runtime.Caching.MemoryCache as a primary implementation. The implementation should be ported quite easily to Microsoft.Extensions.Caching, using something like GC.GetTotalMemory as a weight, but didn't give it too much thought yet.
About using DU for distinguishing between sync/async, I've though about using that, but after giving a brief look through the codebase desided against it, since async still had to go all the way.
We've used Distributed cache for CosmosDb, as Azure Redis Cache has been much cheaper and saved a ton of RU for us. This has been for a module which has NOT been eventsourced and didn't use change feed feature at all. Right now we're using EventStore for all event sourced parts, with no additional caching on top - atm the performance is enough for our workloads, and any perf issues encountered were related to serialzation perf and were fixed by using https://github.com/neuecc/Utf8Json/.

My free time has been kinda sporadic lately but I will try to get back to you with some prototype done.

from equinox.

bartelink commented on May 14, 2024

Really appreciate the detailed response; feels like we should be able to get something sorted quickly when the time comes.

Your benchmarks suggest just making the interface async, keeping the source minimal and idiomatic is the right choice.

Given the nature of EventStore, yes, caching is rarely going to make a significant difference to overall throughput unless you have lots of nodes in the cluster etc., as it internally caches recently hit streams.

Would love to know the perf cost of a GC.TotalMemory on a well-loaded process but that does seem to answer the question.

Regarding Utf8Json, @Szer did a spike which we ultimately didn't merge (copy retained in https://github.com/jet/equinox/tree/Szer-utf8json). If I or someone get around to doing System.Text.Json support as discussed in https://github.com/jet/equinox/issues/79, and/or you're interested in using it, it may make sense to add an Equinox.Codec.Utf8Json.

from equinox.

bartelink commented on May 14, 2024

Some minor diffs coming through in rc3 - have no other plans to do big changes before baking v2.

And an aside re caching - PR #151 is slightly relevant re the above - it provides a way to maintain a rolling-state with caching and etag-optimizations to e.g. avoid null writes and associated cache invalidation (we'll be using it to store some denormalized state that we currently index in ElasticSearch).

Also, if you do anything on this, it might be nice if we could avoid clients needing to take a reference to System.Runtime.Caching also as part of the work (the integration tests require a <Reference atm, same for apps, and avoiding that would be a nice touch)

from equinox.

bartelink commented on May 14, 2024

This is addressed by opening up an extensibility point in #161 by @DSilence 🙏

from equinox.

Customizing cache implementation about equinox HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent