Comments (10)
Thanks!
Have not felt the need to generalize in this direction but can appreciate your desire to have only one set of cache stuff to reason about.
One thing to bear in mind is that a distributed cache is questionable; the S.C.MC holds object refs and does not require serialization - a distributed cache in front of GES is pretty pointless as GES already has good caching built-in and a distributed cache in front of cosmos likely will cost as much in management overhead as it will save in RU cost.
Assuming it doesnt cause too much mess and achieves something worthwhile, would take a PR to make the wiring in of the cache provider pluggable via an optional arg on the resolver etc. (would prefer not to have two concrete impls in the actual file EventStore.fs
and Cosmos.fs
files though - if anything it'd be nice to define an interface in a shared place and then have your impl and the stock one). At present there are small variances in the key strategy between Cosmos and ES; the code has enough tests to make doing a spike feasible, but I wont suggest it's going to be trivial to do.
Can you explain a little about how your MemoryCache
operates - does it hold a ref to an object and self-prune like SRCMC?
from equinox.
I do agree that the benefits of distributed caching are very questionable (and you likely have investigated into it way more than I have). Although, I believe that building a common abstraction may be handy as more providers are added.
As for the MemoryCache
, overall it works in a similar manner to SRCMC:
- The self-prune mechanism based on the "cost" of the object - which has to be specified manually, as it is not the "actual" size of the object, but some abstract value specified during insert process (https://docs.microsoft.com/en-us/aspnet/core/performance/caching/memory?view=aspnetcore-2.2#use-setsize-size-and-sizelimit-to-limit-cache-size). By default, if nothing is specified, it effectively functions as an object count limit
- It doesn't support windows performance counters
- It does hold a ref to the original object
- Overall the API is somewhat nicer and there is no need to use
ConfigurationManager
to adjust it's behavior
from equinox.
I wouldn't say we've investigated it as such; the penalty of an extra roundtrip in terms of latency etc and having to care and feed a server and/or state thereof are the main reasons why its not been done.
(eta: interesting paper on distributed caches: Fast key-value stores:
An idea whose time has come and gone)
The most primary reason of all of course is that doing it in memory means that blue/green deploys are not relevant (the schema does not require versioning), and there is no burden for the fold state to require a codec.
A common abstraction (likely an interface) would likely clarify the interconnect between the Resolver and the Store so would likely be useful regardless.
While the perf counters in SRCMC are nice, its definitely not a decider given the importance of netcore. Similarly, while there are some funky bits in the API, that's not a major concern (I'm not aware of any use cases where making adjustments at runtime and/or varying much beyond the max allocation would make sense?)
The largest stumbling block however seems to be computing the "cost"; the GC-derived heuristic for SRCMC fits the need perfectly in that we're dealing with a third party object 'tree'. Being able to rely on an upper limit to the memory consumption (not risking OOM) is pretty key in typical runtime environments.
from equinox.
@DSilence Did you get any time to investigate or think further on this? If there are to be any minor changes to the contracts (I can't picture any but...), it'd be nice to get them into the 2.0 final release...
from equinox.
Hey @bartelink. I did some initial prototyping by introducing the interface with the following signature:
type CacheItemOptions =
| AbsoluteExpiration of DateTimeOffset
| RelativeExpiration of TimeSpan
type ICache =
abstract member UpdateIfNewer: policy:CacheItemOptions -> key: string -> Async<'entry>
abstract member TryGet: key: string -> Async<'entry option>
The breaking change here is usage of async, to enable more scenarios than in-memory implementation.
- What is your opinion on this?
- I didn't have a lot of time to properly evaluate and test this and it does seem like a breaking change, so if you agree with the approach, I'd keep it out of 2.0 final release.
from equinox.
Assuming the actual interface works and doesn't make a mess (looks fine), I won't lose sleep over the cost of the Async
(or the builder could internally wrap in a DU e.g. type Cache = AsyncCache of ICache | NotAsync of Cache
) - it may well be possible to address it by adding a builder overload etc.
Have you considered how you're going to to compute a weight for an arbitrary folded-state value given the interface you described? Will you instead do a max items count or something else ? The reality of how that's addressed will have big effects on the viability your side. And an idea from left field: what are the chances of migrating your other usages to MemoryCache
instead - do you e.g. use the Distributed Cache ?
I'm happy to keep the issue open and let you drill in at a point that works for you. I've tagged some final naming tweaks items as 2.0
, but there is no actual date set so, who knows: you might have reached a point before then.
from equinox.
Regarding the perf impact - I actually did the numbers. This is for System.Runtime.Caching.MemoryCache.
Code:
[<MemoryDiagnoser>]
type BenchmarkAsync () =
let mutable cache : MemoryCache option = None
[<GlobalSetup>]
member self.InitCache () =
let config = NameValueCollection(1)
config.Add("cacheMemoryLimitMegabytes", "50")
cache <- Some(new MemoryCache("Test", config))
cache.Value.Add("test", new Object(), new DateTimeOffset(2020, 12, 12 ,10, 20, 30, TimeSpan.Zero)) |> ignore
[<GlobalCleanup>]
member self.CleanupCache () =
cache.Value.Dispose()
[<Benchmark>]
member self.GetWithAsyncReturn () =
async {
return cache.Value.Get "test"
} |> Async.RunSynchronously
[<Benchmark>]
member self.GetWithTaskBuilder () =
let task = ContextInsensitive.task {
return cache.Value.Get "test"
}
task.Result
[<Benchmark(Baseline = true)>]
member self.GetDirectly () =
cache.Value.Get "test"
Results:
BenchmarkDotNet=v0.11.5, OS=Windows 10.0.18362
Intel Core i7-7700HQ CPU 2.80GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=2.2.301
[Host] : .NET Core 2.2.6 (CoreCLR 4.6.27817.03, CoreFX 4.6.27818.02), 64bit RyuJIT DEBUG
DefaultJob : .NET Core 2.2.6 (CoreCLR 4.6.27817.03, CoreFX 4.6.27818.02), 64bit RyuJIT
Method | Mean | Error | StdDev | Median | Ratio | RatioSD | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|---|
GetWithAsyncReturn | 606.7 ns | 18.563 ns | 52.961 ns | 578.3 ns | 3.73 | 0.36 | 0.2537 | - | - | 800 B |
GetWithTaskBuilder | 224.5 ns | 4.677 ns | 9.658 ns | 221.3 ns | 1.42 | 0.06 | 0.0482 | - | - | 152 B |
GetDirectly | 163.4 ns | 1.653 ns | 1.546 ns | 163.2 ns | 1.00 | 0.00 | 0.0100 | - | - | 32 B |
Regarding your questions:
-
At the moment, I've used System.Runtime.Caching.MemoryCache as a primary implementation. The implementation should be ported quite easily to Microsoft.Extensions.Caching, using something like
GC.GetTotalMemory
as a weight, but didn't give it too much thought yet. -
About using DU for distinguishing between sync/async, I've though about using that, but after giving a brief look through the codebase desided against it, since async still had to go all the way.
-
We've used Distributed cache for CosmosDb, as Azure Redis Cache has been much cheaper and saved a ton of RU for us. This has been for a module which has NOT been eventsourced and didn't use change feed feature at all. Right now we're using EventStore for all event sourced parts, with no additional caching on top - atm the performance is enough for our workloads, and any perf issues encountered were related to serialzation perf and were fixed by using https://github.com/neuecc/Utf8Json/.
My free time has been kinda sporadic lately but I will try to get back to you with some prototype done.
from equinox.
Really appreciate the detailed response; feels like we should be able to get something sorted quickly when the time comes.
Your benchmarks suggest just making the interface async
, keeping the source minimal and idiomatic is the right choice.
Given the nature of EventStore, yes, caching is rarely going to make a significant difference to overall throughput unless you have lots of nodes in the cluster etc., as it internally caches recently hit streams.
Would love to know the perf cost of a GC.TotalMemory
on a well-loaded process but that does seem to answer the question.
Regarding Utf8Json, @Szer did a spike which we ultimately didn't merge (copy retained in https://github.com/jet/equinox/tree/Szer-utf8json). If I or someone get around to doing System.Text.Json support as discussed in https://github.com/jet/equinox/issues/79, and/or you're interested in using it, it may make sense to add an Equinox.Codec.Utf8Json
.
from equinox.
Some minor diffs coming through in rc3 - have no other plans to do big changes before baking v2.
And an aside re caching - PR #151 is slightly relevant re the above - it provides a way to maintain a rolling-state with caching and etag-optimizations to e.g. avoid null writes and associated cache invalidation (we'll be using it to store some denormalized state that we currently index in ElasticSearch).
Also, if you do anything on this, it might be nice if we could avoid clients needing to take a reference to System.Runtime.Caching
also as part of the work (the integration tests require a <Reference
atm, same for apps, and avoiding that would be a nice touch)
from equinox.
This is addressed by opening up an extensibility point in #161 by @DSilence 🙏
from equinox.
Related Issues (20)
- Cosmos: Provide a clean way to examine endpoint Uri when working from connection string HOT 1
- What are the pros and cons of EventStore vs Cosmos? HOT 1
- V2: constrain dependencies HOT 1
- V2: Cosmos: Backport LogSink signature change HOT 2
- QUESTION: How do you support regulatory requirements to purge persisted Personally Data (e.g. GDPR)? HOT 4
- Cosmos Table support HOT 2
- Cosmos: Fix unfold event numbering
- The canonical definition of AwaitTaskCorrect is incorrect HOT 4
- Add LoadOption.RequireLeader
- missed published packages HOT 6
- StreamId.gen2/3 - consider revert to struct tuples HOT 1
- StreamNotFoundException when trying to add Events to a new Stream with EventStoreDb HOT 5
- Is there a specific message-db version that one has to use? HOT 8
- DynamoDB: EventsContext fails to read when tip not read HOT 1
- Feature: ReadThrough mode HOT 1
- DynamoStore: Needs to write through tip to guarantee order on DDB streams
- Core: NullReferenceException when using cached reads
- DynamoStore: Use ReturnValuesOnConditionCheckFailure to implement Resync
- refactor: Label contexts unambiguously
- Rename AsyncCacheCell to TaskCell
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from equinox.