Could you break out the ASP.NET 5 on Kestrel in to mo

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Benchmark runtime flavors about benchmarks HOT 9 CLOSED

aspnet commented on August 27, 2024

Benchmark runtime flavors

from benchmarks.

Comments (9)

guardrex commented on August 27, 2024

Maybe a pretty over time graph at each change point?

... and along the same lines as a starting point, you probably heard me ask the almost stupid question today on the Windows Surface Pro 4 Community Standup 😄 about seeing Helios-IIS vs. Kestrel-IIS/HttpPlatformHandler Hello World results. "Stupid" because Helios is dead. "Almost" because It would be kind'a cool to see just where Kestrel-IIS/HttpPlatformHandler starts off vis-a-vis Helios.

@benaadams notion of time series would be great. I 👍 that!

from benchmarks.

DamianEdwards commented on August 27, 2024

Yep, want to do all these things 😄

I'm going to try and get something setup to track the results over time this week.

from benchmarks.

DamianEdwards commented on August 27, 2024

There's a spreadsheet now with these details in /results

from benchmarks.

guardrex commented on August 27, 2024

@DamianEdwards Are the SD's on the ASP.NET 5 results really %'s?

from benchmarks.

DamianEdwards commented on August 27, 2024

I'm just recording what wrk reports. I believe that value indicates how many requests fell within a standard deviation of the avg latency.

from benchmarks.

guardrex commented on August 27, 2024

If that's what the number is, I don't think it's all that helpful in judging when the averages are probably different (note below). It isn't a % then, it would be a count. Yeah, those figures wouldn't make sense as SD's expressed as %'s. They would be huge SD's if that were the case. They would be so large that the averages wouldn't be different in most cases.

It would be nice if wrk would report real SD's. Better yet since we're looking at averages, it would be nice to have SE (Standard Error) reported, but I understand that might not be reported by wrk. I was just trying to get an understanding of the variation here in these runs.

"probably different" ... I mean that I wasn't hoping to make actual statistically-valid judgements about differences in averages. You'd have to get into simultaneous inferences and Sequential Bonferroni. However, with SE's you could draw thumbnail "probably different" conclusions for pairwise comparisons.

Anywho ... if you get SE's (or at least real SD's), I hope you'll add those to the spreadsheet. If not, no big deal.

from benchmarks.

guardrex commented on August 27, 2024

Oh ... and I guess I could add that if we had real SD's and you showed the sample sizes, we could calc the SE on our own, since it's just SE = SD/sqrt(n)

from benchmarks.

guardrex commented on August 27, 2024

I just checked the wrk repo at https://github.com/wg/wrk/blob/master/README and found this sample output ...

Running 30s test @ http://127.0.0.1:8080/index.html
    12 threads and 400 connections
    Thread Stats   Avg      Stdev     Max   +/- Stdev
      Latency   635.91us    0.89ms  12.92ms   93.69%
      Req/Sec    56.20k     8.07k   62.00k    86.54%
    22464657 requests in 30.00s, 17.76GB read
  Requests/sec: 748868.53
  Transfer/sec:    606.33MB

The SD for the 56.20K rps is shown as 8.07K, which looks like a good number.

As for the 86.54% number ... that makes no sense to me 😕 ... plus/minus SD as a %. They lost me there. I'll ask them for a clarification.

They do show the total requests, so I calc a SE at 8070 / sqrt(22464657) = 1.7, which is tiny owing to the large sample size and gives you a ton of confidence that the population average is very close to this sample average (i.e., it makes you feel good about comparing this average to other averages derived with the same test rig under the same test conditions).

May I suggest a few changes to your spreadsheet: Drop (or ignore until we know what it means) the %Stdev that they report. Add the actual Stdev from their data and include the total requests count. Add a column to calc the SE from the Stdev and sample size. Report these values in columns next to each other labelled "Total Requests", "RPS Avg," "RPS SD" (or "RPS Stdev"), and "RPS SE" (or "RPS SEM" or "RPS Stderr").

from benchmarks.

guardrex commented on August 27, 2024

@wg said you are correct ...

+/- Stdev is the percentage of requests that fall within 1 standard deviation of the average.

... interesting, but not helpful for comparing RPS averages.

from benchmarks.

Benchmark runtime flavors about benchmarks HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent