I've compared the Read method with Unsafe implementat

<a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="38

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

we cannot make it a JIT constant in easy way. <p dir="a

Read method without locals about disruptor-net HOT 15 CLOSED

disruptor-net commented on June 18, 2024 1

Read method without locals

from disruptor-net.

Comments (15)

buybackoff commented on June 18, 2024 1

dotnet/runtime#65793

from disruptor-net.

ltrzesniewski commented on June 18, 2024

I'm surprised this doesn't thow an InvalidProgramException. This changes a & type on the stack to an O type, and arithmethic on those is not supported:

from disruptor-net.

buybackoff commented on June 18, 2024

It's weird - the distinction between object reference and managed pointer is of little value. Object reference is a manager pointer to a method table. The math should work, only the verifier could complain.

But, .NET itself uses RawArrayData and supposedly it's as fast as it could be, and safe. They could do that because it's only .NET Core, they do not need to account for different layout because they define it.

Calculating the array data offset and storing it in a static readonly proves to be difficult, JIT magic with treating it as a constant does not happen, at least reliably. It requires tiered compilation and the value must be initialized in tier 0 to be treated as a constant in tier 1, any long-running loops must be recompiled in tier 1.

However, we could change ArrayDataOffset to calculate the offset not from the method table, but from the first data byte. Like this:

public static unsafe int ArrayDataOffset2
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    get => sizeof(IntPtr) == 4
        ? RuntimeHelpers.OffsetToStringData == 8 ? 4 : 12
        : RuntimeHelpers.OffsetToStringData == 12 ? 8 : 24;
}

private class RawData<T>
{
    public T Data = default!;
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static T Read1<T>(object array, int index)
    where T : class
{
    return Unsafe.AddByteOffset(ref Unsafe.As<RawData<T>>(array).Data, (nint)(uint)(ArrayDataOffset2 + index * Unsafe.SizeOf<T>()));
}

On my current noisy machine where lots of stuff running this gives very significant throughput improvement for current master in OneToOneSequencedThroughputTest_ThreadAffinity bench, both on the same and different cores.

Need to check ArrayDataOffset2 values for cases other than x64 .NET Core.

I will send a PR for that.

from disruptor-net.

buybackoff commented on June 18, 2024

Brr, this throughput numbers mean very little with different batching on a noisy machine. Need more precise measurement and some extra work 😄

from disruptor-net.

ltrzesniewski commented on June 18, 2024

It's weird - the distinction between object reference and managed pointer is of little value. Object reference is a manager pointer to a method table. The math should work, only the verifier could complain.

Yes, I know about that, but the spec is pretty explicit about it:

Managed pointers are not interchangeable with object references.

Though the current code is storing an O value in a & local, and I couldn't find a mention in ECMA-335 which would allow that in the first place...

from disruptor-net.

ltrzesniewski commented on June 18, 2024

However, we could change ArrayDataOffset to calculate the offset not from the method table, but from the first data byte.

You should calculate this using an array instead of a regular object (T[] instead of RawData<T>) as you should not assume the CLR uses the same layout for objects and for arrays.

from disruptor-net.

buybackoff commented on June 18, 2024

@ltrzesniewski Unsafe.As<RawData<T>>(array).Data points right after the method table. What we do now is pointing to the method table. In array case, the .Data points to its Length slot. Do you know about differences in the method table size? Or other stuff that could be placed before .Data on different implementations?

from disruptor-net.

ltrzesniewski commented on June 18, 2024

In array case, the .Data points to its Length slot.

Exactly. Don't we want the offset between the first element and the method table, thus skipping the length slot?

from disruptor-net.

buybackoff commented on June 18, 2024

Don't we want the offset between the first element and the method table, thus skipping the length slot?

We can calculate it, but we cannot make it a JIT constant in easy way.

So now we have on x64: FirstOffset = MT_Ptr + ArrayDataOffset = MT_Ptr + 8 (MT_PtrSize) [.Data is here] + 4 (uint Length) + 4 (Padding) . What I propose is to just point past MT_Ptr and use existing knowledge about different stuff after .Data on different implementations.

from disruptor-net.

ltrzesniewski commented on June 18, 2024

we cannot make it a JIT constant in easy way.

Oh, ok, I see 👍

use existing knowledge about different stuff after .Data on different implementations.

But that's exactly what ArrayDataOffset does... how would you like to change that more precisely?

from disruptor-net.

buybackoff commented on June 18, 2024

But that's exactly what ArrayDataOffset does... how would you like to change that more precisely?

By using Unsafe and not Ldind.Ref and still avoiding locals.

from disruptor-net.

ltrzesniewski commented on June 18, 2024

Oh, ok, sorry, I misunderstood what you were saying earlier 👍

from disruptor-net.

buybackoff commented on June 18, 2024

@ltrzesniewski

Also this comment about managed pointers to zero: dotnet/coreclr#20386

So I'm confused.

from disruptor-net.

buybackoff commented on June 18, 2024

The current implementation is optimal for x-plat.

For .NET Core it works even with simple Ldarg(nameof(array)) + offset, and I think it should works like that and the O and & separation is artificial both conceptually and implementation-wise. But for this kind of things there is the linked discussion.

from disruptor-net.

ltrzesniewski commented on June 18, 2024

I suppose the reason for having both O and & types is performance: GC scans should be faster for O types, as the GC can assume the value is a pointer to a method table. This gets more complicated for & values, which can point to anywhere inside an object.

But I'm very interested in the answer to your linked question. 🙂

from disruptor-net.

Read method without locals about disruptor-net HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent