netfabric / netfabric.hyperlinq Goto Github PK

High performance LINQ implementation with minimal heap allocations. Supports enumerables, async enumerables, arrays and Span<T>.

License: MIT License

C# 100.00%

linq enumeration csharp dotnet dotnet-standard dotnet-core performance nuget-package csharp-library async-enumerable

netfabric.hyperlinq's Introduction

NetFabric

Defines classes and structs common to other NetFabric projects.

Throw

Defines a set of static methods to throw exceptions.

Methods that throw exceptions cannot be inlined by the JIT compiler. Only if thrown within a called method. Using the methods in this class allows exceptions to be thrown and still let the method be a candidate to be inlined.

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static ReadOnlyListWrapper<T> AsReadOnlyList<T>(T[] source)
    => source switch
    {
        null => Throw.ArgumentNullException<ReadOnlyListWrapper<T>>(nameof(source)),
        _ => new ReadOnlyListWrapper<T>(source)
    };

Credits

The following open-source projects are used to build and test this project:

License

This project is licensed under the MIT license. See the LICENSE file for more info.

netfabric.hyperlinq's People

Contributors

Stargazers

Watchers

netfabric.hyperlinq's Issues

Why not make method signatures identical to linq?

So when i replaced linq with hyperlinq, it wont compile right away i had to change all linq parts
for example
.Where(a=>a!=null) to .Where((a,i)=>a!=null)

Why not make method signatures identical to linq? is it even possible? i couldn't think of anyhting without closure allocations.

Blazor/Razor Views No Longer Compatible with Hyperlinq Enumerables

Sorry for the chatter. I knew upgrading to the latest version would be a chore and I am passing on the pain. 😅😭

I am now seeing this error in Blazor/Razor views making use of Hyperlinq:

Any guidance would be appreciated. I tried to make use of the AsEnumerable() call but that did not seem to affect the error.

Sum on ValueEnumerable.Range gives wrong sum

(NOTE: All of this is with version 3.0.0-beta48 from NuGet.)

If I do the following:

ValueEnumerable.Range(0, 10).Sum()

I get back 50 for the result, when the value should be 45.

If I change the above to:

ValueEnumerable.Range(0, 10).Select(x => x).Sum() or ValueEnumerable.Range(0, 10).Sum(x => x)

I get back the proper result of 45. It seems that adding almost any operation before the Sum (or a selector delegate within Sum) allows it to return the proper result, but just a straight parameterless Sum does not. For instance, even:

ValueEnumerable.Range(0, 10).Where(x => true).Sum()

I get back the proper result of 45. But doing:

ValueEnumerable.Range(0, 10).Skip(0).Sum()

I get back the wrong result of 50. (Take also doesn't change the result to the correct one.)

Rethink return types

LINQ has always been given as the example of functional programming in C#. It was introduced when C# was mostly an imperative language. This has radically changed in the latest versions of C# but LINQ has stayed unchanged...

I'm no expert in functional programming but it seems to me that methods like First(), Single(), Last(), Min(), Max() and Average() have some issues. They all throw exceptions when the source is empty. Single() also throws when the source has more than one item. ElementAt() throws an exception when the index parameter is out of bounds.

Throwing exceptions causes side effects...

There are some alternative like FirstOrDefault(), SingleOrDefault(), LastOrDefault(), DefaultIfEmpty() and ElementAtOrDefault() that return the default value of the return type when the source is empty or the index parameter is out of bounds. DefaultIfEmpty() has a second overload that allows the return of a predefined value. This is still an issue for value-types because the default value or even the predefined one may be possible values of the source. SingleOrDefault() still throws when the source has more than one item.

This project is not constrained by breaking changes avoidance. I want it to be a "better" version of LINQ. Keeping a familiar API but taking advantage of newer language features to improve performance and usability.

There are several possible alternatives...

Try pattern

This pattern can be found in many methods. It consist on a method that returns bool and has an output parameter. The parameter outputs a valid value if the method return true.

Here's a possible implementation of First() using this pattern:

public static bool TryFirst<T>(this IEnumerable<T> source, out T value)
{
  using var enumerator = source.GetEnumerator();
  if (enumerator.MoveNext())
  {
    value = enumerator.Current;
    return true;
  }
        
  value = default;
  return false;
}

It can be used this way:

if (source.Where(item => item > 2).TryFirst(out var value))
{
  Console.WriteLine(value);
}

One limitation is that output parameters cannot output value-types by reference. Hyperlinq currently returns read-only references in the cases where the source is an array, Span<T> or equivalent.

`Nullable<T>`

The method can return a Nullable<T>:

public static T? First<T>(this IEnumerable<T> source) where T : struct
{
  using var enumerator = source.GetEnumerator();
  if (enumerator.MoveNext())
    return enumerator.Current;
	
  return default;
}

The type T has to be constrained to value-types. We would need an overload for reference-types but, because generic constraints are not part of the method signature, the methods must have different names. The alternative is to add a second generic parameter but it would be very confusing.

This alternative has the same issue as FirstOrDefault() and SingleOrDefault(). null may be a possible value for reference-types, so it would be not possible to differentiate between an empty collections and one that contains one or more null values.

`ValueTuple<bool, T>`

The method can return a tuple:

public static (bool HasValue, T Value) First<T>(this IEnumerable<T> source)
{
  using var enumerator = source.GetEnumerator();
  if (enumerator.MoveNext())
    return (true, enumerator.Current);
	
  return (false, default);
}

But, it's not good idea to use tuples in public APIs.

`Option<T>`

The method can return Option<T>:

public static Option<T> First<T>(this IEnumerable<T> source)
{
  using var enumerator = source.GetEnumerator();
  if (enumerator.MoveNext())
    return Option.Some(enumerator.Current);
	
  return Option.None;
}

Possible uses:

// returns first; otherwise throws
var first0 = array
  .Where(item => item > 2)
  .First()
  .Match(
    item => item, 
    () => throw new InvalidOperationException("Sequence contains no elements"));

// returns first; otherwise returns -1
var first1 = array
  .Where(item => item > 2)
  .First()
  .Match(
    item => item, 
    () => -1);

// writes first; otherwise writes <empty>
array
  .Where(item => item > 2)
  .First()
  .Match(
    item => Console.WriteLine(item), 
    () => Console.WriteLine("<empty>"));

It's also possible to add a Desconstruct()method to Option<T>:

public void Deconstruct(out bool hasValue, out T value)
{
  hasValue = _hasValue;
  value = _value;
}

It can then be used the following way:

var (hasValue, value) = array.Where(item => item == 4).First();
if (hasValue)
  Console.WriteLine(value);

The C# language does not support fields to be a reference so it would not be possible to return a readonly reference from arrays or Span<T>. This issue may be worked-around by using ByReference<T>.

`Result<TOk, TError>`

What about Single()? It throws an exception in two different cases. It's the same exception type but with different messages.

It can either return Option<T> but then there is no way to differentiate the errors.

It can return Result<T, string>:

public static Result<T, string> Single<T>(this IEnumerable<T> source)
{
  using var enumerator = source.GetEnumerator();
  if (!enumerator.MoveNext())
    return Result.Error("Sequence contains no elements");
  var value = enumerator.Current;
  if (enumerator.MoveNext())
    return Result.Error("Sequence contains more than one element");
	
  return Result.Ok(value);
}

But then, just like the exception, we can only differentiate by comparing strings.

It's possible to differentiate the errors with an enum type:

public enum SingleError
{
  Empty,
  Multiple
}

public static Result<T, SingleError> Single<T>(this IEnumerable<T> source)
{
  using var enumerator = source.GetEnumerator();
  if (!enumerator.MoveNext())
    return Result.Error(SingleError.Empty);
  var value = enumerator.Current;
  if (enumerator.MoveNext())
    return Result.Error(SingleError.Multiple);
	
  return Result.Ok(value);
}

Match pattern

An alternative is to use the match pattern without having to call one more method:

public static TOut First<T, TOut>(this IEnumerable<T> source, Func<T, TOut> some, Func<TOut> none)
{
  if (some is null) throw new ArgumentNullException(nameof(some));
  if (none is null) throw new ArgumentNullException(nameof(none));

  using var enumerator = source.GetEnumerator();
  if (enumerator.MoveNext())
    return some(enumerator.Current);
	
  return none();
}

public static void First<T>(this IEnumerable<T> source, Action<T> some, Action none)
{
  if (some is null) throw new ArgumentNullException(nameof(some));
  if (none is null) throw new ArgumentNullException(nameof(none));

  using var enumerator = source.GetEnumerator();
  if (enumerator.MoveNext())
    some(enumerator.Current);
  else
    none();
}

I started this journey to avoid throwing exceptions but lets see if it's possible to have the current behavior of First() and at the same time the match pattern.

public static T First<T>(this IEnumerable<T> source, Func<T> none = null)
{
  using var enumerator = source.GetEnumerator();
  if (enumerator.MoveNext())
    return enumerator.Current;
	
  if (none is null)
    throw new InvalidOperationException("Sequence contains no elements");
	
  return none();
}
	
public static TOut First<T, TOut>(this IEnumerable<T> source, Func<T, TOut> some, Func<TOut> none = null)
{
  if (some is null) throw new ArgumentNullException(nameof(some));

  using var enumerator = source.GetEnumerator();
  if (enumerator.MoveNext())
    return some(enumerator.Current);
	
  if (none is null)
    throw new InvalidOperationException("Sequence contains no elements");
	
  return none();
}

public static void First<T>(this IEnumerable<T> source, Action<T> some, Action none = null)
{
  if (some is null) throw new ArgumentNullException(nameof(some));

  using var enumerator = source.GetEnumerator();
  if (enumerator.MoveNext())
  {
    some(enumerator.Current);
  }
  else
  {
    if (none is null)
      throw new InvalidOperationException("Sequence contains no elements");
	
    none();
  }
}

Possible uses:

// returns first; otherwise throws
var first0 = array
  .Where(item => item > 2)
  .First();
		
// returns first; otherwise returns -1
var first1 = array
  .Where(item => item > 2)
  .First(() => -1);

// writes first value; otherwise writes <empty>
array
  .Where(item => item > 2)
  .First(
    item => Console.WriteLine(item), 
    () => Console.WriteLine("<empty>"));

I ended up going full circle. This last solution is compatible with LINQ but, at the same time, supports more functional patterns.

But, can it support the return of references in the case of Span<T>?

public static ref readonly T First<T>(this ReadOnlySpan<T> source)
{
  if (source.Length == 0)
    throw new InvalidOperationException("Sequence contains no elements");

  return ref source[0];
}

public static T First<T>(this ReadOnlySpan<T> source, Func<T> none)
{
  if (none is null) throw new ArgumentNullException(nameof(none));

  if (source.Length == 0)
    return none();
	
  return source[0];
}
	
public static TOut First<T, TOut>(this ReadOnlySpan<T> source, Func<T, TOut> some, Func<TOut> none = null)
{
  if (some is null) throw new ArgumentNullException(nameof(some));

  if (source.Length == 0)
  {
    if (none is null)
	  throw new InvalidOperationException("Sequence contains no elements");
	
    return none();
  }
	
  return some(source[0]);
}

public static void First<T>(this ReadOnlySpan<T> source, Action<T> some, Action none = null)
{
  if (some is null) throw new ArgumentNullException(nameof(some));

  if (source.Length == 0)
  {
    if (none is null)
	  throw new InvalidOperationException("Sequence contains no elements");
	
    none();
  }
  else
  {
    some(source[0]);
  }
}

It requires one more overload and only this one returns a reference. Lambdas still do not support passing by reference...

There is no need for IValueEnumerator

The following is from a comment on Medium by @qwertie:

There is no need for IValueEnumerator — it is the same as IEnumerator except that it is IDisposable. Does it really need to be disposable?

The important part is IValueEnumerable. But I would call it simply IEnumerable and define it like this:

public interface IEnumerable<T, TEnumerator>
 where TEnumerator : IEnumerator<T>, IDisposable {
   TEnumerator GetEnumerator();
}

But you will have a problem in case someone implements a class X that implements both IEnumerable<T> and IEnumerable<T, TEnumerator>. The compiler will say that an extension method like x.Select(e=>e) is ambiguous between System.Linq.Enumerable.Select and NetFabric.Hyperlinq.Enumerable.Select. To avoid this problem, it needs to be derived from IEnumerable<T>:

public interface IEnumerable<T, TEnumerator> : IEnumerable<T>
  where TEnumerator : IEnumerator<T>, IDisposable {
    TEnumerator GetEnumerator();
}

Integration with StructLinq

It would be nice to have one library with LINQ optimizations instead of two: https://github.com/reegeek/StructLinq

performance optimization for simple objects

Hi again i noticed that list.first() slower than linq version

Consider Making `ArrayPoolExtensions` Public

Consider the following class:

https://github.com/DragonSpark/Framework/blob/e43394bb88b223a2c8726ce08da02745b5532631/DragonSpark/Model/Sequences/Memory/Lease.cs

It makes use of a generic IMemoryOwner implementation, and I have been using it with an older version of Hyperlinq.

As of 3.0.0-beta45 however, it appears that ValueMemoryOwner is used, and now sending it to the constructor above as designed will end up in a boxed allocation.

So, I am looking for a way to instantiate the ValueMemoryOwner. Unfortunately, its constructor is internal. Additionally, the only way of instantiating one appears to be via ArrayPoolExtensions and this is also internal.

Would it be worth exploring the possibility of exposing these points of concern as public? Is there another approach I should consider?

Thank you for any assistance provided.

Missing dll in NetFabric.Hyperlinq 3.0.0-beta12

On NuGet, NetFabric.Hyperlinq does not have a .dll in the lib folder. Is this correct?

unity engine throws error, can be fody or too high .net version not sure

Unity engine throws error

"Unloading broken assembly Assets/Plugins/NetFabric.Hyperlinq.dll, this assembly can cause crashes in the runtime"

Not sure, what the reason. it can be fody or too high .net version.

Old hyperlinq version dll did work with Unity3d without errors, not sure which version it was but i can upload it
NetFabric.Hyperlinq-master.zip

Distinct on a Value Type doesn't seem to work in all cases

I am using the FatValueType from NetFabric's own LINQ benchmarks project (https://github.com/NetFabric/LinqBenchmarks), except I've made it a readonly record struct instead of just a struct, so it basically becomes the following:

public readonly record struct FatValueType
{
	public readonly int Value0 { get; }
	public readonly long Value1 { get; }
	public readonly long Value2 { get; }
	public readonly long Value3 { get; }
	public readonly long Value4 { get; }
	public readonly long Value5 { get; }
	public readonly long Value6 { get; }
	public readonly long Value7 { get; }

	public FatValueType(int value)
	{
		this.Value0 = value;
		this.Value1 = value;
		this.Value2 = value;
		this.Value3 = value;
		this.Value4 = value;
		this.Value5 = value;
		this.Value6 = value;
		this.Value7 = value;
	}

	public readonly bool IsEven() => (this.Value0 & 0x01) == 0;

	public static FatValueType operator +(in FatValueType left, in FatValueType right) => new(left.Value0 + right.Value0);

	public static FatValueType operator *(in FatValueType left, int right) => new(left.Value0 * right);
}

I'm using this in my own set of LINQ benchmarks, and I found that despite that EqualityComparer<FatValueType>.Default.GetHashCode() returns the same value for two identical instances of this value type, in the array that contains 4 distinct copies of each value, instead of Hyperlinq returning 100 values, it returns 162. The first 100 values are the first 100 from the original source, but the following 62 are the ones where Value0 is between 1 and 63, except for 32.

From tracing the code in a debugger, I find that it seems like Hyperlinq's Set<T> implementation might be at fault. I am not sure why it is failing in this case, but it seems that after it has added the first 100 items, the 101st item (which is when Value0 is 0) is correctly found as being in the set, but the 102nd item (which is when Value0 is 1) is not correctly found as being in the set.

Probably the simplest way I found to duplicate the problem, without knowing how to fix it, is with the following:

Enumerable.Range(0, 20).Select(i => new FatValueType(i)).Concat(Enumerable.Range(1, 10).Select(i => new FatValueType(i))).AsValueEnumerable().Distinct()

This should return a set of 20 values (0 through 19), but it instead returns all 30 values of the original enumerable.

This problem does not seem to plague primitive types such as int, as if the Select statements are removed from the above, it only returns 20 values. I believe it also seems to affect reference types too, such as the FatReferenceType that is also in NetFabric's LINQ benchmarks project, since when I make an IEqualityComparer<T> class for FatReferenceType to compare its field1 value, despite that it returns true for the 1st and 21st values in the above (when FatValueType is replaced by FatReferenceType and an instance of the comparer is passed to Distinct), the above also returns 30 values instead of 20.

If I had to venture a guess as to why it is failing, it could be because of how Set<T> in Hyperlinq handles resizing itself, but I can't say for sure.

`MemoryValueEnumerable` Uses `SpanEnumerator` When `MemoryEnumerator` is Expected

I'm not trying to pile on here, I swear! 😅

I am running into this issue:

Error	CS8344	foreach statement cannot operate on enumerators of type 'SpanEnumerator<Notification>' in async or iterator methods because 'SpanEnumerator<Notification>' is a ref struct.

I expect this message for Span<T>-based components, but I am using a Memory<T> component which is based on a MemoryValueEnumerable:

I wanted to check in here to make sure this is expected behavior and/or if there are maybe plans to change it. 😁😇

Api should match Linq (Single, First => SingleOption, FirstOption)

In this issue you seem to have introduced an option type for some of the linq functions,
#133

However Linq original First and Hyperlinq first now have a different signature. Wouldn't it feel better to specify a new signature such as FirstOpt, FirstOption ? Actually this is what I've done for my own set of Linq extension methods.

It also prevents Hyperlinq to be a drag&drop solution as it will cause regression.

ToArray Result Element Type is Different Than ToList

Greetings. I have been enjoying getting Hyperlinq integrated into my code and the performance benefits that result of such.

I did find a possible issue, however, and wanted to get some eyes on it to verify if possible.

Using this file and line here:

https://github.com/DragonSpark/Framework/blob/04415dea2427ed6a0040fd580360544591a6e5c5/DragonSpark.Presentation/Delegates.cs#L42

And attempting to invoke ToArray instead, you can see that the returning element type is different from that which is returned by either ToList or the ToDictionary value type. Screenshot of this here:

Am I misunderstanding something fundamental here, by chance? It would seem to me that they all should return the same element value type.

Thank you for any context/insight you can provide, and thanks for your efforts into this library!

Select then ToList() doesn't work ?

Hi, I executed this simple test in Unity and the ToList() doesn't work well... ?!

public List<InboxItem> _firstList = null;

 public class InboxItem
 {
     public int int1;
     public float float1;
     public int int2;
 }

 //I just initialize the list with 10.000 items
 private void Start()
 {
     _firstList = new List<InboxItem>(100000);
     for (int i = 0; i < 100000; i++)
     {
         _firstList.Add(new InboxItem()
         {
             int1 = i,
             float1 = i,
             int2 = 2 * i
         });
     }
 }

private void TestSelectThenToListHyperLinq()
 {
     Debug.Log("TestSelectThenToListHyperLinq");
     var select = _firstList.AsValueEnumerable().Select(x => x.int1);
     var list = select.ToList();
     var array = select.ToArray();
     Debug.Log("list count = " + list.Count);  //return 0 !!!!
     Debug.Log("array Length = " + array.Length); //return n
 }

 private void TestSelectThenToListLinq()
 {
     Debug.Log("TestSelectThenToListLinq");
     var select = _firstList.Select(x => x.int1);
     var list = select.ToList();
     var array = select.ToArray();
     Debug.Log("list count = " + list.Count); //return n
     Debug.Log("array Length = " + array.Length); //return n
 }

Experiencing Slow ArrayPool Performance with Enumerables

Please see the SLN found here:

https://github.com/Mike-E-angelo/Stash/tree/master/Hyperlinq.Memory

When running the tests, I see the following:

|              Method |      Mean |    Error |   StdDev |  Gen 0 | Gen 1 | Gen 2 | Allocated |
|-------------------- |----------:|---------:|---------:|-------:|------:|------:|----------:|
|      LinqEnumerable |  28.76 ns | 0.242 ns | 0.226 ns | 0.0038 |     - |     - |      32 B |
| HyperlinqEnumerable | 164.11 ns | 3.246 ns | 4.656 ns | 0.0038 |     - |     - |      32 B |
|     BasicEnumerable |  64.24 ns | 1.303 ns | 1.219 ns | 0.0038 |     - |     - |      32 B |
|           LinqArray |  30.62 ns | 0.479 ns | 0.400 ns | 0.0038 |     - |     - |      32 B |
|      HyperlinqArray |  44.31 ns | 0.663 ns | 0.620 ns |      - |     - |     - |         - |
|          BasicArray |  66.51 ns | 0.699 ns | 0.620 ns | 0.0038 |     - |     - |      32 B |

You can see that HyperlinqEnumerable seems abnormally slow. It is a copy/paste of HyperlinqArray but uses an IEnumerable<int> instead of an int[].

Wanted to bring this up in case it's not a known issue. Or if I have something fundamentally misunderstood. :)

make EmptyEnumerable a singleton

no need to create an instance every time the method is called

for reference:
https://github.com/dotnet/corefx/blob/master/src/System.Linq/src/System/Linq/Enumerable.SpeedOpt.cs#L11

and

https://github.com/dotnet/corefx/blob/master/src/System.Linq/src/System/Linq/Partition.SpeedOpt.cs#L25

or am I missing something? - https://stackoverflow.com/questions/14131168/singletons-using-structs-in-c-sharp

Issue with `AsAsyncValueEnumerable().SingleAsync`

Please see the tests here:
https://github.com/Mike-E-angelo/Stash/blob/master/Hyperlinq.SingleAsync/Hyperlinq.SingleAsync/SingleTests.cs#L13-L32

In it, I have Hyperlinq and System.Linq.Async polling an IAsyncEnumerable created from an EF.CompileQueryAsync call with their respective SingleAsync method.

System.Linq.Async passes while Hyperlinq does not, even though Hyperlinq reports that Option<T>.IsSome is true.

This seems wrong to me. 😁

netfabric / netfabric.hyperlinq Goto Github PK

netfabric.hyperlinq's Introduction

NetFabric

Throw

Credits

License

netfabric.hyperlinq's People

Contributors

Stargazers

Watchers

Forkers

netfabric.hyperlinq's Issues

Try pattern

Nullable<T>

ValueTuple<bool, T>

Option<T>

Result<TOk, TError>

Match pattern

Recommend Projects

Recommend Topics

Recommend Org

`Nullable<T>`

`ValueTuple<bool, T>`

`Option<T>`

`Result<TOk, TError>`