Git Product home page Git Product logo

Comments (10)

treeowl avatar treeowl commented on July 26, 2024 1

Oh, it looks like I misread the timings. Impressive! Could you open a PR?

from containers.

meooow25 avatar meooow25 commented on July 26, 2024

I just realized it is safe to use bin instead of link in mergeInto, which decreases the time further.

  fromDistinctAscList:  OK (0.22s)
    26.5 μs ± 2.3 μs, 264 KB allocated, 5.3 KB copied, 7.0 MB peak memory
  fromDistinctAscList2: OK (0.22s)
    26.4 μs ± 1.6 μs, 328 KB allocated, 8.7 KB copied, 7.0 MB peak memory

But I suspect this can be adapted to the current definition too, so I would not consider this in favor of the new definition.

from containers.

treeowl avatar treeowl commented on July 26, 2024

Your description reminds me of the way we used to do it for IntMap. IIRC, we switched from that because the simpler way performed at least as well (in that context) and it was easier for my poor brain to understand. That said, you're making a good case here. However, I can't understand your implementation. Could you use scoped type variables to give the local functions types, and add comments to explain what they do? Also, have you considered phased rewrite rules to turn the direct way into the stack-based way and then turn it back if things don't fuse?

from containers.

meooow25 avatar meooow25 commented on July 26, 2024

The IntSet/IntMap change does look similar (#658)! I'll check that out.


How my implementation above works is very similar to counting up in binary:

[]      + 1 -> [1]
[1]     + 1 -> [1,1] -> [2]
[2]     + 1 -> [1,2]
[1,2]   + 1 -> [1,1,2] -> [2,2] -> [4]
[4]     + 1 -> [1,4]
[1,4]   + 1 -> [1,1,4] -> [2,4]
[2,4]   + 1 -> [1,2,4]
[1,2,4] + 1 -> [1,1,2,4] -> [2,2,4] -> [4,4] -> [8]
...

But a little different because perfect binary trees have size $2^n-1$ instead of $2^n$.

Also note that my implementation constructs exactly the same tree as the current algorithm, linking all the same trees, it's just done in a different, and I would say slightly simpler, way. Here's a cleaned up version:

data SetPart a
    = PartL  !(Set a)     -- (PartL l) invariant: l is perfect
    | PartLM !(Set a) !a  -- (PartLM l x) invariant: l is a perfect and maximum l < x

fromDistinctAscList :: forall a. [a] -> Set a
fromDistinctAscList = List.foldl' mergePart Tip . List.foldl' next []
  where
    next :: [SetPart a] -> a -> [SetPart a]
    next (PartL l : parts) !x = PartLM l x : parts
    next parts0 x0 = mergeInto (Bin 1 x0 Tip Tip) parts0
      where
        mergeInto !r (PartLM l x : parts)
            | sz r == sz l = mergeInto (bin x l r) parts
        mergeInto l parts = PartL l : parts

    mergePart :: Set a -> SetPart a -> Set a
    mergePart r (PartL l)    = merge l r
    mergePart r (PartLM l x) = link x l r

    sz :: Set a -> Int
    sz (Bin s _ _ _) = s
    sz Tip = error "impossible"
  fromDistinctAscList:  OK (0.21s)
    25.7 μs ± 1.6 μs, 240 KB allocated, 4.8 KB copied, 7.0 MB peak memory
  fromDistinctAscList2: OK (0.21s)
    25.7 μs ± 2.1 μs, 304 KB allocated, 8.1 KB copied, 7.0 MB peak memory

There is a property that PartL will only occur at the head of the stack for every other element. So the code could be alternately written as

data SetPart a = PartLM !(Set a) !a
data SetBuildState a
    = StateEven [SetPart a]
    | StateOdd [SetPart a] !(Set a)

fromDistinctAscList :: [a] -> Set a
fromDistinctAscList = mergeParts . List.foldl' next (StateEven [])
  where
    next (StateOdd parts l) !x = StateEven (PartLM l x : parts)
    next (StateEven parts0) x0 = mergeInto (Bin 1 x0 Tip Tip) parts0
      where
        mergeInto !r (PartLM l x : parts)
            | sz r == sz l = mergeInto (bin x l r) parts
        mergeInto l parts = StateOdd parts l
    
    mergeParts (StateOdd parts r) = List.foldl' mergePart r parts
    mergeParts (StateEven parts)  = List.foldl' mergePart Tip parts

    mergePart r (PartLM l x) = link x l r

    sz (Bin s _ _ _) = s
    sz Tip = error "impossible"

But this performs slightly worse.

  fromDistinctAscList:  OK (0.11s)
    28.6 μs ± 2.7 μs, 256 KB allocated, 4.9 KB copied, 7.0 MB peak memory
  fromDistinctAscList2: OK (0.12s)
    28.4 μs ± 2.8 μs, 320 KB allocated, 8.3 KB copied, 7.0 MB peak memory

About rewrite rules, I'll test it out but I'm not sure it's worth the complexity. It would also be atypical because I've only seen rewrite back rules for good consumer + producers, but here we're only working with a good consumer.

from containers.

treeowl avatar treeowl commented on July 26, 2024

Don't worry about "atypical". Such rules are justified whenever another implementation is faster, thriftier, or smaller when fusion doesn't occur.

from containers.

meooow25 avatar meooow25 commented on July 26, 2024

I was able to get the rewrite rules working as

"Set.fromDistinctAscList" [~1]
    forall xs. fromDistinctAscList xs = List.foldr foo bar xs baz
"Set.fromDistinctAscList back" [1]
    forall xs. List.foldr foo bar xs baz = fromDistinctAscList xs

I suspect this can be adapted to the current definition too, so I would not consider this in favor of the new definition.

But I was wrong about this and I can't find a way to make the current definition as fast as the new one.

So I say let's go with only the new definition.

from containers.

treeowl avatar treeowl commented on July 26, 2024

Other things being roughly equal, a substantial increase in allocation is bad, particularly for concurrent programs. So I would prefer to get the version with rewrite rules.

from containers.

meooow25 avatar meooow25 commented on July 26, 2024

I wouldn't say it's roughly equal.

Current:

  fromDistinctAscList: OK (0.15s)
    37.9 μs ± 3.3 μs, 159 KB allocated, 3.1 KB copied, 7.0 MB peak memory

Current, that I tried to optimize to use bin but it barely had any effect:

  fromDistinctAscList: OK (0.15s)
    35.1 μs ± 3.5 μs, 159 KB allocated, 3.1 KB copied, 7.0 MB peak memory

The new implementation (in my comment above):

  fromDistinctAscList:  OK (0.21s)
    25.7 μs ± 1.6 μs, 240 KB allocated, 4.8 KB copied, 7.0 MB peak memory

Another version of the new implementation where I tried to adopt the Stack from the old IntSet to reduce allocations:

  fromDistinctAscList: OK (0.21s)
    26.6 μs ± 2.0 μs, 224 KB allocated, 4.4 KB copied, 7.0 MB peak memory

The new version takes at least 25% less time compared to current.
We might still deliberately choose to trade off time for space, but the time difference is not negligible.

from containers.

meooow25 avatar meooow25 commented on July 26, 2024

I see that fromAscList, fromAscListWith, fromAscListWithKey all delegate to fromDistinctAscList after some combining function, which is quite convenient.
Once we have this change, it seems like a good idea to make them participate in fusion too, which can be done by just making the combining function a good consumer and producer.

from containers.

meooow25 avatar meooow25 commented on July 26, 2024

Thanks for the merge 🎉
As mentioned fromAscList and friends can also be improved, but it might take a couple of weeks before I can start working on it.

from containers.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.