Git Product home page Git Product logo

Comments (11)

dermesser avatar dermesser commented on August 15, 2024

Can you elaborate on the goal of this?

Do you have quantitative evidence of varint code needing a fast path?

from integer-encoding-rs.

JAicewizard avatar JAicewizard commented on August 15, 2024

If n<128 no actual operation or calculation is required. Adding a single condition to be able to manage all relatively small numbers seems to be trivial.
For encoding there already is a fast-path for n==0, but this should be extendable to n<128.
Both showed up in my benchmarks, and adding the fast-path for both improved performance by ~5%

from integer-encoding-rs.

dermesser avatar dermesser commented on August 15, 2024

Interesting! Would you mind sending a PR?

from integer-encoding-rs.

dermesser avatar dermesser commented on August 15, 2024

if you have useful benchmarks, feel free to add them too.

from integer-encoding-rs.

JAicewizard avatar JAicewizard commented on August 15, 2024

hmm, I tried to benchmark my program with/without special cases but they seem to be pretty much the same at the moment, I think that is because all numbers I encode are now constant and inlined so it can optimize it away anyways. In benchmarks where I initially added fast-paths I could get up to 20% improvement.

Right now I am measuring 12.3 ns for 10 fast-path encodes and 14.8 ns for unoptimized encode for ints under 64 where everything falls under the encoding fast-path. so the fast-path can improve performance by 17% where it applies.
I am also getting better performance for cases that fall outside the fast-path (13.5ns vs 14.8) which I found interesting.

On another note, I found that for all numbers this implementation is off the golang version, written by google. for numbers specified in this document it also doesn't match even though it is linked in the readme. Is this known?

from integer-encoding-rs.

dermesser avatar dermesser commented on August 15, 2024

On another note, I found that for all numbers this implementation is off the golang version, written by google. for numbers specified in this document it also doesn't match even though it is linked in the readme. Is this known?

where do you see the mismatch? I added a simple interactive encoder and it gives me the same result as the google document:

1
fixed: 1 encoded (unsigned): 00000001  encoded (signed): 00000010 
300
fixed: 100101100 encoded (unsigned): 10101100 00000010  encoded (signed): 11011000 00000100

from integer-encoding-rs.

JAicewizard avatar JAicewizard commented on August 15, 2024

oops!!! I am so sorry if this took a lot of your time, I tried to make it recreatable in the rust playground, and while transforming the method into a separate function with explicit type u64 I realized that rust decided that 150 was a signed integer instead of unsigned.

so here are the reran numbers on my machine for encoding. All functions are within the same crate and module. (again 10 runs per "iteration" to make per iteration overhead irrelevant.)

no forced inlining

between 0x0 and 0x80:
original: 14.8
fast-path: 12.3
port of golang: 12.4

between 0x80 and 0xff:
original: 18.4
fast-path: 18.5
port of golang: 14.8

between 0x0 and 0xff:
original: 15.8
fast-path: 14.6
port of golang: 13.1

with inlining force enabled:

between 0x0 and 0x80:
original: 14.9
fast-path: 12.3
port of golang: 12.4

between 0x80 and 0xff:
original: 18.9
fast-path: 18.8
port of golang: 14.8

between 0x0 and 0xff:
original: 16.8
fast-path: 15.3
port of golang: 13.4

as you can see the golang port is not faster within 0x0 and 0x80, as long as the fast-path is applied, but for numbers between 0x80 and 0xff it is faster. Another thing you can see is that forcefully inlining doesn't make a difference. But cargo doesnt do inlining cross-crate even for very small functions. Calling the encode function from your crate creates worse performance because of inlining:

between 0x0 and 0x80: 17.7
between 0x80 and 0xff: 22.4
between 0x0 and 0xff: 20.4

So even adding an inline attribute would improve performance

Also some benchmarks for the size function (again per 10):

cross-crate: 13.3
in-crate: 7.5
modified*: 6.6

as you can see adding inline to that function could also drastically improve performance.

I havnt done any benchmarks for decoding but if you are interested I can do those too, currently off from school since the global health crisis...

*modified version, basically the go encoding but without the writing data.

go port of varint encoding:

let mut i = 0;
while u >= 0x80 {
	dst[i] = u as u8 | 0x80;
	u >>= 7;
	i += 1;
}
dst[i] = u as u8;
i + 1

from integer-encoding-rs.

dermesser avatar dermesser commented on August 15, 2024

Thank you for the code sample, it makes it clear what you had in mind. I pushed commit
c742e40, let me know if it's correct and achieves the desired benchmark performance. The tests pass, and you can pull branch varint-optimize to check it out.

from integer-encoding-rs.

JAicewizard avatar JAicewizard commented on August 15, 2024

I would also recommend adding inline attributes to allow inlining(not force it) and remove the outor if-statement since doesn't do anything. now it checks first if its equal to zero and then less then to 0x80 and they both result in (almost) the same code running. It doesn't improve performance and makes the code look unclear.

I would recomend just this:

let mut i = 0;
while u >= 0x80 {
	dst[i] = u as u8 | 0x80;
	u >>= 7;
	i += 1;
}
dst[i] = u as u8;
i + 1

Also you have constants at the top of the file, note that 0x80 == 0b10000000 aka MSB so if you prefer to use that do that.

from integer-encoding-rs.

dermesser avatar dermesser commented on August 15, 2024

Thanks, I overlooked this. I've also used your proposed algorithm for signed integers now.

from integer-encoding-rs.

JAicewizard avatar JAicewizard commented on August 15, 2024

That all looks very good to me, I guess I should close the issue?

I did some benchmarking for decoding but didn't find any improvements changing the decoding function.

from integer-encoding-rs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.