Git Product home page Git Product logo

Comments (6)

argosphil avatar argosphil commented on July 20, 2024

os.cpus() is a bad API because it's based on a circa 2000 concept of a CPU: a CPU is a physical chip which allows a single thread of execution and runs at a fixed clock frequency, which determines its speed of execution. Normal users have only one of them, but some expensive system have a small fixed number of identical CPUs.

Today, a CPU is (often) virtual, may offer more than one thread of execution (which interfere in nontrivial ways), runs at a variable clock frequency, becomes available or unavailable while running as virtualization or firmware steals it from the operating system, becomes slower or faster due to power management even when it is running at the same frequency, and there are several different CPU cores on the same system at the same time, designed for efficiency, performance, or latency. The number of CPUs can change at any time. Furthermore, how many of these CPUs should be used by a specific program isn't exposed by os.cpus() at all: maybe my process is limited to a single CPU core by cgroups, maybe it doesn't benefit from SMT and I shouldn't use it, maybe there are many available CPUs which I could use but I shouldn't for some other reason.

On Linux, the current consensus is that the only information you should ever use is the number returned by nproc, and that system administrators should replace that program by one that prints the actually desired number of execution threads. It is nothing to do with the actual number of physical CPUs on any physical machine, and on recent systems is much larger than the number of threads it is useful to spawn (on this laptop, nproc currently prints 20).

Sorry this got a bit long, but in summary, we can't fix bad JavaScript code using a bad API to make a bad guess at the number they actually care for, which is "how many workers should I spawn". That number is almost totally unrelated to the name "hardware concurrency", but there is an API to access it, and it's regrettably called navigator.hardwareConcurrency.

We should define os.cpus = Bun._Os().cpus.

from bun.

Electroid avatar Electroid commented on July 20, 2024

Is the only reason you're saying it's broken is because the console.log output is non-helpful? If so, we could fix that.

The reason we do this optimization is because it's used in a lot of places and indeed does improve startup time. We also haven't seen any reported issues where it being lazy has caused issues.

from bun.

argosphil avatar argosphil commented on July 20, 2024

Is the only reason you're saying it's broken is because the console.log output is non-helpful? If so, we could fix that.

No. Right now, it's severely broken: if you modify the array returned by os.cpus(), things will break badly.

That can be fixed. What cannot be fixed is that delazification means the data won't be generated at the right time. If I do:

let oldCpus = os.cpus();
<expensive code here>
let newCpus = os.cpus();
return newCpus[0].times.user - oldCpus[0].times.user;

that will actually return a NEGATIVE number, because newCpus is delazified before oldCpus is. (I see no other way cpus()[i].times could be used, tbh. You take two snapshots and calculate the difference.)

The reason we do this optimization is because it's used in a lot of places and indeed does improve startup time. We also haven't seen any reported issues where it being lazy has caused issues.

I understand that. It's an invalid optimization that happens to work in 90% of cases. A similar example that was just asked about on Discord is why "a".toString() isn't simplified to "a". The answer is that it doesn't work in all cases.

Note that I can't reproduce the alleged Linux issue. As the "current clock speed" of a CPU is useless information, we could always return 0 for it, as we do on my aarch64 VM, or return the information from /proc/cpuinfo without digging into /sys. Both would avoid the performance problem, if it even still exists.

I believe the "overkill" solution is actually worth looking into: if a zig function is called to generate an object which is immediately destructured/derefed so only some of its properties are relevant, we can optimize away the generation of other properties (assuming they happen without side effects). For example, if Array.map() is called and we immediately discard the result, we can fall back to Array.forEach().

Other languages do this, of course, but I haven't been able to find out how it would work with JSC.

from bun.

Jarred-Sumner avatar Jarred-Sumner commented on July 20, 2024

Can you point to third-party library code which modifies the array returned by os.cpus()? The data in os.cpus is supposed to mostly never change between runs.

from bun.

argosphil avatar argosphil commented on July 20, 2024

Can you point to third-party library code which modifies the array returned by os.cpus()?

Apart from the Discord OP's code, there appear to be 7 instances that are easily searchable on GitHub:

https://github.com/search?q=cpus().shift()&type=code
https://github.com/search?q=cpus().pop()&type=code

As for the "times" property being used in a way that would break, here's a random example:

https://github.com/diefarbe/node-lib/blob/d17d118320ec16e1f2a9983dd7ec4864d4e9b31b/POC/ts/examples/cpu-meter.ts#L51

That code stores the return value of os.cpus() at one point, then compares it to another call of os.cpus() which happened later. Since both are delazified at the same time, there will be no difference.

I'm not sure how often either of those things happen.

The data in os.cpus is supposed to mostly never change between runs.

The only part that doesn't change for me is the "model" property.

Note that this isn't about making future calls to os.cpus() return different values, just about modifying what's documented to be an array of objects to retrieve the individual objects.

The usage of os.cpus().length to determine how many threads to spawn is explicitly advised against in the node documentation, FWIW.

from bun.

argosphil avatar argosphil commented on July 20, 2024

Maybe a compromise here would be to make cpus()[i].speed a getter and eagerly populate the rest of the objects? That way, the information would be only slightly inaccurate (clock speed measured at the wrong time), but it wouldn't be as slow as digging through nprocs files in /sys. On recent Linux systems the clock speed is also exported via /proc/cpuinfo so we wouldn't have to use a getter there at all.

from bun.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.