What version of Bun is running? 1.0.29+a146856d1 <h3 dir="auto

os.cpus() is broken about bun HOT 6 OPEN

argosphil commented on July 20, 2024

os.cpus() is broken

from bun.

Comments (6)

argosphil commented on July 20, 2024

os.cpus() is a bad API because it's based on a circa 2000 concept of a CPU: a CPU is a physical chip which allows a single thread of execution and runs at a fixed clock frequency, which determines its speed of execution. Normal users have only one of them, but some expensive system have a small fixed number of identical CPUs.

Today, a CPU is (often) virtual, may offer more than one thread of execution (which interfere in nontrivial ways), runs at a variable clock frequency, becomes available or unavailable while running as virtualization or firmware steals it from the operating system, becomes slower or faster due to power management even when it is running at the same frequency, and there are several different CPU cores on the same system at the same time, designed for efficiency, performance, or latency. The number of CPUs can change at any time. Furthermore, how many of these CPUs should be used by a specific program isn't exposed by os.cpus() at all: maybe my process is limited to a single CPU core by cgroups, maybe it doesn't benefit from SMT and I shouldn't use it, maybe there are many available CPUs which I could use but I shouldn't for some other reason.

On Linux, the current consensus is that the only information you should ever use is the number returned by nproc, and that system administrators should replace that program by one that prints the actually desired number of execution threads. It is nothing to do with the actual number of physical CPUs on any physical machine, and on recent systems is much larger than the number of threads it is useful to spawn (on this laptop, nproc currently prints 20).

Sorry this got a bit long, but in summary, we can't fix bad JavaScript code using a bad API to make a bad guess at the number they actually care for, which is "how many workers should I spawn". That number is almost totally unrelated to the name "hardware concurrency", but there is an API to access it, and it's regrettably called navigator.hardwareConcurrency.

We should define os.cpus = Bun._Os().cpus.

from bun.

Electroid commented on July 20, 2024

Is the only reason you're saying it's broken is because the console.log output is non-helpful? If so, we could fix that.

The reason we do this optimization is because it's used in a lot of places and indeed does improve startup time. We also haven't seen any reported issues where it being lazy has caused issues.

from bun.

argosphil commented on July 20, 2024

Is the only reason you're saying it's broken is because the console.log output is non-helpful? If so, we could fix that.

No. Right now, it's severely broken: if you modify the array returned by os.cpus(), things will break badly.

That can be fixed. What cannot be fixed is that delazification means the data won't be generated at the right time. If I do:

let oldCpus = os.cpus();
<expensive code here>
let newCpus = os.cpus();
return newCpus[0].times.user - oldCpus[0].times.user;

that will actually return a NEGATIVE number, because newCpus is delazified before oldCpus is. (I see no other way cpus()[i].times could be used, tbh. You take two snapshots and calculate the difference.)

The reason we do this optimization is because it's used in a lot of places and indeed does improve startup time. We also haven't seen any reported issues where it being lazy has caused issues.

I understand that. It's an invalid optimization that happens to work in 90% of cases. A similar example that was just asked about on Discord is why "a".toString() isn't simplified to "a". The answer is that it doesn't work in all cases.

Note that I can't reproduce the alleged Linux issue. As the "current clock speed" of a CPU is useless information, we could always return 0 for it, as we do on my aarch64 VM, or return the information from /proc/cpuinfo without digging into /sys. Both would avoid the performance problem, if it even still exists.

I believe the "overkill" solution is actually worth looking into: if a zig function is called to generate an object which is immediately destructured/derefed so only some of its properties are relevant, we can optimize away the generation of other properties (assuming they happen without side effects). For example, if Array.map() is called and we immediately discard the result, we can fall back to Array.forEach().

Other languages do this, of course, but I haven't been able to find out how it would work with JSC.

from bun.

Jarred-Sumner commented on July 20, 2024

Can you point to third-party library code which modifies the array returned by os.cpus()? The data in os.cpus is supposed to mostly never change between runs.

from bun.

argosphil commented on July 20, 2024

Can you point to third-party library code which modifies the array returned by os.cpus()?

Apart from the Discord OP's code, there appear to be 7 instances that are easily searchable on GitHub:

https://github.com/search?q=cpus().shift()&type=code
https://github.com/search?q=cpus().pop()&type=code

As for the "times" property being used in a way that would break, here's a random example:

https://github.com/diefarbe/node-lib/blob/d17d118320ec16e1f2a9983dd7ec4864d4e9b31b/POC/ts/examples/cpu-meter.ts#L51

That code stores the return value of os.cpus() at one point, then compares it to another call of os.cpus() which happened later. Since both are delazified at the same time, there will be no difference.

I'm not sure how often either of those things happen.

The data in os.cpus is supposed to mostly never change between runs.

The only part that doesn't change for me is the "model" property.

Note that this isn't about making future calls to os.cpus() return different values, just about modifying what's documented to be an array of objects to retrieve the individual objects.

The usage of os.cpus().length to determine how many threads to spawn is explicitly advised against in the node documentation, FWIW.

from bun.

argosphil commented on July 20, 2024

Maybe a compromise here would be to make cpus()[i].speed a getter and eagerly populate the rest of the objects? That way, the information would be only slightly inaccurate (clock speed measured at the wrong time), but it wouldn't be as slow as digging through nprocs files in /sys. On recent Linux systems the clock speed is also exported via /proc/cpuinfo so we wouldn't have to use a getter there at all.

from bun.

os.cpus() is broken about bun HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent