Comments (6)
os.cpus()
is a bad API because it's based on a circa 2000 concept of a CPU: a CPU is a physical chip which allows a single thread of execution and runs at a fixed clock frequency, which determines its speed of execution. Normal users have only one of them, but some expensive system have a small fixed number of identical CPUs.
Today, a CPU is (often) virtual, may offer more than one thread of execution (which interfere in nontrivial ways), runs at a variable clock frequency, becomes available or unavailable while running as virtualization or firmware steals it from the operating system, becomes slower or faster due to power management even when it is running at the same frequency, and there are several different CPU cores on the same system at the same time, designed for efficiency, performance, or latency. The number of CPUs can change at any time. Furthermore, how many of these CPUs should be used by a specific program isn't exposed by os.cpus()
at all: maybe my process is limited to a single CPU core by cgroups, maybe it doesn't benefit from SMT and I shouldn't use it, maybe there are many available CPUs which I could use but I shouldn't for some other reason.
On Linux, the current consensus is that the only information you should ever use is the number returned by nproc
, and that system administrators should replace that program by one that prints the actually desired number of execution threads. It is nothing to do with the actual number of physical CPUs on any physical machine, and on recent systems is much larger than the number of threads it is useful to spawn (on this laptop, nproc
currently prints 20
).
Sorry this got a bit long, but in summary, we can't fix bad JavaScript code using a bad API to make a bad guess at the number they actually care for, which is "how many workers should I spawn". That number is almost totally unrelated to the name "hardware concurrency", but there is an API to access it, and it's regrettably called navigator.hardwareConcurrency
.
We should define os.cpus = Bun._Os().cpus
.
from bun.
Is the only reason you're saying it's broken is because the console.log
output is non-helpful? If so, we could fix that.
The reason we do this optimization is because it's used in a lot of places and indeed does improve startup time. We also haven't seen any reported issues where it being lazy has caused issues.
from bun.
Is the only reason you're saying it's broken is because the
console.log
output is non-helpful? If so, we could fix that.
No. Right now, it's severely broken: if you modify the array returned by os.cpus(), things will break badly.
That can be fixed. What cannot be fixed is that delazification means the data won't be generated at the right time. If I do:
let oldCpus = os.cpus();
<expensive code here>
let newCpus = os.cpus();
return newCpus[0].times.user - oldCpus[0].times.user;
that will actually return a NEGATIVE number, because newCpus is delazified before oldCpus is. (I see no other way cpus()[i].times could be used, tbh. You take two snapshots and calculate the difference.)
The reason we do this optimization is because it's used in a lot of places and indeed does improve startup time. We also haven't seen any reported issues where it being lazy has caused issues.
I understand that. It's an invalid optimization that happens to work in 90% of cases. A similar example that was just asked about on Discord is why "a".toString() isn't simplified to "a". The answer is that it doesn't work in all cases.
Note that I can't reproduce the alleged Linux issue. As the "current clock speed" of a CPU is useless information, we could always return 0 for it, as we do on my aarch64 VM, or return the information from /proc/cpuinfo without digging into /sys. Both would avoid the performance problem, if it even still exists.
I believe the "overkill" solution is actually worth looking into: if a zig function is called to generate an object which is immediately destructured/derefed so only some of its properties are relevant, we can optimize away the generation of other properties (assuming they happen without side effects). For example, if Array.map() is called and we immediately discard the result, we can fall back to Array.forEach().
Other languages do this, of course, but I haven't been able to find out how it would work with JSC.
from bun.
Can you point to third-party library code which modifies the array returned by os.cpus()? The data in os.cpus is supposed to mostly never change between runs.
from bun.
Can you point to third-party library code which modifies the array returned by os.cpus()?
Apart from the Discord OP's code, there appear to be 7 instances that are easily searchable on GitHub:
https://github.com/search?q=cpus().shift()&type=code
https://github.com/search?q=cpus().pop()&type=code
As for the "times" property being used in a way that would break, here's a random example:
That code stores the return value of os.cpus()
at one point, then compares it to another call of os.cpus()
which happened later. Since both are delazified at the same time, there will be no difference.
I'm not sure how often either of those things happen.
The data in os.cpus is supposed to mostly never change between runs.
The only part that doesn't change for me is the "model" property.
Note that this isn't about making future calls to os.cpus() return different values, just about modifying what's documented to be an array of objects to retrieve the individual objects.
The usage of os.cpus().length
to determine how many threads to spawn is explicitly advised against in the node documentation, FWIW.
from bun.
Maybe a compromise here would be to make cpus()[i].speed
a getter and eagerly populate the rest of the objects? That way, the information would be only slightly inaccurate (clock speed measured at the wrong time), but it wouldn't be as slow as digging through nprocs
files in /sys
. On recent Linux systems the clock speed is also exported via /proc/cpuinfo
so we wouldn't have to use a getter there at all.
from bun.
Related Issues (20)
- Auto-install doesn't work for sqlite3 library HOT 2
- Util functions for Bun Shell HOT 2
- Bun interactive shell `bunx bunx`
- Wildcard sideEffects warning using bun run HOT 3
- dynamic-import bug (core dumped) HOT 2
- bun init not accepting Ctrl-C or Enter on MacOS 13.6.3 for bun -v 1.0.30 HOT 2
- fs/promises methods throw EFAULT instead of ENOENT for missing files/directories
- `bun shell` command for quick, cross platform `package.json` scripts HOT 2
- hot reload (docker) changes inside the container HOT 1
- Improve error codes on sigsegv fault HOT 4
- Source maps and error locations are broken
- DOMParser API HOT 1
- package.json scripts `argv` not handling quotes well HOT 2
- import sqlite from "better-sqlite3"; new sqlite("sqlite.db"); // process killed HOT 1
- `ReadableStream` occur error on type `direct` HOT 2
- Ability to not clear screen in watch mode HOT 2
- bun build linked sourcemap option HOT 1
- Mocked module isn't applied
- bun run scripts InvalidExe
- bun shell `rm -r` does not delete all files and can also hit Invalid state HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bun.