Comments (18)
After making the above change (adding .emit_rerun_if_changed(false)
and adding a manual emit) things are looking MUCH better:
$ time cargo check --workspace --manifest-path ./Cargo.toml
1.66s user 0.33s system 74% cpu 2.666 total
🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉
That's more like it
from cargo.
@bradzacher Thanks for that context! It looks like only one specific prepare_target
call is taking up all of that time. If you re-run with cargo check && CARGO_LOG_PROFILE=true CARGO_LOG_PROFILE_CAPTURE_ARGS=true cargo check
then the traces will include the Package ID of the package that is causing this. From that we can try to understand what is different about this package that it is hitting this case in cargo. In particular, if there are is a build script and what directives it emits, symlinks inside its directory, an odd location on disk, etc.
@Xuanwo I don't even know what file walk this is to evaluate an answer like that. One theory is a build.rs
that gives a bad path for for re-run-if-changed or we have some bug with how we are walking the directory structure for one of the rebuild-detection checks we do.
from cargo.
How about making
cargo
respect.gitignore
, or alternatively, introducing.cargoignore
when managing workspaces?
@Xuanwo it already does. The presence of exclude
and include
fields would affect whether .gitignore
is respected.
from cargo.
And it only just clicked that OFC the fact that it's dumping cargo:rerun-if-changed=../../
is what's causing cargo to watch the entire repo and crawl everything.
from cargo.
I wonder if we should track how long rerun-if-changed
took and produce a warning if its above a certain threshold. Unsure how often something like this would come up if it'd be worth it.
EDIT: Something cheaper than a warning is to just log the file walk we do for rerun-if-changed
and then someone using the log profiling can find it.
from cargo.
So that rerun-if-changed
came from tonic
.
TL;DR cos we build with both cargo and bazel - we need to make the two work in sync.
Bazel runs everything from the repo root.
So when we generate code from proto files using tonic we need to tell it to resolve the proto paths relative to the repo root (or else tonic would generate different code when run via bazel or via cargo).
So the tonic config includes this config:
.compile(
/* protos: */ &["../../tools/crate3/some/file.foo"],
/* includes: */ &["../../"],
)?
Tonic automatically emits cargo:rerun-if-changed
directives - hence it emitted cargo:rerun-if-changed=../../
.
Thankfully tonic's builder includes emit_rerun_if_changed(bool)
- so we can disable its default and manually emit a sane one instead (cargo:rerun-if-changed=tools/crate3/some/file.foo
)
from cargo.
Huh, not too sure what this would be. cargo check--manifest-path Cargo.toml
vs cargo check
shouldn't be too different. When looking for Cargo.toml
, we only walk up the directories, not back down again.
Could you give us an idea of
- Your general directory hierarchy (where
node_modules
, workspaceCargo.toml
, packageCargo.toml
, bazel build directory, etc live relative to each other) - What your
workspace.members
looks like
Even better if you could also run cargo check && CARGO_LOG_PROFILE=true cargo check
and open the traces in chrome://tracing
and let us know what the hot section is (docs). If possible, this would be best done with 1.79 as we tweak the traces for better information with each release.
from cargo.
Your general directory hierarchy
it's all over the place, really - the repo is a mess that's grown organically over time.
here's a rough idea of the mess
- /
- Cargo.toml (workspace)
- (bazel folders symlinked in the repo root)
- bazel_out/
- bazel_bin/
- bazel_canva/
- web/ (the frontend codebase
- node_modules/
- crate1/ [[rust]] (some WASM rust)
- Cargo.toml
- tools/ (a lot of internal tooling)
- crate2/ [[rust]]
- Cargo.toml
- crate3/ [[rust]]
- Cargo.toml
- ...
- js1/
- node_modules/
- js2/
- node_modules/
- ...
- backend_service1/ (a backend microservice)
- backend_service_rust1/ [[rust]] (a backend microservice, written in rust)
- Cargo.toml
What your workspace.members looks like
it's just a list of folders (i.e. example based on the above)
members = [
"web/crate1",
"tools/crate2",
"tools/crate3",
"backend_service_rust1",
...
"path/to/crate12", # there are only 12 crates in our monorepo thus far
]
from cargo.
Trace (with 1.78):
trace-1720750556589970.json
I'm not sure how to interpret the trace so here it is in its raw form.
The bulk of the time is spent in "Name prepare_target
, Category cargo::core::compiler::fingerprint
"
from cargo.
Trace (with 1.79):
trace-1720751045123202.json
Looks to be much the same
from cargo.
How about making cargo
respect .gitignore
, or alternatively, introducing .cargoignore
when managing workspaces?
from cargo.
Some more questions:
- Is there any
Cargo.toml
containing oddpackage.include
that refer to relative parent directory like..
? - Is there any symlink under a member package pointing to directories outside the package root, for example to the workspace root?
- Any builds script
rerun-if-changed
has relative paths?
from cargo.
Is there any Cargo.toml containing odd package.include that refer to relative parent directory like ..?
No. There is no package.include
defined in any Cargo.toml
Is there any symlink under a member package pointing to directories outside the package root, for example to the workspace root?
Two crates contain a node_modules
which have a symlinks (as pnpm does symlink installs)
The symlink is from the node_modules
back to the root node_modules
.
I.e. from tools/crate2/node_modules/typescript
to ../../../node_modules/.pnpm/[email protected]/node_modules/typescript
(note ../../..
resolves to the repo root)
Other than that - no links at all
Any builds script rerun-if-changed has relative paths?
There are two:
one outputs cargo:rerun-if-changed=tools/crate2/folder
the other outputs this (where ../../
resolves to the repo root)
cargo:rerun-if-changed=../../tools/crate3/some/file.foo
cargo:rerun-if-changed=../../
from cargo.
If you re-run with cargo check && CARGO_LOG_PROFILE=true CARGO_LOG_PROFILE_CAPTURE_ARGS=true cargo check then the traces will include the Package ID of the package that is causing this
I reran it and the profile says that tools/crate3
(the one with the relative rerun-if-changed
path) is the one that's responsible for that massive block.
from cargo.
Nice!
The follow-up question then will be: Should paths from rerun-if-changed
be normalized?
I believe that could also resolve this issue because .gitignore
is already respected. Note that it is discouraged for a rerun-if-changed
to refer to paths outside package root, as it breaks the self-contained property of a Cargo package.
A normalization would happen here if we decide to go that route.
from cargo.
Following on from that - it would be good to have an error when you break encapsulation like this.
I would suspect that in the vast majority of cases breaking encapsulation is not the intended result.
Forcing someone to emit like cargo::rerun-if-changed-allow-outside-package-folder=true
or something to silence the error would help people catch accidental cases like this and provide huge perf wins!
Alternately - being able to lint for this (eg via clippy) would be good - but likely hard-to-impossible unless clippy can scan the target/debug/foo/output
file for problems.
from cargo.
We can't error because of compatibility. We are hesitant about doing things related to build scripts on Edition boundaries because of cases like tonic
where you delegate your build script written in one edition to another package which could be written in a different edition.
As for a warning, we avoid those right now unless there is a use actionable way to disable it without changing their behavior. We'll soon have lint control which will would be a way to do so.
from cargo.
$ cargo check --workspace --manifest-path Cargo.toml
$ time cargo check --workspace --manifest-path Cargo.toml
before:
3.89s user 42.38s system 58% cpu 1:18.62 total
after:
0.15s user 0.12s system 84% cpu 0.319 total
from cargo.
Related Issues (20)
- Tell `rustc` wrappers which envs to pass through to allow env sandboxing HOT 3
- Cargo downgrades transitive dependency that should not change HOT 3
- Anchor build returns the below error always HOT 2
- Build fails with "filename, directory name, or volume label syntax is incorrect" error in typenum crate HOT 1
- `cargo rustdoc` does not document lib tests HOT 7
- `cargo update -p $some_dependency` causes unexpected dependency downgrades HOT 1
- Allow registries to patch crates HOT 1
- Seamless upgrade from 0.x to 1.0 HOT 11
- Native dependency conflicts even if the native dependency is not active due to disabled features HOT 1
- Document having only `src/bin/*.rs` binaries is valid HOT 3
- `[patch]` ignores branch or rev HOT 1
- Readme file is allowed to not be in the .crate package HOT 2
- cargo info doesn't allow to specify output-format HOT 5
- Adding workspace profiles to the member's manifest at `cargo package` HOT 1
- Inconsistent summary on error count with --message-format=json
- Cargo scripts try to build a nearby `src/lib.rs` if present HOT 2
- Validate project fields before cargo publish HOT 1
- Link to build profile - how useful is it? HOT 3
- duplicate document and missing document HOT 1
- cargo build cache is invalidated. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cargo.