Comments (6)
You can use bf-cat to repeatedly download that blob to see if there's something in bazel itself causing this problem, or if it implicates the cluster conditions for reporting bad content (bf-cat ... File 2b0c7b8839bcb322410449916521fab0fd21d1e51558ffa38fb60f2baaf83e47/18563764 | sha256sum
on repeat should always yield 2b0c7b8839bcb322410449916521fab0fd21d1e51558ffa38fb60f2baaf83e47)
from bazel-buildfarm.
On repeat it yields differing results between:
2b0c7b8839bcb322410449916521fab0fd21d1e51558ffa38fb60f2baaf83e47/18563764
74739361f7c558e5718aa8ce7cda96063cc17cb75d3b8a1bedd1b4118e948050/357
Does this suggest that both are stored and it depends which worker it is coming from?
Local run responses:
cjohnstoniv@Desktop:/helm-farm/bazel-buildfarm$ bazel-bin/src/main/java/build/buildfarm/tools/bf-cat localhost:8080 "" SHA256 File 2b0c7b8839bcb322410449916521fab0fd21d1e51558ffa38fb60f2baaf83e47/18563764 | sha256sum/helm-farm/bazel-buildfarm$ bazel-bin/src/main/java/build/buildfarm/tools/bf-cat localhost:8080 "" SHA256 File 2b0c7b8839bcb322410449916521fab0fd21d1e51558ffa38fb60f2baaf83e47/18563764 | sha256sum
SLF4J: No SLF4J providers were found.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See https://www.slf4j.org/codes.html#noProviders for further details.
74739361f7c558e5718aa8ce7cda96063cc17cb75d3b8a1bedd1b4118e948050 -
cjohnstoniv@Desktop:
SLF4J: No SLF4J providers were found.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See https://www.slf4j.org/codes.html#noProviders for further details.
74739361f7c558e5718aa8ce7cda96063cc17cb75d3b8a1bedd1b4118e948050 -
cjohnstoniv@Desktop:~/helm-farm/bazel-buildfarm$ bazel-bin/src/main/java/build/buildfarm/tools/bf-cat localhost:8080 "" SHA256 File 2b0c7b8839bcb322410449916521fab0fd21d1e51558ffa38fb60f2baaf83e47/18563764 | sha256sum
SLF4J: No SLF4J providers were found.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See https://www.slf4j.org/codes.html#noProviders for further details.
2b0c7b8839bcb322410449916521fab0fd21d1e51558ffa38fb60f2baaf83e47 -
from bazel-buildfarm.
Possibly, though all of the writes go through a checksum validation, so the only way to break through that is to change the file contents with an action - are you running with as-nobody or equivalent protection? Checking actual worker file contents will out
from bazel-buildfarm.
Running with just plain defaults of a docker desktop on windows w/ k8s & helm. In the sample above there is no overrides or configs outside the defaults.
Is there anything different about coverage runs? Particularly in regards to the need for --experimental_split_coverage_postprocessing --experimental_fetch_all_coverage_outputs
?
Is there anything about running in k8s/containers (i.e. something installed on the worker OS)? On prem we run enterprise enterprise linux however in the container it is just an Ubuntu image. So maybe some missing dependency on the worker OS or something. Just odd that this is specific to coverage builds and within k8s.
from bazel-buildfarm.
I'm not predisposed to suspect k8s - I run a k8s buildfarm myself with no obvious behavior like this for any other actions. It seems very specific to some action definition that is munging your inputs:
This has the potential to happen for any action which modifies an input file directly. The thing to check will be any operation which is executed during the build after an output_file/path of bazel-out/k8-fastbuild/testlogs/test-unit/_coverage/coverage-runtime_merged_instr.jar
that then uses it as an input (these things will have to be discovered through --remote_grpc_log, remote_client, and bf-cat). You'll need to remove the offending truncated file (after checking the contents below), restart that worker, and execute in sequence to observe this.
A question about the contents of the truncated file. First, have you located it on the filesystem of the worker and confirmed the size difference and violation. Second, do the contents make sense relative to the input? Are they the first 357 bytes, or maybe an empty jar file consistent with deleting the contents of the heavyweight?
from bazel-buildfarm.
This issue is resolved in Bazel 7.0.0, likely due to the switch to --remote_download_outputs toplevel as the default which no longer downloads the _runtime_merged_instr.jar files to the host machine.
I will close this for now as it likely is not a BF issue and Bazel specific.
from bazel-buildfarm.
Related Issues (20)
- expire Operation in backplane HOT 5
- [Scheduler] Exception notifying context listener HOT 1
- Are workers in RemoteCasWriter fixed whenever any new storage workers are added afterwards? HOT 2
- ci: windows tests fail very often HOT 2
- image bazelbuild/buildfarm-worker:v2.7.0 fails to start with "libfuse.so.2: cannot open shared object file: No such file or directory" HOT 8
- First GRPC type storage tries to create Fuse Exec FS
- Buildfarm is failing at Bazel@HEAD
- Add an optional filter to limit artifact sizes by Action HOT 4
- Diffrence between execution and CAS shard worker HOT 3
- Why does clang work, but llvm-ar not? HOT 4
- rules_oss_audit fails to install dependencies on mac
- Set up OSSF security scorecards
- poisson_distribution_test is failing with BAZEL@HEAD HOT 2
- External dependency of buildfarm fails with bzlmod
- Redis Hot Shard issue due to DispatchMonitor HashMap
- skipLoad looping can exhaust file path length
- Heuristics for controlling putDirectory (linkedInputDirectories) per action
- "./examples.bf-run start" fails HOT 2
- How to obtain remote system information? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bazel-buildfarm.