Comments (6)
Supplement to Question 1: Why doesn‘t verify the existence of output's blob files in the GetActionResult
interface? Maybe just randomly select a worker from the cas work list to judge whether a single cas has blob file.
from bazel-buildfarm.
I've experienced this issue with autoscaling - worker would scale down and bazel client would get stuck waiting forever.
from bazel-buildfarm.
I've experienced this issue with autoscaling - worker would scale down and bazel client would get stuck waiting forever.
Hi,80degreeswest.
Are your execute worker and cas worker together (this indicates that the blob file is lost, but the redis data is not synchronized)?
Which version of bazel do you use?
I have previously tested this scenario with bazel 6.1 and it performed successfully. Of course, the android rbe we used was stuck.
In addition, I'd like to ask you how you deal with this problem.
from bazel-buildfarm.
Hi, @80degreeswest
I noticed that #976 could solve my problem, but not merge.
I tested the efficiency of adding check disk storage before and after. It doesn't seem to add much time at the moment.
from bazel-buildfarm.
Yes I use cas+execute workers. I see this problem when my workers scale down. We use bazel 5.3.1. To work around it you can enable graceful shutdown, which is available in v2.6.1. This config will wait x seconds for any executions in progress to finish before shutting down the worker. Obviously not going to help if your worker is already broken but it will solve the issue in case of normal shutdown. https://github.com/bazelbuild/bazel-buildfarm/blob/main/examples/config.yml#L128.
from bazel-buildfarm.
I'm not sure what the state of that PR is. @luxe may be able to provide some more detail on if it would make sense to re-visit it. @shirchen do you have that change deployed in your cluster?
from bazel-buildfarm.
Related Issues (20)
- expire Operation in backplane HOT 5
- [Scheduler] Exception notifying context listener HOT 1
- Are workers in RemoteCasWriter fixed whenever any new storage workers are added afterwards? HOT 2
- ci: windows tests fail very often HOT 2
- image bazelbuild/buildfarm-worker:v2.7.0 fails to start with "libfuse.so.2: cannot open shared object file: No such file or directory" HOT 8
- First GRPC type storage tries to create Fuse Exec FS
- Buildfarm is failing at Bazel@HEAD
- Add an optional filter to limit artifact sizes by Action HOT 4
- Post Local Clean Java Coverage Builds Against Remote K8s Build Farm Result In Invalid Digest Recieved HOT 6
- Diffrence between execution and CAS shard worker HOT 3
- Why does clang work, but llvm-ar not? HOT 4
- rules_oss_audit fails to install dependencies on mac
- Set up OSSF security scorecards
- poisson_distribution_test is failing with BAZEL@HEAD HOT 2
- External dependency of buildfarm fails with bzlmod
- Redis Hot Shard issue due to DispatchMonitor HashMap
- skipLoad looping can exhaust file path length
- Heuristics for controlling putDirectory (linkedInputDirectories) per action
- "./examples.bf-run start" fails HOT 2
- How to obtain remote system information? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bazel-buildfarm.