Comments (4)
Sorry about the delay in response. Here is some explanation of the state machine:
-
Task
- Terminated: A task goes to ‘Terminated’ when all its instances are done
- Waiting: A task in not initialized yet
- Failed: Task fails
- Running: The Task is being processed
-
Instance
- Terminated: An instance is done
- Waiting: The instance can not run because some of its dependencies have not finished
- Running: An Instance is ‘Running’ on a worker
- Failed: An instance fails
- Interrupted: It is feature we introduced for backup instance, the instance stops due to some reason
Hope this would help.
from clusterdata.
OK, we will include the description in the next minor release of the cluster data, hopefully recently
from clusterdata.
@HaiyangDING
Thanks a lot and Happy New Year!
from clusterdata.
The description of state machine of batch workload is added to documentation. Closing the issue.
from clusterdata.
Related Issues (20)
- Microservice-trace-2022/2021 MSCallGraph timestamp HOT 1
- microservices-v2022 "service" in CallGraph HOT 1
- Why does evaluator for an inference job consume so much time in the cluster-trace-gpu-v2020?
- Question about NaN GPU counts and End Times
- How can i obtain the original data of memory and cpu utilization? HOT 1
- Cluster structure information HOT 1
- Excessive Number of HTTP Interfaces in MSCallgraph for Certain Microservices HOT 2
- For the dm and um in MSCallGraph, can not find their corresponding msname in MSResource in cluster-trace-microservices-v2021 HOT 2
- Inconsistency in Microservice Dataset Across Different Tables (cluster-trace-microservices-v2022)
- Question about 'UNAVAILABLE' and 'UNKNOWN' values in MSCallGraph data
- Inconsistent column number in CallGraph data in the cluster-trace-microservices-v2022 dataset
- When using cpu with hyper-threading, whether disable hyper-threading is a alway right policy?
- Does a Pod GPU MAX memory usage also capped by GPU Milli specified in the GPU-v2023 Trace? HOT 1
- GPU Memory 'max_gpu_wrk_mem' seems to be more than the actual GPU type in GPU'20 trace ?
- Checksum in cluster data 2017 don't match
- What's the meaning of num_pod in pai_job_duration_estimate_100K.csv
- Adding an Application Label for Microservice Traces
- 想请教一下关于cluster-trace-microservices-v2022数据集的疑问
- microservices-v2022 Online Service ID Unique
- Why is the field machine_um_worker in the pai_machine_metric table empty
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clusterdata.