If a run or a job is restarted on the same runner, the runner tries to apply the Git patch (repo diff
) and fails because of a conflict as it's trying to apply it to the folder where it has already applied the patch.
ERRO[2022-05-25T11:21:58Z] diff applier error ae=ApplyError{Fragment: 1, FragmentLine: 3, Line: 3} run_name=odd-rabbit-1 job_id=e7fa162e70b1 workflow=train-mnist filename=.dstack/variables.yaml err=conflict: fragment line does not match src line
ERRO[2022-05-25T11:21:58Z] run job is finished with error job_id=e7fa162e70b1 err=conflict: fragment line does not match src line workflow=train-mnist run_name=odd-rabbit-1
INFO[2022-05-25T11:24:57Z] New job submitted job_id=e7fa162e70b1 workflow=train-mnist run_name=odd-rabbit-1
WARN[2022-05-25T11:24:57Z] count of log arguments must be odd job_id=e7fa162e70b1 workflow=train-mnist run_name=odd-rabbit-1 count=1
INFO[2022-05-25T11:24:58Z] git checkout path=/root/.dstack/tmp/runs/odd-rabbit-1/e7fa162e70b1 workflow=train-mnist run_name=odd-rabbit-1 url=https://github.com/dstackai/dstack-examples.git branch=main hash=f219066b2379c69263f281f65167c8f6046874a2 job_id=e7fa162e70b1 auth=*http.BasicAuth
WARN[2022-05-25T11:24:58Z] git clone ref==nil branch=main hash=f219066b2379c69263f281f65167c8f6046874a2 job_id=e7fa162e70b1 path=/root/.dstack/tmp/runs/odd-rabbit-1/e7fa162e70b1 workflow=train-mnist run_name=odd-rabbit-1 url=https://github.com/dstackai/dstack-examples.git
INFO[2022-05-25T11:24:58Z] apply diff start run_name=odd-rabbit-1 dir=/root/.dstack/tmp/runs/odd-rabbit-1/e7fa162e70b1 job_id=e7fa162e70b1 workflow=train-mnist
ERRO[2022-05-25T11:24:58Z] diff applier error job_id=e7fa162e70b1 workflow=train-mnist run_name=odd-rabbit-1 filename=.dstack/variables.yaml err=conflict: fragment line does not match src line ae=ApplyError{Fragment: 1, FragmentLine: 3, Line: 3}
ERRO[2022-05-25T11:24:58Z] run job is finished with error run_name=odd-rabbit-1 job_id=e7fa162e70b1 err=conflict: fragment line does not match src line workflow=train-mnist