Git Product home page Git Product logo

Comments (3)

uranusjr avatar uranusjr commented on June 15, 2024

DEFERRED is the easy one, it is basically the same as RUNNING from the scheduler’s perspective, except it is waiting for the trigger to continue. For a ti to entered the DEFERRED, it must first be in the RUNNING state (because deferral happens in the worker), so it should have set start_date, but not end_date.

UP_FOR_RESCHEDULE is similar, except it’s done by a (synchronous) sensor in reschedule mode. It is also basically the same as RUNNING except waiting for the scheduler to pick it up again. Since signalling a reschedule is done in the worker, the ti should have set start_date. The situation with end_state is… complicated (this uncertainly is a part of the reason why sensors are disfavoured). Personally I would prefer if it does not set end_date (nor re-set start_date when the scheduler restarts the ti), but it is what it is and we can’t change that without breaking compatibility (and when we can break compatibility in Airflow 3, we probably want to remove synchronous sensors altogether anyway). I think UP_FOR_RESCHEDULE should be considered a special state that is also finished and pre-scheduling and keep its current behaviour.

RESTARTING is also similar (it is actually relative new) in that it’s set after the ti have started, so start_date should have been set. It should keep its start_date until the scheduler picks the ti back up, and end_date should remain null. The same applies for UP_FOR_RETRY. (I think your reference on UP_FOR_RETRY’s end_date is wrong? The code it links does not handle UP_FOR_RETRY, from what I can tell.)

from airflow.

kzosabe avatar kzosabe commented on June 15, 2024

@uranusjr Thanks for the explanation!

Just to confirm, does a DEFFERED task really transition directly to RUNNING?
I may be missing something, but I could not find a code path that directly transitions from DEFFERED to RUNNING.
As far as I have investigated, it seems that DEFFERED tasks are first rolled back to SCHEDULED, as in the following test code.

def test_submit_event(session, create_task_instance):
"""
Tests that events submitted to a trigger re-wake their dependent
task instances.
"""
# Make a trigger
trigger = Trigger(classpath="airflow.triggers.testing.SuccessTrigger", kwargs={})
trigger.id = 1
session.add(trigger)
session.commit()
# Make a TaskInstance that's deferred and waiting on it
task_instance = create_task_instance(
session=session, execution_date=timezone.utcnow(), state=State.DEFERRED
)
task_instance.trigger_id = trigger.id
task_instance.next_kwargs = {"cheesecake": True}
session.commit()
# Call submit_event
Trigger.submit_event(trigger.id, TriggerEvent(42), session=session)
# commit changes made by submit event and expire all cache to read from db.
session.flush()
session.expunge_all()
# Check that the task instance is now scheduled
updated_task_instance = session.query(TaskInstance).one()
assert updated_task_instance.state == State.SCHEDULED
assert updated_task_instance.next_kwargs == {"event": 42, "cheesecake": True}

However, whether there is a direct transition or not, your explanation seems to be correct.

As I was checking your comment, I noticed that, at least the presence or absence of start_date is not be determined by the value of ti.state. So the form of matrix I wrote above may be nonsense.

From reading your comment, in my interpretation, the current desired behavior of start_date is:

  • The datetime when the task is first touched by worker is stored.
  • And is normally retained no matter how the state changes thereafter (including changes to pre-running state).
  • If there is a action that can be regarded as restart , it will be overwritten with the datetime of it.
    • Perhaps we don't want to consider a run from DEFERRED as a restart, but we want to consider a run from UP_FOR_RESCHEDULE as a restart?

Is this understanding correct?

If this is correct, then at least the changes I've made in #38631 should be reasonable in terms of conforming to this interpretation.
Because in the current implementation, the change from QUEUED to SCHEDULED also sets the end_date, which is also clearly contrary to the above.

However, it seems that changes such as putting None in start_date when reverting to a pre-running state, as described in #38631 (comment) , are wrong and should not be done.

I think your reference on UP_FOR_RETRY’s end_date is wrong?

According to my referenced code, ti.set_state(TaskInstanceState.UP_FOR_RETRY) actually sets current time as ti.end_date (because of or ti.state == TaskInstanceState.UP_FOR_RETRY).
However, from reading your commentary, this may also be the wrong behavior.

from airflow.

uranusjr avatar uranusjr commented on June 15, 2024

Just to confirm, does a DEFFERED task really transition directly to RUNNING?

I didn’t say that; I only said a task transitions from RUNNING to DEFERRED 😛 A deferred task is indeed set to SCHEDULED by the triggerer, when the trigger is fulfilled, to be handed back to the scheduler; in some cases the handed back ti isn’t even transitioned back to RUNNING at all, but to a finished state directly (when things time out etc.)

the current desired behavior of start_date is [snip]

Yeah I think that’s about right. UP_FOR_RESCHEDULE is technically a restart because a ti set to it will completely go through the worker cycle again. But again it is special (in a bad way) and let’s try to just not change it.

According to my referenced code, ti.set_state(TaskInstanceState.UP_FOR_RETRY) actually sets current time as ti.end_date (because of or ti.state == TaskInstanceState.UP_FOR_RETRY). However, from reading your commentary, this may also be the wrong behavior.

Ah no, my comment on this was wrong, actually. I was confusing UP_FOR_RETRY and UP_FOR_RESCHEDULE. And although we can technically argue UP_FOR_RETRY maybe should not set end_date, it needs to do so because retry in Airflow depends on a task instance to have an end_date.

Also note that there’s currently a discussion on keeping task instance history across tries, so instead of overwriting those dates on retry, we will have a separate row in the database, each having its start and end dates, thus changing the situation a bit.


By the way, I want to specifically point out the work you put in this. It is a mess where Airflow sets ti states in various places, and it’s evident you must have traced a lot of code to reach the understanding on the issue. Thanks for putting in the effort.

from airflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.