Git Product home page Git Product logo

dagworks-inc / burr Goto Github PK

View Code? Open in Web Editor NEW
436.0 5.0 25.0 19.56 MB

Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, persist, and execute on your own infrastructure.

Home Page: https://burr.dagworks.io

License: BSD 3-Clause Clear License

Python 64.12% JavaScript 0.25% HTML 0.33% Shell 0.05% CSS 0.17% TypeScript 35.07%
burr dags graphs llmops llms mlops persistent-data-structure state-machine state-management visibility

burr's People

Contributors

elijahbenizzy avatar jombooth avatar jordi-adame avatar skrawcz avatar vertis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

burr's Issues

Inputs API

Need to provide a good way to provide inputs for a step that are required to be passed.

problem

I want to provide inputs for a step -- but I don't want to manipulate state outside of actions (power user mode, I want simple).

proposed API

@action(
    reads=[],
    writes=["question"],
    inputs=["question"]
)
def human_converse_placeholder(state: State, question: str) -> Tuple[dict, State]:
    return {"question": question}, state.update(question=question)

(it being a function means I can do extra stuff to manipulate inputs, etc, and it's all unit testable).

then in the control flow something like:

inputs = None
while True:
    current_action, prior_result, current_state = app.step(inputs=inputs)
    inputs = None
    if action.name == "human_converse":
        user_question = input("What is your next question: ")
        inputs = {"question": user_question}
    if action.name == "terminal":
        break

Then it's clear:

  1. what is required for an action to run.
  2. the framework will error if "question" is not provided
  3. we can draw a graph saying "question" is input to be provided.

relationship with bind

bind can be used to bind an action's parameters to something fixed. So that still works. We're just talking about a free set of parameters that are required for a particular step to run.

step()

We need to make sure step() is returning the current action, but without executing it.

Umbrella Phase 1 Requirements

For V1 release. Goal is March 1. In addition to: #29, we have:

  • Example apps with tracking server
    • Chatbot
  • Add more complex example (simple chatbot) in getting started in docs
  • Get video intro for those who want to watch
  • Roadmap in README
  • (stretch) Static server hosted online
  • Streamed output #19
  • (stretch, with feedback) -- handling exceptions (#30)
  • State persistence in a clean way
  • (directly post-release) Logging metadata -- #46
  • Telemetry/tracking on posthog (with opt-out, same as hamilton)(#58)
  • Figure out what slack to link to
  • Add comparison matrix in README #76
    • Langgraph
    • Temporal
    • langchain
    • hamilton
    • superagent.sh (burr is not agent specific, host/run wherever python runs)
  • Change README gif to be of actual chatbot API

state.append doesn't work unless read is specified

This works when image_location_history is added to reads=

@action(reads=["current_image_location", "image_location_history"],
        writes=["current_image_caption", "image_location_history"])
def image_caption(state: State) -> Tuple[dict, State]:
    current_image = state["current_image_location"]
    result = caption_image_driver.execute([
        "generated_caption"
    ], inputs={"image_url": current_image})
    updates = {
        "current_image_caption": result["generated_caption"],
    }
    return result, state.update(**updates).append(image_location_history=current_image)

The append wipes the values in image_location_history if image_location_history is not in reads=. I.e. this doesn't not do as one expects:

@action(reads=["current_image_location"],
        writes=["current_image_caption", "image_location_history"])
def image_caption(state: State) -> Tuple[dict, State]:
    current_image = state["current_image_location"]
    result = caption_image_driver.execute([
        "generated_caption"
    ], inputs={"image_url": current_image})
    updates = {
        "current_image_caption": result["generated_caption"],
    }
    return result, state.update(**updates).append(image_location_history=current_image)

Umbrella Phase 2 requirements

Follow up on #31. Next set of steps. This is in priority order:

  • Improve UI layout #100
  • Examples for how to generate test cases
    • Use persister
  • Simplify examples
    • Ensure everyone has README with the same structure
      • What does this teach you
      • What should you know before
      • How to run
      • How to adapt to your case
      • Add notebook
    • Separate Hamilton from others
  • Add control-flow-based documentation #56
  • Add web-server based example documentation #22
  • Annotations, tagging, querying #46
    • Add programmatic API
    • Add UI annotation capability
    • Add querying capability programmatically
  • Add artifact logging

Document how to hand-off to a human

If someone is building an agent, an agent can go do stuff, but then might need help.

Frame that example using run(halt_before=["human_step"])...

Expand/contract all in UI state/result view

We should be able to click expand/contract to recursively expand/contract subfield views in the non-json view. Should be a faint - or + (or, perhaps, double-chevron) on the right side of the header.

Add Simulation Example

We have room for an example + have it in the docs. We need to think through what we want to suggest. Some ideas:

  1. Time-series forecasting -- do a few steps that forecast each week and move into the future
  2. Trading simulation -- similar to (1) -- steps are data preparation, stock forecasting, portfolio construction . We store positions in state and update them -- the update is tricky
  3. Something with multiple agents talking to one another?

Please comment if you have preferences (or want to contribute!).

Wiping state does not appear to work

When calling:

state.wipe(keep=...)

as part of an action, it doesn't apply the difference. This is because we use a merge operation on a subset of the state, which is added back in. Thus if its missing it doesn't matter, it just takes the old value. Two options:

  1. (short-term) Check the missing keys and delete them from the final state (this is a bit of a hack but it should work). We have to do this by diffing the state before/after update and remembering the missing ones.
  2. (medium/long-term) Instead of diffing the state at each step, we should just apply deltas. Then we do a commit operation afterwards, which gather all of them. The fork/merge op will just gather the deltas, then we apply them to the final state in the order we got them in.

(1) will be good for v0

Decide on chat app for community

We should decided on a chat app for community.

Top contenders:

  • discord
  • slack

Please vote on your preference!

๐ŸŽ‰ for discord
๐Ÿš€ for slack

Parallel map-style actions

Overview

We should be able to launch a whole bunch of actions in parallel. Walking the state machine in parallel is tricky, so this proposes the following:

  1. A MultiAction that performs a map operation over some state field
  2. This then spawns and delegates subactions
  3. These sub actions each have access to a portion of that state field + everything else they need in state
  4. These then write back to state
  5. The user specifies how to reduce the result
  6. This is done via a configurable executor -- asyncio, threading, etc... With default of multithreading (for sync) and asyncio (for async)

Use-cases

  1. PR-bot -- want to determine how to respond for each comment, also for each file
  2. Firing off requests to multiple LLMs at once to determine the best/quickest
  3. Optimizing over multiple ML algorithms at once.
  4. Web scraper -- send a bunch of requests out

Requirements:

  1. Recursive, all the way down. Should use the same executor for sub-actions, sub-sub-actions, etc...
  2. Each individual update (from an action completing) is treated as a state update
  3. Idempotency is encouraged, if not built in (for instance, it'll decide the tasks on the input fields minus the output fields)
  4. Configurable parallelism (max rate, etc...)
  5. Hooks, everything respected as expected
  6. Quick-exit mode (get the first then cancel the others), join-all mode (join them all)

Design Qs:

  1. Should the actions that can be launched from an action all be the same? Or should there be a possibility to launch two of them?
  2. How to handle action failures? Should we allow some? Do retries? End up having the actions themselves handle failure cases and not deal with them at all?
  3. API -- class-based? Function-based?

State typing

Currently state is untyped. Ideally this should be able to leverage a pydantic model, and possibly a typed dict/whatnot. We'll need the proper abstractions however.

Some requirements:

  1. IDE-friendly -- we should be able to use the typing system to get the state objects
  2. Subsetting -- we should be able to define state on a per-action basis (which can come instead of writes/reads potentially). Then we have different actions that can be compiled together. We should also be able to do this centrally.
  3. Validation -- we should be able to use this information to "validate" the graph
  4. Backwards-compatible/optional -- Burr does not support types currently. This should be backwards compatible, meaning that if no types are included we do not validate (or perhaps treat everything as Any, which is bidirectionally compatible with typing.
  5. Transactional updates -- this could be really hard if we rely on pydantic models for more than just the spec...
  6. Allows optionals (maybe?)

Ideas:

pydantic

class MyState(TypedState): # extends pydanticModel
    foo: int
    bar: str

@action(reads=["foo"], writes=["bar"])
def my_action(state: MyState) -> MyState:
    return {...}, state.udpate(bar=str(state["foo"]))

@action(reads=["foo"], writes=["bar"])
def my_action(state: MyState) -> MyState:
    return {...}, state.udpate(bar=["foo"]) # fails validation

- hard to do transactional updates
+ IDE integration is easy
+ easy integration with web-services/fastAPI
~ subsetting is a bit of work, but we can bypass that by using the whole state

in the decorator/class

@action(reads={"foo" : int}, writes=["bar" : str])
def my_action(state: State) -> State:
    return {...}, state.udpate(bar=str(state["foo"]))

- No free IDE integration (without a plugin)
+ simple, loosely coupled, easy to inspect
~ duplicated between readers and writers (can't decide if this is bad?)

Handle large state objects in the UI

Problem

Large state objects will (a) kill the UI, and (b) make looking at data difficult.

Context

I have a large list of objects 1K+, that I am processing and iterating through. The UI needs to handle this case.

Possible Solutions

  1. limit what is returned from the backend. Allow people to dig in, and open up a specific modal if the object is of a certain type or size.
  2. do some smart caching on the UI? or only store "deltas" between steps?
  3. do something else?

Persister Umbrella Issue

  • Add Mongodb support
  • Add snowflake support (this will likely be restrictive)
  • Add deltalake support (this will likely be restrictive)
  • Think through schema versioning patterns and provide guidance for upgrades & updates.

Umbrella V0 requirements

For V0 release. Goal is next monday. User story:

  1. Download burr from pypi
  2. Run server
  3. Check out demos
  4. Copy/paste app
  5. Add something/modify
  6. Track through server

Requirements:

  • Fix this issue: #28 -see #34
  • Get demo of Burr runs as a CLI burr-demo chatbot
    • Demo data for each one
    • Chatbot sample runs
    • Counter example
    • RAG
  • Merge all other examples
  • Test out installing/running
    • Mac
    • Windows
  • Fix run API (halt_before/halt_after)
  • Release 0.3.0 (with server)
  • Release 0.4.0 (with all the other changes)
  • Add Roadmap to README (#38)
  • Fix serialization for tracking (#23)
  • Add guards around installation
  • Remove error case from chatbot example
  • Add temporary "getting started with the UI" -- load up streamlit, open UI, see it work...
  • #36 (solved by #37)
  • Loom for explanation
  • Update docs to be public-facing #41

Tracing inside actions

We want the ability to have insights. Some requirements:

  1. Opentel compatible
  2. Log arbitrary metadata
  3. Recursive
  4. Reactive on the hooks side
  5. Sync + async friendly

API:

import asyncio
from typing import Tuple
import time

from burr.core import ApplicationBuilder, State, action, default
from burr.visibility import Tracer


@action(reads=["number"], writes=["number"])
async def test_async_action(state: State, __tracer: Tracer) -> Tuple[dict, State]:
    with __tracer("compute_number"):
        number = state["number"] + 1

    async with __tracer("sleep_async") as tracer:
        tracer.log_artifact("number", number)
        tracer.log_artifact("some_key", {"hello": "world"})
        await asyncio.sleep(1)

    with __tracer("sleep_sync"):
        with __tracer("nested_sleep"):
            await asyncio.sleep(1)
        await asyncio.sleep(1)

    return {"number": number}, state.update(number=number)


@action(reads=["number"], writes=["number"])
def test_action(state: State, __tracer: Tracer) -> Tuple[dict, State]:
    with __tracer("compute_number"):
        number = state["number"] + 1

    with __tracer("sleep"):
        time.sleep(1)
    return {"number": number}, state.update(number=number)


async def main():
    application = (
        ApplicationBuilder()
        .with_tracker("test:tracer").with_actions(
            test_action=test_action,
            test_async_action=test_async_action,
        )
        .with_transitions(
            ("test_action", "test_async_action", default),
            ("test_async_action", "test_action", default),
        ).with_state(number=0)
        .with_entrypoint("test_action")
        .build()
    )
    await application.arun(halt_after=["test_async_action"])


if __name__ == "__main__":
    asyncio.run(main())

Hamilton action improvements for taking state in/out

Problems with Hamilton integration:

  • We can't modify inputs from state
  • We can't modify outputs to state

Idea:

h = Hamilton( 
    inputs={"most_recent_item": from_state("all_items", process=lambda x: x[-1]) # get the last one
    outputs={"count": update_state("count", process=lambda count: count + 1)} # increments it
)

Also:

h = Hamilton(
    inputs=...,
    outputs=...,
    extra_updates=lambda result, state: ...
)

Streaming Async

This is tricky as async generators have no return capability. The way this will likely work is that the last yield is the final result:

@streaming_action(reads=["prompt"], writes=["prompt"])
def streaming_chat_call(state: State, **run_kwargs) -> Generator[dict, None, Tuple[dict, State]]:
    client = openai.Client()
    response = client.chat.completions.create(
        model='gpt-3.5-turbo',
        messages=[{
            'role': 'user',
            'content': state["prompt"]
        }],
        temperature=0,
        stream=True,
    )
    buffer = []
    for chunk in response:
        delta = chunk.choices[0].delta.content
        buffer.append(delta)
        yield {'response': delta}
    full_response = ''.join(buffer)
    yield {'response': full_response}, state.append(response=full_response)

We may want to consider adding this support for the synchronous one as well, as it could keep things consistent.

Improve notebooks in examples

All notebooks should:

  1. Explain what you'll learn
  2. Display/link to the code they pull in
  3. Explain what they're doing
  4. Explain why they use specific APIs
  5. Link to docs

Continuation of #114

Display inputs in UI

We should know the inputs here. To do this, we need to:

  • Store the inputs as part of action (should be easy), in the tracker
    • Add to action model
    • Wire through here
    • Probably make it optional for backwards compatibility
  • Regenerate the API client (not yet documented)
    • Command: npx openapi-typescript-codegen --input http://localhost:7241/openapi.json --output ./src/api, must be running server
  • Wire through to the GraphView component -- actively in flux, links might be out of date
    • change the convertApplicationToGraph to output a type: "input" action
    • Add a custom node type
    • Determine how it should make it clear that its an input (icon? Shape? TBD...)

The below image should have prompt as an input -- it should be distinct and have one edge to prompt (the action).

image

Allow `~` operation on `Condition` object

Should be as simple as flipping an inversion flag and changing hte name to have a ~

  @action(reads=["n"], writes=["n", "n_history"])
  def even(state: State) -> Tuple[dict, State]:
      result = {"n": state["n"] // 2}
      return result, state.update(**result).append(n_history=result["n"])

  @action(reads=["n"], writes=["n", "n_history"])
  def odd(state: State) -> Tuple[dict, State]:
      result = {"n": 3 * state["n"] + 1}
      return result, state.update(**result).append(n_history=result["n"])
  is_zero = expr("n == 0")
  is_even = expr("n % 2 == 0")
  application = (
      ApplicationBuilder()
      .with_state(n_history=[])
      .with_actions(
          original=Input("n"),
          even=even,
          odd=odd,
          result=Result("n_history"),
      ).with_transitions(
          (["original", "even", "odd"], "result", is_zero),
          (["original", "even", "odd"], "even", is_even),
          (["original", "even", "odd"], "odd", ~is_even),
      ).with_entrypoint("original")
      .build()
  )
  state, [result] = application.run(until=["result"])
    ```

Streamed output

in the case of an LLM app, people will want to stream a result back.

We need to think through the API a bit.

UI edge case to handle

Handle multiple entries for the same sequence ID.

Use case -- tracker logged an error. I then restart from that state and fix the bug, but I want to continue to trace to the same place. This will result in something like this:

{"type":"begin_entry","start_time":"2024-03-14T23:46:33.550378","action":"counter","inputs":{},"sequence_id":2}
{"type":"end_entry","end_time":"2024-03-14T23:46:33.551172","action":"counter","result":null,"exception":"Traceback (most recent call last):\n  File \"/Users/stefankrawczyk/dagworks/burr/burr/core/application.py\", line 320, in _step\n    raise e\n  File \"/Users/stefankrawczyk/dagworks/burr/burr/core/application.py\", line 310, in _step\n    result, new_state = _run_single_step_action(next_action, self._state, inputs)\n  File \"/Users/stefankrawczyk/dagworks/burr/burr/core/application.py\", line 186, in _run_single_step_action\n    result, new_state = action.run_and_update(state, **inputs)\n  File \"/Users/stefankrawczyk/dagworks/burr/burr/core/action.py\", line 442, in run_and_update\n    return self._fn(state, **self._bound_params, **run_kwargs)\n  File \"/Users/stefankrawczyk/dagworks/burr/examples/counter/application.py\", line 18, in counter\n    raise ValueError(\"random error\")\nValueError: random error\n","state":{"counter":1,"__SEQUENCE_ID":2,"__PRIOR_STEP":"counter"},"sequence_id":2}
{"type":"begin_entry","start_time":"2024-03-14T23:55:09.384311","action":"counter","inputs":{},"sequence_id":2}
{"type":"end_entry","end_time":"2024-03-14T23:55:09.385772","action":"counter","result":{"counter":2},"exception":null,"state":{"counter":2,"__SEQUENCE_ID":2,"__PRIOR_STEP":"counter"},"sequence_id":2}

So there will be two entries. The UI just shows the last one. Should really show both...

Alternatively, we should never write to the same place and deal with this problem differently.

Add graph validation/compilation

We should be able to validate the DAG. Specific things we can check for:

  1. For all nodes that read state item X, given an initial state, does there exist a path that does not provide the required state?
  2. For all produced keys, do types match up? (when we have typing)?
  3. Is there a path that will not end up in a valid state (E.G. nowhere to go)?

Then on execution, we should be able to say:

  1. If there is a halt_after, is there a path that will reach it?
  2. If there is a halt_before, is there a path that will reach it?

Idea for (1):

app = ....build(validate=True) # on build, we can validate
app = ....build(validate_types=True, validate_paths=True) # if you want to do different things

Idea for (2)

app = ....build()
app.run(halt_after=..., validate=True)

Note this is not friendly towards state with default values as we don't expose that in the framework... But I think types + defaults should be easy enough to expose.

Spec'ing state saving and loading

Use cases:

  1. User controls how the app is started via with_state() and with_entrypoint().
  2. User delegates to a persister & burr to reload state for a given partition key, app_id -- via with_db() and with_identifier. Seqeuence id is either 0 or loaded from state, or passed in via with_identifier.
  3. user can add a persister via with_persistence -- need to deduplicate hooks potentially

Notes:

  • tracker & persister can be same object
  • can have separate persister and tracker

Requirements:

  • app knows about partition key
  • app knows about app_id or generates one
  • app sets sequence id to 0 unless it loads one from prior state
  • hooks get passed partition key, app_id, sequence no, position

UI Perf improvement: don't try to render list[float] as individual fields

It wont be uncommon to have embeddings in state.

Current behavior

Currently we render each dimension as its own field. This slows down the UI when things get large.

Screenshots

Screen Shot 2024-04-08 at 3 06 46 PM

Steps to replicate behavior

Use telephone example.

  1. save a list of floats
  2. see it in state

Library & System Information

latest.

Expected behavior

Not each dimension as its own field. A single array view. Or something special for embeddings.
The UI is snappy.

Additional context

Telephone example saves embeddings.

Expose inputs and track in hooks

Currently inputs are not exposed through the pre_run_step hook. We should:

  • Expose them as part of the hook
  • Add them to the tracker
  • Test that end to end

Add metadata capture

You want to +1 or -1 a response for later evaluation. We need to expose a way to do that.

idea

# annotated current returned state 
app.annotate({"value": 1, }) # implicit app_id, sequence_id

# annotate some thing afterwards
Tracker().annotate(app_id, sequence_id, {"value": 1})
Tracker().annotate(app_id, sequence_id, {"value": 1, "target": SOME_STATE_KEY})

You'd have to provide more spcificis on what the metadata is yourself I think. e.g. what it targets in the state.

This should work for arbitrary metadata annotation.

We'd then want some export functionality.

Documentation Updates

README

  • have the docs link over the word documentation (not docs) for easy to spot blue underlined link

Docs

  • Landing page is clear and straight to the point

Getting started

https://studious-spork-n8kznlw.pages.github.io/getting_started/why-burr.html

  • first line is great
  • typo: in list have all verbs to the same tense (accepts*, does, decides)
  • should the IF condition be in a single step (5 and 6), I think it matters for clarity
  • After figure, make some space to highlight statement "Now let's try to get this to production"

Installing

  • detail: since the zsh issue is a minority, place it lower, after the general case instructions are done

Simple Example

  • intro is dynamic "this is simple, but feel free to jump ahead"
  • the counter is a much clearer example than the cowsay I tried, it's minimal and showcases the state machine

Next Steps

  • the page is clear; I like the links

Applications

https://studious-spork-n8kznlw.pages.github.io/concepts/state-machine.html

  • under Running header, the Python code block is improperly renderer because of missing newline
  • the 3 execution methods are very clear

State

https://studious-spork-n8kznlw.pages.github.io/concepts/state.html

  • comment in code block says "return a dictionary of all the state", may be clearer with "return a dictionary of the full state / every key of the state"

Actions

https://studious-spork-n8kznlw.pages.github.io/concepts/actions.html

  • function and class API exist, but explicitly state they are fully equivalent and you get full features; maybe mention why prefer one over the other (inheritance?)
  • "We call (1) a Function and (2) a Reducer (similar to Redux)" what is Redux? add a link?
  • this should appear at the end of its paragraph because it's not useful for those unfamiliar with the terms: "We call (1) a Function and (2) a Reducer (similar to Redux)"

Transitions

https://studious-spork-n8kznlw.pages.github.io/concepts/transitions.html

  • the first explanation "transitions move between actions" is unclear. Maybe say "Transitions define explicitly how actions are connected and which action is available next at a given step"
  • rst url formatting error at the bottom of the page

hooks

https://studious-spork-n8kznlw.pages.github.io/concepts/hooks.html

  • error formatting Hamilton link

Planned capabilities

  • for the state management immutability higlight the features 1, 2, 3 by having them in a proper list

Allow actions to not specify intermediate result

Take the following:

@action(reads=['input_var'], writes=['output_var'])
def simple_action(state: State) -> tuple[dict, State]:
    output_var = _compute(state["input_var"])
    result = {"output_var" : output_var}
    return result, state.update(**result)
    # or 
    return result, state.append(**result)

This is just a lot of boiler plate -- you have to:

  1. Compute the result
  2. Store it in a variable
  3. return it
  4. Apply a state update using it

Note most people will probably write it like this:

@action(reads=['input_var'], writes=['output_var', 'output_var_list'])
def simple_action(state: State) -> tuple[dict, State]:
    output_var = _compute(state["input_var"])
    return {"output_var" : output_var}, state.update(output_var=output_var).append(output_var_list=output_var)

Which is correct, but kind of strange. This is due to the oddity of the "single-step" action. That said, if we want these as simple python functions, then we don't really need the result. Just inspecting the state delta will do it... In fact, when we have the layering, we can think of the state delta as the result.

So, what if we just didn't have intermediate results for certain simple state updates:

@action(reads=['input_var'], writes=['output_var'])
def simple_action(state: State) -> State:
    output_var = _compute(state['input_var'])
    return state.update(output_var=output_var).append(output_var_list=output_var)

We would determine which we use either based on the annotation or the return type, nd the user would do this knowing that they wouldn't have access to intermediate results in the UI...

Alternatively, we could have them only use the result, and we auto-update the state...

@action(
    reads=['input_var'], 
    writes=[
        update(output_var='output_var'), 
        append(output_var_list='output_var')
    ]
) 
def simple_action(state: State) -> dict:
    return {"output_var" : _compute(state['input_var'])}

Having update as a separate function is just one approach, could have increment and append, or have the default be update

More ideas (in this case I'm adding output_var_list to illustrate append

@action(reads=['input_var'], writes=[update('output_var'), append('output_var_list')])
@action(reads=['input_var'], writes=['output_var', append('output_var_list')]
@action(reads=['input_var'], updates=['output_var'], appends=['output_var_list'])
@action(
    reads=['input_var'], 
    writes=['output_var', 'output_var_list'], 
    update=lambda result: state.update(output_var=result['output_var']).append(output_var_list=result['output_var'])
)

Basic How-tos in examples

These should form the "how-to" for diataxis.

  1. Saving/loading state (manually)
  2. Control flow patterns (inputs, etc...), broken up by use-cases (terminal chatbot, etc...)
  3. Async in a web-server

Add `increment()` option in state

Common state operation -- we'll want to increment the value.

state.increment(foo=1) # increment foo by 1 in state
state.increment(foo=10) # increment foo by 10 in state
state.increment("foo") # increment foo by 1 in state

Steps:

  1. Add an IncrementField operation in State
  2. Add an .increment(*increment_by_1, **increment) method that corresponds to the operation and applies it

Usability/debugging (umbrella)

From user-feedback:

  • Result has to be a dictionary, this is confusing. Options:
    • Error out if the result is not a dictionary -- this is the best for now
    • Wrap it in the dictionary
    • Relax the restriction?
  • Error out if a step does not produce the items in state you are expecting it to
    • Anything in write must exist
  • People think state is mutable.

Validate `with_entrypoint` and `with_transitions` to check if the types are strings

This should fail for two reasons and mention that the type should be a string, not an action. It should also explain why (because they don't have names until they're added in):

from burr.core import action, State, ApplicationBuilder, default


@action(reads=[], writes=["list"])
def append(state: State) -> tuple[dict, State]:
    return state.append(list=1)


app = (
    ApplicationBuilder()
    .with_state(list=[])
    .with_actions(append=append)
    .with_entrypoint(append)
    .with_transitions(
        (append, append, default),
    )
    .build()
)

old_state = app.state
action, result, new_state = app.step()

print(old_state, new_state)

Add exception transitions

Currently, exceptions will break the control flow of an action, stopping the program early. Thus, if an exception is expected, the program will stop early. We will be adding the ability to conditionally transition based on exceptions, which will allow you to transition to an error-handling (or retry) action that does not need the full outputs of the prior action.

Here is what it would look liek in the current API:

@action(reads=["attempts"], writes=["output", "attempts"])
def some_flaky_action(state: State, max_retries: int=3) -> Tuple[dict, State]:
    result = {"output": None, "attempts": state["attempts"] + 1}
    try:
        result["output"] = call_some_api(...)
    excecpt APIException as e:
        if state["attempts"] >= max_retries:
           raise e
    return result, state.update(**result)

One could imagine adding it as a condition (a few possibilities)

@action(reads=[], writes=["output"])
def some_flaky_action(state: State) -> Tuple[dict, State]:
    result = {"output": call_some_api(...)}
    return result, state.update(**result)

builder.with_actions(
   some_flaky_action=some_flaky_action
).with_transitions(
   (
      "some_flaky_action",
      "some_flaky_action",
      error(APIException) # infinite retries
      error(APIException, max=3) # 3 visits to this edge then it gets reset if this is not chosen
      # That's stored in state
)

Improve UI Layout

Problem

The current layout leaves a lot of wasted real estate and also means I cannot see the data and the graph at the same time.

Proposed Solution

  • move the graph under the sequence list
  • have the data pane on the right take the whole side. We can then subdivide the data pane more easily (e.g. to handle large state objects).

Screen Shot 2024-03-22 at 2 07 23 PM

ML Training Example

We need an ML training example, and ideally want to demonstrate the following:

  • Human-in-the-loop (ml training with human input)
  • Epoch-based training for tracking/decision making

Would love contributions! See the docs from this PR for more details.

Defunct Issue about async generators -- keeping as a bit of documentation

When I built this out I was confused about async generators -- while python does not currently allow for type annotations on the return type, it does allow it to return something. This specifically suggests:

Counter app in Burr demo

Got lazy and decided not to implement. Should be pretty straightforward, follow the GPT-like example.

Burr UI

Design

See doc here: https://docs.google.com/document/d/1dMjZSj2j3nl0JWF17q1oAcxXcBa4uKH5mxm9hsMG9xM/edit#heading=h.wkfjshpmk28

Tasks:

Client

  • Implement client
  • Implement async client
  • Decide on schema + pydantic models

Server

FastUI + websockets for streaming updates. All async.

  • Implement endpoints
    • GET /projects
    • GET /project/{project_id}/runs
    • GET /project/{project_id}/{run_id}`
  • Implement web sockets or server-sent events to allow for refreshing/streaming (should be straightforward)
  • Add some simple abstraction to allow for new backends
  • Add some very basic tests

UI

React + tailwind components. See license for allowance of OS.

  • Get started
    • CRA to get a react app going
    • Tie it together to serve a static route through fastAPI
    • Investigate whether we can develop efficiently with auto-updates
  • Header
  • Title
  • Project select page
  • Project runs page
  • Trace runs page
    • Viz using reactflow
    • Action + action delta
    • Replay button (optional)
    • (auto) Update toggle/refresh

demo (optional for now)

  • Docker image with a bunch of examples loaded
  • Deploy at demo.burr.dagworks.io

Transactional/commit-based state

The plan here is from the docs. Problem is that we're eagerly evaluating state at every point which is not particularly efficient (TBD, however). This also helps us solve this in a clean way: #28.

We plan the ability to manage state in a few ways:

  1. commit -- an internal tool to commit/compile a series of changes so that we have the latest state evaluated
  2. persist -- a user-facing API to persist state to a database. This will be pluggable by the user, and we will have a few built-in options (e.g. a simple in-memory store, a file store, a database store, etc...)
  3. hydrate -- a static method to hydrate state from a database. This will be pluggable by the user, and we will have a few built-in options that mirror those in persist options.

Currently state is immutable, but it utilizes an inefficient copy mechanism. This is out of expedience -- we don't anticipate this will
be painful for the time being, but plan to build a more efficient functional paradigm. We will likely have:

  1. Each state object be a node in a linked list, with a pointer to the previous state. It carries a diff of the changes from the previous state.
  2. An ability to checkpoint (allowing for state garbage collection), and store state in memory/kill out the pointers.

We will also consider having the ability to have a state solely backed by redis (and not memory), but we are still thinking through the API.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.