Git Product home page Git Product logo

ca-timed-node-traversal's Introduction

Workflow Runner with Timed Node Traversal

About

This is a workflow runner that accepts the specification of a workflow in the form of a DAG represented in JSON where letters are assigned to the vertices and numbers are assigned to the edges. One node needs to be designated as the start vertex.

As the runner goes through the graph it prints the letter of each vertex it visits. Next, for each edge going out of a vertex it waits the specific number of seconds before traveling to the connected vertex.

Note: The runner processes edges in parallel so that it starts the “timer” for each edge going out of a vertex at the same time after printing the vertex letter. For example, consider the graph where A is the start vertex:

Example Graph

The runner should start by immediately printing A, then after 5 seconds print B, and then 2 seconds later print C. This graph, represented as JSON, would look something like:

{
  "A": {"start": true, "edges": {"B": 5, "C": 7}},
  "B": {"edges": {}},
  "C": {"edges": {}}
}

Running

The "workflow engine"

This was tested on an OS X system running Python 3.10.2. To get a sense of how to run it, you can get the help by running python main.py -h at the command line. It will print something like this:

workflow runner

positional arguments:
  workflow_spec         workflow specification path represented in JSON

options:
  -h, --help            show this help message and exit
  --with-timestamps, --no-with-timestamps
                        with_timestamps will print timestamps that the node
                        was visited (default: False)

If you have a workflow specification defined in "my_workflow.json" you could process the workflow with:

python main.py my_workflow.json

The tests

The tests can be run with python -m unittest discover. The tests are not as quick as I would like them to be, because of the nature of the program (i.e. needing to wait for nodes to be processed). I could improve this by reducing the edge delay time a bit for in the fixtures.

Developing

This was developed using all standard Python libraries. There is a dev-requirements.txt that contains the Python package "black", but it's actually not necessary for development. It's a code formatter that is run before code is committed.

This was developed using the TDD methodology. It was started with tests:RunnerTest.test_simple_json_file_runs_correctly before any other code was written. The idea was to make it as black box as possible. That being said, I decided to take a bit of a shortcut and add a --with-timestamp optional argument to the script to make it easier to validate correct functionality during "end to end" testing.

A note on code documentation. My personal preference is to avoid it whenever possible. I've found that it's often out of date. It's far better to name variables, functions, etc. descriptively and refactor hard-to-understand code into smaller functions with clear names.

I'm not a huge fan of object-oriented programming, and avoid it where reasonable. I just find it easier to test and reason about when there's no strangely mutating state that I need to keep track of. For passing around data, I used named tuples here, but I also really like using Data Classes.

I've type-hinted most of the code here, though I've probably missed a couple of spots. It's super helpful for PyCharm's validator. I didn't run mypy on the code.

Continuous Integration

I was hopeful to have enough time to write a quick GitHub Actions workflow to run the tests and do code formatting. As it is, I spent longer than I expected on this "homework", so I dropped that idea.

ca-timed-node-traversal's People

Contributors

jashugan avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.