ddmal / rodan Goto Github PK

:dragon_face: A web-based workflow engine.

Shell 0.64% Python 23.37% HTML 7.41% Dockerfile 0.35% Makefile 0.18% JavaScript 17.49% SCSS 0.33% CSS 50.22% Batchfile 0.01%

rodan's Introduction

Rodan

Rodan
- Master Branch
- Develop Branch

This repository contains Docker images that can be used to set up Rodan locally for development. These images can also be used in the future with slight modifications for deployment to a swarm production environment. Please see the wiki for more information about deploying Rodan. Rodan Wiki

Objectives

Simplify the installation process of Rodan on all platforms.
Maintain clear installation documentation.

Quick Start

If you are working on Rodan or Rodan Jobs

Make sure you have Rodan submodule cloned in ${repository_root}/rodan/code and it is up to date with the branch you wish to work with. The branches should be either develop, or the name of the feature you would like to include into develop. The master branch is only for version releases and is supposed to be a guaranteed working version.
Follow the instructions here: https://github.com/DDMAL/Rodan/wiki/Working-on-Rodan
- Note the BRANCHES environment variable in the installation scripts, you can set the environment variable locally by running the following command: export BRANCHES="develop".

If you are working on Rodan-Client

Make sure you have Rodan-Client cloned in ${repository_root}/rodan-client/code and it is up to date with the branch you wish to work with.
Follow the instructions here: https://github.com/DDMAL/Rodan/wiki/Working-on-Rodan-Client

Tips for Interacting with Running Containers

The following commands may seem familiar to you if you have worked with Posix systems, or bash shells in general. Many of the commands that exist for docker, by just adding the prefix docker.

If you would like to see a list of all running containers on your machine, execute: docker ps
To copy files between the container and the host, it is the same way you would use scp between different computers, execute: docker cp,
Other commands like docker top are also available to monitor resources outside of the containers.

A similar concept to using exec is using SSH to connect to another computer. We use exec to connect to a specific container. It is much simpler to use docker compose exec, instead of the docker exec. Docker compose will search the configuration inside docker-compose.yml to know which service is being referenced. The format of the command works this way:

docker compose exec <service_name> <command>
The command could be anything eg: /opt/some_directory/my_shell_script.sh
A command you will use frequently is: docker compose exec rodan bash or docker compose exec celery bash for investigating problems. You should not be using this command to edit files, use docker volumes and your IDE outside of the container.

Consult the documentation of the Docker command line for additional information.

Automated Build

The images are rebuilt and pushed automatically on a nightly basis at 2am. This accomplished with a cron job. You must point the cron job to the nightly script on one of the staging virtual machines. Any account will do and no authentication required, add this line to the crontab. Docker hub will send a Slack notification if the image has built. We should expect 5 new images daily, or more if there was a new tagged release of any of them.

0 2 * * 1-5 /srv/webapps/rodan-docker/scripts/nightly

You may also force Docker Cloud to rebuild new images when new commits are pushed to a Git repository. Unfortunately, we had problems connecting the rodan-docker GitHub repository to Docker Cloud due to authentication issues, so we set up a private repository on Bitbucket instead.

Additional Information

For more information about volumes in Docker, see Use volumes in the Docker documentation. See also the docs for the volumes section of the docker-compose.yml file.

rodan's People

Contributors

Stargazers

Watchers

Forkers

mrbannon agpar jiaks project-renard-survey studio-theyang carrieeex westlyou p42ul chetbae sabrina0822 emzedi

rodan's Issues

2d grid of pages + outputs

For an at-a-glance preview of how the whole workflow is going. Pages running horizontally, output of each job running vertically. Maybe per-workflow?

Crop doesn't work

scaling seems to be wrong - crop job crops too much.

Make projects sidebar collapsible

Remove poly button on Segmentation front-end submits form

Gamera classifier frontend

This seems to be the next job on our to-do list (other than DJVU binarisation, possibly). Filing it here for reference.

Convert RGB files

If we're going to be using the RGB salzinnes images then we need to either:

Convert RGB to greyscale before threshold binarise
Do DJVU binarise with defaults and no frontend
Do DJVU with frontend

so that the workflow can run

Login screen error

Putting in just an email, no u/p causes http500

Store image transformations

E.g., if we crop, we need the old/new offset so that we can show the old image along with the coordinates generated by the new image.

Future additions might be to allow 2 different cropped images: 1 small for recognition, 1 large but without colour bars for showing in diva.

More fine-grained controls for segmentation

It would be nice to have more fine-grained controls for the segmentation frontend. For example, if we could delete points with the delete key, and click+drag to highlight multiple points, etc.

Other things:

When you drag one corner of a box, it should drag the two adjacent corners as well (to preserve perpendicularity) - maybe only when holding control.
Make the canvas slightly larger than the image so that you have more of a "buffer" when moving points. Right now, if I move a point too close to the edge of the image, I lose control of it, which is undesirable.
What else?

Despeckle after staff removal

Make this work really easily by making a new despeckle job that fits in there.

Segmentation: Make bounding boxes a bit bigger

Because segmentation is based on staff detection. We should at least make it a little higher and lower. Will need to investigate to get optimal values.

Decorators for views

Could handle things like

Passing in the relevant model instances (Project, Page, Job, etc) from the slug/id
Rendering the template (with the context dictionary being the return value of the function)
Automatically generating breadcrumbs (when relevant)

To be considered. Low priority (maybe after the July 1st deadline).

More automated running

As suggested by Ich (just storing here for future reference if needed):

During workflow creation, inspect the result of a process on a whole bunch of pages (e.g. see that binarise value x is good for all these pages)
When it runs, don't bring up the binarise interface, just automatically run it
Way to go back and tweak value for a job on a single page, then run from there again. (Save results from the last run?)

Optimise staff removal

The staff removal task takes quite a long time (around 80-100 seconds is what I see on susato). It would be great if this could be profiled and then optimised, if possible.

Redirect to image upload page after creating a project

Show indicator that images are uploading

so that people aren't inclined to click the submit button

Clean up all filler text

Filler text

Create viewport images

These would be JPEG images of the same size as the 400px-wide thumbnail (or another one) but not scaled down (so only a small section would be visible).

Useful for despeckle and maybe others.

What part of the image should be shown?

Editing parameters

It needs to be possible to edit parameters and also decide if the step should be automatic or not. This will require some minor changes to the model infrastructure, and to the way job views are handled to cut down on code duplication.

It should also be possible to edit parameters and restart a job that has already completed.

Simple Binarise front-end not working

from a quick look at the console, it seems that there is something undefined - no image is being loaded

Job to add images to diva

Pyramid tiff
Make sure divaserve works
Install diva, hook up to solr

Segmentation: Remove all intermediate points

If you need to change a box and it's got lots of points then it's very hard. Make a button that just chooses the left/right/top/bottom most corners and makes a rectangle for easy resizing.

Remove a step from the middle of a workflow

e.g.

crop
rotate
binarise
recognise

Be able to remove rotate without removing recognise & binarise first. Only if the output of crop can be used as input to binarise

Store image size in staff-finding JSON

For @brianstern and @antonkhelou

Instead of storing the scaling factor in the database (when creating thumbnails), we can just send the original image size in the JSON that's passed to the frontend.

Make the json look something like this:

{
  "width": 4000,
  "height": 5000,
  "points": [list,of,points]
}

then you can calculate the scale factor in the frontend based on the size of the thumbnail and these sizes.

Integrate teh Twitter bootstrap

Or at least parts of it. Neon uses it so why not. At least the buttons will be prettier.

DJVU binarisation frontend

What's going on with this? Does a frontend exist?

Flash message + auto-refresh after task completion

Project creation

If the first job is an automatic one, it must be kicked off as soon as the workflow has been created.

Also, account for multiple workflows for one page (does everything take that into account? e.g. when getting results)

Redirect on 404 job

e.g. if you do /project/1/crop and there's no crop jobs available, it gives 404. Maybe it should redirect to /projects/1 with a flash message that you can't do that job?

Rotate job being processed even if angle is 0

takes about 15s to compute results for nothing

Create image thumbnails on background job thread

So that the server is responsive after you upload something. Also useful for if you give a directory and ingest all files.
Need to make sure that things that require pages don't show these "uploaded" pages until thumbs are ready. Can't make workflow, etc.

Metadata server & smaller images for neon

Neon needs images that are ~1200px wide to show as a background. It also needs a webservice call to ask for the original size of that image for scaling purposes.

Per-project statistics

To show on the project page.

How many pages are done
Total %age complete
How many things are waiting for people (queue size per job, or just 1 global queue?)
How fast jobs are being done?

Automatic job integration

Should be started right after the previous job finishes (in the @rodan_task decorator).

Also, should take the workflow-specific defaults into account (param). The job-specific defaults are the backup.

Job interfaces

@brianstern, can you put up your work on the interfaces? You don't have to put them in the repository or worry about making them Django templates, even just putting it up as a gist and then linking it would be fine. I just need to see how you've done it so I can cleanly integrate it with what I have already.

Clone a workflow

... and add to a different set of pages

Improve task modularity using decorators

I'm working on this now. All the result and result file stuff should be handled outside of the task-specific code. Here's a sneak preview:

@task
@do_stuff(input_filetype='tiff')
def simple_binarise(image_filepath, **kwargs):
    output_img = gamera.core.load_image(image_filepath)
    output_img = output_img.threshold(kwargs['threshold'])

    return {'tiff': output_img}

Move or resize rotation control

The control for rotating the image in the rotation task could be resized or floated next to the canvas. Brought up by Andrew this morning. Filed here for reference.

Integrate Border Removal Plugin

The border removal plugin (github:DDMAL/gamera-lyric-extraction) works now, and it would be great if we could integrate it into the workflow system.

Using it is pretty straightforward:

from gamera.toolkits import lyric_extraction

# requires an image of type GREYSCALE

im = load_image("blah.tiff")
im = im.to_greyscale()

# border_removal returns a mask.
mask = im.border_removal(**use the defaults**)

result = im.mask(mask)

The result should be a greyscale file with the borders (incl. colour bars) removed.

The technique behind it is pretty tricky, so not sure if we can do this interactively with JS, but the default settings seem to work fine, so this can probably be a non-interactive task.

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/celery/execute/trace.py", line 181, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/home/wliu/rodan_source/rodan/jobs/utils.py", line 55, in real_inner
    outputs = f(*input_paths, **kwargs)
  File "/home/wliu/rodan_source/rodan/jobs/pitch_finding.py", line 25, in pitch_find
    recognized_glyphs = aomr_obj.run(glyphs, poly_list)
  File "/home/wliu/rodan_source/rodan/jobs/aomr_resources/AomrObject.py", line 104, in run
    self.find_staves(poly_list)
  File "/home/wliu/rodan_source/rodan/jobs/aomr_resources/AomrObject.py", line 242, in find_staves
    diff_lo = avg_lines[3]-avg_lines[2]
IndexError: list index out of range

Found 1 staff. The JSON produced by staff-finding looks like this:

[[[[1084, 215], [1086, 215], [1164, 215], [1170, 215]]], [[[701, 1833], [776, 1833]]]