Git Product home page Git Product logo

rodan's Introduction

Rodan

  • Rodan GitHub release GitHub pull requests GitHub issues
    • Master Branch GitHub last commit (branch)
    • Develop Branch GitHub last commit (branch)

This repository contains Docker images that can be used to set up Rodan locally for development. These images can also be used in the future with slight modifications for deployment to a swarm production environment. Please see the wiki for more information about deploying Rodan. Rodan Wiki

Objectives

  • Simplify the installation process of Rodan on all platforms.
  • Maintain clear installation documentation.

Quick Start

If you are working on Rodan or Rodan Jobs

  • Make sure you have Rodan submodule cloned in ${repository_root}/rodan/code and it is up to date with the branch you wish to work with. The branches should be either develop, or the name of the feature you would like to include into develop. The master branch is only for version releases and is supposed to be a guaranteed working version.
  • Follow the instructions here: https://github.com/DDMAL/Rodan/wiki/Working-on-Rodan
    • Note the BRANCHES environment variable in the installation scripts, you can set the environment variable locally by running the following command: export BRANCHES="develop".

If you are working on Rodan-Client

Tips for Interacting with Running Containers

The following commands may seem familiar to you if you have worked with Posix systems, or bash shells in general. Many of the commands that exist for docker, by just adding the prefix docker.

  • If you would like to see a list of all running containers on your machine, execute: docker ps
  • To copy files between the container and the host, it is the same way you would use scp between different computers, execute: docker cp,
  • Other commands like docker top are also available to monitor resources outside of the containers.

A similar concept to using exec is using SSH to connect to another computer. We use exec to connect to a specific container. It is much simpler to use docker compose exec, instead of the docker exec. Docker compose will search the configuration inside docker-compose.yml to know which service is being referenced. The format of the command works this way:

  • docker compose exec <service_name> <command>
  • The command could be anything eg: /opt/some_directory/my_shell_script.sh
  • A command you will use frequently is: docker compose exec rodan bash or docker compose exec celery bash for investigating problems. You should not be using this command to edit files, use docker volumes and your IDE outside of the container.

Consult the documentation of the Docker command line for additional information.

Automated Build

The images are rebuilt and pushed automatically on a nightly basis at 2am. This accomplished with a cron job. You must point the cron job to the nightly script on one of the staging virtual machines. Any account will do and no authentication required, add this line to the crontab. Docker hub will send a Slack notification if the image has built. We should expect 5 new images daily, or more if there was a new tagged release of any of them.

0 2 * * 1-5 /srv/webapps/rodan-docker/scripts/nightly

You may also force Docker Cloud to rebuild new images when new commits are pushed to a Git repository. Unfortunately, we had problems connecting the rodan-docker GitHub repository to Docker Cloud due to authentication issues, so we set up a private repository on Bitbucket instead.

Additional Information

For more information about volumes in Docker, see Use volumes in the Docker documentation. See also the docs for the volumes section of the docker-compose.yml file.

rodan's People

Contributors

ahankinson avatar alastair avatar antonkhelou avatar breakend avatar brianstern avatar cadagong avatar deepanjanroy avatar deepio avatar dellsystem avatar emzedi avatar gabbyhalpin avatar jackyyzhang03 avatar jregimbal avatar kemalkongar avatar khoitiennguyen avatar kqct avatar lbaribeau avatar malajvan avatar mrbannon avatar mrmondrian avatar negarehmir avatar p42ul avatar raviraina avatar rberkow avatar rivukhoda avatar sabrina0822 avatar softcat477 avatar timothydereuse avatar vigliensoni avatar yihongluo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rodan's Issues

2d grid of pages + outputs

For an at-a-glance preview of how the whole workflow is going. Pages running horizontally, output of each job running vertically. Maybe per-workflow?

Gamera classifier frontend

This seems to be the next job on our to-do list (other than DJVU binarisation, possibly). Filing it here for reference.

Convert RGB files

If we're going to be using the RGB salzinnes images then we need to either:

  • Convert RGB to greyscale before threshold binarise
  • Do DJVU binarise with defaults and no frontend
  • Do DJVU with frontend

so that the workflow can run

Store image transformations

E.g., if we crop, we need the old/new offset so that we can show the old image along with the coordinates generated by the new image.

Future additions might be to allow 2 different cropped images: 1 small for recognition, 1 large but without colour bars for showing in diva.

More fine-grained controls for segmentation

It would be nice to have more fine-grained controls for the segmentation frontend. For example, if we could delete points with the delete key, and click+drag to highlight multiple points, etc.

Other things:

  • When you drag one corner of a box, it should drag the two adjacent corners as well (to preserve perpendicularity) - maybe only when holding control.
  • Make the canvas slightly larger than the image so that you have more of a "buffer" when moving points. Right now, if I move a point too close to the edge of the image, I lose control of it, which is undesirable.
  • What else?

Decorators for views

Could handle things like

  • Passing in the relevant model instances (Project, Page, Job, etc) from the slug/id
  • Rendering the template (with the context dictionary being the return value of the function)
  • Automatically generating breadcrumbs (when relevant)

To be considered. Low priority (maybe after the July 1st deadline).

More automated running

As suggested by Ich (just storing here for future reference if needed):

  1. During workflow creation, inspect the result of a process on a whole bunch of pages (e.g. see that binarise value x is good for all these pages)
  2. When it runs, don't bring up the binarise interface, just automatically run it
  3. Way to go back and tweak value for a job on a single page, then run from there again. (Save results from the last run?)

Optimise staff removal

The staff removal task takes quite a long time (around 80-100 seconds is what I see on susato). It would be great if this could be profiled and then optimised, if possible.

Create viewport images

These would be JPEG images of the same size as the 400px-wide thumbnail (or another one) but not scaled down (so only a small section would be visible).

Useful for despeckle and maybe others.

What part of the image should be shown?

Editing parameters

It needs to be possible to edit parameters and also decide if the step should be automatic or not. This will require some minor changes to the model infrastructure, and to the way job views are handled to cut down on code duplication.

It should also be possible to edit parameters and restart a job that has already completed.

Segmentation: Remove all intermediate points

If you need to change a box and it's got lots of points then it's very hard. Make a button that just chooses the left/right/top/bottom most corners and makes a rectangle for easy resizing.

Store image size in staff-finding JSON

For @brianstern and @antonkhelou

Instead of storing the scaling factor in the database (when creating thumbnails), we can just send the original image size in the JSON that's passed to the frontend.

Make the json look something like this:

{
  "width": 4000,
  "height": 5000,
  "points": [list,of,points]
}

then you can calculate the scale factor in the frontend based on the size of the thumbnail and these sizes.

Project creation

If the first job is an automatic one, it must be kicked off as soon as the workflow has been created.

Also, account for multiple workflows for one page (does everything take that into account? e.g. when getting results)

Redirect on 404 job

e.g. if you do /project/1/crop and there's no crop jobs available, it gives 404. Maybe it should redirect to /projects/1 with a flash message that you can't do that job?

Create image thumbnails on background job thread

So that the server is responsive after you upload something. Also useful for if you give a directory and ingest all files.
Need to make sure that things that require pages don't show these "uploaded" pages until thumbs are ready. Can't make workflow, etc.

Metadata server & smaller images for neon

Neon needs images that are ~1200px wide to show as a background. It also needs a webservice call to ask for the original size of that image for scaling purposes.

Per-project statistics

To show on the project page.

  • How many pages are done
  • Total %age complete
  • How many things are waiting for people (queue size per job, or just 1 global queue?)
  • How fast jobs are being done?

Automatic job integration

Should be started right after the previous job finishes (in the @rodan_task decorator).

Also, should take the workflow-specific defaults into account (param). The job-specific defaults are the backup.

Job interfaces

@brianstern, can you put up your work on the interfaces? You don't have to put them in the repository or worry about making them Django templates, even just putting it up as a gist and then linking it would be fine. I just need to see how you've done it so I can cleanly integrate it with what I have already.

Improve task modularity using decorators

I'm working on this now. All the result and result file stuff should be handled outside of the task-specific code. Here's a sneak preview:

@task
@do_stuff(input_filetype='tiff')
def simple_binarise(image_filepath, **kwargs):
    output_img = gamera.core.load_image(image_filepath)
    output_img = output_img.threshold(kwargs['threshold'])

    return {'tiff': output_img}

Move or resize rotation control

The control for rotating the image in the rotation task could be resized or floated next to the canvas. Brought up by Andrew this morning. Filed here for reference.

Integrate Border Removal Plugin

The border removal plugin (github:DDMAL/gamera-lyric-extraction) works now, and it would be great if we could integrate it into the workflow system.

Using it is pretty straightforward:

from gamera.toolkits import lyric_extraction

# requires an image of type GREYSCALE

im = load_image("blah.tiff")
im = im.to_greyscale()

# border_removal returns a mask.
mask = im.border_removal(**use the defaults**)

result = im.mask(mask)

The result should be a greyscale file with the borders (incl. colour bars) removed.

The technique behind it is pretty tricky, so not sure if we can do this interactively with JS, but the default settings seem to work fine, so this can probably be a non-interactive task.

Staff Removal for Split Staves

When a segmented staff is split into multiple parts (generally by an ornate letter), only the leftmost staff is removed.

Demo video

For Ich's demo, we need a script to run through on the prod site, as well as a video showing the same thing (<5 minutes)

Restart crashed tasks

Sometimes a celery task will crash. We need to make sure it recovers if possible and restarts, or show this information somewhere so that someone can go in and fix it

Pitch finding dies on unexpected json inputs

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/celery/execute/trace.py", line 181, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/home/wliu/rodan_source/rodan/jobs/utils.py", line 55, in real_inner
    outputs = f(*input_paths, **kwargs)
  File "/home/wliu/rodan_source/rodan/jobs/pitch_finding.py", line 25, in pitch_find
    recognized_glyphs = aomr_obj.run(glyphs, poly_list)
  File "/home/wliu/rodan_source/rodan/jobs/aomr_resources/AomrObject.py", line 104, in run
    self.find_staves(poly_list)
  File "/home/wliu/rodan_source/rodan/jobs/aomr_resources/AomrObject.py", line 242, in find_staves
    diff_lo = avg_lines[3]-avg_lines[2]
IndexError: list index out of range

Found 1 staff. The JSON produced by staff-finding looks like this:

[[[[1084, 215], [1086, 215], [1164, 215], [1170, 215]]], [[[701, 1833], [776, 1833]]]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.