pydoit / doit Goto Github PK

task management & automation tool

License: MIT License

Python 99.92% Dockerfile 0.08%

python build-tool build-automation task-runner build-system workflow-management data-pipeline workflow workflow-automation data-science

doit's People

Contributors

Stargazers

Watchers

Forkers

pveglia rolegic lelit florianludwig nsegata ankostis schwager-hsph smheidrich philipbl joschkazj fdfr bakfoo gstorer johannesbuchner cornell-brg shanbady vvvvcp hinidu dropindan barleyj gitter-badger ntp9 pjcrosbie wangpanjun pombredanne saimn jean rbeagrie dgilbert101 smutch awesome-python angrywombat poliquin gadgetsteve satishgoda multimeric m32media mfiers itsafire rowhit olivierh59500 frankstain leftink wizeman onnodb congeal auhan99 balkierode frenzymadness tonyfast frozax vlcinsky opencollective moltob tompace101 okin niko-jr rbdixon imuli davidbrochart ceball pdcalado micropicostack afcarl wmvanvliet fuxi2003 cav71 nuno-andre drkeoni joshanderson99 frostytear janfel facundofc xordspar0 swenzel slippycheeze ka7 slaperche-scality wlsrhdrhks simchuck ravishdeep10 fakegit rbs-pli smarie viniciussalmeida massyah harryscholes fieldplay jayvdb danmichaelo gaybro8777 hartwork stefanor myeasyhome gridl sailfish009 skadge dkennedy2 imaginarystargazer sobolevn

doit's Issues

Add a command to recompute dependencies state.

When changing check_file_uptodate the states of dependencies are dropped, so all the tasks are considered out-of-date. This is an issue when the tasks takes a long time to run, and if the file deps / targets are already existing.
So I would like to add a command which recompute the states of dependencies without executing the tasks. I think I will take example on the forget command, to find all tasks/subtasks and call self.dep_manager.save_success for each task. Seems reasonable ?
Also suggestions for the cmd name are welcome ;-), something like rebuilddb ?

is it possible to run an action regardless of errors in a previous one?

def enable_bullseye():                                                          
    global PATH                                                                 
    PATH = os.environ['PATH']                                                   
    os.environ['PATH'] = '%(BULLSEYEDIR)s:%(PATH)s' % env()                     
    os.environ['COVFILE'] = COVFILE                                             
    check_call(['cov01', '-1'])                                                 

def disable_bullseye():                                                         
    check_call(['cov01', '-0'])                                                 
    os.environ['PATH'] = PATH #restore saved PATH values                        

def compile_cmds():                                                             
    compile_cmd = 'cd %(BUILDDIR)s && make' % env()                             
    if FLOW == 'eng':                                                           
        return [compile_cmd]                                                    
    elif FLOW == 'be':                                                          
        return [                                                                
            (enable_bullseye,),                                                 
            compile_cmd,                                                        
            (disable_bullseye,), # <-- I want this to run no matter what happens above in complie_cmd                                
        ]                                                                       
    elif FLOW == 'kw':                                                          
        return ['kw-inject %(compile_cmd)s' % env()]

Utilize python-logging for all messages

Would it be an acceptable modification to utilize python's loggin standard library for all messages?

That would facilitate integration of doit in larger projects, besides making possible various configurations for the output messages (i.e. adding timestamps, host, etc).

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Delayed task is never executed if basename is set

Description

If I set the "basename" option of a task that has been created via doit.create_after, this task is never executed.

Minimal example to reproduce

Here is a minimal dodo.py file to reproduce the error. You can see from the output to sys.stderr that this is not merely a case of not reporting the task's run, but it actually doesn't run at all.
Note that removing either the "basename" option or the create_after decorator produces the "correct" behaviour; it is only when used together that they cause trouble.

dodo.py:

from doit import create_after
import sys

def say_hello(your_name):
  sys.stderr.write("Hello from {}!\n".format(your_name))

def task_a():
  return {
    "actions": [ (say_hello, [ "a" ] ) ]
  }

@create_after("a")
def task_b():
  return {
    "actions": [ (say_hello, [ "b" ] ) ],
    "basename": "B saying hello"
  }

Expected behaviour

Terminal log:

$ doit
.  a
Hello from a!
.  B saying hello
Hello from b!
$

Actual behaviour

Terminal log:

$ doit
.  a
Hello from a!
$

System information

doit version: 0.27.0
Python version: 3.4.2

Executing tasks in parallel fails on Windows

First of all a thank you for this tool, which seems to be very useful for automating scientifc workflows. I am using doit for automating a training-evaluation workflow, which contains many embarrassingly parallel tasks so the -n <NUM_JOB> options is exactly what I want. Unfortunately there are some pickling isses when doing so on Windows.

My software stack:

configparser              3.3.0.post2
doit                      0.28.0
pip                       6.1.1
python                    2.7.9
setuptools                15.1
six                       1.9.0

dodo.py (instead of the echo commands, some python scripts are started in my application, but the parallel execution behavior is the same):

# -*- coding: utf-8 -*-
import os
import os.path as osp

from doit import tools

FEATURES = ["lbp_small", "lbp_medium", "lbp_72angles", "hog_normalised",
            "hog_default", "daisy_default", "hog_single_cell"]
OUT = "out"

paths = {}
paths["OUT_FEATURES"] = osp.join(OUT, "features")
paths["OUT_EVALUATION"] = osp.join(OUT, "evaluation")
paths["OUT_FIGURES"] = osp.join(OUT, "figures")


def task_feature_extraction():       
    for feat in FEATURES:
        feat_spec = "feat_{}.json".format(feat)
        feat_file = osp.join(paths["OUT_FEATURES"], "feat_{}.hdf5".format(feat))
        yield {"name": feat_file,
               "actions": ["echo extract %s > %s" % (feat_spec, feat_file)],
               "targets": [feat_file],
               "clean": True,
               # force doit to always mark the task
               # as up-to-date (unless target removed)
               'uptodate': [True]}

Command: doit -n 4 causes traceback:

Traceback (most recent call last):
  File "C:\Anaconda\envs\surface-classification\lib\site-packages\doit\doit_cmd.py", line 165, in run
    return command.parse_execute(args)
  File "C:\Anaconda\envs\surface-classification\lib\site-packages\doit\cmd_base.py", line 122, in parse_execute
    return self.execute(params, args)
  File "C:\Anaconda\envs\surface-classification\lib\site-packages\doit\cmd_base.py", line 405, in execute
    return self._execute(**exec_params)
  File "C:\Anaconda\envs\surface-classification\lib\site-packages\doit\cmd_run.py", line 239, in _execute
    return runner.run_all(self.control.task_dispatcher())
  File "C:\Anaconda\envs\surface-classification\lib\site-packages\doit\runner.py", line 238, in run_all
    self.run_tasks(task_dispatcher)
  File "C:\Anaconda\envs\surface-classification\lib\site-packages\doit\runner.py", line 417, in run_tasks
    proc_list = self._run_start_processes(job_q, result_q)
  File "C:\Anaconda\envs\surface-classification\lib\site-packages\doit\runner.py", line 390, in _run_start_processes
    process.start()
  File "C:\Anaconda\envs\surface-classification\lib\multiprocessing\process.py", line 130, in start
    self._popen = Popen(self)
  File "C:\Anaconda\envs\surface-classification\lib\multiprocessing\forking.py", line 277, in __init__
    dump(process_obj, to_child, HIGHEST_PROTOCOL)
  File "C:\Anaconda\envs\surface-classification\lib\multiprocessing\forking.py", line 199, in dump
    ForkingPickler(file, protocol).dump(obj)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 224, in dump
    self.save(obj)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 419, in save_reduce
    save(state)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 649, in save_dict
    self._batch_setitems(obj.iteritems())
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 681, in _batch_setitems
    save(v)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Anaconda\envs\surface-classification\lib\multiprocessing\forking.py", line 67, in dispatcher
    self.save_reduce(obj=obj, *rv)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 401, in save_reduce
    save(args)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 548, in save_tuple
    save(element)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 419, in save_reduce
    save(state)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 649, in save_dict
    self._batch_setitems(obj.iteritems())
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 681, in _batch_setitems
    save(v)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 419, in save_reduce
    save(state)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 649, in save_dict
    self._batch_setitems(obj.iteritems())
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 681, in _batch_setitems
    save(v)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Anaconda\envs\surface-classification\lib\multiprocessing\forking.py", line 67, in dispatcher
    self.save_reduce(obj=obj, *rv)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 401, in save_reduce
    save(args)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 548, in save_tuple
    save(element)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 419, in save_reduce
    save(state)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 649, in save_dict
    self._batch_setitems(obj.iteritems())
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 681, in _batch_setitems
    save(v)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 419, in save_reduce
    save(state)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 649, in save_dict
    self._batch_setitems(obj.iteritems())
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 681, in _batch_setitems
    save(v)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 396, in save_reduce
    save(cls)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 748, in save_global
    (obj, module, name))
PicklingError: Can't pickle <type 'DB'>: it's not found as __builtin__.DB
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Anaconda\envs\surface-classification\lib\multiprocessing\forking.py", line 381, in main
    self = load(from_parent)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 1378, in load
    return Unpickler(file).load()
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 858, in load
    dispatch[key](self)
  File "C:\Anaconda\envs\surface-classification\lib\pickle.py", line 880, in load_eof
    raise EOFError
EOFError
Exception AttributeError: "'_DBWithCursor' object has no attribute 'dbc'" in  ignored

If you need further details, please contact me.

Action command which accept execution failure

In make, one can prepend "-" to mark a command which is allowed to fail and will not break the whole task execution.

E.g. following (modified) snippet from Makefile for building PDF documentation tries few times command pdflatex (and all must succeed), then it tries command makeindex, which is allowed to fail without breaking whole task execution, and finally gives few more executions of pdflatex.

%.pdf: %.tex
    pdflatex '$<'
    pdflatex '$<'
    pdflatex '$<'
    -makeindex -s python.ist '$(basename $<).idx'
    pdflatex '$<'
    pdflatex '$<'

Question: is such an option available in doit?

"sqlite3.OperationalError: database is locked" in parallel execution

Hi,

Sometimes when executing doit with parallel execution (e.g. -n 10), I get the following error:

Error in atexit._run_exitfuncs:
Traceback (most recent call last):
OperationalError: database is locked
Error in sys.exitfunc:
sqlite3.OperationalError: database is locked

This is with python-sqlite2 version 2.3.5.

I am not sure how to debug this, because it seems to appear randomly. It does however effect running tasks, I think it closes their stdin/out or sends them a signal.

capture actions output in byte mode (or configurable byte/line mode)

first reported on #88

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

setup Continuous integration for Windows

http://www.appveyor.com/

View Dependencies as a Tree

This issue was first raised in this discussion

`auto` does not work on Mac OS X

Adding the auto command line option does not work on Mac OS X (10.10.3). The task is run once, but then nothing occurs after that. I have played around with the source code and from what I can tell, the process that doit auto creates is not exited when the file system change event is received.

Here is a simple dodo.py file that shows this problem:

def task_test(): 
    return { 
        'actions': ['echo "test" > test.txt'], 
        'file_dep': ['watch.txt']             
        }

If you run doit auto and modify watch.txt on Mac OS X, nothing happens.

WIP at #101

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

Add support for PosixPath in file dependencies

In FAQ a pathlib is suggested for handling folders and paths:

file_dep does NOT support folders. If you want to specify all files from a folder you can use a third party library like pathlib ( pathlib was add on python’s 3.4 stdlib).

But current version of doit doesn't support PosixPath:

Traceback (most recent call last):
  File "/home/bartosz/.pyenv/versions/anaconda3-2.3.0/lib/python3.4/site-packages/doit/runner.py", line 125, in select_task
    node.run_status = self.dep_manager.get_status(task, tasks_dict)
  File "/home/bartosz/.pyenv/versions/anaconda3-2.3.0/lib/python3.4/site-packages/doit/dependency.py", line 600, in get_status
    if not os.path.exists(targ):
  File "/home/bartosz/.pyenv/versions/anaconda3-2.3.0/lib/python3.4/genericpath.py", line 19, in exists
    os.stat(path)
TypeError: argument should be string, bytes or integer, not PosixPath

A possible solution is converting to string with str(path), but handling PosixPath directly would be far more elegant.

force uptodate True without checking other rules

i.e. I want to have a custom uptodate check that will mark the task as up-to-date even if there is a missing target or file-dep changed.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

docs: add a section to advertise other projects based on doit

Hi, this is not really a bug report.

I created a Reporter that makes html progress report pages, live.
https://github.com/JohannesBuchner/doithtml
Demo included.
Just wanted to let you know.

Cheers,
Johannes

getargs should create an implicit result_dep

move issues from bitbucket

https://bitbucket.org/schettino72/doit/issues?status=new&status=open

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

unstable test on travis

There are some tests that sometimes fails... I works consistently on my machine but fails on travis.
It seems there is no way to debug on travis.

Failing on callable tasks with keyword only args

Hello! Thank you for the great tool!
I found that problem when I was trying to use shutil.copyfile:

from shutil import copyfile

def task_copyfile():
    return {'actions': [(copyfile, ['foo', 'bar'])]}

It fails with the following error message:

Traceback (most recent call last):
  File "C:\Python34\lib\site-packages\doit\doit_cmd.py", line 165, in run
    return command.parse_execute(args)
  File "C:\Python34\lib\site-packages\doit\cmd_base.py", line 122, in parse_execute
    return self.execute(params, args)
  File "C:\Python34\lib\site-packages\doit\cmd_base.py", line 405, in execute
    return self._execute(**exec_params)
  File "C:\Python34\lib\site-packages\doit\cmd_run.py", line 239, in _execute
    return runner.run_all(self.control.task_dispatcher())
  File "C:\Python34\lib\site-packages\doit\runner.py", line 238, in run_all
    self.run_tasks(task_dispatcher)
  File "C:\Python34\lib\site-packages\doit\runner.py", line 204, in run_tasks
    catched_excp = self.execute_task(node.task)
  File "C:\Python34\lib\site-packages\doit\runner.py", line 166, in execute_task
    return task.execute(sys.stdout, sys.stderr, self.verbosity)
  File "C:\Python34\lib\site-packages\doit\task.py", line 376, in execute
    action_return = action.execute(task_stdout, task_stderr)
  File "C:\Python34\lib\site-packages\doit\action.py", line 368, in execute
    kwargs = self._prepare_kwargs()
  File "C:\Python34\lib\site-packages\doit\action.py", line 337, in _prepare_kwargs
    self.args, self.kwargs)
  File "C:\Python34\lib\site-packages\doit\action.py", line 44, in _prepare_kwargs
    argspec = inspect.getargspec(func)
  File "C:\Python34\lib\inspect.py", line 936, in getargspec
    raise ValueError("Function has keyword-only arguments or annotations"
ValueError: Function has keyword-only arguments or annotations, use getfullargspec() API which can support them

I'm using python 3.4, according to the documentation argument follow_symlinks was added to shutil.copyfile in python 3.3. With python 2.7 code above works without any problem.
Of course that particular problem can be solved on my side with lambda for example but I thought that it could be the useful information for you.

UnicodeDecodeError when running command

I get the following when I run a command from a dodo.py file:

Exception in thread Thread-1:
Traceback (most recent call last):
File "/home/utils/python/2.7.6/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/home/utils/python/2.7.6/lib/python2.7/threading.py", line 763, in run
self.__target(_self.__args, *_self.__kwargs)
File "/home/utils/python/2.7.6/lib/python2.7/site-packages/doit/action.py", line 142, in print_process_output
line = input.readline().decode('utf-8')
File "/home/utils/python/2.7.6/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa9 in position 10: invalid start byte

It seems that doit cannot handle non utf8 encoded output. In my case, the tool I am running outputs the copyright symbol to stdout.

I attached a sample of the failing setup. Please change the extention to *.tgz and unpack it. I could not figure out how to attach a non-image file to the issue report.

Task with a target but no file dependencies logic

Why would a task with a target not be up-to-date unless that target is a file dependency of another task?

https://github.com/pydoit/doit/blob/master/doit/dependency.py#L517-519

Example:

def task_init():
    return {
        'actions': ["echo init > package.json"],
        'targets': ["package.json"],
    }

is always run until

def task_install():
    return {
        'actions': ["echo installing modules"],
        'file_dep': ["package.json"],
    }

I'm not understanding why that dependency is required to check for the existence of package.json. I didn't think doit was working correctly until I tracked the problem to these lines, but my intuition about targets may be off.

Thanks in advance!

speed up sqlite3 DB

It unusable slow...

Change its implementation to save the DB only once at the end of the execution. That means it will be able to use by multiple instances but multiple instances should not execute the same tasks...

task with targets always runs

The documentation says "If a target doesn’t exist the task will be executed", but the following code runs every time doit is called no matter if package.json exists or not.

def task_init():
    return {
        'actions': ["echo init > package.json"],
        'targets': ["package.json"],
    }

Output is as follows:

$ ls
dodo.py
$ doit
.  init
$ ls
dodo.py         package.json
$ doit
.  init

Admittedly, I might be missing something important, so if it can be explained why this is the case, that would be appreciated. Thanks!

doit throws unhelpful exception when subtask does not exist

First of all thanks for this awesome framework - it helped me a lot!

While this is no functional problem it makes it hard to use for users.
When I request a target where the basename exist but not the subtask doit fails with an exception unrelated to the real problem.

Minimal example:

def task_example():
    for sub in ['a','b']:
        yield { 'actions' : ["echo %s" % sub],
                'name'    : sub,
                'verbosity' : 2,
                'uptodate': [False]}

When I now run doit example:c I get the following traceback:

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/doit/doit_cmd.py", line 165, in run
    return command.parse_execute(args)
  File "/usr/lib/python2.7/site-packages/doit/cmd_base.py", line 122, in parse_execute
    return self.execute(params, args)
  File "/usr/lib/python2.7/site-packages/doit/cmd_base.py", line 405, in execute
    return self._execute(**exec_params)
  File "/usr/lib/python2.7/site-packages/doit/cmd_run.py", line 176, in _execute
    self.control.process(self.sel_tasks)
  File "/usr/lib/python2.7/site-packages/doit/control.py", line 203, in process
    self.selected_tasks = self._filter_tasks(task_selection)
  File "/usr/lib/python2.7/site-packages/doit/control.py", line 185, in _filter_tasks
    loader.basename = basename
AttributeError: 'NoneType' object has no attribute 'basename'

fix initial pickling on windows

from #84

The problem is that for Task I did not define getstate method. Because sometimes I want to really pass the full Task including the attributes that might contain functions, and sometimes I know I can leave these attributes out.

So the fix would be to define a Task.getstate that removes unsafe attributes by default, and figure out a way to create a pickle from an object ignoring its getstate! I am not sure how to do that. Maybe using cloudpickle will just solve the problem in a easier way.

I will take some time to experiment on that... meanwhile you can remove this change in UptodateCalculator (and let the test fail). This change will actually break the code if you have a getargs attribute in a DelayedTask (the tests doesnt cover that). So I will merge the improvement on not using wheels and fix this issue later. Of course you are welcome to try to fix it yourself :)

Ability to select tasks using a shortcut

Usually tasks are identified by long names like "do_something_on_this", "build_documentation". Thankfully we have tab-completion that makes it less annoying to build the command line...

I think it would be nice to have another, possibly quicker even if mnemonic way to select a task, once you know its name: what about being able to say doit dsot or doit bd to run the hypothetic tasks mentioned above?

An initial, proof-of-concept implementation is lelit/doit@a540533

plugin system

Add a plugin system, it should support:

This plugins could be distributed as separate python packages on PyPI, or just include in the root folder of a project together with dodo.py.

keep-going after failure

make has --keep-going, which continues executing the dependency tree even when some other branches failed. It would be nice to have something similar for doit.
I am unsure about the technical implementation. I think the threaded execution (-n) already runs until each thread fails or succeeds?

delay creation of tasks until after another task is executed

This is required when loading dodo.py you dont have enough information to create some of your tasks metadata. see: getnikola/nikola#1562

WIP branch: https://github.com/pydoit/doit/tree/delayed-task-creation

Sample testing code: https://gist.github.com/schettino72/9868c27526a6c5ea554c

TODO:

support delayed task creation for run command
make sure commands list, clean, etc create all tasks
fix multi-processing
make sure implicit dependencies (file_dep -> another_task.target) is respected
tests

auto: support custom ( user specified ) commands to be executed after each task execution

WIP branch https://github.com/pydoit/doit/tree/auto-notifications

specify target of a DelayedTask on command line

Since not all tasks are created before execution starts, it is required some special handling for target names of targets that have not been created yet.

See discussion on getnikola/nikola#1562 (comment)

General idea is that if a target is not found, before raising an error to the user try to load DelayedTasks (as done by command list) to look for the given target name.

Some considerations:

as of now a DelayedTask creates an implicit task_dep for the task given in it's executed param. But this task_dep is not preserved when the DelayedTask is re-created. It should not only be preserved but all created tasks should include this task_dep because the guarantee that the dependent task was already executed wont exist anymore!
if the selected tasks for execution includes know tasks and targets they should be executed before, any DelayedTask is loaded to look for an unknown target. This will ensure that same command line will work nicely even if it is on its first execution.

Uptodate state issue with result_dep

Hi,
I have a task for which I use result_dep for a list of subtasks, and this task is not considered uptodate after running it, even if the subtasks are uptodate.
Here is an example to reproduce this (the file_dep allows to have the subtasks uptodate, but the result is the same without):

from doit.tools import result_dep

def task_version():
    def version(name):
        print name
        return name

    for name in ('foo', 'bar'):
        yield {
            'name': name,
            'file_dep': ['dodo_resultdep.py'],
            'actions': [(version, [name])]
        }

def task_send_email():
    return {'actions': ['echo "TODO: send an email"'],
            'uptodate': [result_dep('version:foo'), result_dep('version:bar')]}

and the output:

(doit)~/l/doit git:master ❯❯❯ doit -v 2 -f dodo_resultdep.py
.  version:foo
foo
.  version:bar
bar
.  send_email
TODO: send an email
(doit)~/l/doit git:master ❯❯❯ doit -v 2 -f dodo_resultdep.py
-- version:foo
-- version:bar
.  send_email
TODO: send an email
(doit)~/l/doit git:master ❯❯❯ doit -v 2 -f dodo_resultdep.py
-- version:foo
-- version:bar
-- send_email

The send_email is uptodate only after the second execution, which is not possible for real tasks ;-).

Doit does not store task.values when an uptodate callback returns True

Hello @schettino72, I was trying to execute an expensive and slow task just once without using run_once because the task could be ready before the existence of doit in our project. The problem is that checking if it is already done is slow too, so I would like to check it once.

Then I tried to implement my own uptodate callback trying to detect if my_expensive_task_is_done method was already executed before but doit does not store new task.values if uptodate returns True, so I am running in circles here... I know, use a file as target and write it only if the task is done (is actually what I did) but it isn't the best solution. I would like to use doit's cache I think it would be more elegant.

cmd-action substitution keys should include (almost) all task properties

The most important task-properties are the "dependency fields" and these are good to have them in the substitution-dictionary for debugging purposes.
Currently even a simple echo %(name)s cmd-action does not work.

As a side note, it would facilitate the user if at action.py@, ine:248 it catches KeyError and prints the original cmd's format-text causing the problem.

errors with python2.6 on cent-os

Maybe this should not qualify as an issue as it does not list python 2.6 is a version of python that is tested against, but it seems that doit will not work when installed from pip, easy_install or source.

Here is the error message I get when trying to run doit:

[j@hbx ~]$ doit
Traceback (most recent call last):
  File "/usr/bin/doit", line 9, in <module>
    load_entry_point('doit==0.29.dev0', 'console_scripts', 'doit')()
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 299, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2229, in load_entry_point
    return ep.load()
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 1948, in load
    entry = __import__(self.module_name, globals(),globals(), ['__name__'])
  File "/usr/lib/python2.6/site-packages/doit-0.29.dev0-py2.6.egg/doit/__init__.py", line 33, in <module>
    from doit.doit_cmd import get_var
  File "/usr/lib/python2.6/site-packages/doit-0.29.dev0-py2.6.egg/doit/doit_cmd.py", line 11, in <module>
    from .plugin import PluginDict
  File "/usr/lib/python2.6/site-packages/doit-0.29.dev0-py2.6.egg/doit/plugin.py", line 86
    return {k: self.get_plugin(k) for k in self.keys()}
                                    ^

Sorry in advance if this is not an appropriate thing to file an issue on.

Thanks, -JE

show reason task is not up-to-date

Show the reason why a task is (would be) executed, e.g. "Missing target: x.y.z" or "Dependencey updated: a.b.c".

This should be implemented as part of doit info command.

dry-run

Please check this abandoned PR #10 for details.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Support usage of lambda in tasks actions with multi-process runner

Found this project after looking on how spark handles it:
https://github.com/cloudpipe/cloudpickle

task is run although uptodate returns True

Hi,

I have a program that produces a output file. I already ran that program outside of doit. I wrote a dodo file for that task (which is computationally costly).
I want doit not to re-run that task. So I added the outputfile in targets and for uptodate I wrote a function::

  def exists(task, values):
    return all([os.path.exists(file) for file in task.targets])

If I add a print I can see that that function returns True. But doit runs the task nevertheless. Is this because the task is not in the database yet? How can I avoid execution? Do I have to modify the database?

Cheers,
Johannes

implement cmd `auto` on Windows

WIP at #101

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

tests fails on pypy

issues on handling multiple matches delayed target regex

Refers to tasks missing from PR #58

need docs
when --auto-delayed-regex is used, if a target that doesnt exist (or a typo) is passed in the command, it will be silently ignored. It should raise an error saying the target/task/command doesnt exist

Improve error message for command/task/target not found

See #60 (comment)

level 0: mentiondoit help to see list of tasks
level 1: only mention doit help if run (or build in nikola) was not specified
level 2: suggest similar command/task names

Implementation of level 0 is good enough...

`doit list` has issues with Unicode strings

I created a Nikola post with the slug ą.

$ nikola list --all
…
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/doit/doit_cmd.py", line 165, in run
    return command.parse_execute(args)
  File "/usr/lib/python2.7/site-packages/doit/cmd_base.py", line 122, in parse_execute
    return self.execute(params, args)
  File "/usr/lib/python2.7/site-packages/doit/cmd_base.py", line 405, in execute
    return self._execute(**exec_params)
  File "/usr/lib/python2.7/site-packages/doit/cmd_list.py", line 149, in _execute
    self._print_task(template, task, status, list_deps, tasks)
  File "/usr/lib/python2.7/site-packages/doit/cmd_list.py", line 83, in _print_task
    self.outstream.write(template.format(**line_data))
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0105' in position 26: ordinal not in range(128)

Expected output: render_pages:output/posts/ą.html

Option to not use md5 for file dependencies.

Hi,
First, thanks for doit, it is a great tool !
I'm using it currently to process a lot of huge files, and the bottleneck is the computation of the md5sums for the file dependencies. From what I have seen in the code it is not possible currently to have this. Even if I use task_dep and uptodate with check_timestamp_unchanged instead of file_dep, I still need to specify targets, hence md5sum is computed on the target files.
So, unless I missed another way to achieve this, what I would like is to use file_dep and target, as it works well and is very practical with subtasks, but with only the timestamp check.
Maybe adding an option in DOIT_CONFIG to deactivate the use of md5sums could be a solution ?

Potential bug in `list -s --all`

When running doit with the arguments list -s --all, I am getting the following exception:

U cconv                           
Traceback (most recent call last):
  File "/home/jgosmann/.local/lib/python2.7/site-packages/doit/doit_cmd.py", line 121, in run
    return command.parse_execute(args)
  File "/home/jgosmann/.local/lib/python2.7/site-packages/doit/cmd_base.py", line 84, in parse_execute
    return self.execute(params, args)
  File "/home/jgosmann/.local/lib/python2.7/site-packages/doit/cmd_base.py", line 278, in execute
    return self._execute(**exec_params)
  File "/home/jgosmann/.local/lib/python2.7/site-packages/doit/cmd_list.py", line 157, in _execute
    self._print_task(template, task, status, list_deps)
  File "/home/jgosmann/.local/lib/python2.7/site-packages/doit/cmd_list.py", line 88, in _print_task
    task_status = self.dep_manager.get_status(task, None)
  File "/home/jgosmann/.local/lib/python2.7/site-packages/doit/dependency.py", line 501, in get_status
    uptodate_result = utd(*args, **utd_kwargs)
  File "/home/jgosmann/.local/lib/python2.7/site-packages/doit/task.py", line 537, in __call__
    dep_task = self.tasks_dict[self.dep_name]
TypeError: 'NoneType' object has no attribute '__getitem__'

This might be my own fault as I am using my own task loader and maybe I screw up the construction of the tasks (this makes also non-trivial to provide an actual minimal example). However, everything seems to work fine when executing the tasks. Also, what makes me believe that this might be a bug in doit is that tasks_dict is explicitely set to None in cmd_list.py on line 88.

auto command in conjunction with task arguments

'auto' command interpretes task arguments as names of tasks. As an effect execution of the command failes. The same syntax works fine with 'run' command. Example:

# this works fine:
$ doit run mytask -p arg1 -q arg2

# this failes:
$ doit auto mytask -p arg1 -q arg2

Automatic tab-completion inspired from autojump

Hi,

I just discovered this project. It looks very interesting, congrats. Going through the documentation, I found the tab-completion functionality kind of complicated.

I remember the autojump project doing something very interesting:

There is a --complete command line flag that do nothing except printing the completion for the other given arguments. Then, there is a generic bash | zsh completion script – that can be installed with the software – which call the same command with the --complete flag for generating the completion. This remove the need of a custom completion function and its associated sourcing!

I'm not involved in autojump, so excuse any wrong understanding on how it really works 😄

An example to illustrate (inspired from autojump):

ZSH completion:

#compdef doit
cur=${words[2, -1]}

doit --complete ${=cur[*]} | while read i; do
    compadd -U "$i";
done

Then when using doit:

$ doit som
$ doit som[TAB]

# This call the `_doit` zsh completion

# The zsh completion above will internally call this (on line 4):
# `doit --complete som`

# The doit program run with these argument will look for
# targets/tasks/whatever beginning with `som` and print them to stdout

# The zsh completion will use what is printed to generate the list of
# possible completion, and eventually complete the line.

$ doit something

Advantages:

No need for custom completion functions: it can be installed with the program.
Completion can be even smarter. You can return completion that depends on the context (e.g. up-to-date targets no showing, etc.). Python limits are the only limits ✌️
It's easy to extend completion to others shells: completion is no longer shell-dependent (autojump as also a completion for fish for instance).

I don't know yet if I will be willing/motivated to implement this, but I'm throwing this enhancement idea just in case someone want to go ahead 👍

Kind Regards,
++ Fab

Priority for the verbosity config

Hi,
I have a DOIT_CONFIG with verbosity set to 1 (the default value iirc), and I would like to set verbosity to 2 for one of my tasks, but setting verbosity to 2 in this task attributes doesn't work, the value from DOIT_CONFIG seems to have the highest priority.
I would have though that the priority order for this kind of setting would be : cli options > task attribute > global DOIT_CONFIG value, but it seems it is not the case. Looking at the code, I did not found where the verbosity from the task attributes is used. It seems that something is missing here https://github.com/pydoit/doit/blob/master/doit/cmd_run.py#L195

upate travis to use new infrastructure

http://docs.travis-ci.com/user/migrating-from-legacy

Selective import for backend db.

bsddb issues on OSX, would be nice if we set backend format to JSON that it does not need to import bsddb.

pydoit / doit Goto Github PK

doit's People

Contributors

Stargazers

Watchers

Forkers

doit's Issues

Description

Minimal example to reproduce

dodo.py:

Expected behaviour

Terminal log:

Actual behaviour

Terminal log:

System information

Upvote & Fund

Recommend Projects

Recommend Topics

Recommend Org