Comments (23)
Parallel processing seems to work fine with threads, it only fails with processes. As far as I understood, the problem is that the updated self.task
from the main process isn't available in the child processed, which results in a KeyError
when trying to do task = self.tasks[recv_task.name]
.
(Which can result in another problem: if task
was never assigned before, the exception handler in execute_task_subprocess
tries to access task.name
and bails out. This can be solved by assigning task = None
in the beginning of the function, and checking task
for being None
in the exception handler, and if its not there, emitting a dict which doesn't contain name
.)
from doit.
I guess using shared memory to store self.tasks
could be useful (maybe via https://docs.python.org/library/multiprocessing.html#module-multiprocessing.sharedctypes).
(Maybe it would also be better to refactor MRunner
and MThreadRunner
so that such implementation details result in less clutter?)
from doit.
I updated my branch. Only problem left is multiprocessing... I just figured out what to do :)
Actually the main process sends the task to the executing process, but it sends an incomplete task without actions
or any other attribute that contains a callable. This was done because it is not possible to pickle closures. I dont think this sharedctypes can handle closures.
So I will change the code to send the full task in case it is late created task and if has a closure give an error. So closures will be forbidden only in this specific case...
from doit.
Aren't most tasks in nikola using closures (such as in actions
), or did I understood something incorrectly?
from doit.
Please only forbid closures when using processes and not threads. After all, there they work fine :)
from doit.
@schettino72: is it possible to create another delayed task during delayed task generation? Delayed task creation wants me to return dicts, and it seems I cannot return a Task(..., loader=DelayedLoader(...))
object.
from doit.
@felixfontein ok. I will add support for that... please check 4e32bd3
note that the semantics of creating a task directly is not the same as passing dicts (mainly regarding to task_dep and group_task)
when i have time a create a proper patch and merge this.
from doit.
It still doesn't work. I think the reason is this line:
5cca99c#diff-4804aaab12636cb475f8a46e607c1ce7R398
It sets the loader of newly created tasks to DelayedLoaded
(which equals False
). Replacing that with
if nt.name == this_task.name:
nt.loader = DelayedLoaded
seems to work better.
from doit.
With that change, I still have a problem: I get an exception in line 103 of runner.py, when doit tries to access node.task
(at that point, node
equals hold on
). This is reproduced by the newest version of my test program: https://github.com/getnikola/nikola/blob/earlytask_experiments/test.py
from doit.
I don't have time to debug this further right now, I'll try again tomorrow...
from doit.
Ok, I think I found the source -- though I don't yet know how to fix it. The problem seems that the node for the task creating more tasks is destroyed, while it contains information which nodes are waiting for it. This information seems to not be recovered anywhere and gets lost, whence the nodes waiting for the task generation wait forever.
from doit.
@felixfontein i will take a look in your example...
from doit.
Are you using doit master
branch? you should be using that. I run your example there was no hanging or exceptions...
what is the problem with it? is there any missing task. i still didnt follow the logic of your example...
from doit.
yes if a task uses loader
attribute all other arguments (but the name
will be discarded when the "real" task is created). but your code seems ok in this respect.
from doit.
And you should revert your change that sets the loader
to DelayedLoaded
(False) only for the task the matches the name. This probably break some stuff.
from doit.
I was using the create-task-instance branch. And without the change, only some tasks are executed since half of them aren't generated.
The proper output should be:
GENERATE 1
. copy_1:dest/1.a
. copy_1:dest/2.a
. copy_1:dest/2.b
. level_1_wait
GENERATE 2
. copy_2:dest/1.b
. copy_2:dest/2.e
. level_2_wait
GENERATE 3
. copy_3:dest/1.c
. copy_3:dest/2.c
. level_3_wait
GENERATE 4
. copy_4:dest/1.d
. copy_4:dest/2.d
. copy_4:dest/2.f
. level_4_wait
. level_4_done
. level_3_done
. level_2_done
. level_1_done
(I never got further than level_4_done
, it throws an exception with the change.)
from doit.
thanks. i am taking a look...
from doit.
I got a fix :) your example works for me now...
223ff87
from doit.
Cool! Works great now :) I also added code (remotely) similar to loader.load_tasks()
so that list
and clean
work again.
from doit.
The only thing which isn't working at the moment is specifying default tasks. If I add
DOIT_CONFIG['default_tasks'] = ['level_1_done', 'level_2_done', 'level_3_done', 'level_4_done']
(or similar) in case cmd.execute_tasks
is True
(Nikola uses ['render_site', 'post_render']
), doit complains that it doesn't know level_3_done
(and level_4_done
if it would come so far; both aren't a surprise since they aren't generated yet, not even as an empty hull). Any idea how to get around that?
from doit.
@felixfontein please give me details to your problems with nevermind I saw your changes alreadylist
and clean
and point to your code changes...
doit complains that it doesn't know level_3_done (and level_4_done if it would come so far; both aren't a surprise since they aren't generated yet, not even as an empty hull). Any idea how to get around that?
When I first implemented this feature I didnt think people would use it to create completely dynamic tasks, now it can... but I would recommend you to always use "empty hull" for tasks. I dont know how to solve your problem other that creating an empty hull.
from doit.
All changes are now merged into master. Please use master
instead of create-task-instance
.
I am ready to make a new release but I will wait for further feedback from you...
so let me know when you think it is good enough for a release (no hurry).
from doit.
Thanks a lot for implementing that feature and merging it into master!
I'll try to find out what precisely is needed in Nikola and how I can implement that. That'll take a bit of time (in particular because I don't have too much spare time at the moment), but if I find something on the way I'll tell you :)
from doit.
Related Issues (20)
- tasks must not change working directory to avoid "No such file or directory: '.doit.db.dat'" HOT 5
- Cleaning all doesn't seem to work HOT 2
- tasks are uptodate even though their task_dependency is not HOT 6
- -s param doesn´t accept multiple arguments since 0.34.0 HOT 1
- Pre-commit hook HOT 2
- typo on opencollective page? HOT 2
- Interesting doit use HOT 1
- How to pass command line arguments to dependent tasks
- Problems with interaction between config files and --seek-file/--file HOT 2
- Feature Request: Version of LongRunning that returns TaskError on non zero return code? HOT 2
- dep_manager.get_result() should not be discouraged if MRunner is also used HOT 1
- Task marked as not up-to-date because of a uppercase/lowercase difference in the drive letter. HOT 1
- Building version `>= 0.36.0` conda package for windows with python 3.11 HOT 2
- create required CmdOption HOT 1
- basename in Task.valid_attr, but not in Task.__init__
- watch depedency has no effect on task selection
- Program hangs when num process is set to 1 for using multiprocessing
- Gradual update of doit.dat file for multi-tasks (subtasks)
- Doit ignores action_string_formatting config when defined in pyproject.toml
- How to get the pathpath / sys.path correct with doit when using a src directory
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from doit.