Comments (10)
Hi @idomic
Can you please provide me with posthob.ipynb or where it is located?
I can't seem to find it.
from ploomber-engine.
I ran two sample notebooks (sample-notebooks.zip) to understand the issue a bit more, some comments:
printed output
I was unable to reproduce the issue, all print statements are displayed when passing --log-output
; ploomber-engine only displays whatever is sent to stdout
so my guess is that papermill is also displaying stderr
and/or the text results from each cell - we should run a more detailed analysis and then ensure that both produce the same output. Another thing we can add is the cell delimiter (papermill prints: Executing Cell X -----
)
ploomber-engine print.ipynb /dev/null --log-output
dual progress bar
I could not reproduce this by creating notebook that displays a progressbar using tqdm and executing it with --log-output
, so we need to investigate more:
ploomber-engine progress.ipynb /dev/null --log-output
from ploomber-engine.
Yeah the delimiter can be a good option, it prints all together.
To recreate you can run the posthob.ipynb.
from ploomber-engine.
Hi,
So, I am running this code:
print(1+2)
print(3+4)
print(1+7)
from tqdm.auto import tqdm
import time
my_list = list(range(100))
with tqdm(total=len(my_list)) as pbar:
for x in my_list:
time.sleep(0.01)
pbar.update(1)
if x%20==0:
print(x)
print(1)
Running with ploomber engine on CLI give me output:
Observations:
- Progress bar of cell 5 is not displayed
- As you can see, ploomber-engine skips \n characters when executing
- Also ploomber-engine execution time which is around 5-8sec is not consistent and it is slower than papermill 3-4sec
- Also, I can't find documentation of ploomber-engine CLI command
Commands used
ploomber-engine rough.ipynb output.ipynb --log-output
papermill rough.ipynb output.ipynb --log-output
PS: I am not able to run the notebook @idomic mentioned
Edit: Updated Images and fix spellings
from ploomber-engine.
@mehtamohit013 A few thoughts:
Also, I can't find documentation of ploomber-engine CLI command
I've opened an issue about it last week I think
Progress bar of cell 5 is not displayed
I think if the --log-output
is there we need to research why, sounds like a bug.
Also ploomber-engine execution time which is around 5-8sec is not consistent and it is slower than papermill 3-4sec
It runs on a different process, that's why the difference, but try profiling it, see what's causing this delay.
Let's connect on the notebook I'll help you run it!
from ploomber-engine.
I think the missing output might be that the tqdm progress bar is printed to standard error and we're just displaying standard output. If that's the case, we should ensure we also display standard error in the console.
You can check this with:
import sys
print("printing to stderr", file=sys.stderr)
and see if ploomber-engine displays it
from ploomber-engine.
Some clarification regarding performance
- I have extracted 15 samples from profiling with a mean of 1.02846 sec and std dev of 0.0047 sec
- I have used the
time
command of zsh, and I am getting time in the range of 1.15 - 1.22 sec, while papermill is in the range 1.65-1.70 sec - The 5-8 sec that I mentioned above is the time, the zsh shell is taking to generate a new command for me to input. So maybe it should include the delay in stdout displaying to the shell.
Just a minor observation: We cannot pass the file name to which data should be saved in --save-profiling-data
. It creates output-profiling-data.csv
by default
from ploomber-engine.
Just a minor observation: We cannot pass the file name to which data should be saved in --save-profiling-data. It creates output-profiling-data.csv by default
Please open an issue about it, I think there should be an option to pass an argument.
from ploomber-engine.
The 5-8 sec that I mentioned above is the time, the zsh shell is taking to generate a new command for me to input. So maybe it should include the delay in stdout displaying to the shell.
Seems like it's faster than papermill, but the output is slower, but we still need to figure out why and how to fix it.
from ploomber-engine.
I think the missing output might be that the tqdm progress bar is printed to standard error and we're just displaying standard output. If that's the case, we should ensure we also display standard error in the console.
Hi @edublancas ,
Currently, ploomber engine prints the output from stdout only when the cell is completely executed, however, this is not ideal as the output should be printed to the console as soon as it is printed to notebook stdout
I have mentioned more details in PR #66
from ploomber-engine.
Related Issues (20)
- add save_profiling_data option to execute_notebook
- add cwd to execute_notebook (kwarg exists in papermill) HOT 1
- execute_notebook `cwd` not working as expected HOT 1
- clean notebook before execution HOT 2
- ploomber-engine cli isn't well documented HOT 2
- adding error message when a notebook fails HOT 1
- Modify --save-profiling-data to accept File Path(Ploomber Engine CLI) HOT 2
- Fix stdout and stderr stream; IO() HOT 4
- Unable to link to .rst files HOT 2
- ploomber-engine displaying standard error even without `--log-output` HOT 3
- memory leak with "PloomberClient" HOT 4
- integration with Airflow HOT 2
- Modify --profile-runtime and --profile-memory plot file location HOT 1
- Possible to avoid translating `tuple` parameter values as strings, or else allow a user-supplied `Translator`? HOT 8
- Inspection and type of parameters HOT 1
- add jupytercon video HOT 1
- notebook execution fails if there's anything in stdout
- Question: Re-using interactive shell for multiple notebook executions HOT 3
- optional posthog HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ploomber-engine.