Comments (8)
Setting the Console out and err at the start of each cell execution leads to all sorts of problems when using anonymous functions with executors. The history in #31 should show many of the things I tried without success: marking everything transient, wrapping the stream setting in a object, marking the object properties transient, letting the object be serialized but error'ing quietly on executors, ...
Taking a step back, the correct way to solve this problem is still what I noted above: setting the console stdout/stderr before / around creation of the SparkContext so that all threads inherit the streams. The only reason I avoided that approach to start is because it's going to be difficult (and hopefully not impossible) to get initializing ordering right using spylon, pyspark, and py4j.
What we currently do:
- Initialize pyspark.SparkContext which creates a py4j gateway for us
- Initialize the Scala interpreter using the gateway
- Set the Console / System streams to what we want (which doesn't work for all threads)
What we need to do:
- Launch the py4j gateway ourselves using pyspark.java_gateway.launch_gateway
- Initialize the Scala interpreter using the gateway
- Set the Console / System streams to what we want
- Initialize pyspark.SparkContext with the gateway instance
from spylon-kernel.
Saw this yesterday in a different case. Turned out it was a spark config setting that actually disabled output for explain
. In your case, if you run the cell again, does it work the second time? Or is it that particular call always gives nothing.
from spylon-kernel.
No, in my case it refuses to return no matter the number of retries. It seems to be more about the command refusing to return output than the cell. That is, if I change the command within the same cell it will return output, but if I try the refused command in a new cell, it still refuses.
from spylon-kernel.
That's what i saw yesterday with explain
but it was due to how Spark was configured.
from spylon-kernel.
After talking with @patrick-nicholson, his case is likely another form of #21. Leaving this open for the time being.
from spylon-kernel.
Got a report of at least one instance where the output started going into the notebook log instead of into the notebook, but then restored itself back to going into the notebook a few cells / calls later.
from spylon-kernel.
I can reproduce the above behavior by running cells quickly in succession. Not sure yet if it's coming from the scala or python process when it happens.
from spylon-kernel.
Problem is in scala. Console.setOut/setErr
is thread specific, allows mutation (it's a setter!), and, for these reasons, deprecated in Scala 2.12. Tried using System.set
but to no avail. Fix that works is to wrap all code to be interpreted in a Console.withOut/withErr(fileHandle) { }
block. Will work this change in soon and see how it behaves with some interesting cases after some other code cleanup.
from spylon-kernel.
Related Issues (20)
- ExecutorClassLoader error in Spylon notebook
- How add additional jar files to SparkContext HOT 5
- [Request] provide example of Spark Yarn cluster conectivity (EMR) HOT 1
- Write NULL File to HDFS
- Graph Frames modules are missing
- Not able of import external packages HOT 6
- Question: How to gracefully stop execution in a cell?
- Does spylon-kernel support Spark 3.0? HOT 2
- spylon-kernel error : compilation: disabled (not enough contiguous free space left) HOT 1
- Unable to install spylon kernel HOT 2
- spylon launcher.packages inside kernel.json args
- s3 filesystem not found
- [BUG]: Spark submit fails: No such file or directory: '/opt/spark/python/pyspark/./bin/spark-submit'
- Cannot get Hive data HOT 1
- Run Scala cell on Jupyter notebook
- Cannot install spylon-kernel on Ubuntu 22 HOT 1
- Failed running `python -m spylon_kernel install`
- Unable to use existing spark server with spylon-kernel
- Using spylon-kernel with java?
- Outdated versioneer.py broken for Python 3.12
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from spylon-kernel.