Comments (2)
There are a couple ways to achieve this:
- add a
tf.identity(x)
to your model to echo whatever inputs you desire. - use a Spark UDF to invoke the model using
df.withColumn(udf(x))
.
Unfortunately, I don't have any ready examples of these...
Alternatively, if you want to create a PR with your modifications to pipeline.py, I can take a look at incorporating it (assuming that we have a way to preserve backwards-compatibility for folks who expect the prior behavior).
from tensorflowonspark.
Thank you, @leewyang for your advice! I will try your advice next time in similar cases.
Regarding the PR, I don't want to contaminate your beautiful codebase.
What I did in pipeline.py was that:
- _transform()
Use a list to keep track of original columns other than the input_cols.
dataset.select(all of the columns instead of just input features).
append original columns' names in spark.createDataFrame() in addition to output_cols. - yield_batch()
Add a list to contain additional original columns like tensors[].
In the for item in iterable:, append the original columns into that list; yield tensors, original_columns_list.
So, I also need to change _run_model_tf1 and _run_model_tf2. - _run_model_tf1 and _run_model_tf2
Get the original columns list from yield_batch() along with tensors, put it in the result.extend(zip(*(here + python_outputs)))
Personally, I was looking for better ways of handing this. Thank you very much!
from tensorflowonspark.
Related Issues (20)
- Writing checkpoints to HDFS takes long HOT 2
- when using mnist_spark.py , serializer.dump_stream Timeout while feeding partition HOT 2
- pkg_resources.DistributionNotFound: The 'tensorflow' distribution was not found and is required by the application HOT 3
- MNIST example - Exception in TF background thread HOT 2
- the doubt about the data policy HOT 1
- Performance issues in the program HOT 2
- Performance issues in examples/mnist/estimator (by P3) HOT 3
- tensorflow.python.framework.errors_impl.UnimplementedError: File system scheme 'cosn' not implemented HOT 2
- Model Saved with TF-2.5.0 HOT 3
- How to integrate a model into Spark cluster HOT 12
- Get stuck at "Added broadcast_0_piece0 in memory on" while runing Spark standalone cluster HOT 1
- ExitCode: 13 executing mnist_data_setup.py on a yarn cluster HOT 3
- can it run on tensorflow-cpu? HOT 1
- can it run use ParameterServerStrategy HOT 3
- do we support scala & java code write tensorflow model with tenorflow-core-api ? HOT 3
- Evalator hangs while training HOT 1
- yarn mode error HOT 1
- error while running mnist_tf_ds.py HOT 1
- I have been trying to use TensorFlowOnSpark in Azure Synapse Analytics and I would like to ask if you have any information about its compatibility in this environment
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tensorflowonspark.