tensorflow / java Goto Github PK
View Code? Open in Web Editor NEWJava bindings for TensorFlow
License: Apache License 2.0
Java bindings for TensorFlow
License: Apache License 2.0
Right now, the TensorFlow Java client from the repository does not expose directly protobufs that are part of the contract of the C API (Though in this thread, it was mentioned that protobufs would be eventually removed from the API but I have a feeling this won't happen anytime soon).
Instead, it compiles and distribute the protobufs Java bindings as a separate artifact and the client itself remains agnostic of the content of the messages and simply expose unmapped byte arrays, like here.
We need to decide how we want to handle those protobufs in the new distribution. Possible choices are:
Note that if we decide to compile the protobufs from the new TF Java client, it will brings its load of additional dependencies, such as grpc.
CC: @sjamesr
Many of the Java TF Ops use parameterized types for either TType
, TNumber
or both. Sometimes an Op
uses <T extends TType>
and sometimes another Op is using <T extends TNumber>
. When writing a method that uses two different Ops that declare <T>
differently, the compiler complains that T
cannot be converted to the other type. It is interesting that TNumber
is a subclass of TType
. I have searched "Professor Google", but have not found an answer to this kind of problem.
TType
to TNumber
conversion is very common, especially if you are creating a base class with a common method signature across many similar objects. Sometimes, the subclass calls for a TType
, sometimes a TNumber
. The real problem, is when you have a common method such as public <T extends TType> Operand<T> call (Operand<T> input)
.
As a work around, let's say that you cast a TType
to a TNumber
(where <U extends TNumber>
) as in:
@SuppressWarnings("unchecked")
Operand<U> uInput = (Operand<U>)input;
Now when you call something like tf.math.greater(uInput, otherValue);
, the compiler complains:
no instance of type variables(s) exists so that T conforms to TNumber
. That is because tf.math.greater
uses <T extends TNumber>
while other ops, like tf.nn.relu
defines <T extends TType>
.
Another way around this is to force erasure as in (Operand)value
.
At a minimum, it would be nice if there were a convention like <T extends TType>
and <U extends TNumber>
consistently, but this may not solve all these kind of issues, as I have seen <U extends TType, T extends TType>
, and <V extends TType, T extends TType, U extends TType>
The main issue that contributes to this problem is that the Ops require a mixture of types, so a higher level user is artificially juggling the situation by casting like above, or by forcing an erasure of the type.
IMO this situation is going to be confusing to the API user. I still haven't figured out a clean way to get around the issue when two method signatures use the same generic parameter in different ways.
Perhaps there is a better way. My gut feel is this is going to become a larger headache down the line.
The specific example I am running into at this time this problem is:
@Override
public Operand<T> call(Operand<T> input) {
@SuppressWarnings("unchecked")
Operand<U> uInput = (Operand<U>)input;
....
Operand<U> greater = tf.dtypes.cast(
tf.math.greater(uInput,
tf.dtypes.cast(tf.constant(threshold),
input.asTensor().dataType())), input.asTensor().dataType());
uInput = tf.math.mul(uInput, greater);
input = (Operand<T>)uInput;
...
Just creating this issue to record my intent to do this work, since I'm not going to do it immediately.
This issue continues a discussion that started in #91. Here's a brief excerpt of key points made so far.
@karllessard wrote:
The current Python API is made up of two layers because it is historically the product of a merge between two different projects: the original TF API and the Keras project. I personally think it brings more confusion to the users than benefits and we don't need to follow this schema if we think we can do better in Java since we start from scratch.
I'm slowly leaning now to the idea of having a single API that supports both "beginner" and "advanced" modes, whether we call it Keras or not.
@JimClarke5 wrote:
IMHO, the beauty of Keras is in the simple, straight forward, Model and Layers. Most of the Layers have defaults for constructs like Metrics, Optimizers, Activations, etc. Also, they allow simple strings in their parameters that instruct the underlying layers to construct elements, like new Dense(24, "relu"), so the way these elements are constructed can be hidden from a Keras user.
@Craigacp wrote:
My preference is to have both a low and high level framework, which is how TF python currently is. You don't need to use Keras if you don't want to, but many people do.
One reason to advocate for both frameworks is that it might actually take less development effort. Building out Keras to have full coverage requires a lot of consistent effort, but supporting ops that are added to TF's C API in a lower level API is essentially free for us.
The high level framework is for people who use Keras in TF Python, and want an API that guides them better. I think that we should have stronger typing information than exists in Python, as it's what would be expected from idiomatic Java and it helps IDEs & discoverability.
(note: this is an extract from a previous pull request, which is now made as a separate issue)
In actual TF Java artifacts (i.e. those issued from the main TF repository), the native artifacts (e.g. libtensorflow_jni.jar) contains a LICENSE file (and a new THIRD_PARTY_TF_JNI_LICENSES since 1.15, I think). These licenses list copyright notices from different libraries used by TF core runtime.
When we migrated to our Java repo, I took a snapshot of the LICENSE file and copy it under tensorflow-core-api/src/main/resources. But that was not very useful since the licenses must be attached to the same artifact/folder than the native libraries we distribute (i.e. that jars with the os-arch classifier) and now it was ending in our Java jar (without classifier).
So I moved it to tensorflow-core-api directly for now and I guess that we need to add some Maven rules to include it in our classified/native jars. Nonetheless, some details remain obscure to me:
Where do come from the LICENSE file (and the new THIRD_PARTY_TF_JNI_LICENCES) found in the TF Java 1.15 artifacts?
How do we stay in sync with changes in those files?
Are the licenses the same for all supported OS?
CC: @sjamesr
Recently, I wanna use java deploy my model. But some thing wrong happen. Such as tensorflow core dump
in libtensorflow-1.15.0.so.
SO, I wanna change the my java library to tensorflow/java
, but I cannot find some document to help quickly load my model. The only thing which can help me is the Issues
.
And when I test my code in tensorflow-core-api, it raise the
Warning: Could not load Loader: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
Exception in thread "main" java.lang.UnsatisfiedLinkError: no jnitensorflow in java.library.path
where I can found that.
Thanks for help~
System information
Describe the feature and the current behavior/state.
Now, we could run our code on GPU only via adding GPU dependencies to the classpath.
But the basic Python API provides an ability to set up the preferred device (GPU or CPU via device name)
The basic option also is available for low-level builder here
Will this change the current api? How?
Let's add the function tf.withDevice(โ/GPU:0โ)
to the Scope
class.
Who will benefit with this feature?
Anyone who trains neural network in distributed mode on different GPU/CPU devices.
Any Other info.
Hi, recently I benchmarked the inference on GPU with AWS EC2 P3.2xlarge instance on ResNet50 pretrained model. CPU benchmark are pretty close to python, however there is a regression on GPU:
0.2.0 TF Java
p50 4.76ms
P90 6.47ms
Python (TF 2.3.1)
P50 3.24ms
P90 4.59ms
I am note sure why CPU is very close but GPU is kind of far (20% diff)
System information
You can get the Keras pretrained resnet50 model and save it to savedModel format.
public class Example {
public static void main(String[] args) {
int ITERATION = 1000;
String dir = "model_path";
SavedModelBundle.Loader loader =
SavedModelBundle.loader(dir).withTags("serve");
SavedModelBundle bundle = loader.load();
Session session = bundle.session();
List<Long> timeCollector = new ArrayList<>();
for (int i = 0; i < ITERATION; i++) {
long start = System.nanoTime();
forward(session);
timeCollector.add(System.nanoTime() - start);
}
Collections.sort(timeCollector);
System.out.println("P50: " + percentile(timeCollector, 50) + "ms");
System.out.println("P90: " + percentile(timeCollector, 90) + "ms");
System.out.println("P99: " + percentile(timeCollector, 99) + "ms");
}
public static double percentile(List<Long> times, int percentile) {
int index = times.size() * percentile / 100;
return times.get(index) / 1_000_000f;
}
public static void forward(Session session) {
Session.Runner runner = session.runner();
try(Tensor<?> tensor = Tensor.of(TFloat32.DTYPE, Shape.of(1, 224, 224, 3))) {
runner.feed("serving_default_input_1:0", tensor);
runner.fetch("StatefulPartitionedCall:0");
List<Tensor<?>> result = runner.run();
}
}
}
if __name__ == "__main__":
if len(sys.argv) != 4:
print("usage: python3 benchmark.py <model_name> <model_dir> <num_iterations>")
exit(1)
model_name = sys.argv[1]
model_path = sys.argv[2]
iterations = int(sys.argv[3])
print("#############################################")
print("start testing Model: " + model_name)
begin = time.time()
# load model
model = tf.saved_model.load(model_path)
latencies = []
for _ in range(iterations):
inputs = tf.zeros((1, 224, 224, 3))
start = time.time()
result = model(inputs)
# convert the second to mini-second
latencies.append((time.time() - start) * 1000)
result.numpy()
elapsed = (time.time() - begin) * 1000
throughput = iterations / elapsed * 1000
p50 = np.percentile(latencies, 50)
p90 = np.percentile(latencies, 90)
p99 = np.percentile(latencies, 99)
print("Model: {}".format(model_name))
print("Iterations: {:d}".format(iterations))
print("Throughput: {:.2f}".format(throughput))
print("Elapsed: {:.3f} ms.".format(elapsed))
print("P50: {:.3f} ms".format(p50))
print("P90: {:.3f} ms".format(p90))
print("P99: {:.3f} ms".format(p99))
Thanks for helping me build tensorflow-java per #94 !
My first tests involve NdArray and Graph, but I naively can't find org/bytedeco/javacpp/Pointer.class
to put in my.jar
to avoid the following exception:
$ java -jar my.jar tf --verbose
DEBUG 1596849621481: Started ndarray TensorflowJavaTest.
DEBUG 1596849621705: matrix3d rank 3
DEBUG 1596849621706: Finished ndarray TensorflowJavaTest.
DEBUG 1596849621706: Started graph TensorflowJavaTest.
Exception in thread "main" java.lang.NoClassDefFoundError: org/bytedeco/javacpp/Pointer
at vis.TensorflowJavaTest.run(tf.scala:76)
at vis.TensorflowJavaTest$.runArgs(tf.scala:56)
at vis.VIS$.main(vis.scala:171)
at vis.VIS.main(vis.scala)
Caused by: java.lang.ClassNotFoundException: org.bytedeco.javacpp.Pointer
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 4 more
The code for my simple test is in Scala, which gets packed into my.jar
with other code:
debug("Started ndarray TensorflowJavaTest.")
// run simple data buffers test as in https://github.com/tensorflow/java/tree/master/ndarray
val matrix3d = org.tensorflow.ndarray.NdArrays.ofInts( org.tensorflow.ndarray.Shape.of(2, 3, 2) )
debug("matrix3d rank " + matrix3d.rank)
debug("Finished ndarray TensorflowJavaTest.")
debug("Started graph TensorflowJavaTest.")
// run simple graph and session tests per https://github.com/tensorflow/java/blob/master/tensorflow-core/tensorflow-core-api/src/test/java/org/tensorflow/SessionTest.java
val graph = new org.tensorflow.Graph() // defined in https://github.com/tensorflow/java/blob/master/tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Graph.java
val session = new org.tensorflow.Session(graph)
val tf = org.tensorflow.op.Ops.create(graph)
debug("Finished graph TensorflowJavaTest.")
Where might the bytedeco classes be? I only see these jars from my build:
(base) tensorflow-java$ find . -name '*jar'|grep -v surefire
./tensorflow-framework/target/tensorflow-framework-0.2.0-SNAPSHOT.jar
./ndarray/target/ndarray-0.2.0-SNAPSHOT.jar
./tensorflow-core/tensorflow-core-generator/target/tensorflow-core-generator-0.2.0-SNAPSHOT.jar
./tensorflow-core/tensorflow-core-generator/target/tensorflow-core-generator-0.2.0-SNAPSHOT-sources.jar
./tensorflow-core/tensorflow-core-api/target/tensorflow-core-api-0.2.0-SNAPSHOT.jar
./tensorflow-core/tensorflow-core-api/target/tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar
./tensorflow-core/tensorflow-core-api/target/tensorflow-core-api-0.2.0-SNAPSHOT-sources.jar
./tensorflow-core/tensorflow-core-platform/target/tensorflow-core-platform-0.2.0-SNAPSHOT.jar
./tensorflow-core/tensorflow-core-platform/target/tensorflow-core-platform-0.2.0-SNAPSHOT-sources.jar
./tensorflow-core/tensorflow-core-platform/target/tensorflow-core-platform-0.2.0-SNAPSHOT-javadoc.jar
No bytedeco jars in Maven's .ivy2
cache:
$ find ~/.ivy2/cache -name '*bytedeco*'
In the docker container where I built tensorflow-java
, there is a .javacpp
directory with bytedeco artifacts, but no .class
or .jar
files.
root@4f6083770318:~/.javacpp/cache# find|sort
.
./.lock
./javacpp-1.5.3-linux-x86_64.jar
./javacpp-1.5.3-linux-x86_64.jar/org
./javacpp-1.5.3-linux-x86_64.jar/org/bytedeco
./javacpp-1.5.3-linux-x86_64.jar/org/bytedeco/javacpp
./javacpp-1.5.3-linux-x86_64.jar/org/bytedeco/javacpp/linux-x86_64
./javacpp-1.5.3-linux-x86_64.jar/org/bytedeco/javacpp/linux-x86_64/libjnijavacpp.so
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow/internal
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow/internal/c_api
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow/internal/c_api/linux-x86_64
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow/internal/c_api/linux-x86_64/libjnitensorflow.so
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow/internal/c_api/linux-x86_64/libtensorflow.so
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow/internal/c_api/linux-x86_64/libtensorflow.so.2
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow/internal/c_api/linux-x86_64/libtensorflow_framework.so
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow/internal/c_api/linux-x86_64/libtensorflow_framework.so.2
Would a colleague kindly tell me where the bytedeco Pointer class file is, so I may continue with my tests? Many thanks in advance! Sorry for the basic question. Surely I'm missing something, because I can see bytedeco in various pom.xml
files, like this one:
(base) tensorflow-java/tensorflow-core/tensorflow-core-api$ grep -2 bytedeco pom.xml
<dependencies>
<dependency>
<groupId>org.bytedeco</groupId>
<artifactId>javacpp</artifactId>
<version>${javacpp.version}</version>
</dependency>
<dependency>
<groupId>org.bytedeco</groupId>
<artifactId>javacpp</artifactId>
<version>${javacpp.version}</version>
--
</plugin>
<plugin>
<groupId>org.bytedeco</groupId>
<artifactId>javacpp</artifactId>
<version>${javacpp.version}</version>
--
<quiet>true</quiet>
<links>
<link>http://bytedeco.org/javacpp/apidocs</link>
</links>
</configuration>
TensorFlow release both 2.2.1 and 2.3.1 to address its security vulnerabilities.
See: https://github.com/tensorflow/tensorflow/releases?linkId=100676843
We should upgrade to 2.3.1 as well.
Hi,
tensorflow-core-api:0.2.0-SNAPSHOT
is failing on AWS EC2 instances with Tesla V100 GPUs.
all the core-api tests failed with error code:
CUDA runtime implicit initialization on GPU:0 failed. Status: device kernel image is invalid
I have root caused the issue to be CUDA compute 7.0 is not enabled during compilation, running the following command and build from source again fixed the issue.
export TF_CUDA_COMPUTE_CAPABILITIES=7.0
Somehow in the release build of tensorflow-core-api, compute 7.0 is not enabled, but compute 3.5 and 7.0 should be the default capability of TF 2.3 according to here. The python packages and main repo built from source works fine without any modification.
References:
tensorflow/tensorflow#41132
tensorflow/tensorflow@cf1b6b3
System information
Describe the feature and the current behavior/state.
Currently in Java, we have access to the core tf.io
ops such as tf.parseExample
, tf.parseSingleExample
, tf.decodeRaw
etc. In order to serialize TF Record datasets and read in datasets from the tensorflow_datasets
buckets, for example, we need to be easily able to use these ops.
In Python, the relevant abstractions built on top of tf.io
are defined in parsing_config.py. Specifically it will be very helpful to have abstractions such as:
FixedLenFeature
, SparseFeature
, FixedLenSequenceFeature
, etc..._ParseOpParams
class which wraps the parameters to tf.parseExample
See these examples which relate to using the parse-example
ops, and reading TFRecord files
Will this change the current api? How?
This will add APIs for serializing / parsing examples to / from TF Record files
Who will benefit with this feature?
Anyone using datasets stored as TFRecord flies from TensorFlow java (for example, to load datasets from the tensorflow_datasets
GCP bucket)
Any Other info.
Feel free to get in touch with me anytime to discuss! Happy to help.
System information
Describe the feature and the current behavior/state.
libtensorflow for x86(e.g. i386/i486/i586/i686)
Will this change the current api? How?
No
Who will benefit with this feature?
32 bit processor users
Any Other info.
Tensorflow 2.1 has been released in the main repository. So, any plan for releasing a new version (2.x) of the java package?
System information
- GCC/Compiler version (if compiling from source):
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 11.0.3 (clang-1103.0.32.59)
Target: x86_64-apple-darwin19.4.0
Thread model: posix
- CUDA/cuDNN version: None
- GPU model and memory: NA
- tensorflow-java : Version HEAD
Describe the current behavior
When calling tf.zeros()
or tf.fill()
, the returned data does not match the Shape passed in.
If I have a shape of (2,2), I would expect 4 values to be returned, but I only get 2.
However, when I pass new long[(int)shape.size()]
as the shape argument, it works as expected and 4 values are returned. It seems that the fill op is using the size of the input array rather than the values contained within in the array.
Also, I noticed that the ZerosTest.java, does not check for the length of the returned array, it merely just checks that each element returned is zero.
Describe the expected behavior
The python version works as expected:
import tensorflow as tf;
print(tf.__version__)
op = tf.fill([2,2], 1.0)
print(op.numpy())
op = tf.zeros([2,2])
print(op.numpy())
With output:
2.2.0-rc3
[[1. 1.]
[1. 1.]]
[[0. 0.]
[0. 0.]]
Code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate the problem.
The following java code exhibits the issue:
float[] actualF = { 1.f, 1.f, 1.f, 1.f };
try (EagerSession session = EagerSession.create()) {
Ops tf = Ops.create(session);
Shape shape = Shape.of(2,2);
// this only returns 2 of the 4 zeros
Operand<TFloat32> zeroOp = tf.zeros(
tf.shape(tf.dtypes.cast(tf.constant(shape.asArray() ), TFloat32.DTYPE)),
TFloat32.DTYPE);
zeroOp.asTensor().data().read(DataBuffers.of(actualF));
System.out.print("tf.zeros: ");
zeroOp.asTensor().data().scalars().forEach(s -> System.out.print(s.getFloat() + ", "));
System.out.println();
System.out.println("actual Array: " + Arrays.toString(actualF));
// this only returns 2 of the 4 zeros
Arrays.fill(actualF, 0.F);
Operand<TFloat32> fillOp = tf.fill(
tf.shape(tf.dtypes.cast(tf.constant(shape.asArray() ), TFloat32.DTYPE)),
tf.dtypes.cast(tf.constant(1.0), TFloat32.DTYPE));
fillOp.asTensor().data().read(DataBuffers.of(actualF));
System.out.print("tf.fill: ");
fillOp.asTensor().data().scalars().forEach(s -> System.out.print(s.getFloat() + ", "));
System.out.println();
System.out.println("actual Array: " + Arrays.toString(actualF));
// this works as expected:
Arrays.fill(actualF, 0.F);
Operand<TFloat32> fillOp1 = tf.fill(
tf.shape(tf.dtypes.cast(tf.constant(new long[(int)shape.size()] ), TFloat32.DTYPE)),
tf.dtypes.cast(tf.constant(1.0), TFloat32.DTYPE));
fillOp1.asTensor().data().read(DataBuffers.of(actualF));
System.out.print("tf.fill: ");
fillOp1.asTensor().data().scalars().forEach(s -> System.out.print(s.getFloat() + ", "));
System.out.println();
System.out.println("actual Array: " + Arrays.toString(actualF));
}
With output:
tf.zeros: 0.0, 0.0,
expected Array: [0.0, 0.0, 0.0, 0.0]
actual Array: [0.0, 0.0, 1.0, 1.0]
tf.fill: 1.0, 1.0,
expected Array: [1.0, 1.0, 1.0, 1.0]
actual Array: [1.0, 1.0, 0.0, 0.0]
tf.fill: 1.0, 1.0, 1.0, 1.0,
expected Array: [1.0, 1.0, 1.0, 1.0]
actual Array: [1.0, 1.0, 1.0, 1.0]
Other info / logs
NA
System information
Describe the feature and the current behavior/state.
The problem is described in the mailing thread.
I have next problem: trying to repeat one of the modern CNN architectures on Java API. Most of them are using BatchNormalization as a popular layer with tf.nn.batchNormalization() op.
I trying to use old operands like BatchNormWithGlobalNormalization
I've used but got Exception in thread "main" org.tensorflow.exceptions.TFUnimplementedException: Op BatchNormWithGlobalNormalization is not available in GraphDef version 175. It has been removed in version 9. Use tf.nn.batch_normalization().
at org.tensorflow.internal.c_api.AbstractTF_Status.throwExceptionIfNotOK(AbstractTF_Status.java:99)
This was deperecated years ago, but we have it in 1.15 and 2.x APIs.
The possible solutions:
Does Tensorflow Java API support .tflite file format? I am not able to find any documentation, so any leads in this direction will be helpful
The method conv2d(Operand, Operand, List, String, Conv2d.Options...) in the type NnOps is not applicable for the arguments (Constant, Tensor, List).
But how can I generate the Operand type?
With your snapshot libtensorflow-1.0.1-20170323.012702-1.jar (Downloaded from https://oss.sonatype.org/content/repositories/snapshots/org/tensorflow/libtensorflow/1.0.1-SNAPSHOT/) get the errors like:
(command:
javac -cp ./libtensorflow-1.0.1-20170323.012702-1.jar:. HelloTensorFlowSnapshot.java
)
HelloTensorFlowSnapshot.java:4: error: package org.tensorflow.exceptions does not exist
import org.tensorflow.exceptions.TensorFlowException;
^
HelloTensorFlowSnapshot.java:9: error: cannot find symbol
import org.tensorflow.GraphOperation;
^
symbol: class GraphOperation
location: package org.tensorflow
HelloTensorFlowSnapshot.java:10: error: package org.tensorflow.proto.framework does not exist
import org.tensorflow.proto.framework.SignatureDef;
^
HelloTensorFlowSnapshot.java:14: error: package org.tensorflow.proto.framework does not exist
import org.tensorflow.proto.framework.MetaGraphDef;
^
HelloTensorFlowSnapshot.java:16: error: package org.tensorflow.proto.framework does not exist
import org.tensorflow.proto.framework.TensorInfo;
^
HelloTensorFlowSnapshot.java:17: error: package org.tensorflow.types does not exist
import org.tensorflow.types.TFloat32;
^
HelloTensorFlowSnapshot.java:18: error: package org.tensorflow.tools does not exist
import org.tensorflow.tools.Shape;
^
HelloTensorFlowSnapshot.java:20: error: package org.tensorflow.tools.buffer does not exist
import org.tensorflow.tools.buffer.DataBuffers;
^
HelloTensorFlowSnapshot.java:21: error: package org.tensorflow.tools.ndarray does not exist
import org.tensorflow.tools.ndarray.FloatNdArray;
^
HelloTensorFlowSnapshot.java:22: error: package org.tensorflow.tools.ndarray does not exist
import org.tensorflow.tools.ndarray.StdArrays;
^
HelloTensorFlowSnapshot.java:23: error: package org.tensorflow.proto.framework does not exist
import org.tensorflow.proto.framework.TensorInfo;
I try to run the working code from https://stackoverflow.com/questions/61228372/tensorflow-2-0-java-api.
Thanks in advance,
Milan
Thanks for making this Java API to Tensorflow! Would you kindly help me understand this error from mvn install
:
[INFO] Reactor Summary for TensorFlow Java Parent 0.2.0-SNAPSHOT:
[INFO]
[INFO] TensorFlow Java Parent ............................. SUCCESS [ 1.774 s]
[INFO] TensorFlow NdArray Library ......................... SUCCESS [02:47 min]
[INFO] TensorFlow Core Parent ............................. SUCCESS [ 0.187 s]
[INFO] TensorFlow Core Annotation Processor ............... SUCCESS [ 3.157 s]
[INFO] TensorFlow Core API Library ........................ FAILURE [05:15 min]
[INFO] TensorFlow Core API Library Platform ............... SKIPPED
[INFO] TensorFlow Framework Library ....................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 08:09 min
[INFO] Finished at: 2020-08-01T04:56:29Z
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.bytedeco:javacpp:1.5.3:build (javacpp-build) on project tensorflow-core-api: Execution javacpp-build of goal org.bytedeco:javacpp:1.5.3:build failed: Process exited with an error: 132 -> [Help 1]utionException: Failed to execute goal org.bytedeco:javacpp:1.5.3:build (javacpp-build) on project tensorflow-core-api: Execution javacpp-build of goal org.bytedeco:javacpp:1.5.3:build failed: Process exited with an error: 132r.java:215)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:288)
at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke (Method.java:566)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
Caused by: org.apache.maven.plugin.PluginExecutionException: Execution javacpp-build of goal org.bytedeco:javacpp:1.5.3:build failed: Process exited with an error: 132
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:148)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:288)
at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke (Method.java:566)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
Caused by: java.lang.RuntimeException: Process exited with an error: 132
at org.bytedeco.javacpp.tools.Builder.build (Builder.java:1026)
at org.bytedeco.javacpp.tools.BuildMojo.execute (BuildMojo.java:411)
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:137)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:288)
at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke (Method.java:566)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[ERROR]
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
My 'uname -a
' is 'Linux 4.9.0-7-amd64 #1 SMP Debian 4.9.110-3+deb9u2 (2018-08-13) x86_64 GNU/Linux
'. I'm running in Docker, w/ Java 11. Happy to use Java 8 instead if that helps.
I run CPU-only. Old CPU. No AVX. CPU flags are (from /proc/cpuinfo
):
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes rdrand lahf_lm 3dnowprefetch epb kaiser tpr_shadow vnmi flexpriority ept vpid tsc_adjust smep erms dtherm ida arat
I have a custom .whl file for my Tensorflow build from source, which excludes AVX instructions. However, tensorflow-java's 'mvn install
' seems to build its own Tensorflow, and I'm not sure AVX is excluded. Moreover, my /var/log/messages
shows an invalid opcode error during 'mvn install
', so I suspect AVX issues etc:
Aug 1 00:56:28 kernel: [6829864.802276] traps: java_op_generat[5990] trap invalid opcode ip:7fe3188f1a40 sp:7ffe8f916cb8 error:0
Aug 1 00:56:28 kernel: [6829864.802289] in libtensorflow_framework.so.2.2.0[7fe317356000+1ab9000]
I'd prefer to tell tensorflow-java to use the Tensorflow w/o AVX that I built and installed, rather than use tensorflow-java's mvn-install-built Java Parent Tensorflow. Is this possible? Please let me know if this is a reasonable fix, or what I should do. Not sure how to interpret the error messages. Thanks for your time and insights!
Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template
System information
You can collect some of this information using our environment capture script
You can also obtain the TensorFlow version with
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
Here is the result of the capture script:
tf_env.txt
Describe the current behavior
We tested the new Tensorflow Java API (not the legacy one). The brand new version released in October 2020. We tested it on some machines including Azure Databricks NC6_v3 and Azure Virtual Machines (the capture log is from the virtual machine). I noticed that in case of no GPU available the library falls back to CPU. And this is fine. However we also measured the time for some example processing (a few vector operations). And we see that there is no significant difference between processing time on GPU and on CPU. It looks as it is not using GPU, even if this is present (we tried two graphic cards: Tesla K80 with compute compatibility 3.7 and Tesla V100 with compute compatibility 7.0). In both cases we do not see any difference in processing time.
Describe the expected behavior
Expected behaviour is to get execution times much better if the program is executed on a machine with GPU present.
Code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate the problem.
We used the following java program:
HelloTensorFlow_java.txt
pom_xml.txt
The source was compiled to class file and it was run via the following command:
java -classpath protobuf-java-3.8.0.jar:ndarray-0.2.0.jar:javacpp-1.5.4.jar:javacpp-1.5.4-linux-x86_64.jar:tensorflow-core-api-0.2.0.jar:tensorflow-core-api-0.2.0-linux-x86_64-gpu.jar:tensorflow-core-platform-gpu-0.2.0.jar:. HelloTensorFlow
The listed libraries were downloaded from https://oss.sonatype.org/.
Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
The enclosed program issues the following log:
log.txt
From the log you may see that the GPU was present and recognized.
However the execution time did not differ, when we started it with GPU and without.
System information
Problem:
I have been using TF 1.15 from original java TF repository
<dependency>
<groupId>org.tensorflow</groupId>
<artifactId>tensorflow</artifactId>
<version>1.15.0</version>
</dependency>
which gave me this output:
I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2494460000 Hz
I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f77250299d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
so found out this repo, made environment to be able to build TF from sources, ran mvn install
command which, I would assume, compiled TF on my specific platform. Using dependencies in my project:
<dependency>
<groupId>org.tensorflow</groupId>
<artifactId>tensorflow-core-api</artifactId>
<version>0.2.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.tensorflow</groupId>
<artifactId>tensorflow-core-api</artifactId>
<version>0.2.0-SNAPSHOT</version>
<classifier>linux-x86_64</classifier>
</dependency>
getting output:
Warning: Could not load Loader: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib]
Warning: Could not load Pointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib]
Warning: Could not load BytePointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib]
I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Warning: Could not load PointerPointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib]
Everything somehow runs, but throughput is about the same as generic 1.15 version and latency is about 2 times worse than the previous version using the same TF model with V1 behavior enabled. Not sure how to enable AVX2 FMA instructions when TF clearly founds them. I suppose it has something to do about missing jnijavacpp library. Could anyone help me, please?
Thanks
Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template
System information
You can collect some of this information using our environment capture script
You can also obtain the TensorFlow version with
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
Describe the current behavior
Tensor created using TFloat16.tensorOf does not have correct value. In example below float has value [0, 1] but TFloat16 generate values [0,0] whereas TFloat32 gives correct value [0, 1]
Describe the expected behavior
Code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate the problem.
public class Test {
public static void main(String[] args) {
float[][] f1 = {{0}, {1}};
System.out.println(StdArrays.ndCopyOf(f1).getFloat(0,0));
System.out.println(StdArrays.ndCopyOf(f1).getFloat(1,0));
System.out.println("FLOAT16");
Tensor<TFloat16> tf_float1 = TFloat16.tensorOf(StdArrays.ndCopyOf(f1));
System.out.println(tf_float1.data().getFloat(0,0));
System.out.println(tf_float1.data().getFloat(1,0));
System.out.println("FLOAT32");
Tensor<TFloat32> tf_float2 = TFloat32.tensorOf(StdArrays.ndCopyOf(f1));
System.out.println(tf_float2.data().getFloat(0,0));
System.out.println(tf_float2.data().getFloat(1,0));
}
}
OUTPUT
0.0
1.0
FLOAT16
Warning: Could not load Loader: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
Warning: Could not load Pointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
Warning: Could not load BytePointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
0.0
0.0
FLOAT32
0.0
1.0
Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
Hi, DJL's TensorFlow engine is depending on tensorflow-core-api' SNAPSHOT package. Our dependencies here: https://github.com/awslabs/djl/blob/master/tensorflow/tensorflow-native-auto/build.gradle#L19
We found out there is an update on tensorflow-core-api SNAPSHOT on 04/28, but the corresponding linux-gpu-mkl.jar is missing, same for windows.
Did the upload failed?
https://oss.sonatype.org/#nexus-search;quick~tensorflow-core-api
we get 404 when trying to download jar, both gradle build and manually trying the following link failed.
https://oss.sonatype.org/service/local/artifact/maven/redirect?r=snapshots&g=org.tensorflow&a=tensorflow-core-api&v=0.1.0-SNAPSHOT&e=jar&c=windows-x86_64
The mac-os-mkl.jar is there, but the libjnimkldnn.dylib
and libiomp5.dylib
extra libraries are missing, is this intended? how can I find them? We rely on this task to download native dependencies automatically for users based on their platform.
Please help take a look, thank you so much!
When we migrated to tensorflow/java
and to Maven, we dropped a few rules that were embedded in the Bazel configuration of the previous build. One of them is all the link checks that are listed in (this file)[https://github.com/tensorflow/tensorflow/blob/master/tensorflow/java/build_defs.bzl]
This issue is a placeholder to remember that those should probably be added back to the new Maven build.
System information
Describe the current behavior
When using the current Java bindings for TensorFlow, the log gets filled with output from every operation when a model is evaluated. I was unable to find a way to set the log level via the Java API. Is this at all possible? If not, could you please consider adding this functionality in future versions?
Describe the documentation issue
Before the java keras development was moved to this repo, it had a great getting started guide. That would be great to be easily discoverable in this repo.
(A getting started guide for importing models from python would be great too. I have a feeling that's a pretty common use case for using tensorflow in java)
In Python TensorFlow, there are some OP
s defined in the Python Layer, and some defined in the C-api layer. I have been tasked to see how Java TensorFlow might want to handle this.
I have run some experiments with creating a FrameworkOperatorProcessor
class in tensorflow-flow-generator
and a couple of architectures present themselves. This class is basically a copy of OperatorProcessor
with some tweaks.
The approaches seem to dictate generating a new class in tensorflow-framework
, that I named FOps
for now.
The first approach is to have FOps
subclass org.tensorflow.op.Ops
, that is generated in tensorflow-core-api.
However, this leads to potential problems with name clashes with the methods and groups already in Ops
. A prime example of this are the NN classes we added for Nn and NnRaw (SoftmaxCrossEntropyWithLogits<T> softmaxCrossEntropyWithLogits()
has the same signature in both generated classes.) This option requires changing Ops from a final
class to non-final so that it can be inherited.
A second approach is to use the delegate pattern, and have FOps hold an internal reference to Ops, and you could call methods on each as required. For example,
FOps ftf = FOps.create(graph);
ftf.math.tensordot(); // framework op
ftf.getOps().math.mul(); // raw op
Keep both totally separate from each other. This may potentially allow reuse of the existing OperatorProcessor
. It may be more cumbersome to the programmer user.
Another option, that I haven't thought of yet.
I welcome thoughts on this.
I'd like to know if a new feature can be added to have Index
knowing about not only its Dimension
, but also the other dimensions.
I already shared this with @karllessard , but I'd like to see if we can do something about.
Let me explain: I have a float[rows*columns]
table with the actual data and an int[rows*columns]
table with permutations to sort each column of the data table separately.
Data table:
data = [ 1.34 0.87 2.45 ]
[ 0.45 1.56 1.66 ]
[ 1.02 0.98 0.34 ]
permutations = [ 2 0 2 ]
[ 0 2 1 ]
[ 1 1 0 ]
This is a way to have both a sorted and an unsorted version of the data table by using permutations in the indexing when the sorted version is needed.
I was thinking to put the float table in a FloatDataBuffer
, then wrap it in a FloatNdArray
and create an Index that uses the permutations table. But the fact that we need a separate Index instance for each dimension makes the things complicated (impossible?). Is now the only option to implement my own FloatDataBuffer
/FloatNdArray
?
Many thanks for TF-Java! mvn install -Pdev -Djavacpp.platform.extension=-gpu -e
on the master branch appears to fail a test, so I thought I'd share it, in case anyone else encounters this:
# mvn install -Pdev -Djavacpp.platform.extension=-gpu -e
...
Downloading from ossrh-snapshots: https://oss.sonatype.org/content/repositories/snapshots/org/tensorflow/tensorflow-core-api/0.3.0-SNAPSHOT/tensorflow-core-api-0.3.0-20201008.134402-33-linux-x86_64-gpu.jar
...
tensorflow framework build error
[INFO] --- maven-surefire-plugin:2.22.2:test (default-test) @ tensorflow-framework ---
[INFO] Surefire report directory: /tmp/docker-share/tensorflow-java/tensorflow-framework/target/surefire-reports
[INFO]
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.tensorflow.framework.optimizers.AdaDeltaTest
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.573 s - in org.tensorflow.framework.optimizers.AdaDeltaTest
[INFO] Running org.tensorflow.framework.optimizers.NadamTest
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.113 s - in org.tensorflow.framework.optimizers.NadamTest
[INFO] Running org.tensorflow.framework.optimizers.AdamTest
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.134 s - in org.tensorflow.framework.optimizers.AdamTest
[INFO] Running org.tensorflow.framework.optimizers.AdaGradTest
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.083 s - in org.tensorflow.framework.optimizers.AdaGradTest
[INFO] Running org.tensorflow.framework.optimizers.RMSPropTest
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.367 s - in org.tensorflow.framework.optimizers.RMSPropTest
[INFO] Running org.tensorflow.framework.optimizers.AdamaxTest
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.118 s - in org.tensorflow.framework.optimizers.AdamaxTest
[INFO] Running org.tensorflow.framework.optimizers.FtrlTest
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.273 s - in org.tensorflow.framework.optimizers.FtrlTest
[INFO] Running org.tensorflow.framework.optimizers.MomentumTest
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.206 s - in org.tensorflow.framework.optimizers.MomentumTest
[INFO] Running org.tensorflow.framework.optimizers.OptimizersTest
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.054 s - in org.tensorflow.framework.optimizers.OptimizersTest
[INFO] Running org.tensorflow.framework.optimizers.GradientDescentTest
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.14 s - in org.tensorflow.framework.optimizers.GradientDescentTest
[INFO] Running org.tensorflow.framework.optimizers.AdaGradDATest
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 2.055 s <<< FAILURE! - in org.tensorflow.framework.optimizers.AdaGradDATest
[ERROR] testBasic Time elapsed: 2.044 s <<< ERROR!
org.tensorflow.exceptions.TFInvalidArgumentException:
Cannot assign a device for operation adagrad-da_1: Could not satisfy explicit device specification '' because the node {{colocation_node adagrad-da_1}} was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:GPU:0, /job:localhost/replica:0/task:0/device:GPU:1, /job:localhost/replica:0/task:0/device:GPU:2, /job:localhost/replica:0/task:0/device:GPU:3].
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_name_index_=1 requested_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' assigned_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' resource_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]
ApplyAdagradDA: CPU
VariableV2: GPU CPU
Assign: GPU CPU
Colocation members, user-requested devices, and framework assigned devices, if any:
var0 (VariableV2) framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
adagrad-da_1 (Assign)
var0-gradient_accumulator (VariableV2)
adagrad-da_10 (Assign)
var0-gradient_squared_accumulator (VariableV2)
adagrad-da_15 (Assign)
adagrad-da_36 (ApplyAdagradDA)
[[{{node adagrad-da_1}}]]
at org.tensorflow.framework.optimizers.AdaGradDATest.testBasic(AdaGradDATest.java:90)
[INFO] Running org.tensorflow.framework.data.SkipDatasetTest
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.029 s - in org.tensorflow.framework.data.SkipDatasetTest
[INFO] Running org.tensorflow.framework.data.BatchDatasetTest
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.021 s - in org.tensorflow.framework.data.BatchDatasetTest
...
[INFO] Running org.tensorflow.framework.initializers.OrthogonalTest
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.333 s - in org.tensorflow.framework.initializers.OrthogonalTest
[INFO] Running org.tensorflow.framework.initializers.HeTest
[INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.451 s - in org.tensorflow.framework.initializers.HeTest
[INFO]
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR] AdaGradDATest.testBasic:90 ? TFInvalidArgument Cannot assign a device for oper...
[INFO]
[ERROR] Tests run: 112, Failures: 0, Errors: 1, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for TensorFlow Java Parent 0.3.0-SNAPSHOT:
[INFO]
[INFO] TensorFlow Java Parent ............................. SUCCESS [ 3.164 s]
[INFO] TensorFlow NdArray Library ......................... SUCCESS [ 30.072 s]
[INFO] TensorFlow Core Parent ............................. SUCCESS [ 0.006 s]
[INFO] TensorFlow Core Annotation Processor ............... SUCCESS [ 0.190 s]
[INFO] TensorFlow Core API Library ........................ SUCCESS [ 34.833 s]
[INFO] TensorFlow Core API Library Platform GPU ........... SUCCESS [ 0.021 s]
[INFO] TensorFlow Framework Library ....................... FAILURE [01:29 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 02:37 min
[INFO] Finished at: 2020-10-14T01:44:35Z
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.22.2:test (default-test) on project tensorflow-framework: There are test failures.
[ERROR]
[ERROR] Please refer to /tmp/docker-share/tensorflow-java/tensorflow-framework/target/surefire-reports for the individual test results.
[ERROR] Please refer to dump files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
[ERROR] -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.22.2:test (default-test) on project tensorflow-framework: There are test failures.
Please refer to /tmp/docker-share/tensorflow-java/tensorflow-framework/target/surefire-reports for the individual test results.
Please refer to dump files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:215)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
...
This new quad-GPU machine presents new challenges, compared to our previous small test systems in #100 etc. Really appreciate the fast -Pdev
option for building from artifacts, thanks!
This is painful for machines which don't have MKL installed. The MKL version bound against is the one using Intel's OpenMP, which is even less likely to be available on a machine (plus it's MKL-ML, not regular MKL nor MKL DNN - now DNNL).
We should make the build system a little more configurable, or at least set the default build to be something that works everywhere (even if the performance is worse).
I keep running into this when testing things on different machines here, none of which have MKL, and I have to get approvals for new inbound libraries.
System information
This is follow-up to #97 (comment) and #97 (comment). I'm trying to build a tensorflow-java with SavedModelBundle exporter(...)
but encounter a build error.
Specifically, I git checkout
the save_model
branch of https://github.com/karllessard/tensorflow-java and mvn install
, but encounter a build error. I would raise an issue in https://github.com/karllessard/tensorflow-java, but https://github.com/karllessard/tensorflow-java/issues redirects to https://github.com/karllessard/tensorflow-java/pulls, so I cannot raise an issue there.
Would a colleague kindly advise how to fix the following build error:
root@478b80e86d3b:tensorflow-java-karl# mvn install -e
...
[INFO] --- maven-compiler-plugin:3.8.0:compile (default-compile) @ tensorflow-core-api ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 1678 source files to /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/target/classes
[INFO] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Tensor.java: Some input files use unchecked or unsafe operations.
[INFO] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Tensor.java: Recompile with -Xlint:unchecked for details.
[INFO] -------------------------------------------------------------
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[129,55] cannot find symbol
symbol: variable internal_static_tensorflow_BoundedTensorSpecProto_descriptor
location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[135,55] cannot find symbol
symbol: variable internal_static_tensorflow_BoundedTensorSpecProto_fieldAccessorTable
location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[479,57] cannot find symbol
symbol: variable internal_static_tensorflow_BoundedTensorSpecProto_descriptor
location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[485,57] cannot find symbol
symbol: variable internal_static_tensorflow_BoundedTensorSpecProto_fieldAccessorTable
location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[536,57] cannot find symbol
symbol: variable internal_static_tensorflow_BoundedTensorSpecProto_descriptor
location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[82,65] cannot find symbol
symbol: variable internal_static_tensorflow_SaveableObject_descriptor
location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[88,65] cannot find symbol
symbol: variable internal_static_tensorflow_SaveableObject_fieldAccessorTable
location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[290,67] cannot find symbol
symbol: variable internal_static_tensorflow_SaveableObject_descriptor
location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[296,67] cannot find symbol
symbol: variable internal_static_tensorflow_SaveableObject_fieldAccessorTable
location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[329,67] cannot find symbol
symbol: variable internal_static_tensorflow_SaveableObject_descriptor
location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[INFO] 10 errors
[INFO] -------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for TensorFlow Java Parent 0.1.0-SNAPSHOT:
[INFO]
[INFO] TensorFlow Java Parent ............................. SUCCESS [ 2.152 s]
[INFO] TensorFlow Tools Library ........................... SUCCESS [02:29 min]
[INFO] TensorFlow Core Parent ............................. SUCCESS [ 0.136 s]
[INFO] TensorFlow Core Annotation Processor ............... SUCCESS [ 4.371 s]
[INFO] TensorFlow Core API Library ........................ FAILURE [08:09 min]
[INFO] TensorFlow Core API Library Platform ............... SKIPPED
[INFO] TensorFlow Framework Library ....................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 10:47 min
[INFO] Finished at: 2020-08-14T17:41:27Z
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.0:compile (default-compile) on project tensorflow-core-api: Compilation failure: Compilation failure:
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[129,55] cannot find symbol
[ERROR] symbol: variable internal_static_tensorflow_BoundedTensorSpecProto_descriptor
[ERROR] location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[135,55] cannot find symbol
[ERROR] symbol: variable internal_static_tensorflow_BoundedTensorSpecProto_fieldAccessorTable
[ERROR] location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[479,57] cannot find symbol
[ERROR] symbol: variable internal_static_tensorflow_BoundedTensorSpecProto_descriptor
[ERROR] location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[485,57] cannot find symbol
[ERROR] symbol: variable internal_static_tensorflow_BoundedTensorSpecProto_fieldAccessorTable
[ERROR] location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[536,57] cannot find symbol
[ERROR] symbol: variable internal_static_tensorflow_BoundedTensorSpecProto_descriptor
[ERROR] location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[82,65] cannot find symbol
[ERROR] symbol: variable internal_static_tensorflow_SaveableObject_descriptor
[ERROR] location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[88,65] cannot find symbol
[ERROR] symbol: variable internal_static_tensorflow_SaveableObject_fieldAccessorTable
[ERROR] location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[290,67] cannot find symbol
[ERROR] symbol: variable internal_static_tensorflow_SaveableObject_descriptor
[ERROR] location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[296,67] cannot find symbol
[ERROR] symbol: variable internal_static_tensorflow_SaveableObject_fieldAccessorTable
[ERROR] location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[329,67] cannot find symbol
[ERROR] symbol: variable internal_static_tensorflow_SaveableObject_descriptor
[ERROR] location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] -> [Help 1]
Rather than debug https://github.com/karllessard/tensorflow-java, I would be happy if there were a save_model
branch in https://github.com/tensorflow/java. However, I'm just trying to build something to test model save/export and load -- it does not matter to me how this is achieved.
Currently in tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/DataType.java, we use an uncomfortable pattern: DataType
is omniscient about TType
and dispatches on a string TType.NAME
. For example:
/** Returns true if this data type represents an integer type */
public boolean isInteger() {
switch (this.name()) {
case TInt32.NAME:
case TInt64.NAME:
case TUint8.NAME:
return true;
default:
return false;
}
}
The author of this pattern, @JimClarke5, mentioned in our Google Group that he regarded it as temporary:
My present code does a switch on DataType.name(), but IMO, this isnโt the most elegant way to do this.
@karllessard suggested a direction, although with some open questions:
Each data type in Java inherit from a "type family" as in here, which can be use to set bounds on a given datatype when used as a generic parameter (e.g.
Tensor<? extends TNumber>
to only accept tensors that are numeric). But if doesn't do in your case and you really want to check the data type family at runtime, then we need to add new methods, likedataType.isNumber()
. I think ideally it should be in line with the same data types classes defined in the core library; the new methods could even be added to the C API, in this file.
Let's decide on a direction! This is moderately pervasive in our code, but also a pretty simple change, so I'd advocate we choose a direction soon and I'm tempted to volunteer to make the change.
SignatureDef sig = model.metaGraphDef().getSignatureDefMap().get("serving_default");
does not compile (it did a week ago).
It has the error Error: java: cannot access com.google.protobuf.MessageOrBuilder class file for com.google.protobuf.MessageOrBuilder not found
This error showed up in the past week using the TF 2.0 (0.1.0-SNAPSHOT
)
System information
Describe the current behavior
Does not compile
Describe the expected behavior
Compiles
Code to reproduce the issue
SavedModelBundle model = SavedModelBundle.load(pathToBundle, "serve");
SignatureDef sig = model.metaGraphDef().getSignatureDefMap().get("serving_default");
Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
I'm using this code to get the input and output names of my model. See this Stack Overflow answer and this Stack Overflow answer
In the TF java documentation It looks as if you can import a Graph Defition from a byte array. In their examples they also read from a byte array.
The code here, however, requires a GraphDef.
What's the best way to overcome this? Or am I not understanding what a GraphDef is?
Thanks!
System information
Java API does not support factory method for creating Tensor
Tensors.create() - https://www.tensorflow.org/api_docs/java/reference/org/tensorflow/Tensors
This is not a new topic but I want to start a thread on it so we can get closer to a complete solution.
We always had trouble building our Windows platforms with extensions mkl
, gpu
or mkl-gpu
on GitHub Actions because the operation takes too long (i.e. beyond the 6 hours limit).
Adding this new option to the Visual Studio compiler reduces drastically the compilation time of MKL functions, as I've tested locally. Still, all platforms with GPU support do not complete in time, as you can see in this workflow.
Strangely, we can observe that the preparation of the environment of these builds took 20 minutes (i.e. x2) the time of the non-GPU builds. But we install the same software, regardless if we are building for GPU or not. Can we investigate what is the cause of that delay?
Also, I have been told by SIG Build that disabling Eigen inlining helps reducing the compilation time even more, but this time at the price of some performance loss. Still, should we give it a try?
Thanks for helping me build a working .jar
file containing TensorflowJava per #96!
Would a colleague kindly help me with the following runtime error, when I run my tests within a Docker container:
root@4f6083770318:tensorflow-java# java -jar my.jar tf --verbose
DEBUG 1596940815346: Started ndarray TensorflowJavaTest.
DEBUG 1596940815727: matrix3d rank 3
DEBUG 1596940815727: Finished ndarray TensorflowJavaTest.
DEBUG 1596940815728: Started graph TensorflowJavaTest.
DEBUG 1596940819085: fetch_test fetched.data.getInt(0)=3 is 3? true
DEBUG 1596940819087: fetch_test fetched.data.getInt(1)=4 is 4? true
Exception in thread "main" java.lang.IncompatibleClassChangeError: Method 'org.tensorflow.Tensor org.tensorflow.types.TInt32.vectorOf(int[])' must be InterfaceMethodref constant
at vis.TensorflowJavaTest.feed_test(tf.scala:80)
at vis.TensorflowJavaTest.run(tf.scala:106)
at vis.TensorflowJavaTest$.runArgs(tf.scala:58)
at vis.VIS$.main(vis.scala:171)
at vis.VIS.main(vis.scala)
My tests are based on runUsingColonSeparatedNames()
in SessionTest.java, and I re-wrote some in Scala.
The test body is:
debug("Started ndarray TensorflowJavaTest.")
// run simple data buffers test as in https://github.com/tensorflow/java/tree/master/ndarray
val matrix3d = org.tensorflow.ndarray.NdArrays.ofInts( org.tensorflow.ndarray.Shape.of(2, 3, 2) )
debug("matrix3d rank " + matrix3d.rank)
debug("Finished ndarray TensorflowJavaTest.")
debug("Started graph TensorflowJavaTest.")
// run simple graph and session tests per https://github.com/tensorflow/java/blob/master/tensorflow-core/tensorflow-core-api/src/test/java/org/tensorflow/SessionTest.java
val graph = new org.tensorflow.Graph() // defined in https://github.com/tensorflow/java/blob/master/tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Graph.java
val session = new org.tensorflow.Session(graph)
val tf = org.tensorflow.op.Ops.create(graph)
fetch_test(session, tf)
feed_test(session, tf)
debug("Finished graph TensorflowJavaTest.")
The fetch test is:
private def fetch_test(session: org.tensorflow.Session, tf: org.tensorflow.op.Ops): Unit = {
val split = tf.split( tf.constant(0), tf.array(1, 2, 3, 4), 2l)
tf.math.add(split.output().get(0), split.output().get(1))
// Fetch using colon separated names.
val fetched = session.runner().fetch("Split:1").run().get(0).expect(org.tensorflow.types.TInt32.DTYPE)
val fetch0 = fetched.data().getInt(0)
debug("fetch_test fetched.data.getInt(0)=%d is 3? %b".format(fetch0, fetch0 == 3)) // expected to be 3
val fetch1 = fetched.data().getInt(1)
debug("fetch_test fetched.data.getInt(1)=%d is 4? %b".format(fetch1, fetch1 == 4)) // expected to be 4
}
The feed test is:
import scala.collection.JavaConversions._ // convert java iterator to scala iterator with foreach defined. Allows TensorflowJava's NdArraySequence to be treated as a Java Iterable, to be treated as a Scala Iterable, so mkString() will work
...
private def feed_test(session: org.tensorflow.Session, tf: org.tensorflow.op.Ops): Unit = {
// Feed using colon separated names.
val fed = org.tensorflow.types.TInt32.vectorOf(4, 3, 2, 1) // <-- line 80, throws IncompatibleClassChangeError at runtime
val fetched = session.runner()
.feed("Split:0", fed)
.feed("Split:1", fed)
.fetch("Add")
.run()
.get(0)
.expect(org.tensorflow.types.TInt32.DTYPE)
val data = fetched.data()
debug("feed_test fetched.data=%s".format(data.scalars.mkString(",")))
}
Kindly advise what is wrong with the line val fed = org.tensorflow.types.TInt32.vectorOf(4, 3, 2, 1)
, which throws the IncompatibleClassChangeError
at runtime?
System information
Describe the current behavior
It crashes with the following output:
2020-07-01 20:14:58.469557: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: tokenizer
2020-07-01 20:14:58.474996: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2020-07-01 20:14:58.475083: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:295] Reading SavedModel debug info (if present) from: tokenizer
2020-07-01 20:14:58.475670: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-07-01 20:14:58.478426: W external/org_tensorflow/tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-07-01 20:14:58.478482: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-07-01 20:14:58.478517: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (1470715d11a2): /proc/driver/nvidia/version does not exist
2020-07-01 20:14:58.478585: I external/org_tensorflow/tensorflow/core/common_runtime/process_util.cc:147] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
2020-07-01 20:14:58.491011: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:234] Restoring SavedModel bundle.
2020-07-01 20:14:58.522974: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:183] Running initialization op on SavedModel bundle at path: tokenizer
2020-07-01 20:14:58.537814: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:364] SavedModel load for tags { serve }; Status: success: OK. Took 68286 microseconds.
2020-07-01 20:14:59.321091: W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at wordpiece_kernel.cc:204 : Invalid argument: Trying to access resource using the wrong type. Expected N10tensorflow6lookup15LookupInterfaceE got N10tensorflow6lookup15LookupInterfaceE
[WARNING]
org.tensorflow.exceptions.TFInvalidArgumentException: Trying to access resource using the wrong type. Expected N10tensorflow6lookup15LookupInterfaceE got N10tensorflow6lookup15LookupInterfaceE
[[{{node WordpieceTokenizeWithOffsets/WordpieceTokenizeWithOffsets/WordpieceTokenizeWithOffsets}}]]
at org.tensorflow.internal.c_api.AbstractTF_Status.throwExceptionIfNotOK (AbstractTF_Status.java:87)
at org.tensorflow.Session.run (Session.java:595)
at org.tensorflow.Session.access$100 (Session.java:70)
at org.tensorflow.Session$Runner.runHelper (Session.java:335)
at org.tensorflow.Session$Runner.run (Session.java:285)
at Main.main (Main.java:25)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:254)
at java.lang.Thread.run (Thread.java:834)
Describe the expected behavior
Succesful execution of the model
Code to reproduce the issue
import org.tensorflow.SavedModelBundle;
import org.tensorflow.Session;
import org.tensorflow.Tensor;
import org.tensorflow.TensorFlow;
import org.tensorflow.tools.ndarray.NdArrays;
import org.tensorflow.types.TString;
import java.nio.charset.StandardCharsets;
import java.util.List;
public class Main {
public static void main(String[] args) {
TensorFlow.version();
String libDir = "/usr/local/lib/python3.7/dist-packages/tensorflow_text/python/ops/";
TensorFlow.loadLibrary(libDir + "_wordpiece_tokenizer.so");
TensorFlow.loadLibrary(libDir + "_normalize_ops.so");
TensorFlow.loadLibrary(libDir + "_regex_split_ops.so");
SavedModelBundle savedModelBundle = SavedModelBundle.load("tokenizer", "serve");
Session.Runner runner = savedModelBundle.session().runner();
runner.feed("serving_default_text:0", TString.tensorOfBytes(NdArrays.vectorOfObjects("a b c d e".getBytes(StandardCharsets.UTF_8))));
runner.fetch("StatefulPartitionedCall_1:0");
List<Tensor<?>> outputs = runner.run();
System.out.println(outputs);
}
}
The following script can be used to generate a minimal saved model triggering the problem:
from tensorflow.python.framework import dtypes
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import lookup_ops
from tensorflow.python.ops import math_ops
from tensorflow.python.ops import string_ops
from tensorflow_text.python.ops import bert_tokenizer
import tensorflow as tf
vocab = [
b'a', b'b', b'c', b'd'
]
def _create_table(vocab, num_oov=1):
init = lookup_ops.KeyValueTensorInitializer(
vocab,
math_ops.range(
array_ops.size(vocab, out_type=dtypes.int64), dtype=dtypes.int64),
key_dtype=dtypes.string,
value_dtype=dtypes.int64)
return lookup_ops.StaticVocabularyTableV1(
init, num_oov, lookup_key_dtype=dtypes.string)
table = _create_table(vocab)
class Module(tf.Module):
def __init__(self, table):
self.table = table
self.tokenizer = bert_tokenizer.BertTokenizer(
self.table,
token_out_type=dtypes.string,
lower_case=True,
preserve_unused_token=False)
@tf.function(input_signature=[tf.TensorSpec(1, dtype=tf.dtypes.string)])
def serve(self, text):
return self.tokenizer.tokenize(text)
module = Module(table)
tf.saved_model.save(module, 'tokenizer')
model = tf.saved_model.load('tokenizer')
print(model.serve(['a a b c d e']))
Other info / logs
There is an issue in tensorflow-text (tensorflow/text#272) where the same thing happens on macos (while this is on linux). This model works on linux using python to load and execute the model however, so the root cause is most likely different. Looking at the fix for the macos issue (tensorflow/tensorflow@1823f87#diff-991a6b786e16708ba1e6f5c9926cf151) makes me suspect that this may be caused by type ids being generated differently due to tensorflow-java building native tensorflow libs separately in a slightly different way than the python libraries.
System information
Describe the feature and the current behavior/state.
It's so said i have been told that "There is no libtensorflow support for TensorFlow 2 yet"
https://github.com/tensorflow/tensorflow/issues/36950#issuecomment-592238340
tf2 changes a lot, we could not use the new model build on tf2 due to no new version of libtensorflow available.
Will this change the current api? How?
To build the whole tf2 on java is huge project, is it possible to build the libtensorflow first?
Who will benefit with this feature?
all developer useing tf java api
Confirmation: lets model the API with functions while still loading/saving session-centric graphs
I take it that we go with the current branch (after addressing open comments) and adding unit tests.
For unit tests: Here is a proposal
Originally posted by @Shajan in #89 (comment)
Is it possible to provide a migration guide from TF1 to TF2 for java, or at least list a comprehensive list of breaking changes?
For python, their is an update script and a compatibility-API. I know that would be too much, but a guide would be great to have.
Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template
I'm trying to use the USE LITE model from tensorflow hub in Java. I can download and load the model into Java just fine but the problem arises when I try to run inference. I get errors saying that there are no operations with the name of the input variables when I want to feed them into my graph despite specifying the input names when downloading and saving the model (in python)
System information
You can collect some of this information using our environment capture script
You can also obtain the TensorFlow version with
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
Describe the current behavior
When I try to run inference with the model in java it gives me the error java.lang.IllegalArgumentException: No Operation named [values] in the Graph
Describe the expected behavior
The model should be be to run inference on the sample inputs to produce a single output
Code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate the problem.
Attached are two files: The first is a Python script I used to download and save the Tensorflow model in an appropriate format. The second is a java file I used to actually load the model. Note that the path in the java script would need to be changed to where the USE lite model is based on the python script
HelloTensorFlow.java.zip
download_tensorflow_model.py.zip
Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
OS Platform and Distribution : macOS Catalina 10.15.3
TensorFlow installed from : binary
TensorFlow version : 2.1.0
Python version: 3.7.3
Java - 1.8
So basically I am using TF 1.15.0 Java binding to do the inferencing from savedmodel where features are ported as part of savedmodel , the code below is working
import org.tensorflow.SavedModelBundle;
import org.tensorflow.Session;
import org.tensorflow.Tensor;
import org.tensorflow.TensorFlow;
public class TF1_b {
public static void main(String[] args) {
String testSen = "1";
byte[] inputBytes = new byte[1];
try {
inputBytes = testSen.getBytes("UTF-8");
} catch (Exception e) {
e.printStackTrace();
}
System.out.println(TensorFlow.version());
String basePath = "<path>/anaconda3/lib/python3.7/site-packages/tensorflow_text/python/ops/";
System.load(basePath + "_wordpiece_tokenizer.dylib");
try (SavedModelBundle b = SavedModelBundle.load("<saved_model_path>", "serve")) {
Session sess = b.session();
Tensor x = Tensor.create(inputBytes, String.class);
float[][] y = sess.runner().feed("text:0", x)
.fetch("linear/linear_model/linear_model/linear_model/weighted_sum:0").run().get(0)
.copyTo(new float[1][1]);
System.out.println(y[0][0]);
} catch (Exception e) {
System.out.println(e);
}
}
}
In pom.xml the dependency I am using for above is
<dependencies>
<dependency>
<groupId>org.tensorflow</groupId>
<artifactId>tensorflow</artifactId>
<version>1.15.0</version>
</dependency>
</dependencies>
The above code works perfect and i am able to load System.load(basePath + "_wordpiece_tokenizer.dylib");
with no problems .
Now when I try to run the same model with
<repositories>
<repository>
<id>tensorflow-snapshots</id>
<url>https://oss.sonatype.org/content/repositories/snapshots/</url>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories>
<dependencies>
<!-- Example of dependency, see README.md for more options -->
<dependency>
<groupId>org.tensorflow</groupId>
<artifactId>tensorflow-core-platform</artifactId>
<version>0.1.0-SNAPSHOT</version>
</dependency>
</dependencies>
as described https://stackoverflow.com/questions/61373396/tensorflow-2-x-java-bindings
with changes only for for two lines
Tensor x = TString.scalarOf(new String("1"));
Tensor<TFloat32> y = sess.runner().feed("text:0", x) .fetch("linear/linear_model/linear_model/linear_model/weighted_sum:0").run().get(0).expect(TFloat32.DTYPE);
I get error
Exception in thread "main" java.lang.UnsatisfiedLinkError: <path>/anaconda3/lib/python3.7/site-packages/tensorflow_text/python/ops/_wordpiece_tokenizer.dylib: dlopen(<path>/anaconda3/lib/python3.7/site-packages/tensorflow_text/python/ops/_wordpiece_tokenizer.dylib, 1): Library not loaded: @rpath/libtensorflow_framework.1.dylib
Referenced from: <path>/anaconda3/lib/python3.7/site-packages/tensorflow_text/python/ops/_wordpiece_tokenizer.dylib
Reason: image not found
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
at java.lang.Runtime.load0(Runtime.java:809)
at java.lang.System.load(System.java:1086)
The path <path>/anaconda3/lib/python3.7/site-packages/tensorflow_text/python/ops/_wordpiece_tokenizer.dylib:
exists and so TF 1.15.x java inferencing is working fine for inferencing .
The java code for TF 2.x i am using is below
import org.tensorflow.SavedModelBundle;
import org.tensorflow.Session;
import org.tensorflow.Tensor;
import org.tensorflow.TensorFlow;
import org.tensorflow.tools.ndarray.NdArrays;
import org.tensorflow.types.TFloat32;
import org.tensorflow.types.TString;
public class TF_2 {
public static void main(String[] args) {
String testSen = "1";
byte[] inputBytes = new byte[1];
try {
inputBytes = testSen.getBytes("UTF-8");
} catch (Exception e) {
e.printStackTrace();
}
System.out.println(TensorFlow.version());
String basePath = "<path>/anaconda3/lib/python3.7/site-packages/tensorflow_text/python/ops/";
System.load(basePath + "_wordpiece_tokenizer.dylib");
try (SavedModelBundle b = SavedModelBundle.load("<saved_model_path>", "serve")) {
Session sess = b.session();
Tensor x = TString.scalarOf(new String("1"));
Tensor<TFloat32> y = sess.runner().feed("text:0", x)
.fetch("linear/linear_model/linear_model/linear_model/weighted_sum:0").run().get(0)
.expect(TFloat32.DTYPE);
System.out.println(y.data().toString());
} catch (Exception e) {
System.out.println(e);
}
}
}
Describe the problem
I'm sorry if this is a bad question, but I'm unable to find the maven coordinates for this project. You have a sample pom.xml but I can't find them on maven repo and when I try to add them I get errors that they can't be found. What am I missing?
Hello. I just started learning speech recognition. I want to program speech recognition using java based on PC. Is there a demo for speech recognition using tensorflow in Java ?
Please make sure that this is a feature request. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:feature_template
System information
Describe the feature and the current behavior/state.
Will this change the current api? How?
Who will benefit with this feature?
Any Other info.
Please make sure that this is a documentation issue. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:doc_template
System information
Describe the documentation issue
It's bit confusing to me as I just heard the news about alpha release at version 0.2.0 but when I see the installation docs here https://www.tensorflow.org/install/lang_java it's pointing to maven artifact version 2.3.0 Though the 0.2.0 version is also available on maven central : https://search.maven.org/artifact/org.tensorflow/tensorflow-core-api/0.2.0/jar
Please update the docs so people who want to play with the alpha release can find it.
We welcome contributions by users. Will you be able to update submit a PR (use the doc style guide) to fix the doc Issue? No
Hi, it seems the java api can't create an output more than 8. This is a big limitation especially on split operators.
Here is a reproducible code, I'm using tensorflow-core-api
latest SNAPSHOT version.
package ai.djl.tensorflow.engine;
import org.tensorflow.EagerSession;
import org.tensorflow.op.Ops;
import org.tensorflow.types.TInt64;
public class Tftest {
public static void main(String[] args) {
EagerSession eagerSession = EagerSession.options().async(true).build();
Ops tf = Ops.create(eagerSession);
// creates a (20, 20) tensor with zeros and split to 10 tensors equally on axis 0,
// should return 10 tensors each with shape(2, 20).
tf.splitV(
tf.zeros(tf.constant(new long[]{20,20}), TInt64.DTYPE),
tf.constant(10),
tf.constant(0),
10L
);
}
}
error message:
Exception in thread "main" java.lang.IllegalArgumentException: Expecting 10 outputs, but *num_retvals is 8
at org.tensorflow.internal.c_api.AbstractTF_Status.throwExceptionIfNotOK(AbstractTF_Status.java:72)
at org.tensorflow.EagerOperationBuilder.execute(EagerOperationBuilder.java:302)
at org.tensorflow.EagerOperationBuilder.build(EagerOperationBuilder.java:68)
at org.tensorflow.EagerOperationBuilder.build(EagerOperationBuilder.java:57)
at org.tensorflow.op.core.SplitV.create(SplitV.java:64)
at org.tensorflow.op.Ops.splitV(Ops.java:5715)
at ai.djl.tensorflow.engine.Tftest.main(Tftest.java:11)
Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template
System information
You can collect some of this information using our environment capture script
You can also obtain the TensorFlow version with
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
Describe the current behavior
I am trying to invoke a TFHUB model (Universal Sentence Encoder v4) using the new java API (using Scala). However, I am getting stuck at the error below.
An exception or error caused a run to abort: Malformed TF_STRING tensor; too short to hold number of elements
org.tensorflow.exceptions.TFInvalidArgumentException: Malformed TF_STRING tensor; too short to hold number of elements
at org.tensorflow.internal.c_api.AbstractTF_Status.throwExceptionIfNotOK(AbstractTF_Status.java:87)
at org.tensorflow.Session.run(Session.java:595)
at org.tensorflow.Session.access$100(Session.java:70)
at org.tensorflow.Session$Runner.runHelper(Session.java:335)
at org.tensorflow.Session$Runner.run(Session.java:285)
at org.samik.EmbeddingModelServer.USEEmbeddingServerTest.<init>(USEEmbeddingServerTest.scala:85)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
Describe the expected behavior
The code should compile and execute.
Code to reproduce the issue
import org.tensorflow.proto.framework.{MetaGraphDef, SignatureDef, TensorInfo}
import org.tensorflow.{SavedModelBundle, Tensor}
import scala.collection.JavaConverters._
import scala.collection.mutable
import org.tensorflow.ndarray.Shape
import org.tensorflow.ndarray.buffer.DataBuffers
import org.tensorflow.types.{TString, TUint8}
import java.nio.ByteBuffer
import java.nio.charset.StandardCharsets
class TestApp extends App
{
val useModel = SavedModelBundle.load("/local/path/to/tfhub/use_4", "serve")
val metaData = useModel.metaGraphDef()
val signatureDef = metaData.getSignatureDefMap().get("serving_default")
val firstInput = getInputToShape(metaData).keys.head
val firstOutput = getOutputToShape(metaData).keys.head
val input = "Hello"
val dataBuffer = DataBuffers.of(ByteBuffer.wrap(input.getBytes(StandardCharsets.UTF_8)))
val tensor = Tensor.of(TString.DTYPE, Shape.of(1L), dataBuffer)
println(s"Tensor: $tensor")
val sessionRunner = useModel.session().runner()
val result = sessionRunner
.feed(firstInput, tensor)
//****** The below line (fetch(..)) seems to be generating the error *********//
.fetch(firstOutput)
.run()
.asScala
println(result)
private def getOutputToShape(metadata: MetaGraphDef): mutable.Map[String, Shape] =
mapToShape(signatureDef.getOutputsMap.asScala)
private def getInputToShape(metadata: MetaGraphDef): mutable.Map[String, Shape] =
mapToShape(signatureDef.getInputsMap.asScala)
private def mapToShape(map: mutable.Map[String, TensorInfo]): mutable.Map[String, Shape] =
{
map.foldLeft(mutable.HashMap[String, Shape]())
{ case(accum, (_, tensorInfo)) =>
val dimList = tensorInfo.getTensorShape.getDimList.asScala.map(_.getSize)
val shape = if(dimList.length == 0) Shape.unknown() else Shape.of(dimList: _*)
accum += (tensorInfo.getName -> shape)
}
}
}
However, pretty much the same code, with the same helper functions work with the published jar (1.15.0). Here is the corresponding snippet.
val metaData = MetaGraphDef.parseFrom(useModel.metaGraphDef())
val firstInput = getInputToShape(metaData).keys.head
val firstOutput = getOutputToShape(metaData).keys.head
val input = "Hello there!"
val inputTensor: Tensor[String] = Tensors.create(Array(input.getBytes()))
val sessionRunner = useModel.session().runner()
val results = sessionRunner.feed(firstInput, inputTensor).fetch(firstOutput).run().asScala
results.foreach(tensor => {
val array = Array.ofDim[Float](tensor.shape()(0).toInt, tensor.shape()(1).toInt)
tensor.copyTo(array)
println(s"[${array(0).mkString(", ")}]")
})
Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.