Git Product home page Git Product logo

java's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

java's Issues

Protobuf support in TF Java

Right now, the TensorFlow Java client from the repository does not expose directly protobufs that are part of the contract of the C API (Though in this thread, it was mentioned that protobufs would be eventually removed from the API but I have a feeling this won't happen anytime soon).

Instead, it compiles and distribute the protobufs Java bindings as a separate artifact and the client itself remains agnostic of the content of the messages and simply expose unmapped byte arrays, like here.

We need to decide how we want to handle those protobufs in the new distribution. Possible choices are:

  • We keep distributing the compiled protobufs as an external artifact like now from the core repository build
  • We keep distributing the compiled protobufs as an external artifact but from our new repository build
  • We compile and distribute the Java protobuf bindings within the new TF Java client artifact so they get exposed directly by our API
  • We compile the Java protobuf bindings with the new TF Java client but wrap them so they are not exposed directly by our API (in case they are removed in TF core, see previous thread link)
  • Any other idea?

Note that if we decide to compile the protobufs from the new TF Java client, it will brings its load of additional dependencies, such as grpc.

CC: @sjamesr

Issues when translating from TType to TNumber and parameterized types within methods.

Many of the Java TF Ops use parameterized types for either TType, TNumber or both. Sometimes an Op uses <T extends TType> and sometimes another Op is using <T extends TNumber>. When writing a method that uses two different Ops that declare <T> differently, the compiler complains that T cannot be converted to the other type. It is interesting that TNumber is a subclass of TType. I have searched "Professor Google", but have not found an answer to this kind of problem.

TType to TNumber conversion is very common, especially if you are creating a base class with a common method signature across many similar objects. Sometimes, the subclass calls for a TType, sometimes a TNumber. The real problem, is when you have a common method such as public <T extends TType> Operand<T> call (Operand<T> input).

As a work around, let's say that you cast a TType to a TNumber (where <U extends TNumber>) as in:

@SuppressWarnings("unchecked")
Operand<U> uInput = (Operand<U>)input;

Now when you call something like tf.math.greater(uInput, otherValue);, the compiler complains:
no instance of type variables(s) exists so that T conforms to TNumber. That is because tf.math.greater uses <T extends TNumber> while other ops, like tf.nn.relu defines <T extends TType>.

Another way around this is to force erasure as in (Operand)value.

At a minimum, it would be nice if there were a convention like <T extends TType> and <U extends TNumber> consistently, but this may not solve all these kind of issues, as I have seen <U extends TType, T extends TType>, and <V extends TType, T extends TType, U extends TType>

The main issue that contributes to this problem is that the Ops require a mixture of types, so a higher level user is artificially juggling the situation by casting like above, or by forcing an erasure of the type.
IMO this situation is going to be confusing to the API user. I still haven't figured out a clean way to get around the issue when two method signatures use the same generic parameter in different ways.

Perhaps there is a better way. My gut feel is this is going to become a larger headache down the line.

The specific example I am running into at this time this problem is:

@Override
    public Operand<T> call(Operand<T> input) {
        @SuppressWarnings("unchecked")
        Operand<U> uInput = (Operand<U>)input;
....
        Operand<U> greater = tf.dtypes.cast(
                    tf.math.greater(uInput, 
                            tf.dtypes.cast(tf.constant(threshold),  
                            input.asTensor().dataType())), input.asTensor().dataType());
         uInput = tf.math.mul(uInput, greater);
         input = (Operand<T>)uInput;
...

Instead of a separate Keras layer, should we just have a single Keras-like framework API?

This issue continues a discussion that started in #91. Here's a brief excerpt of key points made so far.

@karllessard wrote:

The current Python API is made up of two layers because it is historically the product of a merge between two different projects: the original TF API and the Keras project. I personally think it brings more confusion to the users than benefits and we don't need to follow this schema if we think we can do better in Java since we start from scratch.

I'm slowly leaning now to the idea of having a single API that supports both "beginner" and "advanced" modes, whether we call it Keras or not.

@JimClarke5 wrote:

IMHO, the beauty of Keras is in the simple, straight forward, Model and Layers. Most of the Layers have defaults for constructs like Metrics, Optimizers, Activations, etc. Also, they allow simple strings in their parameters that instruct the underlying layers to construct elements, like new Dense(24, "relu"), so the way these elements are constructed can be hidden from a Keras user.

@Craigacp wrote:

My preference is to have both a low and high level framework, which is how TF python currently is. You don't need to use Keras if you don't want to, but many people do.

One reason to advocate for both frameworks is that it might actually take less development effort. Building out Keras to have full coverage requires a lot of consistent effort, but supporting ops that are added to TF's C API in a lower level API is essentially free for us.

The high level framework is for people who use Keras in TF Python, and want an API that guides them better. I think that we should have stronger typing information than exists in Python, as it's what would be expected from idiomatic Java and it helps IDEs & discoverability.

Native libraries licensing

(note: this is an extract from a previous pull request, which is now made as a separate issue)

In actual TF Java artifacts (i.e. those issued from the main TF repository), the native artifacts (e.g. libtensorflow_jni.jar) contains a LICENSE file (and a new THIRD_PARTY_TF_JNI_LICENSES since 1.15, I think). These licenses list copyright notices from different libraries used by TF core runtime.

When we migrated to our Java repo, I took a snapshot of the LICENSE file and copy it under tensorflow-core-api/src/main/resources. But that was not very useful since the licenses must be attached to the same artifact/folder than the native libraries we distribute (i.e. that jars with the os-arch classifier) and now it was ending in our Java jar (without classifier).

So I moved it to tensorflow-core-api directly for now and I guess that we need to add some Maven rules to include it in our classified/native jars. Nonetheless, some details remain obscure to me:

Where do come from the LICENSE file (and the new THIRD_PARTY_TF_JNI_LICENCES) found in the TF Java 1.15 artifacts?
How do we stay in sync with changes in those files?
Are the licenses the same for all supported OS?

CC: @sjamesr

Where I can find the document of java of tensorflow?

Recently, I wanna use java deploy my model. But some thing wrong happen. Such as tensorflow core dump in libtensorflow-1.15.0.so.
SO, I wanna change the my java library to tensorflow/java, but I cannot find some document to help quickly load my model. The only thing which can help me is the Issues.

And when I test my code in tensorflow-core-api, it raise the

Warning: Could not load Loader: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
Exception in thread "main" java.lang.UnsatisfiedLinkError: no jnitensorflow in java.library.path

where I can found that.

Thanks for help~

Add option to run TensorFlow job on the preferred device

System information

  • TensorFlow version (you are using): 1.15 or 2.x
  • Are you willing to contribute it (Yes/No): Yes

Describe the feature and the current behavior/state.
Now, we could run our code on GPU only via adding GPU dependencies to the classpath.
But the basic Python API provides an ability to set up the preferred device (GPU or CPU via device name)

The basic option also is available for low-level builder here

Will this change the current api? How?

Let's add the function tf.withDevice(โ€œ/GPU:0โ€) to the Scope class.

Who will benefit with this feature?
Anyone who trains neural network in distributed mode on different GPU/CPU devices.

Any Other info.

Performance regression on GPU

Description

Hi, recently I benchmarked the inference on GPU with AWS EC2 P3.2xlarge instance on ResNet50 pretrained model. CPU benchmark are pretty close to python, however there is a regression on GPU:

0.2.0 TF Java

p50 4.76ms
P90 6.47ms

Python (TF 2.3.1)

P50 3.24ms
P90 4.59ms

I am note sure why CPU is very close but GPU is kind of far (20% diff)

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): AWS DL AMI (Ubuntu 18.04 based)
  • CUDA/cuDNN version: CUDA 10.1
  • GPU model and memory: Tesla V100 16GB

Step to reproduce

You can get the Keras pretrained resnet50 model and save it to savedModel format.

Java

public class Example {
  public static void main(String[] args) {

    int ITERATION = 1000;
    String dir = "model_path";

    SavedModelBundle.Loader loader =
            SavedModelBundle.loader(dir).withTags("serve");

    SavedModelBundle bundle = loader.load();
    Session session = bundle.session();
    List<Long> timeCollector = new ArrayList<>();
    for (int i = 0; i < ITERATION; i++) {
      long start = System.nanoTime();
      forward(session);
      timeCollector.add(System.nanoTime() - start);
    }
    Collections.sort(timeCollector);
    System.out.println("P50: " + percentile(timeCollector, 50) + "ms");
    System.out.println("P90: " + percentile(timeCollector, 90) + "ms");
    System.out.println("P99: " + percentile(timeCollector, 99) + "ms");
  }

  public static double percentile(List<Long> times, int percentile) {
    int index = times.size() * percentile / 100;
    return times.get(index) / 1_000_000f;
  }

  public static void forward(Session session) {
    Session.Runner runner = session.runner();
    try(Tensor<?> tensor = Tensor.of(TFloat32.DTYPE, Shape.of(1, 224, 224, 3))) {
      runner.feed("serving_default_input_1:0", tensor);
      runner.fetch("StatefulPartitionedCall:0");
      List<Tensor<?>> result = runner.run();
    }
  }
}

python

if __name__ == "__main__":
    if len(sys.argv) != 4:
        print("usage: python3 benchmark.py <model_name> <model_dir> <num_iterations>")
        exit(1)

    model_name = sys.argv[1]
    model_path = sys.argv[2]
    iterations = int(sys.argv[3])

    print("#############################################")
    print("start testing Model: " + model_name)
    begin = time.time()

    # load model
    model = tf.saved_model.load(model_path)
    latencies = []
    for _ in range(iterations):
        inputs = tf.zeros((1, 224, 224, 3))

        start = time.time()
        result = model(inputs)
        # convert the second to mini-second
        latencies.append((time.time() - start) * 1000)
        result.numpy()

    elapsed = (time.time() - begin) * 1000
    throughput = iterations / elapsed * 1000
    p50 = np.percentile(latencies, 50)
    p90 = np.percentile(latencies, 90)
    p99 = np.percentile(latencies, 99)

    print("Model: {}".format(model_name))
    print("Iterations: {:d}".format(iterations))
    print("Throughput: {:.2f}".format(throughput))
    print("Elapsed: {:.3f} ms.".format(elapsed))
    print("P50: {:.3f} ms".format(p50))
    print("P90: {:.3f} ms".format(p90))
    print("P99: {:.3f} ms".format(p99))

Exception in thread "main" java.lang.NoClassDefFoundError: org/bytedeco/javacpp/Pointer

Thanks for helping me build tensorflow-java per #94 !

My first tests involve NdArray and Graph, but I naively can't find org/bytedeco/javacpp/Pointer.class to put in my.jar to avoid the following exception:

$ java -jar my.jar tf --verbose
DEBUG 1596849621481: Started ndarray TensorflowJavaTest.
DEBUG 1596849621705: matrix3d rank 3
DEBUG 1596849621706: Finished ndarray TensorflowJavaTest.
DEBUG 1596849621706: Started graph TensorflowJavaTest.
Exception in thread "main" java.lang.NoClassDefFoundError: org/bytedeco/javacpp/Pointer
	at vis.TensorflowJavaTest.run(tf.scala:76)
	at vis.TensorflowJavaTest$.runArgs(tf.scala:56)
	at vis.VIS$.main(vis.scala:171)
	at vis.VIS.main(vis.scala)
Caused by: java.lang.ClassNotFoundException: org.bytedeco.javacpp.Pointer
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 4 more

The code for my simple test is in Scala, which gets packed into my.jar with other code:

      debug("Started ndarray TensorflowJavaTest.")
      // run simple data buffers test as in https://github.com/tensorflow/java/tree/master/ndarray
      val matrix3d = org.tensorflow.ndarray.NdArrays.ofInts( org.tensorflow.ndarray.Shape.of(2, 3, 2) )
      debug("matrix3d rank " + matrix3d.rank)
      debug("Finished ndarray TensorflowJavaTest.")

      debug("Started graph TensorflowJavaTest.")
      // run simple graph and session tests per https://github.com/tensorflow/java/blob/master/tensorflow-core/tensorflow-core-api/src/test/java/org/tensorflow/SessionTest.java
      val graph = new org.tensorflow.Graph() // defined in https://github.com/tensorflow/java/blob/master/tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Graph.java
      val session = new org.tensorflow.Session(graph)
      val tf = org.tensorflow.op.Ops.create(graph)
      debug("Finished graph TensorflowJavaTest.")

Where might the bytedeco classes be? I only see these jars from my build:

(base) tensorflow-java$ find . -name '*jar'|grep -v surefire
./tensorflow-framework/target/tensorflow-framework-0.2.0-SNAPSHOT.jar
./ndarray/target/ndarray-0.2.0-SNAPSHOT.jar
./tensorflow-core/tensorflow-core-generator/target/tensorflow-core-generator-0.2.0-SNAPSHOT.jar
./tensorflow-core/tensorflow-core-generator/target/tensorflow-core-generator-0.2.0-SNAPSHOT-sources.jar
./tensorflow-core/tensorflow-core-api/target/tensorflow-core-api-0.2.0-SNAPSHOT.jar
./tensorflow-core/tensorflow-core-api/target/tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar
./tensorflow-core/tensorflow-core-api/target/tensorflow-core-api-0.2.0-SNAPSHOT-sources.jar
./tensorflow-core/tensorflow-core-platform/target/tensorflow-core-platform-0.2.0-SNAPSHOT.jar
./tensorflow-core/tensorflow-core-platform/target/tensorflow-core-platform-0.2.0-SNAPSHOT-sources.jar
./tensorflow-core/tensorflow-core-platform/target/tensorflow-core-platform-0.2.0-SNAPSHOT-javadoc.jar

No bytedeco jars in Maven's .ivy2 cache:

$ find ~/.ivy2/cache -name '*bytedeco*'

In the docker container where I built tensorflow-java, there is a .javacpp directory with bytedeco artifacts, but no .class or .jar files.

root@4f6083770318:~/.javacpp/cache# find|sort
.
./.lock
./javacpp-1.5.3-linux-x86_64.jar
./javacpp-1.5.3-linux-x86_64.jar/org
./javacpp-1.5.3-linux-x86_64.jar/org/bytedeco
./javacpp-1.5.3-linux-x86_64.jar/org/bytedeco/javacpp
./javacpp-1.5.3-linux-x86_64.jar/org/bytedeco/javacpp/linux-x86_64
./javacpp-1.5.3-linux-x86_64.jar/org/bytedeco/javacpp/linux-x86_64/libjnijavacpp.so
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow/internal
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow/internal/c_api
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow/internal/c_api/linux-x86_64
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow/internal/c_api/linux-x86_64/libjnitensorflow.so
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow/internal/c_api/linux-x86_64/libtensorflow.so
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow/internal/c_api/linux-x86_64/libtensorflow.so.2
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow/internal/c_api/linux-x86_64/libtensorflow_framework.so
./tensorflow-core-api-0.2.0-SNAPSHOT-linux-x86_64.jar/org/tensorflow/internal/c_api/linux-x86_64/libtensorflow_framework.so.2

Would a colleague kindly tell me where the bytedeco Pointer class file is, so I may continue with my tests? Many thanks in advance! Sorry for the basic question. Surely I'm missing something, because I can see bytedeco in various pom.xml files, like this one:

(base) tensorflow-java/tensorflow-core/tensorflow-core-api$ grep -2 bytedeco pom.xml
  <dependencies>
    <dependency>
      <groupId>org.bytedeco</groupId>
      <artifactId>javacpp</artifactId>
      <version>${javacpp.version}</version>
    </dependency>
    <dependency>
      <groupId>org.bytedeco</groupId>
      <artifactId>javacpp</artifactId>
      <version>${javacpp.version}</version>
--
      </plugin>
      <plugin>
        <groupId>org.bytedeco</groupId>
        <artifactId>javacpp</artifactId>
        <version>${javacpp.version}</version>
--
              <quiet>true</quiet>
              <links>
                <link>http://bytedeco.org/javacpp/apidocs</link>
              </links>
            </configuration>

TF Java 2.3 failed on Tesla V100 GPU

Hi,

tensorflow-core-api:0.2.0-SNAPSHOT is failing on AWS EC2 instances with Tesla V100 GPUs.
all the core-api tests failed with error code:

CUDA runtime implicit initialization on GPU:0 failed. Status: device kernel image is invalid

I have root caused the issue to be CUDA compute 7.0 is not enabled during compilation, running the following command and build from source again fixed the issue.
export TF_CUDA_COMPUTE_CAPABILITIES=7.0

Somehow in the release build of tensorflow-core-api, compute 7.0 is not enabled, but compute 3.5 and 7.0 should be the default capability of TF 2.3 according to here. The python packages and main repo built from source works fine without any modification.

References:
tensorflow/tensorflow#41132
tensorflow/tensorflow@cf1b6b3

Add abstractions for parsing TFRecord Files using `tf.Example` and `tf.io` ops

System information

  • TensorFlow version (you are using): Latest master of TensorFlow Java
  • Are you willing to contribute it (Yes/No): No (working on other things at the moment)

Describe the feature and the current behavior/state.
Currently in Java, we have access to the core tf.io ops such as tf.parseExample, tf.parseSingleExample, tf.decodeRaw etc. In order to serialize TF Record datasets and read in datasets from the tensorflow_datasets buckets, for example, we need to be easily able to use these ops.

In Python, the relevant abstractions built on top of tf.io are defined in parsing_config.py. Specifically it will be very helpful to have abstractions such as:

  • Various feature types: FixedLenFeature, SparseFeature, FixedLenSequenceFeature, etc...
  • The _ParseOpParams class which wraps the parameters to tf.parseExample
  • Standardizing a flow for defining features in a TFRecord file.

See these examples which relate to using the parse-example ops, and reading TFRecord files

Will this change the current api? How?

This will add APIs for serializing / parsing examples to / from TF Record files

Who will benefit with this feature?

Anyone using datasets stored as TFRecord flies from TensorFlow java (for example, to load datasets from the tensorflow_datasets GCP bucket)

Any Other info.

Feel free to get in touch with me anytime to discuss! Happy to help.

x86 support

System information

  • TensorFlow version (you are using): 2.3.0
  • Are you willing to contribute it (Yes/No): No

Describe the feature and the current behavior/state.
libtensorflow for x86(e.g. i386/i486/i586/i686)

Will this change the current api? How?
No

Who will benefit with this feature?
32 bit processor users

Any Other info.

Issue with tf.fill() and tf.zeros()not returning enough values for shape

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): YES
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
    OSX 10.15.4
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
    NA
  • TensorFlow installed from (source or binary):
    Source
  • TensorFlow version (use command below): 2.2.0-rc3
  • Python version: 3.7
  • Bazel version (if compiling from source): bazel 2.0.0
- GCC/Compiler version (if compiling from source):

Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 11.0.3 (clang-1103.0.32.59)
Target: x86_64-apple-darwin19.4.0
Thread model: posix


- CUDA/cuDNN version: None
- GPU model and memory: NA
- tensorflow-java : Version HEAD

Describe the current behavior
When calling tf.zeros() or tf.fill(), the returned data does not match the Shape passed in.
If I have a shape of (2,2), I would expect 4 values to be returned, but I only get 2.
However, when I pass new long[(int)shape.size()] as the shape argument, it works as expected and 4 values are returned. It seems that the fill op is using the size of the input array rather than the values contained within in the array.

Also, I noticed that the ZerosTest.java, does not check for the length of the returned array, it merely just checks that each element returned is zero.

Describe the expected behavior
The python version works as expected:

import tensorflow as tf;
print(tf.__version__)

op = tf.fill([2,2], 1.0)
print(op.numpy())

op = tf.zeros([2,2])
print(op.numpy())

With output:

2.2.0-rc3
[[1. 1.]
 [1. 1.]]
[[0. 0.]
 [0. 0.]]

Code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate the problem.
The following java code exhibits the issue:

        float[] actualF = { 1.f, 1.f, 1.f, 1.f };
        try (EagerSession session = EagerSession.create()) {
           Ops tf = Ops.create(session);
            Shape shape = Shape.of(2,2);
             // this only returns 2 of the 4 zeros
            Operand<TFloat32> zeroOp = tf.zeros(
                    tf.shape(tf.dtypes.cast(tf.constant(shape.asArray() ), TFloat32.DTYPE)), 
                    TFloat32.DTYPE);
            zeroOp.asTensor().data().read(DataBuffers.of(actualF));
            System.out.print("tf.zeros: ");
            zeroOp.asTensor().data().scalars().forEach(s -> System.out.print(s.getFloat() + ", "));
            System.out.println();
            System.out.println("actual Array: " + Arrays.toString(actualF));
            
            // this only returns 2 of the 4 zeros
            Arrays.fill(actualF, 0.F);
            Operand<TFloat32> fillOp = tf.fill(
                    tf.shape(tf.dtypes.cast(tf.constant(shape.asArray() ), TFloat32.DTYPE)), 
                    tf.dtypes.cast(tf.constant(1.0), TFloat32.DTYPE));
            fillOp.asTensor().data().read(DataBuffers.of(actualF));
            System.out.print("tf.fill: ");
            fillOp.asTensor().data().scalars().forEach(s -> System.out.print(s.getFloat() + ", "));
            System.out.println();
            System.out.println("actual Array: " + Arrays.toString(actualF));
            
            
            // this works as expected:
            Arrays.fill(actualF, 0.F);
            Operand<TFloat32> fillOp1 = tf.fill(
                    tf.shape(tf.dtypes.cast(tf.constant(new long[(int)shape.size()] ), TFloat32.DTYPE)), 
                    tf.dtypes.cast(tf.constant(1.0), TFloat32.DTYPE));
            fillOp1.asTensor().data().read(DataBuffers.of(actualF));
            System.out.print("tf.fill: ");
            fillOp1.asTensor().data().scalars().forEach(s -> System.out.print(s.getFloat() + ", "));
            System.out.println();
            System.out.println("actual Array: " + Arrays.toString(actualF));
        }

With output:

tf.zeros: 0.0, 0.0, 
expected Array:  [0.0, 0.0, 0.0, 0.0]
actual Array: [0.0, 0.0, 1.0, 1.0]

tf.fill: 1.0, 1.0, 
expected Array:  [1.0, 1.0, 1.0, 1.0]
actual Array: [1.0, 1.0, 0.0, 0.0]

tf.fill: 1.0, 1.0, 1.0, 1.0, 
expected Array:  [1.0, 1.0, 1.0, 1.0]
actual Array: [1.0, 1.0, 1.0, 1.0]

Other info / logs
NA

Add BatchNorm support

System information

  • 1.15 and 2.3:
  • Are you willing to contribute it (Yes):

Describe the feature and the current behavior/state.
The problem is described in the mailing thread.

I have next problem: trying to repeat one of the modern CNN architectures on Java API. Most of them are using BatchNormalization as a popular layer with tf.nn.batchNormalization() op.

I trying to use old operands like BatchNormWithGlobalNormalization

I've used but got Exception in thread "main" org.tensorflow.exceptions.TFUnimplementedException: Op BatchNormWithGlobalNormalization is not available in GraphDef version 175. It has been removed in version 9. Use tf.nn.batch_normalization().
at org.tensorflow.internal.c_api.AbstractTF_Status.throwExceptionIfNotOK(AbstractTF_Status.java:99)

This was deperecated years ago, but we have it in 1.15 and 2.x APIs.

The possible solutions:

  1. To implement all required stuff in Java Ops
  2. Add gradients and contributions in TF Core (gradients and another stuff) if it's possible

TFlite support for tensorflow java

Does Tensorflow Java API support .tflite file format? I am not able to find any documentation, so any leads in this direction will be helpful

Operand problem

The method conv2d(Operand, Operand, List, String, Conv2d.Options...) in the type NnOps is not applicable for the arguments (Constant, Tensor, List).
But how can I generate the Operand type?

Snapshot libtensorflow-1.0.1-20170323.012702-1.jar cannot find packages

With your snapshot libtensorflow-1.0.1-20170323.012702-1.jar (Downloaded from https://oss.sonatype.org/content/repositories/snapshots/org/tensorflow/libtensorflow/1.0.1-SNAPSHOT/) get the errors like:

(command:
javac -cp ./libtensorflow-1.0.1-20170323.012702-1.jar:. HelloTensorFlowSnapshot.java
)

HelloTensorFlowSnapshot.java:4: error: package org.tensorflow.exceptions does not exist
import org.tensorflow.exceptions.TensorFlowException;
                                ^
HelloTensorFlowSnapshot.java:9: error: cannot find symbol
import org.tensorflow.GraphOperation;
                     ^
  symbol:   class GraphOperation
  location: package org.tensorflow
HelloTensorFlowSnapshot.java:10: error: package org.tensorflow.proto.framework does not exist
import org.tensorflow.proto.framework.SignatureDef;
                                     ^
HelloTensorFlowSnapshot.java:14: error: package org.tensorflow.proto.framework does not exist
import org.tensorflow.proto.framework.MetaGraphDef;
                                     ^
HelloTensorFlowSnapshot.java:16: error: package org.tensorflow.proto.framework does not exist
import org.tensorflow.proto.framework.TensorInfo;
                                     ^
HelloTensorFlowSnapshot.java:17: error: package org.tensorflow.types does not exist
import org.tensorflow.types.TFloat32;
                           ^
HelloTensorFlowSnapshot.java:18: error: package org.tensorflow.tools does not exist
import org.tensorflow.tools.Shape;
                           ^
HelloTensorFlowSnapshot.java:20: error: package org.tensorflow.tools.buffer does not exist
import org.tensorflow.tools.buffer.DataBuffers;
                                  ^
HelloTensorFlowSnapshot.java:21: error: package org.tensorflow.tools.ndarray does not exist
import org.tensorflow.tools.ndarray.FloatNdArray;
                                   ^
HelloTensorFlowSnapshot.java:22: error: package org.tensorflow.tools.ndarray does not exist
import org.tensorflow.tools.ndarray.StdArrays;
                                   ^
HelloTensorFlowSnapshot.java:23: error: package org.tensorflow.proto.framework does not exist
import org.tensorflow.proto.framework.TensorInfo;

I try to run the working code from https://stackoverflow.com/questions/61228372/tensorflow-2-0-java-api.

Thanks in advance,
Milan

mvn install: Failed to execute goal org.bytedeco:javacpp:1.5.3:build

Thanks for making this Java API to Tensorflow! Would you kindly help me understand this error from mvn install:

[INFO] Reactor Summary for TensorFlow Java Parent 0.2.0-SNAPSHOT:
[INFO] 
[INFO] TensorFlow Java Parent ............................. SUCCESS [  1.774 s]
[INFO] TensorFlow NdArray Library ......................... SUCCESS [02:47 min]
[INFO] TensorFlow Core Parent ............................. SUCCESS [  0.187 s]
[INFO] TensorFlow Core Annotation Processor ............... SUCCESS [  3.157 s]
[INFO] TensorFlow Core API Library ........................ FAILURE [05:15 min]
[INFO] TensorFlow Core API Library Platform ............... SKIPPED
[INFO] TensorFlow Framework Library ....................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  08:09 min
[INFO] Finished at: 2020-08-01T04:56:29Z
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.bytedeco:javacpp:1.5.3:build (javacpp-build) on project tensorflow-core-api: Execution javacpp-build of goal org.bytedeco:javacpp:1.5.3:build failed: Process exited with an error: 132 -> [Help 1]utionException: Failed to execute goal org.bytedeco:javacpp:1.5.3:build (javacpp-build) on project tensorflow-core-api: Execution javacpp-build of goal org.bytedeco:javacpp:1.5.3:build failed: Process exited with an error: 132r.java:215)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
    at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:288)
    at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:566)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
Caused by: org.apache.maven.plugin.PluginExecutionException: Execution javacpp-build of goal org.bytedeco:javacpp:1.5.3:build failed: Process exited with an error: 132
    at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:148)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
    at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:288)
    at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:566)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
Caused by: java.lang.RuntimeException: Process exited with an error: 132
    at org.bytedeco.javacpp.tools.Builder.build (Builder.java:1026)
    at org.bytedeco.javacpp.tools.BuildMojo.execute (BuildMojo.java:411)
    at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:137)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
    at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:288)
    at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:566)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[ERROR] 
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]

My 'uname -a' is 'Linux 4.9.0-7-amd64 #1 SMP Debian 4.9.110-3+deb9u2 (2018-08-13) x86_64 GNU/Linux'. I'm running in Docker, w/ Java 11. Happy to use Java 8 instead if that helps.

I run CPU-only. Old CPU. No AVX. CPU flags are (from /proc/cpuinfo):

flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes rdrand lahf_lm 3dnowprefetch epb kaiser tpr_shadow vnmi flexpriority ept vpid tsc_adjust smep erms dtherm ida arat

I have a custom .whl file for my Tensorflow build from source, which excludes AVX instructions. However, tensorflow-java's 'mvn install' seems to build its own Tensorflow, and I'm not sure AVX is excluded. Moreover, my /var/log/messages shows an invalid opcode error during 'mvn install', so I suspect AVX issues etc:

Aug  1 00:56:28 kernel: [6829864.802276] traps: java_op_generat[5990] trap invalid opcode ip:7fe3188f1a40 sp:7ffe8f916cb8 error:0
Aug  1 00:56:28 kernel: [6829864.802289]  in libtensorflow_framework.so.2.2.0[7fe317356000+1ab9000]

I'd prefer to tell tensorflow-java to use the Tensorflow w/o AVX that I built and installed, rather than use tensorflow-java's mvn-install-built Java Parent Tensorflow. Is this possible? Please let me know if this is a reasonable fix, or what I should do. Not sure how to interpret the error messages. Thanks for your time and insights!

The Java Tensorflow library does not seem to be using GPU

Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): YES
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: NO
  • TensorFlow installed from (source or binary): from https://oss.sonatype.org/
  • TensorFlow version (use command below): 2.3.1
  • Python version: 3.7.7
  • Bazel version (if compiling from source): NO
  • GCC/Compiler version (if compiling from source): NO
  • CUDA/cuDNN version: 10.1
  • GPU model and memory: Tesla K80, compute capability 3.7 (but we also tested this on Tesla V100 7.0 compute capability)

You can collect some of this information using our environment capture script
You can also obtain the TensorFlow version with
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Here is the result of the capture script:
tf_env.txt

Describe the current behavior
We tested the new Tensorflow Java API (not the legacy one). The brand new version released in October 2020. We tested it on some machines including Azure Databricks NC6_v3 and Azure Virtual Machines (the capture log is from the virtual machine). I noticed that in case of no GPU available the library falls back to CPU. And this is fine. However we also measured the time for some example processing (a few vector operations). And we see that there is no significant difference between processing time on GPU and on CPU. It looks as it is not using GPU, even if this is present (we tried two graphic cards: Tesla K80 with compute compatibility 3.7 and Tesla V100 with compute compatibility 7.0). In both cases we do not see any difference in processing time.

Describe the expected behavior
Expected behaviour is to get execution times much better if the program is executed on a machine with GPU present.

Code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate the problem.

We used the following java program:

HelloTensorFlow_java.txt
pom_xml.txt

The source was compiled to class file and it was run via the following command:
java -classpath protobuf-java-3.8.0.jar:ndarray-0.2.0.jar:javacpp-1.5.4.jar:javacpp-1.5.4-linux-x86_64.jar:tensorflow-core-api-0.2.0.jar:tensorflow-core-api-0.2.0-linux-x86_64-gpu.jar:tensorflow-core-platform-gpu-0.2.0.jar:. HelloTensorFlow

The listed libraries were downloaded from https://oss.sonatype.org/.

Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

The enclosed program issues the following log:
log.txt

From the log you may see that the GPU was present and recognized.
However the execution time did not differ, when we started it with GPU and without.

Not utilising AVX2 instructions after compilation from sources

System information

  • OS Platform and Distribution: Linux Ubuntu 20.04
  • TensorFlow installed from (source or binary): built via "mvn install"
  • TensorFlow version: 2.3 (using 0.2.0-SNAPSHOT)
  • Python version: 3.8.2
  • Bazel version (if compiling from source): 3.4.1
  • GCC/Compiler version (if compiling from source): 9.3.0

Problem:

I have been using TF 1.15 from original java TF repository

<dependency>
    <groupId>org.tensorflow</groupId>
    <artifactId>tensorflow</artifactId>
    <version>1.15.0</version>
</dependency>

which gave me this output:

I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2494460000 Hz
I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f77250299d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version

so found out this repo, made environment to be able to build TF from sources, ran mvn install command which, I would assume, compiled TF on my specific platform. Using dependencies in my project:

<dependency>
    <groupId>org.tensorflow</groupId>
    <artifactId>tensorflow-core-api</artifactId>
    <version>0.2.0-SNAPSHOT</version>
</dependency>
<dependency>
    <groupId>org.tensorflow</groupId>
    <artifactId>tensorflow-core-api</artifactId>
    <version>0.2.0-SNAPSHOT</version>
    <classifier>linux-x86_64</classifier>
</dependency>

getting output:

Warning: Could not load Loader: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib]
Warning: Could not load Pointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib]
Warning: Could not load BytePointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib]
I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Warning: Could not load PointerPointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib]

Everything somehow runs, but throughput is about the same as generic 1.15 version and latency is about 2 times worse than the previous version using the same TF model with V1 behavior enabled. Not sure how to enable AVX2 FMA instructions when TF clearly founds them. I suppose it has something to do about missing jnijavacpp library. Could anyone help me, please?

Thanks

Tensors created using TFloat16.tensorOf does not have correct output

Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow):Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Mac
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
  • TensorFlow installed from (source or binary): source
  • TensorFlow version (use command below): 2.2.0
  • Python version:3.6
  • Bazel version (if compiling from source):
  • GCC/Compiler version (if compiling from source):
  • CUDA/cuDNN version:
  • GPU model and memory:

You can collect some of this information using our environment capture script
You can also obtain the TensorFlow version with
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Describe the current behavior
Tensor created using TFloat16.tensorOf does not have correct value. In example below float has value [0, 1] but TFloat16 generate values [0,0] whereas TFloat32 gives correct value [0, 1]
Describe the expected behavior

Code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate the problem.
public class Test {

public static void main(String[] args) {
	float[][] f1 = {{0}, {1}};
	

    System.out.println(StdArrays.ndCopyOf(f1).getFloat(0,0));
    System.out.println(StdArrays.ndCopyOf(f1).getFloat(1,0));
    
	System.out.println("FLOAT16");
    Tensor<TFloat16> tf_float1 = TFloat16.tensorOf(StdArrays.ndCopyOf(f1));
    System.out.println(tf_float1.data().getFloat(0,0));
    System.out.println(tf_float1.data().getFloat(1,0));
    
    System.out.println("FLOAT32");
    Tensor<TFloat32> tf_float2 = TFloat32.tensorOf(StdArrays.ndCopyOf(f1));
    System.out.println(tf_float2.data().getFloat(0,0));
    System.out.println(tf_float2.data().getFloat(1,0));
}

}

OUTPUT
0.0
1.0
FLOAT16
Warning: Could not load Loader: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
Warning: Could not load Pointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
Warning: Could not load BytePointer: java.lang.UnsatisfiedLinkError: no jnijavacpp in java.library.path
0.0
0.0
FLOAT32
0.0
1.0
Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

Failed to fetch latest snapshot for tensorflow-core-api linux gpu mkl

Hi, DJL's TensorFlow engine is depending on tensorflow-core-api' SNAPSHOT package. Our dependencies here: https://github.com/awslabs/djl/blob/master/tensorflow/tensorflow-native-auto/build.gradle#L19
We found out there is an update on tensorflow-core-api SNAPSHOT on 04/28, but the corresponding linux-gpu-mkl.jar is missing, same for windows.

Did the upload failed?
https://oss.sonatype.org/#nexus-search;quick~tensorflow-core-api

we get 404 when trying to download jar, both gradle build and manually trying the following link failed.
https://oss.sonatype.org/service/local/artifact/maven/redirect?r=snapshots&g=org.tensorflow&a=tensorflow-core-api&v=0.1.0-SNAPSHOT&e=jar&c=windows-x86_64

The mac-os-mkl.jar is there, but the libjnimkldnn.dylib and libiomp5.dylib extra libraries are missing, is this intended? how can I find them? We rely on this task to download native dependencies automatically for users based on their platform.
Please help take a look, thank you so much!

image

image(1)

Missing Lint checks in new build for TensorFlow

When we migrated to tensorflow/java and to Maven, we dropped a few rules that were embedded in the Bazel configuration of the previous build. One of them is all the link checks that are listed in (this file)[https://github.com/tensorflow/tensorflow/blob/master/tensorflow/java/build_defs.bzl]

This issue is a placeholder to remember that those should probably be added back to the new Maven build.

Set log level

System information

  • TensorFlow version (use command below): 1.15

Describe the current behavior

When using the current Java bindings for TensorFlow, the log gets filled with output from every operation when a model is evaluated. I was unable to find a way to set the log level via the Java API. Is this at all possible? If not, could you please consider adding this functionality in future versions?

Getting Started Tutorial

Describe the documentation issue

Before the java keras development was moved to this repo, it had a great getting started guide. That would be great to be easily discoverable in this repo.

(A getting started guide for importing models from python would be great too. I have a feeling that's a pretty common use case for using tensorflow in java)

Framework Ops vs Raw Ops

In Python TensorFlow, there are some OPs defined in the Python Layer, and some defined in the C-api layer. I have been tasked to see how Java TensorFlow might want to handle this.

I have run some experiments with creating a FrameworkOperatorProcessor class in tensorflow-flow-generator and a couple of architectures present themselves. This class is basically a copy of OperatorProcessor with some tweaks.

The approaches seem to dictate generating a new class in tensorflow-framework, that I named FOps for now.

  1. The first approach is to have FOps subclass org.tensorflow.op.Ops, that is generated in tensorflow-core-api. However, this leads to potential problems with name clashes with the methods and groups already in Ops. A prime example of this are the NN classes we added for Nn and NnRaw (SoftmaxCrossEntropyWithLogits<T> softmaxCrossEntropyWithLogits() has the same signature in both generated classes.) This option requires changing Ops from a final class to non-final so that it can be inherited.

  2. A second approach is to use the delegate pattern, and have FOps hold an internal reference to Ops, and you could call methods on each as required. For example,

FOps ftf = FOps.create(graph);
ftf.math.tensordot(); // framework op
ftf.getOps().math.mul();  // raw op
  1. Keep both totally separate from each other. This may potentially allow reuse of the existing OperatorProcessor. It may be more cumbersome to the programmer user.

  2. Another option, that I haven't thought of yet.

I welcome thoughts on this.

How to create a custom Index that is a function of multiple coordinates/dimensions

I'd like to know if a new feature can be added to have Index knowing about not only its Dimension, but also the other dimensions.

I already shared this with @karllessard , but I'd like to see if we can do something about.
Let me explain: I have a float[rows*columns] table with the actual data and an int[rows*columns] table with permutations to sort each column of the data table separately.

Data table:

data = [ 1.34 0.87 2.45 ]
       [ 0.45 1.56 1.66 ]
       [ 1.02 0.98 0.34 ]

permutations = [ 2 0 2 ]
               [ 0 2 1 ]
               [ 1 1 0 ]

This is a way to have both a sorted and an unsorted version of the data table by using permutations in the indexing when the sorted version is needed.

I was thinking to put the float table in a FloatDataBuffer, then wrap it in a FloatNdArray and create an Index that uses the permutations table. But the fact that we need a separate Index instance for each dimension makes the things complicated (impossible?). Is now the only option to implement my own FloatDataBuffer/FloatNdArray?

mvn install -Pdev -Djavacpp.platform.extension=-gpu -e --> FAILURE! - in org.tensorflow.framework.optimizers.AdaGradDATest

Many thanks for TF-Java! mvn install -Pdev -Djavacpp.platform.extension=-gpu -e on the master branch appears to fail a test, so I thought I'd share it, in case anyone else encounters this:

# mvn install -Pdev -Djavacpp.platform.extension=-gpu -e
...
Downloading from ossrh-snapshots: https://oss.sonatype.org/content/repositories/snapshots/org/tensorflow/tensorflow-core-api/0.3.0-SNAPSHOT/tensorflow-core-api-0.3.0-20201008.134402-33-linux-x86_64-gpu.jar
...

tensorflow framework build error
[INFO] --- maven-surefire-plugin:2.22.2:test (default-test) @ tensorflow-framework ---
[INFO] Surefire report directory: /tmp/docker-share/tensorflow-java/tensorflow-framework/target/surefire-reports
[INFO] 
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.tensorflow.framework.optimizers.AdaDeltaTest
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.573 s - in org.tensorflow.framework.optimizers.AdaDeltaTest
[INFO] Running org.tensorflow.framework.optimizers.NadamTest
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.113 s - in org.tensorflow.framework.optimizers.NadamTest
[INFO] Running org.tensorflow.framework.optimizers.AdamTest
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.134 s - in org.tensorflow.framework.optimizers.AdamTest
[INFO] Running org.tensorflow.framework.optimizers.AdaGradTest
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.083 s - in org.tensorflow.framework.optimizers.AdaGradTest
[INFO] Running org.tensorflow.framework.optimizers.RMSPropTest
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.367 s - in org.tensorflow.framework.optimizers.RMSPropTest
[INFO] Running org.tensorflow.framework.optimizers.AdamaxTest
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.118 s - in org.tensorflow.framework.optimizers.AdamaxTest
[INFO] Running org.tensorflow.framework.optimizers.FtrlTest
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.273 s - in org.tensorflow.framework.optimizers.FtrlTest
[INFO] Running org.tensorflow.framework.optimizers.MomentumTest
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.206 s - in org.tensorflow.framework.optimizers.MomentumTest
[INFO] Running org.tensorflow.framework.optimizers.OptimizersTest
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.054 s - in org.tensorflow.framework.optimizers.OptimizersTest
[INFO] Running org.tensorflow.framework.optimizers.GradientDescentTest
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.14 s - in org.tensorflow.framework.optimizers.GradientDescentTest
[INFO] Running org.tensorflow.framework.optimizers.AdaGradDATest
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 2.055 s <<< FAILURE! - in org.tensorflow.framework.optimizers.AdaGradDATest
[ERROR] testBasic  Time elapsed: 2.044 s  <<< ERROR!
org.tensorflow.exceptions.TFInvalidArgumentException: 
Cannot assign a device for operation adagrad-da_1: Could not satisfy explicit device specification '' because the node {{colocation_node adagrad-da_1}} was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:GPU:0, /job:localhost/replica:0/task:0/device:GPU:1, /job:localhost/replica:0/task:0/device:GPU:2, /job:localhost/replica:0/task:0/device:GPU:3]. 
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=1 requested_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' assigned_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' resource_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]
ApplyAdagradDA: CPU 
VariableV2: GPU CPU 
Assign: GPU CPU 
Colocation members, user-requested devices, and framework assigned devices, if any:
  var0 (VariableV2)  framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
  adagrad-da_1 (Assign) 
  var0-gradient_accumulator (VariableV2) 
  adagrad-da_10 (Assign) 
  var0-gradient_squared_accumulator (VariableV2) 
  adagrad-da_15 (Assign) 
  adagrad-da_36 (ApplyAdagradDA) 

         [[{{node adagrad-da_1}}]]
        at org.tensorflow.framework.optimizers.AdaGradDATest.testBasic(AdaGradDATest.java:90)

[INFO] Running org.tensorflow.framework.data.SkipDatasetTest
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.029 s - in org.tensorflow.framework.data.SkipDatasetTest
[INFO] Running org.tensorflow.framework.data.BatchDatasetTest
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.021 s - in org.tensorflow.framework.data.BatchDatasetTest
...
[INFO] Running org.tensorflow.framework.initializers.OrthogonalTest
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.333 s - in org.tensorflow.framework.initializers.OrthogonalTest
[INFO] Running org.tensorflow.framework.initializers.HeTest
[INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.451 s - in org.tensorflow.framework.initializers.HeTest
[INFO] 
[INFO] Results:
[INFO] 
[ERROR] Errors: 
[ERROR]   AdaGradDATest.testBasic:90 ? TFInvalidArgument Cannot assign a device for oper...
[INFO] 
[ERROR] Tests run: 112, Failures: 0, Errors: 1, Skipped: 0
[INFO] 
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for TensorFlow Java Parent 0.3.0-SNAPSHOT:
[INFO] 
[INFO] TensorFlow Java Parent ............................. SUCCESS [  3.164 s]
[INFO] TensorFlow NdArray Library ......................... SUCCESS [ 30.072 s]
[INFO] TensorFlow Core Parent ............................. SUCCESS [  0.006 s]
[INFO] TensorFlow Core Annotation Processor ............... SUCCESS [  0.190 s]
[INFO] TensorFlow Core API Library ........................ SUCCESS [ 34.833 s]
[INFO] TensorFlow Core API Library Platform GPU ........... SUCCESS [  0.021 s]
[INFO] TensorFlow Framework Library ....................... FAILURE [01:29 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  02:37 min
[INFO] Finished at: 2020-10-14T01:44:35Z
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.22.2:test (default-test) on project tensorflow-framework: There are test failures.
[ERROR] 
[ERROR] Please refer to /tmp/docker-share/tensorflow-java/tensorflow-framework/target/surefire-reports for the individual test results.
[ERROR] Please refer to dump files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
[ERROR] -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.22.2:test (default-test) on project tensorflow-framework: There are test failures.

Please refer to /tmp/docker-share/tensorflow-java/tensorflow-framework/target/surefire-reports for the individual test results.
Please refer to dump files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:215)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
...

This new quad-GPU machine presents new challenges, compared to our previous small test systems in #100 etc. Really appreciate the fast -Pdev option for building from artifacts, thanks!

Current default build is MKL

This is painful for machines which don't have MKL installed. The MKL version bound against is the one using Intel's OpenMP, which is even less likely to be available on a machine (plus it's MKL-ML, not regular MKL nor MKL DNN - now DNNL).

We should make the build system a little more configurable, or at least set the default build to be something that works everywhere (even if the performance is worse).

I keep running into this when testing things on different machines here, none of which have MKL, and I have to get approvals for new inbound libraries.

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux
  • TensorFlow installed from (source or binary): source
  • TensorFlow version: 2.0

SavedModelBundle exporter(...) build: mvn install [ERROR] tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[129,55] cannot find symbol symbol: variable internal_static_tensorflow_BoundedTensorSpecProto_descriptor location: class org.tensorflow.proto.framework.StructProtos

This is follow-up to #97 (comment) and #97 (comment). I'm trying to build a tensorflow-java with SavedModelBundle exporter(...) but encounter a build error.

Specifically, I git checkout the save_model branch of https://github.com/karllessard/tensorflow-java and mvn install, but encounter a build error. I would raise an issue in https://github.com/karllessard/tensorflow-java, but https://github.com/karllessard/tensorflow-java/issues redirects to https://github.com/karllessard/tensorflow-java/pulls, so I cannot raise an issue there.

Would a colleague kindly advise how to fix the following build error:

root@478b80e86d3b:tensorflow-java-karl# mvn install -e
...
[INFO] --- maven-compiler-plugin:3.8.0:compile (default-compile) @ tensorflow-core-api ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 1678 source files to /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/target/classes
[INFO] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Tensor.java: Some input files use unchecked or unsafe operations.
[INFO] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Tensor.java: Recompile with -Xlint:unchecked for details.
[INFO] -------------------------------------------------------------
[ERROR] COMPILATION ERROR : 
[INFO] -------------------------------------------------------------
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[129,55] cannot find symbol
  symbol:   variable internal_static_tensorflow_BoundedTensorSpecProto_descriptor
  location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[135,55] cannot find symbol
  symbol:   variable internal_static_tensorflow_BoundedTensorSpecProto_fieldAccessorTable
  location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[479,57] cannot find symbol
  symbol:   variable internal_static_tensorflow_BoundedTensorSpecProto_descriptor
  location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[485,57] cannot find symbol
  symbol:   variable internal_static_tensorflow_BoundedTensorSpecProto_fieldAccessorTable
  location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[536,57] cannot find symbol
  symbol:   variable internal_static_tensorflow_BoundedTensorSpecProto_descriptor
  location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[82,65] cannot find symbol
  symbol:   variable internal_static_tensorflow_SaveableObject_descriptor
  location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[88,65] cannot find symbol
  symbol:   variable internal_static_tensorflow_SaveableObject_fieldAccessorTable
  location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[290,67] cannot find symbol
  symbol:   variable internal_static_tensorflow_SaveableObject_descriptor
  location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[296,67] cannot find symbol
  symbol:   variable internal_static_tensorflow_SaveableObject_fieldAccessorTable
  location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[329,67] cannot find symbol
  symbol:   variable internal_static_tensorflow_SaveableObject_descriptor
  location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[INFO] 10 errors 
[INFO] -------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for TensorFlow Java Parent 0.1.0-SNAPSHOT:
[INFO] 
[INFO] TensorFlow Java Parent ............................. SUCCESS [  2.152 s]
[INFO] TensorFlow Tools Library ........................... SUCCESS [02:29 min]
[INFO] TensorFlow Core Parent ............................. SUCCESS [  0.136 s]
[INFO] TensorFlow Core Annotation Processor ............... SUCCESS [  4.371 s]
[INFO] TensorFlow Core API Library ........................ FAILURE [08:09 min]
[INFO] TensorFlow Core API Library Platform ............... SKIPPED
[INFO] TensorFlow Framework Library ....................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  10:47 min
[INFO] Finished at: 2020-08-14T17:41:27Z
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.0:compile (default-compile) on project tensorflow-core-api: Compilation failure: Compilation failure: 
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[129,55] cannot find symbol
[ERROR]   symbol:   variable internal_static_tensorflow_BoundedTensorSpecProto_descriptor
[ERROR]   location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[135,55] cannot find symbol
[ERROR]   symbol:   variable internal_static_tensorflow_BoundedTensorSpecProto_fieldAccessorTable
[ERROR]   location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[479,57] cannot find symbol
[ERROR]   symbol:   variable internal_static_tensorflow_BoundedTensorSpecProto_descriptor
[ERROR]   location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[485,57] cannot find symbol
[ERROR]   symbol:   variable internal_static_tensorflow_BoundedTensorSpecProto_fieldAccessorTable
[ERROR]   location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/BoundedTensorSpecProto.java:[536,57] cannot find symbol
[ERROR]   symbol:   variable internal_static_tensorflow_BoundedTensorSpecProto_descriptor
[ERROR]   location: class org.tensorflow.proto.framework.StructProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[82,65] cannot find symbol
[ERROR]   symbol:   variable internal_static_tensorflow_SaveableObject_descriptor
[ERROR]   location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[88,65] cannot find symbol
[ERROR]   symbol:   variable internal_static_tensorflow_SaveableObject_fieldAccessorTable
[ERROR]   location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[290,67] cannot find symbol
[ERROR]   symbol:   variable internal_static_tensorflow_SaveableObject_descriptor
[ERROR]   location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[296,67] cannot find symbol
[ERROR]   symbol:   variable internal_static_tensorflow_SaveableObject_fieldAccessorTable
[ERROR]   location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] /tmp/docker-share/tensorflow-java-karl/tensorflow-core/tensorflow-core-api/src/gen/java/org/tensorflow/proto/framework/SaveableObject.java:[329,67] cannot find symbol
[ERROR]   symbol:   variable internal_static_tensorflow_SaveableObject_descriptor
[ERROR]   location: class org.tensorflow.proto.framework.SavedObjectGraphProtos
[ERROR] -> [Help 1]

Rather than debug https://github.com/karllessard/tensorflow-java, I would be happy if there were a save_model branch in https://github.com/tensorflow/java. However, I'm just trying to build something to test model save/export and load -- it does not matter to me how this is achieved.

how to implement DataType families?

Currently in tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/DataType.java, we use an uncomfortable pattern: DataType is omniscient about TType and dispatches on a string TType.NAME. For example:

  /** Returns true if this data type represents an integer type */
  public boolean isInteger() {
    switch (this.name()) {
      case TInt32.NAME:
      case TInt64.NAME:
      case TUint8.NAME:
        return true;
      default:
        return false;
    }
  }

The author of this pattern, @JimClarke5, mentioned in our Google Group that he regarded it as temporary:

My present code does a switch on DataType.name(), but IMO, this isnโ€™t the most elegant way to do this.

@karllessard suggested a direction, although with some open questions:

Each data type in Java inherit from a "type family" as in here, which can be use to set bounds on a given datatype when used as a generic parameter (e.g. Tensor<? extends TNumber> to only accept tensors that are numeric). But if doesn't do in your case and you really want to check the data type family at runtime, then we need to add new methods, like dataType.isNumber(). I think ideally it should be in line with the same data types classes defined in the core library; the new methods could even be added to the C API, in this file.

Let's decide on a direction! This is moderately pervasive in our code, but also a pretty simple change, so I'd advocate we choose a direction soon and I'm tempted to volunteer to make the change.

Cannot Get Serving Default Signature

SignatureDef sig = model.metaGraphDef().getSignatureDefMap().get("serving_default"); does not compile (it did a week ago).

It has the error Error: java: cannot access com.google.protobuf.MessageOrBuilder class file for com.google.protobuf.MessageOrBuilder not found

This error showed up in the past week using the TF 2.0 (0.1.0-SNAPSHOT)

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: no
  • TensorFlow installed from (source or binary): maven
  • TensorFlow version (use command below): 2.x June 21 Snapshot
  • Python version:
  • Bazel version (if compiling from source):
  • GCC/Compiler version (if compiling from source):
  • CUDA/cuDNN version:
  • GPU model and memory:

Describe the current behavior
Does not compile

Describe the expected behavior
Compiles

Code to reproduce the issue

SavedModelBundle model = SavedModelBundle.load(pathToBundle, "serve");
SignatureDef sig = model.metaGraphDef().getSignatureDefMap().get("serving_default");

Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

I'm using this code to get the input and output names of my model. See this Stack Overflow answer and this Stack Overflow answer

Graph importGraphDef doesn't match documentation

In the TF java documentation It looks as if you can import a Graph Defition from a byte array. In their examples they also read from a byte array.

The code here, however, requires a GraphDef.

What's the best way to overcome this? Or am I not understanding what a GraphDef is?

Thanks!

System information

  • TensorFlow version: 0.1.0-SNAPSHOT (2.2.0)

Windows build fails on GitHub Actions

This is not a new topic but I want to start a thread on it so we can get closer to a complete solution.

We always had trouble building our Windows platforms with extensions mkl, gpu or mkl-gpu on GitHub Actions because the operation takes too long (i.e. beyond the 6 hours limit).

Adding this new option to the Visual Studio compiler reduces drastically the compilation time of MKL functions, as I've tested locally. Still, all platforms with GPU support do not complete in time, as you can see in this workflow.

Strangely, we can observe that the preparation of the environment of these builds took 20 minutes (i.e. x2) the time of the non-GPU builds. But we install the same software, regardless if we are building for GPU or not. Can we investigate what is the cause of that delay?

Also, I have been told by SIG Build that disabling Eigen inlining helps reducing the compilation time even more, but this time at the price of some performance loss. Still, should we give it a try?

Exception in thread "main" java.lang.IncompatibleClassChangeError: Method 'org.tensorflow.Tensor org.tensorflow.types.TInt32.vectorOf(int[])' must be InterfaceMethodref constant

Thanks for helping me build a working .jar file containing TensorflowJava per #96!

Would a colleague kindly help me with the following runtime error, when I run my tests within a Docker container:

root@4f6083770318:tensorflow-java# java -jar my.jar tf --verbose
DEBUG 1596940815346: Started ndarray TensorflowJavaTest.
DEBUG 1596940815727: matrix3d rank 3
DEBUG 1596940815727: Finished ndarray TensorflowJavaTest.
DEBUG 1596940815728: Started graph TensorflowJavaTest.
DEBUG 1596940819085: fetch_test fetched.data.getInt(0)=3 is 3? true
DEBUG 1596940819087: fetch_test fetched.data.getInt(1)=4 is 4? true
Exception in thread "main" java.lang.IncompatibleClassChangeError: Method 'org.tensorflow.Tensor org.tensorflow.types.TInt32.vectorOf(int[])' must be InterfaceMethodref constant
        at vis.TensorflowJavaTest.feed_test(tf.scala:80)
        at vis.TensorflowJavaTest.run(tf.scala:106)
        at vis.TensorflowJavaTest$.runArgs(tf.scala:58)
        at vis.VIS$.main(vis.scala:171)
        at vis.VIS.main(vis.scala)

My tests are based on runUsingColonSeparatedNames() in SessionTest.java, and I re-wrote some in Scala.

The test body is:

      debug("Started ndarray TensorflowJavaTest.")
      // run simple data buffers test as in https://github.com/tensorflow/java/tree/master/ndarray
      val matrix3d = org.tensorflow.ndarray.NdArrays.ofInts( org.tensorflow.ndarray.Shape.of(2, 3, 2) )
      debug("matrix3d rank " + matrix3d.rank)
      debug("Finished ndarray TensorflowJavaTest.")

      debug("Started graph TensorflowJavaTest.")
      // run simple graph and session tests per https://github.com/tensorflow/java/blob/master/tensorflow-core/tensorflow-core-api/src/test/java/org/tensorflow/SessionTest.java
      val graph = new org.tensorflow.Graph() // defined in https://github.com/tensorflow/java/blob/master/tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Graph.java
      val session = new org.tensorflow.Session(graph)
      val tf = org.tensorflow.op.Ops.create(graph)

      fetch_test(session, tf)
      feed_test(session, tf)

      debug("Finished graph TensorflowJavaTest.")

The fetch test is:

    private def fetch_test(session: org.tensorflow.Session, tf: org.tensorflow.op.Ops): Unit = {
      val split = tf.split( tf.constant(0), tf.array(1, 2, 3, 4), 2l)
      tf.math.add(split.output().get(0), split.output().get(1))
      // Fetch using colon separated names.
      val fetched = session.runner().fetch("Split:1").run().get(0).expect(org.tensorflow.types.TInt32.DTYPE)
      val fetch0 = fetched.data().getInt(0)
      debug("fetch_test fetched.data.getInt(0)=%d is 3? %b".format(fetch0, fetch0 == 3)) // expected to be 3
      val fetch1 = fetched.data().getInt(1)
      debug("fetch_test fetched.data.getInt(1)=%d is 4? %b".format(fetch1, fetch1 == 4)) // expected to be 4
    }

The feed test is:

      import scala.collection.JavaConversions._ // convert java iterator to scala iterator with foreach defined.  Allows TensorflowJava's NdArraySequence to be treated as a Java Iterable, to be treated as a Scala Iterable, so mkString() will work 
...
    private def feed_test(session: org.tensorflow.Session, tf: org.tensorflow.op.Ops): Unit = {
      // Feed using colon separated names.
      val fed = org.tensorflow.types.TInt32.vectorOf(4, 3, 2, 1) // <-- line 80, throws IncompatibleClassChangeError at runtime
      val fetched = session.runner()
                           .feed("Split:0", fed)
                           .feed("Split:1", fed)
                           .fetch("Add")
                           .run()
                           .get(0)
                           .expect(org.tensorflow.types.TInt32.DTYPE)
      val data = fetched.data()
      debug("feed_test fetched.data=%s".format(data.scalars.mkString(",")))
    }

Kindly advise what is wrong with the line val fed = org.tensorflow.types.TInt32.vectorOf(4, 3, 2, 1), which throws the IncompatibleClassChangeError at runtime?

Cannot run model using tensorflow-text ops

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Debian Buster
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: No
  • TensorFlow installed from (source or binary): Maven snapshots
  • TensorFlow version (use command below): 0.1.0-SNAPSHOT (tensorflow_text 2.2.1)
  • Python version: 3.7 and 3.6
  • Bazel version (if compiling from source):
  • GCC/Compiler version (if compiling from source):
  • CUDA/cuDNN version:
  • GPU model and memory:

Describe the current behavior
It crashes with the following output:

2020-07-01 20:14:58.469557: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: tokenizer
2020-07-01 20:14:58.474996: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2020-07-01 20:14:58.475083: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:295] Reading SavedModel debug info (if present) from: tokenizer
2020-07-01 20:14:58.475670: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-07-01 20:14:58.478426: W external/org_tensorflow/tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-07-01 20:14:58.478482: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-07-01 20:14:58.478517: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (1470715d11a2): /proc/driver/nvidia/version does not exist
2020-07-01 20:14:58.478585: I external/org_tensorflow/tensorflow/core/common_runtime/process_util.cc:147] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
2020-07-01 20:14:58.491011: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:234] Restoring SavedModel bundle.
2020-07-01 20:14:58.522974: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:183] Running initialization op on SavedModel bundle at path: tokenizer
2020-07-01 20:14:58.537814: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:364] SavedModel load for tags { serve }; Status: success: OK. Took 68286 microseconds.
2020-07-01 20:14:59.321091: W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at wordpiece_kernel.cc:204 : Invalid argument: Trying to access resource using the wrong type. Expected N10tensorflow6lookup15LookupInterfaceE got N10tensorflow6lookup15LookupInterfaceE
[WARNING]
org.tensorflow.exceptions.TFInvalidArgumentException: Trying to access resource using the wrong type. Expected N10tensorflow6lookup15LookupInterfaceE got N10tensorflow6lookup15LookupInterfaceE
         [[{{node WordpieceTokenizeWithOffsets/WordpieceTokenizeWithOffsets/WordpieceTokenizeWithOffsets}}]]
    at org.tensorflow.internal.c_api.AbstractTF_Status.throwExceptionIfNotOK (AbstractTF_Status.java:87)
    at org.tensorflow.Session.run (Session.java:595)
    at org.tensorflow.Session.access$100 (Session.java:70)
    at org.tensorflow.Session$Runner.runHelper (Session.java:335)
    at org.tensorflow.Session$Runner.run (Session.java:285)
    at Main.main (Main.java:25)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:254)
    at java.lang.Thread.run (Thread.java:834)

Describe the expected behavior
Succesful execution of the model

Code to reproduce the issue

import org.tensorflow.SavedModelBundle;
import org.tensorflow.Session;
import org.tensorflow.Tensor;
import org.tensorflow.TensorFlow;
import org.tensorflow.tools.ndarray.NdArrays;
import org.tensorflow.types.TString;

import java.nio.charset.StandardCharsets;
import java.util.List;

public class Main {
    public static void main(String[] args) {
        TensorFlow.version();

        String libDir = "/usr/local/lib/python3.7/dist-packages/tensorflow_text/python/ops/";
        TensorFlow.loadLibrary(libDir + "_wordpiece_tokenizer.so");
        TensorFlow.loadLibrary(libDir + "_normalize_ops.so");
        TensorFlow.loadLibrary(libDir + "_regex_split_ops.so");

        SavedModelBundle savedModelBundle = SavedModelBundle.load("tokenizer", "serve");
        Session.Runner runner = savedModelBundle.session().runner();

        runner.feed("serving_default_text:0", TString.tensorOfBytes(NdArrays.vectorOfObjects("a b c d e".getBytes(StandardCharsets.UTF_8))));
        runner.fetch("StatefulPartitionedCall_1:0");
        List<Tensor<?>> outputs = runner.run();
        System.out.println(outputs);
    }
}

The following script can be used to generate a minimal saved model triggering the problem:

from tensorflow.python.framework import dtypes
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import lookup_ops
from tensorflow.python.ops import math_ops
from tensorflow.python.ops import string_ops
from tensorflow_text.python.ops import bert_tokenizer
import tensorflow as tf

vocab = [
   b'a', b'b', b'c', b'd'
]

def _create_table(vocab, num_oov=1):
  init = lookup_ops.KeyValueTensorInitializer(
      vocab,
      math_ops.range(
          array_ops.size(vocab, out_type=dtypes.int64), dtype=dtypes.int64),
      key_dtype=dtypes.string,
      value_dtype=dtypes.int64)
  return lookup_ops.StaticVocabularyTableV1(
      init, num_oov, lookup_key_dtype=dtypes.string)


table = _create_table(vocab)

class Module(tf.Module):
    def __init__(self, table):
        self.table = table
        self.tokenizer = bert_tokenizer.BertTokenizer(
                             self.table,
                             token_out_type=dtypes.string,
                             lower_case=True,
                             preserve_unused_token=False)

    @tf.function(input_signature=[tf.TensorSpec(1, dtype=tf.dtypes.string)])
    def serve(self, text):
        return self.tokenizer.tokenize(text)

module = Module(table)
tf.saved_model.save(module, 'tokenizer')

model = tf.saved_model.load('tokenizer')
print(model.serve(['a a b c d e']))

Other info / logs
There is an issue in tensorflow-text (tensorflow/text#272) where the same thing happens on macos (while this is on linux). This model works on linux using python to load and execute the model however, so the root cause is most likely different. Looking at the fix for the macos issue (tensorflow/tensorflow@1823f87#diff-991a6b786e16708ba1e6f5c9926cf151) makes me suspect that this may be caused by type ids being generated differently due to tensorflow-java building native tensorflow libs separately in a slightly different way than the python libraries.

Is it possible to build the libtensorflow first?

System information

  • TensorFlow version (you are using): tf2.0+
  • Are you willing to contribute it (Yes/No): yes.. but my code is bad

Describe the feature and the current behavior/state.

It's so said i have been told that "There is no libtensorflow support for TensorFlow 2 yet"
https://github.com/tensorflow/tensorflow/issues/36950#issuecomment-592238340

tf2 changes a lot, we could not use the new model build on tf2 due to no new version of libtensorflow available.

Will this change the current api? How?

To build the whole tf2 on java is huge project, is it possible to build the libtensorflow first?

Who will benefit with this feature?

all developer useing tf java api

Saving and loading models in Java with a functional API

Confirmation: lets model the API with functions while still loading/saving session-centric graphs

I take it that we go with the current branch (after addressing open comments) and adding unit tests.

For unit tests: Here is a proposal

  • Checkin savedmodel (created using python) as a resource
  • [optionally] Add the python model + instructions (but not have the test run python code)

Originally posted by @Shajan in #89 (comment)

[Request] Migration Guide/Breaking changes between major versions

Is it possible to provide a migration guide from TF1 to TF2 for java, or at least list a comprehensive list of breaking changes?
For python, their is an update script and a compatibility-API. I know that would be too much, but a guide would be great to have.

Can't use USE-LITE from tensorflow hub in Java

Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template

I'm trying to use the USE LITE model from tensorflow hub in Java. I can download and load the model into Java just fine but the problem arises when I try to run inference. I get errors saying that there are no operations with the name of the input variables when I want to feed them into my graph despite specifying the input names when downloading and saving the model (in python)

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Mac OS 10.13.6
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
  • TensorFlow installed from (source or binary):
  • TensorFlow version (use command below): 2.2.0
  • Python version: 3.7.4
  • Bazel version (if compiling from source):
  • GCC/Compiler version (if compiling from source):
  • CUDA/cuDNN version:
  • GPU model and memory:
  • Java version: 11.0.1
  • org tensorflow installation: I installed using Maven (specifically I have the version 1.12.0)

You can collect some of this information using our environment capture script
You can also obtain the TensorFlow version with
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Describe the current behavior
When I try to run inference with the model in java it gives me the error java.lang.IllegalArgumentException: No Operation named [values] in the Graph

Describe the expected behavior
The model should be be to run inference on the sample inputs to produce a single output

Code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate the problem.
Attached are two files: The first is a Python script I used to download and save the Tensorflow model in an appropriate format. The second is a java file I used to actually load the model. Note that the path in the java script would need to be changed to where the USE lite model is based on the python script
HelloTensorFlow.java.zip
download_tensorflow_model.py.zip

Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

TF 2.x Java Binding @rpath/libtensorflow_framework.1.dylib

OS Platform and Distribution : macOS Catalina 10.15.3

TensorFlow installed from : binary

TensorFlow version : 2.1.0

Python version: 3.7.3

Java - 1.8

So basically I am using TF 1.15.0 Java binding to do the inferencing from savedmodel where features are ported as part of savedmodel , the code below is working

import org.tensorflow.SavedModelBundle;
import org.tensorflow.Session;
import org.tensorflow.Tensor;
import org.tensorflow.TensorFlow;

public class TF1_b {

	public static void main(String[] args) {

		String testSen = "1";

		byte[] inputBytes = new byte[1];
		try {
			inputBytes = testSen.getBytes("UTF-8");
		} catch (Exception e) {
			e.printStackTrace();
		}

		System.out.println(TensorFlow.version());


		String basePath = "<path>/anaconda3/lib/python3.7/site-packages/tensorflow_text/python/ops/";
		System.load(basePath + "_wordpiece_tokenizer.dylib");


		try (SavedModelBundle b = SavedModelBundle.load("<saved_model_path>", "serve")) {


			Session sess = b.session();


			Tensor x = Tensor.create(inputBytes, String.class);

			float[][] y = sess.runner().feed("text:0", x)
							.fetch("linear/linear_model/linear_model/linear_model/weighted_sum:0").run().get(0)
							.copyTo(new float[1][1]);

			System.out.println(y[0][0]);



		} catch (Exception e) {
			System.out.println(e);

		}
	}

}

In pom.xml the dependency I am using for above is

<dependencies>
		<dependency>
			<groupId>org.tensorflow</groupId>
			<artifactId>tensorflow</artifactId>
			<version>1.15.0</version>
		</dependency>
	</dependencies>

The above code works perfect and i am able to load System.load(basePath + "_wordpiece_tokenizer.dylib"); with no problems .

Now when I try to run the same model with


<repositories>
    <repository>
        <id>tensorflow-snapshots</id>
        <url>https://oss.sonatype.org/content/repositories/snapshots/</url>
        <snapshots>
            <enabled>true</enabled>
        </snapshots>
    </repository>
</repositories>
<dependencies>
    <!-- Example of dependency, see README.md for more options -->
    <dependency>
        <groupId>org.tensorflow</groupId>
        <artifactId>tensorflow-core-platform</artifactId>
        <version>0.1.0-SNAPSHOT</version>
    </dependency>
</dependencies>

as described https://stackoverflow.com/questions/61373396/tensorflow-2-x-java-bindings

with changes only for for two lines

Tensor x = TString.scalarOf(new String("1"));
Tensor<TFloat32> y = sess.runner().feed("text:0", x) .fetch("linear/linear_model/linear_model/linear_model/weighted_sum:0").run().get(0).expect(TFloat32.DTYPE);

I get error

Exception in thread "main" java.lang.UnsatisfiedLinkError: <path>/anaconda3/lib/python3.7/site-packages/tensorflow_text/python/ops/_wordpiece_tokenizer.dylib: dlopen(<path>/anaconda3/lib/python3.7/site-packages/tensorflow_text/python/ops/_wordpiece_tokenizer.dylib, 1): Library not loaded: @rpath/libtensorflow_framework.1.dylib
  Referenced from: <path>/anaconda3/lib/python3.7/site-packages/tensorflow_text/python/ops/_wordpiece_tokenizer.dylib
  Reason: image not found
	at java.lang.ClassLoader$NativeLibrary.load(Native Method)
	at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
	at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
	at java.lang.Runtime.load0(Runtime.java:809)
	at java.lang.System.load(System.java:1086)

The path <path>/anaconda3/lib/python3.7/site-packages/tensorflow_text/python/ops/_wordpiece_tokenizer.dylib: exists and so TF 1.15.x java inferencing is working fine for inferencing .

The java code for TF 2.x i am using is below


import org.tensorflow.SavedModelBundle;
import org.tensorflow.Session;
import org.tensorflow.Tensor;
import org.tensorflow.TensorFlow;
import org.tensorflow.tools.ndarray.NdArrays;
import org.tensorflow.types.TFloat32;
import org.tensorflow.types.TString;

public class TF_2 {



	public static void main(String[] args) {



		String testSen = "1";

		byte[] inputBytes = new byte[1];
		try {
			inputBytes = testSen.getBytes("UTF-8");
		} catch (Exception e) {
			e.printStackTrace();
		}

		System.out.println(TensorFlow.version());

		String basePath = "<path>/anaconda3/lib/python3.7/site-packages/tensorflow_text/python/ops/";

		System.load(basePath + "_wordpiece_tokenizer.dylib");



		try (SavedModelBundle b = SavedModelBundle.load("<saved_model_path>", "serve")) {


			Session sess = b.session();


			Tensor x = TString.scalarOf(new String("1"));

			Tensor<TFloat32> y = sess.runner().feed("text:0", x)
							.fetch("linear/linear_model/linear_model/linear_model/weighted_sum:0").run().get(0)
							.expect(TFloat32.DTYPE);


			System.out.println(y.data().toString());



		} catch (Exception e) {
			System.out.println(e);
		}
	}


}

Maven coordinates not found

Describe the problem
I'm sorry if this is a bad question, but I'm unable to find the maven coordinates for this project. You have a sample pom.xml but I can't find them on maven repo and when I try to add them I get errors that they can't be found. What am I missing?

2.1?

Please make sure that this is a feature request. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:feature_template

System information

  • TensorFlow version (you are using):
  • Are you willing to contribute it (Yes/No):

Describe the feature and the current behavior/state.

Will this change the current api? How?

Who will benefit with this feature?

Any Other info.

Version number confusion

Please make sure that this is a documentation issue. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:doc_template

System information

Describe the documentation issue

It's bit confusing to me as I just heard the news about alpha release at version 0.2.0 but when I see the installation docs here https://www.tensorflow.org/install/lang_java it's pointing to maven artifact version 2.3.0 Though the 0.2.0 version is also available on maven central : https://search.maven.org/artifact/org.tensorflow/tensorflow-core-api/0.2.0/jar

Please update the docs so people who want to play with the alpha release can find it.

We welcome contributions by users. Will you be able to update submit a PR (use the doc style guide) to fix the doc Issue? No

Operator with more than 8 outputs

Hi, it seems the java api can't create an output more than 8. This is a big limitation especially on split operators.
Here is a reproducible code, I'm using tensorflow-core-api latest SNAPSHOT version.

package ai.djl.tensorflow.engine;

import org.tensorflow.EagerSession;
import org.tensorflow.op.Ops;
import org.tensorflow.types.TInt64;

public class Tftest {
    public static void main(String[] args) {
        EagerSession eagerSession = EagerSession.options().async(true).build();
        Ops tf = Ops.create(eagerSession);
        // creates a (20, 20) tensor with zeros and split to 10 tensors equally on axis 0,
        // should return 10 tensors each with shape(2, 20).
        tf.splitV(
                tf.zeros(tf.constant(new long[]{20,20}), TInt64.DTYPE),
                tf.constant(10),
                tf.constant(0),
                10L
        );
    }
}

error message:

Exception in thread "main" java.lang.IllegalArgumentException: Expecting 10 outputs, but *num_retvals is 8
	at org.tensorflow.internal.c_api.AbstractTF_Status.throwExceptionIfNotOK(AbstractTF_Status.java:72)
	at org.tensorflow.EagerOperationBuilder.execute(EagerOperationBuilder.java:302)
	at org.tensorflow.EagerOperationBuilder.build(EagerOperationBuilder.java:68)
	at org.tensorflow.EagerOperationBuilder.build(EagerOperationBuilder.java:57)
	at org.tensorflow.op.core.SplitV.create(SplitV.java:64)
	at org.tensorflow.op.Ops.splitV(Ops.java:5715)
	at ai.djl.tensorflow.engine.Tftest.main(Tftest.java:11)

Unable to execute tfhub model: getting TFInvalidArgumentException

Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Debian Buster
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: No
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 0.2.0-SNAPSHOT (from sonatype)
  • Python version: -
  • Bazel version (if compiling from source):
  • GCC/Compiler version (if compiling from source):
  • CUDA/cuDNN version:
  • GPU model and memory:

You can collect some of this information using our environment capture script
You can also obtain the TensorFlow version with
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Describe the current behavior

I am trying to invoke a TFHUB model (Universal Sentence Encoder v4) using the new java API (using Scala). However, I am getting stuck at the error below.

An exception or error caused a run to abort: Malformed TF_STRING tensor; too short to hold number of elements 
org.tensorflow.exceptions.TFInvalidArgumentException: Malformed TF_STRING tensor; too short to hold number of elements
	at org.tensorflow.internal.c_api.AbstractTF_Status.throwExceptionIfNotOK(AbstractTF_Status.java:87)
	at org.tensorflow.Session.run(Session.java:595)
	at org.tensorflow.Session.access$100(Session.java:70)
	at org.tensorflow.Session$Runner.runHelper(Session.java:335)
	at org.tensorflow.Session$Runner.run(Session.java:285)
	at org.samik.EmbeddingModelServer.USEEmbeddingServerTest.<init>(USEEmbeddingServerTest.scala:85)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)

Describe the expected behavior

The code should compile and execute.

Code to reproduce the issue

import org.tensorflow.proto.framework.{MetaGraphDef, SignatureDef, TensorInfo}
import org.tensorflow.{SavedModelBundle, Tensor}

import scala.collection.JavaConverters._
import scala.collection.mutable

import org.tensorflow.ndarray.Shape
import org.tensorflow.ndarray.buffer.DataBuffers
import org.tensorflow.types.{TString, TUint8}
import java.nio.ByteBuffer
import java.nio.charset.StandardCharsets

class TestApp extends App
{
	val useModel = SavedModelBundle.load("/local/path/to/tfhub/use_4", "serve")

	val metaData = useModel.metaGraphDef()
	val signatureDef = metaData.getSignatureDefMap().get("serving_default")
	val firstInput = getInputToShape(metaData).keys.head
	val firstOutput = getOutputToShape(metaData).keys.head


	val input = "Hello"
	val dataBuffer = DataBuffers.of(ByteBuffer.wrap(input.getBytes(StandardCharsets.UTF_8)))
	val tensor = Tensor.of(TString.DTYPE, Shape.of(1L), dataBuffer)
	println(s"Tensor: $tensor")
	val sessionRunner = useModel.session().runner()
	val result = sessionRunner
			.feed(firstInput, tensor)
                        //****** The below line (fetch(..)) seems to be generating the error *********//
			.fetch(firstOutput)
			.run()
			.asScala
	println(result)

	private def getOutputToShape(metadata: MetaGraphDef): mutable.Map[String, Shape] =
		mapToShape(signatureDef.getOutputsMap.asScala)

	private def getInputToShape(metadata: MetaGraphDef): mutable.Map[String, Shape] =
		mapToShape(signatureDef.getInputsMap.asScala)

	private def mapToShape(map: mutable.Map[String, TensorInfo]): mutable.Map[String, Shape] =
	{
		map.foldLeft(mutable.HashMap[String, Shape]())
		{ case(accum, (_, tensorInfo)) =>
			val dimList = tensorInfo.getTensorShape.getDimList.asScala.map(_.getSize)
			val shape = if(dimList.length == 0) Shape.unknown() else Shape.of(dimList: _*)
			accum += (tensorInfo.getName -> shape)
		}
	}
}

However, pretty much the same code, with the same helper functions work with the published jar (1.15.0). Here is the corresponding snippet.

    val metaData = MetaGraphDef.parseFrom(useModel.metaGraphDef())
    val firstInput = getInputToShape(metaData).keys.head
    val firstOutput = getOutputToShape(metaData).keys.head

    val input = "Hello there!"
    val inputTensor: Tensor[String] = Tensors.create(Array(input.getBytes()))

    val sessionRunner = useModel.session().runner()
    val results = sessionRunner.feed(firstInput, inputTensor).fetch(firstOutput).run().asScala
    results.foreach(tensor => {
        val array = Array.ofDim[Float](tensor.shape()(0).toInt, tensor.shape()(1).toInt)
        tensor.copyTo(array)
        println(s"[${array(0).mkString(", ")}]")
    })

Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.