Git Product home page Git Product logo

Comments (17)

deansher avatar deansher commented on May 18, 2024 1

Although we don't have complete consensus, perhaps we have sufficient consensus to

  • drop the empty keras package for now,
  • change our README accordingly,
  • document provisional goals for our framework API in our README,
  • and revisit this issue when someone eventually raises a new issue that proposes a specific, immediate split into a Keras layer?

from java.

karllessard avatar karllessard commented on May 18, 2024 1

Just to add that Iā€™m fine with this last proposal from @deansher

from java.

deansher avatar deansher commented on May 18, 2024

Here's an argument for just a single framework layer.

Providing two API layers would add code, tests, and other complexity to this repo. We'd have two ongoing API design efforts.

If a Keras layer completely covered lower layers, then we'd have the added cost and complexity of percolating every new bit of lower-level functionality up through the Keras layer. We probably still wouldn't achieve totally airtight, so we'd have our very own leaky abstraction from the outset. If a Keras layer didn't attempt to completely cover lower layers, then we would be asking every user to understand our layering idea from the outset.

Perhaps a better starting point would be to attempt a single framework API that addresses the full spectrum of user needs? We would feel the tension between beginner/implicit/best-practice-defaults and expert/explicit, but that's a normal tension in API design and is addressed (to varying degrees of success) by normal API design techniques.

If we find ourselves forced into some sort of layering, we will understand the motivations much more concretely at that point, so we can make better decisions.

from java.

karllessard avatar karllessard commented on May 18, 2024

I share the same concerns as @deansher about the complexity of maintaining two distinct APIs. Another thing worth mentioning is that every time someone will try to add a new logic in the Keras layer, chances are that we will ask him to move that logic to tensorflow-framework first, expose it with an API very similar to what Keras is doing and then wrap it up with the original Keras interface in tensorflow-keras (exactly what I did with @JimClarke5 in his optimizer PR). I think we all agree that we want to avoid duplicating the same logic and only what is added to the framework can be shared across libraries.

If we opt for the single API though, we probably don't want to call it "Keras" or we will feel forced to mirror as much as possible the Python Keras library, even if it is not as flexible as we want it to be in some cases. On the other hand, just taking what we like from the Keras API and adding it to our framework kind of dissolve the notorious exposure of having a pure-Keras implementation in Java, which can attract more ML developers to switch to the JVM. Still, that is probably also what would allow us to build the best API in Java to TensorFlow users.

from java.

karllessard avatar karllessard commented on May 18, 2024

Before we can merge new PRs from @JimClarke5 , I think we really need to reach an agreement on this point.

In addition to my previous comment, I think we can address the question from another angle, by asking ourselves for who do we build the Keras API for Java.
a) For Python users already familiar with it?
b) For any users, presuming that if Keras was that successful on Python, it should then be the right API for Java?

If the answer is a), then we probably want to stay very close to what the Python Keras API offers and the facade pattern as proposed initially, sitting on top of the framework, is probably the right choice. If answer is b), then I feel that we are more free to move away from the original API, bringing only the important pieces and enhancing them with what we think is missing for a more complete solution that can satisfy both beginners and more advanced users. In this case, having a single framework should be enough.

from java.

JimClarke5 avatar JimClarke5 commented on May 18, 2024

The current PRs, "Optimizer Learning Rate Change" and "Initialization" are focused only on framework, and comprise elements that can be independent of any Keras implementation and can be used on their own. I would think, concepts like initialization, loss(cost) functions, activation functions, regularization etc. would transcend all ML implementations. To that end, I support adding these stand-alone elements to framework, independently of the decision on Keras. We may revisit some of the method signatures with a view to broader use, but I think the basic functionality will still be needed in many higher level implementations.

from java.

Craigacp avatar Craigacp commented on May 18, 2024

I think there is a third group, which is Java developers that have to port into Java from Python whatever their data science team came up with. There they might appreciate something similar to the Python API as it would be easier to see how to transform the Python source into Java source. In my experience this process of porting something from Python into Java for deployment tends to be pretty common, though that might be because I mainly talk to people who work at massive companies which can afford to have this disconnect between data science and deployment.

I think that we're still quite far away from having any kind of higher level API, and much of the work towards it is building out lower level blocks, so we should proceed on a little ways before trying to make a final decision. Much of the current code is getting out of the C API and wrapping it so that the constructs are usable in Java, as many of the C API pieces seem to be incomplete. We'd need to build these components anyway, and so surfacing a public API on top of them at some level is still further down the line.

from java.

deansher avatar deansher commented on May 18, 2024

Great point from @Craigacp : "Java developers that have to port into Java from Python whatever their data science team came up with. There they might appreciate something similar to the Python API as it would be easier to see how to transform the Python source into Java source." Another example along these lines that will be very common is developers that are studying an existing model in a paper or in open source and reimplementing it in Java.

While also agreeing with @Craigacp that "we should proceed on a little ways before trying to make a final decision", perhaps it would be worth documenting provisional goals in our README? Here's a shot at abstracting the above discussion into provisional goals for our framework API:

  • If either you know how to implement a model in the Python Keras API, or you are reimplementing an existing Python Keras model in Java, you should be able to cleanly and naturally follow the same high-level structure in the framework API.

  • Also, given some familiarity with patterns followed throughout the framework API, you should be able to easily translate every detail of a Python Keras implementation into the framework API.

  • However, the framework API is not intended to literally mimic the Python Keras API. Rather, it should expose the same capabilities in an API that feels natural and idiomatic to a Java programmer who does not know Keras. If we ever find ourselves unable to reconcile this goal with easy translation from Python Keras, we may split out a Keras layer.

  • Also, the framework API should support fine control over all aspects of modeling, training, and inference. Unlike with Python Keras, we want this to feel like staying in the same API rather than diving into a separate layer. But here again, if we are ever unable to reconcile this goal with easy translation from Python Keras, we may split the framework API into two layers.

Thoughts?

from java.

Craigacp avatar Craigacp commented on May 18, 2024

I guess the fundamental difference between Keras and not Keras is the model.compile and model.fit functions. These restrict what can be done with a Keras model in fairly fundamental ways (e.g. they make it hard to do multi-task learning across multiple datasets and losses), but they make it substantially easier to use by having a model object and sensible entry points that show users how to build supervised learning models. If we made the Keras Java implementation more idiomatically Java, then the Model object would own the layers and they wouldn't be mutable outside it, which is in conflict with the Python Keras as it doesn't care (which makes it less safe).

Stepping outside of supervised learning (e.g. to RL, or to multi-task supervised learning) means you have to leave behind bits of the Keras interface (e.g. the Keras RL examples here - https://keras.io/examples/rl/deep_q_network_breakout/) and don't really use compile or fit. My main concern is that we don't force people into something as restrictive as Keras without the appropriate escape hatches, and I think that having those hatches essentially dictates that we have two high-ish level interfaces one Keras, and one non-Keras. But the non-Keras interface is pretty much what we have in frameworks at the moment, which is just a prettied up version of the C API which has the missing bits patched over. Plus we'd need to get the gradient tape, but that's a discussion for another time.

from java.

karllessard avatar karllessard commented on May 18, 2024

I guess the fundamental difference between Keras and not Keras is the model.compile and model.fit functions.

train_on_batch is the Keras endpoint giving more flexibility to the developers in their training loop, I don't know if that can also apply to the specific use cases you had in mind @Craigacp ?

If we made the Keras Java implementation more idiomatically Java, then the Model object would own the layers and they wouldn't be mutable outside it, which is in conflict with the Python Keras as it doesn't care (which makes it less safe).

Maybe this can be handled by renaming Model to ModelTemplate, which is then concretized as a Model on model.compile, following pretty much the basic builder pattern in Java.

@deansher , I agree with your list of goals. It seems that the general consensus is that we should first build up a complete framework that is both user-friendly and flexible enough to support more complex or advanced tasks, and then reevaluate the need of having a second API that mirrors as close as possible Python Keras.

from java.

Craigacp avatar Craigacp commented on May 18, 2024

I guess the fundamental difference between Keras and not Keras is the model.compile and model.fit functions.

train_on_batch is the Keras endpoint giving more flexibility to the developers in their training loop, I don't know if that can also apply to the specific use cases you had in mind @Craigacp ?

train_on_batch works fine if the loss function is the same for each batch, but that's not true for some of the NLP use cases I'm working on (e.g. we train a model on a masked language model loss for some datasets, and similarity losses for others). Though I guess there could be multiple models which share layers, but I don't know how that would work if we did impose ownership of layers.

from java.

zaleslaw avatar zaleslaw commented on May 18, 2024

Add my 5 cents here: I suggest to keep low level API as graph + optimizers + load/saving variables and keep it separated from Keras package. Maybe initializers, losses and metris should be added too. But Activations, training cycle, layers could be developed in Keras package. Of course we should not have two examples of HeNormal initializers for example, only one

from java.

KartikChugh avatar KartikChugh commented on May 18, 2024

Agree with losses and metrics; but initializers should be with activations, no?

from java.

deansher avatar deansher commented on May 18, 2024

One interesting question: If we discover that something is implemented wrongly or unfortunately in Python, how will we decide whether to fix it in Java or be carefully bug-for-bug compatible? I'm thinking about the goal I proposed above, based on our discussion, "if either you know how to implement a model in the Python Keras API, or you are reimplementing an existing Python Keras model in Java, you should be able to cleanly and naturally follow the same high-level structure in the framework API."

This is probably an easy decision if it's just plain broken in Python. But perhaps a very difficult decision if it falls into a gray area, where the Python implementation seems plausible but quite unfortunate.

from java.

SidneyLann avatar SidneyLann commented on May 18, 2024

I have 20 years java experience and will never use python, I just want to use a java DL framework that has the strongest capacities but not to refer to python. In China, java is the top 1 developement language and many developers should not use python because the most business systems are developed in java.

from java.

JimClarke5 avatar JimClarke5 commented on May 18, 2024

If we find errors or better ways to implement algorithms in Java, I am all for changing TF Java. We have just found a Keras limitation on 1D Softmax inputs, and we removed that restriction in the Java implementation. We are also leverging Java strong typing which leads to, IMO, cleaner implementations.

from java.

deansher avatar deansher commented on May 18, 2024

We have agreed on a path forward and documented it in READMEs.

from java.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.