Git Product home page Git Product logo

Comments (12)

Ayushk4 avatar Ayushk4 commented on July 21, 2024 2

I would like to work on this issue. I have read and understood the paper and related concepts.

I have done a crude implementation of paper for solving elliptic PDEs with Dirichlet boundary condition with the networks hyperparameters as described in the paper (4.1.2) for dimensions = 5. It can be found here. I trained it for 500 iterations (of the 20,000 iters stated) and the network seems to be training somewhat well.

Could you guide me ahead with how should I proceed with this?

  • Should I train this currently implemented example till converged?
  • Should I try out one more example on a parabolic equation involving time as well?
  • Or should I directly proceed with writing the API for this PDE solver?

from neuralpde.jl.

ChrisRackauckas avatar ChrisRackauckas commented on July 21, 2024 1

Should I train this currently implemented example till converged?

Yes, let's make sure it converges "all of the way" first. It still looks like it's missing some of the corner behavior.

Should I train this currently implemented example till converged?

Try to get it working where, instead of writing down the discretization, it should just call DifferentialEquations.jl. This can use DiffEqFlux.jl's diffeq_adjoint to make sure the adjoint for training is the fast one.

Or should I directly proceed with writing the API for this PDE solver?

Let's wait on that until we know it's all working, then slap an API on it and "package it up"

from neuralpde.jl.

ChrisRackauckas avatar ChrisRackauckas commented on July 21, 2024 1

A couple of problems were faced while training on GPUs, being specific to GPUs. Very slow backpropagation on sinc , tanh not working as activation and problems with implementation on AdaGrad optimiser. I got it running after some workarounds. I believe most of these are Tracker and perhaps version related. I would look into these after I am done with this PR on the latest versions of Julia, Flux / Zygote and raise issue it if still persists.

Some of these may just be slow GPU kernels. It could be worth isolating them to get an MWE for the GPU developers.

from neuralpde.jl.

ChrisRackauckas avatar ChrisRackauckas commented on July 21, 2024 1

Please let me know if you have something else in mind?

Sounds great!

from neuralpde.jl.

ChrisRackauckas avatar ChrisRackauckas commented on July 21, 2024 1

Hmm, I was thinking of Algorithm 2. That can be any time discretization, so time step that with DiffEq. But now I see you can't because you don't end up with a differential equation for the evolution. Instead you need to optimize at each step, so what I was thinking wasn't possible, so ignore that :). Instead stick to the paper here.

from neuralpde.jl.

ChrisRackauckas avatar ChrisRackauckas commented on July 21, 2024 1

We can generalize it to use arbitrary tableaus though, but this can be later. Method 3 might be better anyways

from neuralpde.jl.

ChrisRackauckas avatar ChrisRackauckas commented on July 21, 2024 1

Yes, this sounds like a good direction to go down.

from neuralpde.jl.

Ayushk4 avatar Ayushk4 commented on July 21, 2024

Yes, let's make sure it converges "all of the way" first. It still looks like it's missing some of the corner behavior.

I trained it till converged and seems to fit well for dims = 20. I have uploaded the plots and notebooks for the same.
For dims = 5, however, after about 6400 iterations, the loss went NaN due to Adversarial Network going Inf. Still, it seemed to fit pretty well. I believe some hyperparameter tuning should do the trick for this case.

I moved the models to GPU and trained these there since it was slow on the CPU.

A couple of problems were faced while training on GPUs, being specific to GPUs. Very slow backpropagation on sinc , tanh not working as activation and problems with implementation on AdaGrad optimiser. I got it running after some workarounds. I believe most of these are Tracker and perhaps version related. I would look into these after I am done with this PR on the latest versions of Julia, Flux / Zygote and raise issue it if still persists.

Try to get it working where, instead of writing down the discretization, it should just call DifferentialEquations.jl. This can use DiffEqFlux.jl's diffeq_adjoint to make sure the adjoint for training is the fast one.

I am proceeding on to getting this to work where it calls DifferentialEquations.jl.

Please let me know if you have something else in mind?

from neuralpde.jl.

Ayushk4 avatar Ayushk4 commented on July 21, 2024

instead of writing down the discretisation, it should just call DifferentialEquations.jl.

I have a doubt here. For PDEs involving time, there are two methods mentioned in the paper. First one outputs N (no of time segments) different set of parameters values for the primal network, corresponding to each timestep. The other one changes the weak solution to give one set of parameter values. For which of these two methods, should I try this out? Could you also guide on how to proceed with this by pointing me to some resources?

from neuralpde.jl.

Ayushk4 avatar Ayushk4 commented on July 21, 2024

ping @ChrisRackauckas . ^

from neuralpde.jl.

Ayushk4 avatar Ayushk4 commented on July 21, 2024

Sure. I am proceeding with Algorithm 3 then.

from neuralpde.jl.

Ayushk4 avatar Ayushk4 commented on July 21, 2024

I have implemented the Algorithm 3 as in the paper (link). I have uploaded plots (also separate images) for the same. It seems to have trained well, but took somewhat long to train ~10hrs on GPU.

I was thinking of equation (12) from method-3 that has to be minimized. Instead of numerical integration(for the outer integral), maybe we can use DifferentialEquations.jl . Maybe it could be written down as solving dy/dt = f(y,t) for T=1 with initial value at t=0 given? Is it worth a shot?

Equation

Since the method-3 example works fine, I can first proceed with writing the API for it and then trying the above and make changes to API accordingly (or keeping both methods). I could also go the other way round.

What do you suggest on this?

from neuralpde.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.