https://arxiv.org/abs/1907.0

ping <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url

Weak adversarial network PDE solvers about neuralpde.jl HOT 12 OPEN

sciml commented on July 21, 2024 2

Weak adversarial network PDE solvers

from neuralpde.jl.

Comments (12)

Ayushk4 commented on July 21, 2024 2

I would like to work on this issue. I have read and understood the paper and related concepts.

I have done a crude implementation of paper for solving elliptic PDEs with Dirichlet boundary condition with the networks hyperparameters as described in the paper (4.1.2) for dimensions = 5. It can be found here. I trained it for 500 iterations (of the 20,000 iters stated) and the network seems to be training somewhat well.

Could you guide me ahead with how should I proceed with this?

Should I train this currently implemented example till converged?
Should I try out one more example on a parabolic equation involving time as well?
Or should I directly proceed with writing the API for this PDE solver?

from neuralpde.jl.

ChrisRackauckas commented on July 21, 2024 1

Should I train this currently implemented example till converged?

Yes, let's make sure it converges "all of the way" first. It still looks like it's missing some of the corner behavior.

Should I train this currently implemented example till converged?

Try to get it working where, instead of writing down the discretization, it should just call DifferentialEquations.jl. This can use DiffEqFlux.jl's diffeq_adjoint to make sure the adjoint for training is the fast one.

Or should I directly proceed with writing the API for this PDE solver?

Let's wait on that until we know it's all working, then slap an API on it and "package it up"

from neuralpde.jl.

ChrisRackauckas commented on July 21, 2024 1

A couple of problems were faced while training on GPUs, being specific to GPUs. Very slow backpropagation on sinc , tanh not working as activation and problems with implementation on AdaGrad optimiser. I got it running after some workarounds. I believe most of these are Tracker and perhaps version related. I would look into these after I am done with this PR on the latest versions of Julia, Flux / Zygote and raise issue it if still persists.

Some of these may just be slow GPU kernels. It could be worth isolating them to get an MWE for the GPU developers.

from neuralpde.jl.

ChrisRackauckas commented on July 21, 2024 1

Please let me know if you have something else in mind?

Sounds great!

from neuralpde.jl.

ChrisRackauckas commented on July 21, 2024 1

Hmm, I was thinking of Algorithm 2. That can be any time discretization, so time step that with DiffEq. But now I see you can't because you don't end up with a differential equation for the evolution. Instead you need to optimize at each step, so what I was thinking wasn't possible, so ignore that :). Instead stick to the paper here.

from neuralpde.jl.

ChrisRackauckas commented on July 21, 2024 1

We can generalize it to use arbitrary tableaus though, but this can be later. Method 3 might be better anyways

from neuralpde.jl.

ChrisRackauckas commented on July 21, 2024 1

Yes, this sounds like a good direction to go down.

from neuralpde.jl.

Ayushk4 commented on July 21, 2024

Yes, let's make sure it converges "all of the way" first. It still looks like it's missing some of the corner behavior.

I trained it till converged and seems to fit well for dims = 20. I have uploaded the plots and notebooks for the same.
For dims = 5, however, after about 6400 iterations, the loss went NaN due to Adversarial Network going Inf. Still, it seemed to fit pretty well. I believe some hyperparameter tuning should do the trick for this case.

I moved the models to GPU and trained these there since it was slow on the CPU.

A couple of problems were faced while training on GPUs, being specific to GPUs. Very slow backpropagation on sinc , tanh not working as activation and problems with implementation on AdaGrad optimiser. I got it running after some workarounds. I believe most of these are Tracker and perhaps version related. I would look into these after I am done with this PR on the latest versions of Julia, Flux / Zygote and raise issue it if still persists.

Try to get it working where, instead of writing down the discretization, it should just call DifferentialEquations.jl. This can use DiffEqFlux.jl's diffeq_adjoint to make sure the adjoint for training is the fast one.

I am proceeding on to getting this to work where it calls DifferentialEquations.jl.

Please let me know if you have something else in mind?

from neuralpde.jl.

Ayushk4 commented on July 21, 2024

instead of writing down the discretisation, it should just call DifferentialEquations.jl.

I have a doubt here. For PDEs involving time, there are two methods mentioned in the paper. First one outputs N (no of time segments) different set of parameters values for the primal network, corresponding to each timestep. The other one changes the weak solution to give one set of parameter values. For which of these two methods, should I try this out? Could you also guide on how to proceed with this by pointing me to some resources?

from neuralpde.jl.

Ayushk4 commented on July 21, 2024

ping @ChrisRackauckas . ^

from neuralpde.jl.

Ayushk4 commented on July 21, 2024

Sure. I am proceeding with Algorithm 3 then.

from neuralpde.jl.

Ayushk4 commented on July 21, 2024

I have implemented the Algorithm 3 as in the paper (link). I have uploaded plots (also separate images) for the same. It seems to have trained well, but took somewhat long to train ~10hrs on GPU.

I was thinking of equation (12) from method-3 that has to be minimized. Instead of numerical integration(for the outer integral), maybe we can use DifferentialEquations.jl . Maybe it could be written down as solving dy/dt = f(y,t) for T=1 with initial value at t=0 given? Is it worth a shot?

Since the method-3 example works fine, I can first proceed with writing the API for it and then trying the above and make changes to API accordingly (or keeping both methods). I could also go the other way round.

What do you suggest on this?

from neuralpde.jl.

Weak adversarial network PDE solvers about neuralpde.jl HOT 12 OPEN

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent