Comments (3)
Yes, it is known that these do not support the flag, and the documentation for each exclude mentioning maximize for that reason.
It is totally easily actionable to add maximize for RAdam and NAdam + tests, perhaps easier than adding the docs lol. I may just add this for funsies later this week but if someone wants to do this, go for it!
LBFGS I've gotta check, but off the top of my head it should work as well, but the implementation is not immediately doable in my head.
from pytorch.
@janeyx99 May I help to add this? I have made a pr for Radam
. If #126765 is OK, I will add another pr for Nadam.
from pytorch.
Reopening to track LBFGS
from pytorch.
Related Issues (20)
- Each parameter of `nn.MaxPool1d()`, `nn.MaxPool2d()` and `nn.MaxPool3d()` should have `required` or `optional`
- `int`, `float` and `complex` type with `return_indices` of `nn.MaxPool1d()` also work
- Inductor doesn't inplace normalization operations HOT 1
- Flex Attention: Calculates Gradients Even if Input Has requires_grad=False
- `torch._custom_ops.custom_ops` could cause a crash in pytorch HOT 2
- Setting `True` to `return_indices` argument of `nn.MaxPool1d()` with the tensor of `int` works HOT 2
- [torch.export] `torch._export.serde.serialize.SerializeError: Serializing <built-in function truediv> is not supported` HOT 7
- [RFC] Default torch.compile backend customization HOT 5
- [torch.export] Dim discrete/finite set HOT 5
- Clarify torch.linalg.qr documentation for QR factorization to be well-defined/unique HOT 4
- [inductor] gradient numeric discrepency when fusing Transfromer (Embedding, LayerNorm, Linear) HOT 2
- [Distributed] Decouple ProcessGroupNCCL's regular exit from abort
- DISABLED test_angle_cpu (__main__.CpuTritonTests) HOT 3
- DISABLED test_open_device_registration (__main__.TritonExtensionBackendTests) HOT 2
- DISABLED test_graph_grad_scaling_foreach_False_fused_False_AdamW_cuda_float32 (__main__.TestCudaOptimsCUDA) HOT 1
- A division-by-zero operation in a scripted function can cause the core dumped
- [ONNX] Export with error unhashable list HOT 1
- torch.ones(2,4,1,30,1).to('mps').sum(dim=-2) throws "buffer is not large enough" on mps HOT 2
- MPS `any()` crashes on a Tensor with >4 dims HOT 5
- DISABLED test_graph_grad_scaling_foreach_False_fused_False_Adam_cuda_float32 (__main__.TestCudaOptimsCUDA) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch.