Comments (15)
Would you consider providing (or point me to an example) a simple but complete example of using ReduceLROnPlateau. Thanks, Lars
from pytorch-lightning.
Also, at this moment https://github.com/williamFalcon/pytorch-lightning/blob/master/pytorch_lightning/trainer/trainer.py#L958 lr_scheduler.step() is called at the end of the epoch, while some scheduler (eg: https://pytorch.org/docs/master/optim.html#torch.optim.lr_scheduler.OneCycleLR) should be called at the end of the batch
from pytorch-lightning.
@terkelbo good suggestion. can you propose a clean way of doing this? maybe it can be merged with #29
from pytorch-lightning.
I think we can directly adjust lr in the optimizer as keras, which means we don't need ReduceLROnPlateau as a metric.
Specifically, perhaps we can consider adding a hook in pytorch_lightning/root_module/root_module.py
like optimizer_step
, so that this function can be exposed to callbacks?
from pytorch-lightning.
@lr1d good point. you can just adjust the LR in the current callback also (optimizer_step). or in any of the other callbacks
from pytorch-lightning.
closing because this should be implicitly supported since we can pass a ReduceLROnPlateau object... nothing we need to do, this is standard PyTorch functionality
from pytorch-lightning.
When using the pytorch object, it requires to pass the metric as a param. for exmaple lr_scheduler.step(val_loss)
. How do you overwrite the origianl schedulers in thie way?
from pytorch-lightning.
Please, I can't follow. Why the issue is closed? Do you suggest using some hooks like optimizer_step
and implement reducing on plateau scheduling by hand?
from pytorch-lightning.
I’m not sure either how to pass the ReduceLROnPlateau object as it needs the metric argument as pointed out by @Ir1d. @williamFalcon would it be possible that you give an example of how to use this scheduler with pytorch lightning?
from pytorch-lightning.
It feels rather dirty to me, but you can save the loss in your pl.LightningModule
's training_step()
method, and then use it in the optimizer_step
method. I can't verify whether it works as expected right now though.
def training_step(self, batch, batch_nb):
# REQUIRED
x, y = batch
y_hat = self.forward(x)
self.loss = F.mse_loss(y, y_hat)
return {'loss': self.loss}
def configure_optimizers(self):
# REQUIRED
# can return multiple optimizers and learning_rate schedulers
self.optimizer = torch.optim.Adam(self.parameters(), lr=self.lrinit)
self.scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(self.optimizer, mode='min',
factor=0.1, patience=10,
verbose=False, threshold=0.0001,
threshold_mode='rel', cooldown=0,
min_lr=0, eps=1e-08)
return self.optimizer
def optimizer_step(self, epoch_nb, batch_nb, optimizer, optimizer_i):
"""
Do something instead of the standard optimizer behavior
:param epoch_nb:
:param batch_nb:
:param optimizer:
:param optimizer_i:
:return:
"""
# Sometimes (if closure is not None), this step method should return the loss. Now its None
loss = optimizer.step()
# So let's use self.loss, set in training_step, for LR scheduling
self.scheduler.step(self.loss)
# clear gradients
optimizer.zero_grad()
Edit: This will not work as intended, since the optimizer step is called after every batch.
from pytorch-lightning.
we could modify the scheduler step to take in the loss when it needs it? i have to look at this more carefully though
from pytorch-lightning.
That would be the best solution indeed, but then you'd need a way to figure out which ones need the loss when calling step. I'm not sure how to do that at this moment.
For now, I solved it as follows, which may help people until there is a better solution:
- Use the
on_epoch_start
callback on your LightningModule to initialize an empty list - In every
training_step
call, append the loss to the list - In the
on_epoch_end
callback, process the list to get the averga loss, callscheduler.step(mean_loss)
and clear the list
from pytorch-lightning.
From PT docs.
https://pytorch.org/docs/stable/optim.html
Prior to PyTorch 1.1.0, the learning rate scheduler was expected to be called before the optimizer’s update; 1.1.0 changed this behavior in a BC-breaking way. If you use the learning rate scheduler (calling scheduler.step()) before the optimizer’s update (calling optimizer.step()), this will skip the first value of the learning rate schedule. If you are unable to reproduce results after upgrading to PyTorch 1.1.0, please check if you are calling scheduler.step() at the wrong time.
from pytorch-lightning.
Either way, @kvhooreb i added the .step(epoch)
fix. Let me know if this works for you.
from pytorch-lightning.
Yes, a simple example would be great please.
from pytorch-lightning.
Related Issues (20)
- Pydantic AttributeError when using LightningDataModule
- Loading from a checkpoint does not work properly in distributed training
- Optimizer step based validation trigger HOT 1
- Log `TensorBoard` histograms HOT 1
- PTL 2.2 specifically causes torchscript errors when loaded in any environment not containing PTL 2.2
- When calling trainer.test() train_dataloader is also validated, which makes no sense HOT 2
- Support `ThunderModule` models HOT 2
- Calculated loss differs from logged loss in training_step (even if seed_everything, deterministic set to true and shuffle to false) HOT 1
- Trainer does not wait for neptune logger completion and logger connection stays open unless explicitly closed HOT 1
- Validation does not produce any output in PyTorch Lightning using my UNetTestModel
- Unable to extend FSDPStrategy to HPU accelerator HOT 7
- SaveConfigCallback.save_config is conflict with DDP HOT 1
- Logging Documentation Does not Detail How to Access the Logged Values during the fit loop
- Apply the ignore of the save_hyperparameters function to args as well.
- Cannot run in SLURM Interactive Session
- Resume from mid steps inside an epoch
- `DDPStrategy` fails when using accelerators other than CUDA
- PyTorch Lightning with T5 Model - RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn HOT 1
- Script freezes when Trainer is instantiated
- Sanitize object params before they get logged from argument-free classes
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch-lightning.