I get an error while running script "pylaia-htr-train-ctc ". Here is log:
[2022-02-09 06:55:50,368 INFO laia] Arguments: {'syms': '/content/kzh/syms_ctc.txt', 'img_dirs': ['/content/kzh/imgs/PYLAIA_PREPARED'], 'tr_txt_table': '/content/kzh/tr.txt', 'va_txt_table': '/content/kzh/va.txt', 'common': CommonArgs(seed=74565, train_path='', model_filename='model_h128', experiment_dirname='experiment', monitor=<Monitor.va_cer: 'va_cer'>, checkpoint=None), 'data': DataArgs(batch_size=10, color_mode=<ColorMode.L: 'L'>), 'train': TrainArgs(delimiters=['<space>'], checkpoint_k=3, resume=False, early_stopping_patience=20, gpu_stats=False, augment_training=False), 'optimizer': OptimizerArgs(name=<Name.RMSProp: 'RMSProp'>, learning_rate=0.0003, momentum=0.0, weight_l2_penalty=0.0, nesterov=False), 'scheduler': SchedulerArgs(active=False, monitor=<Monitor.va_cer: 'va_cer'>, patience=10, factor=0.1), 'trainer': TrainerArgs(gradient_clip_val=0.0, process_position=0, num_nodes=1, num_processes=1, gpus=1, auto_select_gpus=False, tpu_cores=None, progress_bar_refresh_rate=1, overfit_batches=0.0, track_grad_norm=-1, check_val_every_n_epoch=1, fast_dev_run=False, accumulate_grad_batches=1, max_epochs=1000, min_epochs=1, max_steps=None, min_steps=None, limit_train_batches=1.0, limit_val_batches=1.0, limit_test_batches=1.0, val_check_interval=1.0, flush_logs_every_n_steps=100, log_every_n_steps=50, accelerator=None, sync_batchnorm=False, precision=32, weights_summary='full', weights_save_path=None, num_sanity_val_steps=2, truncated_bptt_steps=None, profiler=None, benchmark=False, deterministic=False, reload_dataloaders_every_epoch=False, replace_sampler_ddp=True, terminate_on_nan=False, prepare_data_per_node=True, plugins=None, amp_backend='native', amp_level='O2', distributed_backend=None, automatic_optimization=None, move_metrics_to_cpu=False, enable_pl_optimizer=True)}
[2022-02-09 06:55:50,918 INFO laia] Installed:
[2022-02-09 06:55:51,001 INFO laia.common.loader] Loaded model model_h128
[2022-02-09 06:55:51,002 INFO laia.engine.data_module] Training data transforms:
ToImageTensor(
vision.Convert(mode=L),
vision.Invert(),
ToTensor()
)
[2022-02-09 06:55:51,004 WARNING py.warnings] UserWarning: Checkpoint directory experiment exists and is not empty. With save_top_k=3, all files in this directory will be deleted when a checkpoint is saved!
[2022-02-09 06:55:51,006 WARNING py.warnings] UserWarning: You have set progress_bar_refresh_rate < 20 on Google Colab. This may crash. Consider using progress_bar_refresh_rate >= 20 in Trainer.
[2022-02-09 06:55:51,061 INFO lightning] GPU available: True, used: True
[2022-02-09 06:55:51,061 INFO lightning] TPU available: False, using: 0 TPU cores
[2022-02-09 06:55:51,061 INFO lightning] LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[2022-02-09 06:55:55,327 WARNING py.warnings] UserWarning: Experiment logs directory experiment exists and is not empty. Previous log files in this directory will be deleted when the new ones are saved!
[2022-02-09 06:55:55,341 INFO lightning]
| Name | Type | Params
-------------------------------------------------------------------
0 | model | LaiaCRNN | 9.6 M
1 | model.conv | Sequential | 92.5 K
2 | model.conv.0 | ConvBlock | 160
3 | model.conv.0.conv | Conv2d | 160
4 | model.conv.0.activation | LeakyReLU | 0
5 | model.conv.0.pool | MaxPool2d | 0
6 | model.conv.1 | ConvBlock | 4.6 K
7 | model.conv.1.conv | Conv2d | 4.6 K
8 | model.conv.1.activation | LeakyReLU | 0
9 | model.conv.1.pool | MaxPool2d | 0
10 | model.conv.2 | ConvBlock | 13.9 K
11 | model.conv.2.conv | Conv2d | 13.9 K
12 | model.conv.2.activation | LeakyReLU | 0
13 | model.conv.2.pool | MaxPool2d | 0
14 | model.conv.3 | ConvBlock | 27.7 K
15 | model.conv.3.conv | Conv2d | 27.7 K
16 | model.conv.3.activation | LeakyReLU | 0
17 | model.conv.4 | ConvBlock | 46.2 K
18 | model.conv.4.conv | Conv2d | 46.2 K
19 | model.conv.4.activation | LeakyReLU | 0
20 | model.sequencer | ImagePoolingSequencer | 0
21 | model.rnn | LSTM | 9.5 M
22 | model.linear | Linear | 33.3 K
23 | criterion | CTCLoss | 0
-------------------------------------------------------------------
9.6 M Trainable params
0 Non-trainable params
9.6 M Total params
[2022-02-09 06:55:55,721 CRITICAL laia] Uncaught exception:
Traceback (most recent call last):
File "/content/PyLaia/laia/engine/engine_exception.py", line 27, in exception_catcher
yield
File "/content/PyLaia/laia/engine/engine_module.py", line 148, in validation_step
batch_y_hat = self.model(batch_x)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/content/PyLaia/laia/models/htr/laia_crnn.py", line 118, in forward
x = self.sequencer(x)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/content/PyLaia/laia/nn/image_pooling_sequencer.py", line 53, in forward
"Input images must have a fixed "
ValueError: Input images must have a fixed height of 16 pixels, found [15, 16]
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/bin/pylaia-htr-train-ctc", line 33, in <module>
sys.exit(load_entry_point('laia', 'console_scripts', 'pylaia-htr-train-ctc')())
File "/content/PyLaia/laia/scripts/htr/train_ctc.py", line 200, in main
run(**args)
File "/content/PyLaia/laia/scripts/htr/train_ctc.py", line 128, in run
trainer.fit(engine_module, datamodule=data_module)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 468, in fit
results = self.accelerator_backend.train()
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 66, in train
results = self.train_or_test()
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/accelerators/accelerator.py", line 66, in train_or_test
results = self.trainer.train()
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 490, in train
self.run_sanity_check(self.get_model())
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 697, in run_sanity_check
_, eval_results = self.run_evaluation(test_mode=False, max_batches=self.num_sanity_val_batches)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 613, in run_evaluation
output = self.evaluation_loop.evaluation_step(test_mode, batch, batch_idx, dataloader_idx)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/evaluation_loop.py", line 178, in evaluation_step
output = self.trainer.accelerator_backend.validation_step(args)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 90, in validation_step
output = self.__validation_step(args)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 98, in __validation_step
output = self.trainer.model.validation_step(*args)
File "/content/PyLaia/laia/engine/htr_engine_module.py", line 72, in validation_step
result = super().validation_step(batch, *args, **kwargs)
File "/content/PyLaia/laia/engine/engine_module.py", line 148, in validation_step
batch_y_hat = self.model(batch_x)
File "/usr/lib/python3.6/contextlib.py", line 99, in __exit__
self.gen.throw(type, value, traceback)
File "/content/PyLaia/laia/engine/engine_exception.py", line 34, in exception_catcher
) from e
laia.engine.engine_exception.EngineException: Exception "ValueError('Input images must have a fixed height of 16 pixels, found [15, 16]',)" raised during epoch=0, global_step=0 with batch=['7_44_825', '10_35_229', '2_51_124', '13_17_158', '10_26_124', '7_0_439', '12_45_126', '10_4_131', '10_5_399', '13_16_155']