hi there, the inference is ok, but then i ran into an error while testing, i am not sure what caused this error and how to fix it
my GPU is a Tesla V100-SXM2-32GB
here is the error info
(sifu) lby@ubuntu:~/code/SIFU$ python -m apps.train -cfg ./configs/train/sifu.yaml -test
ICON:
w/ Global Image Encoder: True
Image Features used by MLP: ['normal_F', 'normal_B']
Geometry Features used by MLP: ['sdf', 'cmap', 'norm', 'vis', 'sample_id']
Dim of Image Features (local): 6
Dim of Geometry Features (ICON): 7
Dim of MLP's first layer: 78
GPU available: True, used: True
TPU available: None, using: 0 TPU cores
Resume MLP weights from ./data/ckpt/sifu.ckpt
Resume normal model from ./data/ckpt/normal.ckpt
load from ./data/cape/test.txt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Testing: 0it [00:00, ?it/s]../aten/src/ATen/native/cuda/MultinomialKernel.cu:109: binarySearchForMultinomial: block: [7,0,0], thread: [96,0,0] Assertion `cumdist[size - 1] > static_cast<scalar_t>(0)` failed.
../aten/src/ATen/native/cuda/MultinomialKernel.cu:109: binarySearchForMultinomial: block: [7,0,0], thread: [97,0,0] Assertion `cumdist[size - 1] > static_cast<scalar_t>(0)` failed.
......(omit)
../aten/src/ATen/native/cuda/MultinomialKernel.cu:109: binarySearchForMultinomial: block: [2,0,0], thread: [95,0,0] Assertion `cumdist[size - 1] > static_cast<scalar_t>(0)` failed.
Traceback (most recent call last):
File "/home/lby/miniconda3/envs/sifu/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/lby/miniconda3/envs/sifu/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/lby/code/SIFU/apps/train.py", line 157, in <module>
trainer.test(model=model, datamodule=datamodule)
File "/home/lby/miniconda3/envs/sifu/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 915, in test
results = self.__test_given_model(model, test_dataloaders)
File "/home/lby/miniconda3/envs/sifu/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 973, in __test_given_model
results = self.fit(model)
File "/home/lby/miniconda3/envs/sifu/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 499, in fit
self.dispatch()
File "/home/lby/miniconda3/envs/sifu/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 540, in dispatch
self.accelerator.start_testing(self)
File "/home/lby/miniconda3/envs/sifu/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 76, in start_testing
self.training_type_plugin.start_testing(trainer)
File "/home/lby/miniconda3/envs/sifu/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 118, in start_testing
self._results = trainer.run_test()
File "/home/lby/miniconda3/envs/sifu/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 786, in run_test
eval_loop_results, _ = self.run_evaluation()
File "/home/lby/miniconda3/envs/sifu/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 725, in run_evaluation
output = self.evaluation_loop.evaluation_step(batch, batch_idx, dataloader_idx)
File "/home/lby/miniconda3/envs/sifu/lib/python3.8/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 160, in evaluation_step
output = self.trainer.accelerator.test_step(args)
File "/home/lby/miniconda3/envs/sifu/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 195, in test_step
return self.training_type_plugin.test_step(*args)
File "/home/lby/miniconda3/envs/sifu/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 134, in test_step
return self.lightning_module.test_step(*args, **kwargs)
File "/home/lby/code/SIFU/apps/ICON.py", line 686, in test_step
chamfer, p2s = self.evaluator.calculate_chamfer_p2s(num_samples=1000)
File "/home/lby/code/SIFU/lib/dataset/Evaluator.py", line 167, in calculate_chamfer_p2s
sample_points_from_meshes(self.tgt_mesh, num_samples))
File "/home/lby/code/pytorch3d/pytorch3d/ops/sample_points_from_meshes.py", line 100, in sample_points_from_meshes
sample_face_idxs += mesh_to_face[meshes.valid].view(num_valid_meshes, 1)
RuntimeError: numel: integer multiplication overflow
Testing: 0%| | 0/450 [00:06<?, ?it/s]