Comments (11)
You can also manually updated the dic. Like this:
state_dict =checkpoint['state_dict']
from collections import OrderedDict
new_state_dict = OrderedDict()
for k, v in state_dict.items():
if 'module' not in k:
k = 'module.'+k
else:
k = k.replace('features.module.', 'module.features.')
new_state_dict[k]=v
model.load_state_dict(new_state_dict)
from pytorch-classification.
change:
model.load_state_dict(torch.load(path + '/pytorch_model.pt'))
to
model.load_state_dict(torch.load(path + '/pytorch_model.pt'), strict=False)
from pytorch-classification.
Hi,
The problem is the module is load with dataparallel activated and you are trying to load it without data parallel. That's why there's an extra module at the beginning of each key!
Refer to this link for more information:
https://discuss.pytorch.org/t/missing-keys-unexpected-keys-in-state-dict-when-loading-self-trained-model/22379
from pytorch-classification.
change:
model.load_state_dict(torch.load(path + '/pytorch_model.pt'))
tomodel.load_state_dict(torch.load(path + '/pytorch_model.pt'), strict=False)
Although it will make the RuntimeError go away, don't do this unless you know what you are doing. It will leave any parameters it can't find in the checkpoint with random values. That's not what you want if the issue is caused by a mix-up of parameter names, as was the case for the issue reporter.
from pytorch-classification.
The mutiple GPUs usage in pytorch is a little difficult.
In TF, you just set the os.environ["CUDA_VISIBLE_DEVICES"]='0,1,2,3'
from pytorch-classification.
Am getting an error similar to this one
This is what I am running:
python cifar.py -a preresnet --depth 110 --epochs 3 --schedule 81 122 --gamma 0.1 --wd 1e-4 --checkpoint checkpoints/cifar10/preresnet-110 --resume 'checkpoint.pth.tar'
('checkpoint.pth.tar' is from the onedrive folder)
RuntimeError: Error(s) in loading state_dict for DataParallel:
Missing key(s) in state_dict: "module.bn.weight", "module.bn.bias", "module.bn.running_mean", "module.bn.running_var".
Unexpected key(s) in state_dict: "module.bn1.weight", "module.bn1.bias", "module.bn1.running_mean", "module.bn1.running_var", "module.layer1.0.conv3.weight", "module.layer1.0.bn3.weight", "module.layer1.0.bn3.bias", "module.layer1.0.bn3.running_mean", "module.layer1.0.bn3.running_var", "module.layer1.0.downsample.0.weight", "module.layer1.0.downsample.1.weight", "module.layer1.0.downsample.1.bias", "module.layer1.0.downsample.1.running_mean", "module.layer1.0.downsample.1.running_var", "module.layer1.1.conv3.weight", "module.layer1.1.bn3.weight", "module.layer1.1.bn3.bias", "module.layer1.1.bn3.running_mean", "module.layer1.1.bn3.running_var", "module.layer1.2.conv3.weight", "module.layer1.2.bn3.weight", "module.layer1.2.bn3.bias", "module.layer1.2.bn3.running_mean", "module.layer1.2.bn3.running_var", "module.layer1.3.conv3.weight", "module.layer1.3.bn3.weight", "module.layer1.3.bn3.bias", "module.layer1.3.bn3.running_mean", "module.layer1.3.bn3.running_var", "module.layer1.4.conv3.weight", "module.layer1.4.bn3.weight", "module.layer1.4.bn3.bias", "module.layer1.4.bn3.running_mean", "module.layer1.4.bn3.running_var", "module.layer1.5.conv3.weight", "module.layer1.5.bn3.weight", "module.layer1.5.bn3.bias", "module.layer1.5.bn3.running_mean", "module.layer1.5.bn3.running_var", "module.layer1.6.conv3.weight", "module.layer1.6.bn3.weight", "module.layer1.6.bn3.bias", "module.layer1.6.bn3.running_mean", "module.layer1.6.bn3.running_var", "module.layer1.7.conv3.weight", "module.layer1.7.bn3.weight", "module.layer1.7.bn3.bias", "module.layer1.7.bn3.running_mean", "module.layer1.7.bn3.running_var", "module.layer1.8.conv3.weight", "module.layer1.8.bn3.weight", "module.layer1.8.bn3.bias", "module.layer1.8.bn3.running_mean", "module.layer1.8.bn3.running_var", "module.layer1.9.conv3.weight", "module.layer1.9.bn3.weight", "module.layer1.9.bn3.bias", "module.layer1.9.bn3.running_mean", "module.layer1.9.bn3.running_var", "module.layer1.10.conv3.weight", "module.layer1.10.bn3.weight", "module.layer1.10.bn3.bias", "module.layer1.10.bn3.running_mean", "module.layer1.10.bn3.running_var", "module.layer1.11.conv3.weight", "module.layer1.11.bn3.weight", "module.layer1.11.bn3.bias", "module.layer1.11.bn3.running_mean", "module.layer1.11.bn3.running_var", "module.layer1.12.conv3.weight", "module.layer1.12.bn3.weight", "module.layer1.12.bn3.bias", "module.layer1.12.bn3.running_mean", "module.layer1.12.bn3.running_var", "module.layer1.13.conv3.weight", "module.layer1.13.bn3.weight", "module.layer1.13.bn3.bias", "module.layer1.13.bn3.running_mean", "module.layer1.13.bn3.running_var", "module.layer1.14.conv3.weight", "module.layer1.14.bn3.weight", "module.layer1.14.bn3.bias", "module.layer1.14.bn3.running_mean", "module.layer1.14.bn3.running_var", "module.layer1.15.conv3.weight", "module.layer1.15.bn3.weight", "module.layer1.15.bn3.bias", "module.layer1.15.bn3.running_mean", "module.layer1.15.bn3.running_var", "module.layer1.16.conv3.weight", "module.layer1.16.bn3.weight", "module.layer1.16.bn3.bias", "module.layer1.16.bn3.running_mean", "module.layer1.16.bn3.running_var", "module.layer1.17.conv3.weight", "module.layer1.17.bn3.weight", "module.layer1.17.bn3.bias", "module.layer1.17.bn3.running_mean", "module.layer1.17.bn3.running_var", "module.layer2.0.conv3.weight", "module.layer2.0.bn3.weight", "module.layer2.0.bn3.bias", "module.layer2.0.bn3.running_mean", "module.layer2.0.bn3.running_var", "module.layer2.0.downsample.1.weight", "module.layer2.0.downsample.1.bias", "module.layer2.0.downsample.1.running_mean", "module.layer2.0.downsample.1.running_var", "module.layer2.1.conv3.weight", "module.layer2.1.bn3.weight", "module.layer2.1.bn3.bias", "module.layer2.1.bn3.running_mean", "module.layer2.1.bn3.running_var", "module.layer2.2.conv3.weight", "module.layer2.2.bn3.weight", "module.layer2.2.bn3.bias", "module.layer2.2.bn3.running_mean", "module.layer2.2.bn3.running_var", "module.layer2.3.conv3.weight", "module.layer2.3.bn3.weight", "module.layer2.3.bn3.bias", "module.layer2.3.bn3.running_mean", "module.layer2.3.bn3.running_var", "module.layer2.4.conv3.weight", "module.layer2.4.bn3.weight", "module.layer2.4.bn3.bias", "module.layer2.4.bn3.running_mean", "module.layer2.4.bn3.running_var", "module.layer2.5.conv3.weight", "module.layer2.5.bn3.weight", "module.layer2.5.bn3.bias", "module.layer2.5.bn3.running_mean", "module.layer2.5.bn3.running_var", "module.layer2.6.conv3.weight", "module.layer2.6.bn3.weight", "module.layer2.6.bn3.bias", "module.layer2.6.bn3.running_mean", "module.layer2.6.bn3.running_var", "module.layer2.7.conv3.weight", "module.layer2.7.bn3.weight", "module.layer2.7.bn3.bias", "module.layer2.7.bn3.running_mean", "module.layer2.7.bn3.running_var", "module.layer2.8.conv3.weight", "module.layer2.8.bn3.weight", "module.layer2.8.bn3.bias", "module.layer2.8.bn3.running_mean", "module.layer2.8.bn3.running_var", "module.layer2.9.conv3.weight", "module.layer2.9.bn3.weight", "module.layer2.9.bn3.bias", "module.layer2.9.bn3.running_mean", "module.layer2.9.bn3.running_var", "module.layer2.10.conv3.weight", "module.layer2.10.bn3.weight", "module.layer2.10.bn3.bias", "module.layer2.10.bn3.running_mean", "module.layer2.10.bn3.running_var", "module.layer2.11.conv3.weight", "module.layer2.11.bn3.weight", "module.layer2.11.bn3.bias", "module.layer2.11.bn3.running_mean", "module.layer2.11.bn3.running_var", "module.layer2.12.conv3.weight", "module.layer2.12.bn3.weight", "module.layer2.12.bn3.bias", "module.layer2.12.bn3.running_mean", "module.layer2.12.bn3.running_var", "module.layer2.13.conv3.weight", "module.layer2.13.bn3.weight", "module.layer2.13.bn3.bias", "module.layer2.13.bn3.running_mean", "module.layer2.13.bn3.running_var", "module.layer2.14.conv3.weight", "module.layer2.14.bn3.weight", "module.layer2.14.bn3.bias", "module.layer2.14.bn3.running_mean", "module.layer2.14.bn3.running_var", "module.layer2.15.conv3.weight", "module.layer2.15.bn3.weight", "module.layer2.15.bn3.bias", "module.layer2.15.bn3.running_mean", "module.layer2.15.bn3.running_var", "module.layer2.16.conv3.weight", "module.layer2.16.bn3.weight", "module.layer2.16.bn3.bias", "module.layer2.16.bn3.running_mean", "module.layer2.16.bn3.running_var", "module.layer2.17.conv3.weight", "module.layer2.17.bn3.weight", "module.layer2.17.bn3.bias", "module.layer2.17.bn3.running_mean", "module.layer2.17.bn3.running_var", "module.layer3.0.conv3.weight", "module.layer3.0.bn3.weight", "module.layer3.0.bn3.bias", "module.layer3.0.bn3.running_mean", "module.layer3.0.bn3.running_var", "module.layer3.0.downsample.1.weight", "module.layer3.0.downsample.1.bias", "module.layer3.0.downsample.1.running_mean", "module.layer3.0.downsample.1.running_var", "module.layer3.1.conv3.weight", "module.layer3.1.bn3.weight", "module.layer3.1.bn3.bias", "module.layer3.1.bn3.running_mean", "module.layer3.1.bn3.running_var", "module.layer3.2.conv3.weight", "module.layer3.2.bn3.weight", "module.layer3.2.bn3.bias", "module.layer3.2.bn3.running_mean", "module.layer3.2.bn3.running_var", "module.layer3.3.conv3.weight", "module.layer3.3.bn3.weight", "module.layer3.3.bn3.bias", "module.layer3.3.bn3.running_mean", "module.layer3.3.bn3.running_var", "module.layer3.4.conv3.weight", "module.layer3.4.bn3.weight", "module.layer3.4.bn3.bias", "module.layer3.4.bn3.running_mean", "module.layer3.4.bn3.running_var", "module.layer3.5.conv3.weight", "module.layer3.5.bn3.weight", "module.layer3.5.bn3.bias", "module.layer3.5.bn3.running_mean", "module.layer3.5.bn3.running_var", "module.layer3.6.conv3.weight", "module.layer3.6.bn3.weight", "module.layer3.6.bn3.bias", "module.layer3.6.bn3.running_mean", "module.layer3.6.bn3.running_var", "module.layer3.7.conv3.weight", "module.layer3.7.bn3.weight", "module.layer3.7.bn3.bias", "module.layer3.7.bn3.running_mean", "module.layer3.7.bn3.running_var", "module.layer3.8.conv3.weight", "module.layer3.8.bn3.weight", "module.layer3.8.bn3.bias", "module.layer3.8.bn3.running_mean", "module.layer3.8.bn3.running_var", "module.layer3.9.conv3.weight", "module.layer3.9.bn3.weight", "module.layer3.9.bn3.bias", "module.layer3.9.bn3.running_mean", "module.layer3.9.bn3.running_var", "module.layer3.10.conv3.weight", "module.layer3.10.bn3.weight", "module.layer3.10.bn3.bias", "module.layer3.10.bn3.running_mean", "module.layer3.10.bn3.running_var", "module.layer3.11.conv3.weight", "module.layer3.11.bn3.weight", "module.layer3.11.bn3.bias", "module.layer3.11.bn3.running_mean", "module.layer3.11.bn3.running_var", "module.layer3.12.conv3.weight", "module.layer3.12.bn3.weight", "module.layer3.12.bn3.bias", "module.layer3.12.bn3.running_mean", "module.layer3.12.bn3.running_var", "module.layer3.13.conv3.weight", "module.layer3.13.bn3.weight", "module.layer3.13.bn3.bias", "module.layer3.13.bn3.running_mean", "module.layer3.13.bn3.running_var", "module.layer3.14.conv3.weight", "module.layer3.14.bn3.weight", "module.layer3.14.bn3.bias", "module.layer3.14.bn3.running_mean", "module.layer3.14.bn3.running_var", "module.layer3.15.conv3.weight", "module.layer3.15.bn3.weight", "module.layer3.15.bn3.bias", "module.layer3.15.bn3.running_mean", "module.layer3.15.bn3.running_var", "module.layer3.16.conv3.weight", "module.layer3.16.bn3.weight", "module.layer3.16.bn3.bias", "module.layer3.16.bn3.running_mean", "module.layer3.16.bn3.running_var", "module.layer3.17.conv3.weight", "module.layer3.17.bn3.weight", "module.layer3.17.bn3.bias", "module.layer3.17.bn3.running_mean", "module.layer3.17.bn3.running_var".
size mismatch for module.layer1.0.conv1.weight: copying a param with shape torch.Size([16, 16, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.1.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.2.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.3.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.4.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.5.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.6.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.7.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.8.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.9.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.10.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.11.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.12.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.13.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.14.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.15.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.16.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.17.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer2.0.bn1.weight: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for module.layer2.0.bn1.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for module.layer2.0.bn1.running_mean: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for module.layer2.0.bn1.running_var: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for module.layer2.0.conv1.weight: copying a param with shape torch.Size([32, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 3, 3]).
size mismatch for module.layer2.0.downsample.0.weight: copying a param with shape torch.Size([128, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 1, 1]).
size mismatch for module.layer2.1.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.2.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.3.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.4.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.5.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.6.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.7.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.8.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.9.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.10.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.11.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.12.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.13.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.14.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.15.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.16.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.17.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer3.0.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for module.layer3.0.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for module.layer3.0.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for module.layer3.0.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for module.layer3.0.conv1.weight: copying a param with shape torch.Size([64, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 32, 3, 3]).
size mismatch for module.layer3.0.downsample.0.weight: copying a param with shape torch.Size([256, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 32, 1, 1]).
size mismatch for module.layer3.1.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.2.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.3.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.4.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.5.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.6.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.7.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.8.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.9.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.10.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.11.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.12.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.13.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.14.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.15.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.16.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.17.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.fc.weight: copying a param with shape torch.Size([100, 256]) from checkpoint, the shape in current model is torch.Size([10, 64]).
size mismatch for module.fc.bias: copying a param with shape torch.Size([100]) from checkpoint, the shape in current model is torch.Size([10]).
from pytorch-classification.
You can also manually updated the dic. Like this:
state_dict =checkpoint['state_dict'] from collections import OrderedDict new_state_dict = OrderedDict() for k, v in state_dict.items(): if 'module' not in k: k = 'module.'+k else: k = k.replace('features.module.', 'module.features.') new_state_dict[k]=v model.load_state_dict(new_state_dict)
I suppose this issue can be closed as the referenced post mentions the cause of the error and offers a solution
from pytorch-classification.
You can also manually updated the dic. Like this:
state_dict =checkpoint['state_dict'] from collections import OrderedDict new_state_dict = OrderedDict() for k, v in state_dict.items(): if 'module' not in k: k = 'module.'+k else: k = k.replace('features.module.', 'module.features.') new_state_dict[k]=v model.load_state_dict(new_state_dict)
life saver!
from pytorch-classification.
Am getting an error similar to this one
This is what I am running:
python cifar.py -a preresnet --depth 110 --epochs 3 --schedule 81 122 --gamma 0.1 --wd 1e-4 --checkpoint checkpoints/cifar10/preresnet-110 --resume 'checkpoint.pth.tar'
('checkpoint.pth.tar' is from the onedrive folder)RuntimeError: Error(s) in loading state_dict for DataParallel:
Missing key(s) in state_dict: "module.bn.weight", "module.bn.bias", "module.bn.running_mean", "module.bn.running_var".
Unexpected key(s) in state_dict: "module.bn1.weight", "module.bn1.bias", "module.bn1.running_mean", "module.bn1.running_var", "module.layer1.0.conv3.weight", "module.layer1.0.bn3.weight", "module.layer1.0.bn3.bias", "module.layer1.0.bn3.running_mean", "module.layer1.0.bn3.running_var", "module.layer1.0.downsample.0.weight", "module.layer1.0.downsample.1.weight", "module.layer1.0.downsample.1.bias", "module.layer1.0.downsample.1.running_mean", "module.layer1.0.downsample.1.running_var", "module.layer1.1.conv3.weight", "module.layer1.1.bn3.weight", "module.layer1.1.bn3.bias", "module.layer1.1.bn3.running_mean", "module.layer1.1.bn3.running_var", "module.layer1.2.conv3.weight", "module.layer1.2.bn3.weight", "module.layer1.2.bn3.bias", "module.layer1.2.bn3.running_mean", "module.layer1.2.bn3.running_var", "module.layer1.3.conv3.weight", "module.layer1.3.bn3.weight", "module.layer1.3.bn3.bias", "module.layer1.3.bn3.running_mean", "module.layer1.3.bn3.running_var", "module.layer1.4.conv3.weight", "module.layer1.4.bn3.weight", "module.layer1.4.bn3.bias", "module.layer1.4.bn3.running_mean", "module.layer1.4.bn3.running_var", "module.layer1.5.conv3.weight", "module.layer1.5.bn3.weight", "module.layer1.5.bn3.bias", "module.layer1.5.bn3.running_mean", "module.layer1.5.bn3.running_var", "module.layer1.6.conv3.weight", "module.layer1.6.bn3.weight", "module.layer1.6.bn3.bias", "module.layer1.6.bn3.running_mean", "module.layer1.6.bn3.running_var", "module.layer1.7.conv3.weight", "module.layer1.7.bn3.weight", "module.layer1.7.bn3.bias", "module.layer1.7.bn3.running_mean", "module.layer1.7.bn3.running_var", "module.layer1.8.conv3.weight", "module.layer1.8.bn3.weight", "module.layer1.8.bn3.bias", "module.layer1.8.bn3.running_mean", "module.layer1.8.bn3.running_var", "module.layer1.9.conv3.weight", "module.layer1.9.bn3.weight", "module.layer1.9.bn3.bias", "module.layer1.9.bn3.running_mean", "module.layer1.9.bn3.running_var", "module.layer1.10.conv3.weight", "module.layer1.10.bn3.weight", "module.layer1.10.bn3.bias", "module.layer1.10.bn3.running_mean", "module.layer1.10.bn3.running_var", "module.layer1.11.conv3.weight", "module.layer1.11.bn3.weight", "module.layer1.11.bn3.bias", "module.layer1.11.bn3.running_mean", "module.layer1.11.bn3.running_var", "module.layer1.12.conv3.weight", "module.layer1.12.bn3.weight", "module.layer1.12.bn3.bias", "module.layer1.12.bn3.running_mean", "module.layer1.12.bn3.running_var", "module.layer1.13.conv3.weight", "module.layer1.13.bn3.weight", "module.layer1.13.bn3.bias", "module.layer1.13.bn3.running_mean", "module.layer1.13.bn3.running_var", "module.layer1.14.conv3.weight", "module.layer1.14.bn3.weight", "module.layer1.14.bn3.bias", "module.layer1.14.bn3.running_mean", "module.layer1.14.bn3.running_var", "module.layer1.15.conv3.weight", "module.layer1.15.bn3.weight", "module.layer1.15.bn3.bias", "module.layer1.15.bn3.running_mean", "module.layer1.15.bn3.running_var", "module.layer1.16.conv3.weight", "module.layer1.16.bn3.weight", "module.layer1.16.bn3.bias", "module.layer1.16.bn3.running_mean", "module.layer1.16.bn3.running_var", "module.layer1.17.conv3.weight", "module.layer1.17.bn3.weight", "module.layer1.17.bn3.bias", "module.layer1.17.bn3.running_mean", "module.layer1.17.bn3.running_var", "module.layer2.0.conv3.weight", "module.layer2.0.bn3.weight", "module.layer2.0.bn3.bias", "module.layer2.0.bn3.running_mean", "module.layer2.0.bn3.running_var", "module.layer2.0.downsample.1.weight", "module.layer2.0.downsample.1.bias", "module.layer2.0.downsample.1.running_mean", "module.layer2.0.downsample.1.running_var", "module.layer2.1.conv3.weight", "module.layer2.1.bn3.weight", "module.layer2.1.bn3.bias", "module.layer2.1.bn3.running_mean", "module.layer2.1.bn3.running_var", "module.layer2.2.conv3.weight", "module.layer2.2.bn3.weight", "module.layer2.2.bn3.bias", "module.layer2.2.bn3.running_mean", "module.layer2.2.bn3.running_var", "module.layer2.3.conv3.weight", "module.layer2.3.bn3.weight", "module.layer2.3.bn3.bias", "module.layer2.3.bn3.running_mean", "module.layer2.3.bn3.running_var", "module.layer2.4.conv3.weight", "module.layer2.4.bn3.weight", "module.layer2.4.bn3.bias", "module.layer2.4.bn3.running_mean", "module.layer2.4.bn3.running_var", "module.layer2.5.conv3.weight", "module.layer2.5.bn3.weight", "module.layer2.5.bn3.bias", "module.layer2.5.bn3.running_mean", "module.layer2.5.bn3.running_var", "module.layer2.6.conv3.weight", "module.layer2.6.bn3.weight", "module.layer2.6.bn3.bias", "module.layer2.6.bn3.running_mean", "module.layer2.6.bn3.running_var", "module.layer2.7.conv3.weight", "module.layer2.7.bn3.weight", "module.layer2.7.bn3.bias", "module.layer2.7.bn3.running_mean", "module.layer2.7.bn3.running_var", "module.layer2.8.conv3.weight", "module.layer2.8.bn3.weight", "module.layer2.8.bn3.bias", "module.layer2.8.bn3.running_mean", "module.layer2.8.bn3.running_var", "module.layer2.9.conv3.weight", "module.layer2.9.bn3.weight", "module.layer2.9.bn3.bias", "module.layer2.9.bn3.running_mean", "module.layer2.9.bn3.running_var", "module.layer2.10.conv3.weight", "module.layer2.10.bn3.weight", "module.layer2.10.bn3.bias", "module.layer2.10.bn3.running_mean", "module.layer2.10.bn3.running_var", "module.layer2.11.conv3.weight", "module.layer2.11.bn3.weight", "module.layer2.11.bn3.bias", "module.layer2.11.bn3.running_mean", "module.layer2.11.bn3.running_var", "module.layer2.12.conv3.weight", "module.layer2.12.bn3.weight", "module.layer2.12.bn3.bias", "module.layer2.12.bn3.running_mean", "module.layer2.12.bn3.running_var", "module.layer2.13.conv3.weight", "module.layer2.13.bn3.weight", "module.layer2.13.bn3.bias", "module.layer2.13.bn3.running_mean", "module.layer2.13.bn3.running_var", "module.layer2.14.conv3.weight", "module.layer2.14.bn3.weight", "module.layer2.14.bn3.bias", "module.layer2.14.bn3.running_mean", "module.layer2.14.bn3.running_var", "module.layer2.15.conv3.weight", "module.layer2.15.bn3.weight", "module.layer2.15.bn3.bias", "module.layer2.15.bn3.running_mean", "module.layer2.15.bn3.running_var", "module.layer2.16.conv3.weight", "module.layer2.16.bn3.weight", "module.layer2.16.bn3.bias", "module.layer2.16.bn3.running_mean", "module.layer2.16.bn3.running_var", "module.layer2.17.conv3.weight", "module.layer2.17.bn3.weight", "module.layer2.17.bn3.bias", "module.layer2.17.bn3.running_mean", "module.layer2.17.bn3.running_var", "module.layer3.0.conv3.weight", "module.layer3.0.bn3.weight", "module.layer3.0.bn3.bias", "module.layer3.0.bn3.running_mean", "module.layer3.0.bn3.running_var", "module.layer3.0.downsample.1.weight", "module.layer3.0.downsample.1.bias", "module.layer3.0.downsample.1.running_mean", "module.layer3.0.downsample.1.running_var", "module.layer3.1.conv3.weight", "module.layer3.1.bn3.weight", "module.layer3.1.bn3.bias", "module.layer3.1.bn3.running_mean", "module.layer3.1.bn3.running_var", "module.layer3.2.conv3.weight", "module.layer3.2.bn3.weight", "module.layer3.2.bn3.bias", "module.layer3.2.bn3.running_mean", "module.layer3.2.bn3.running_var", "module.layer3.3.conv3.weight", "module.layer3.3.bn3.weight", "module.layer3.3.bn3.bias", "module.layer3.3.bn3.running_mean", "module.layer3.3.bn3.running_var", "module.layer3.4.conv3.weight", "module.layer3.4.bn3.weight", "module.layer3.4.bn3.bias", "module.layer3.4.bn3.running_mean", "module.layer3.4.bn3.running_var", "module.layer3.5.conv3.weight", "module.layer3.5.bn3.weight", "module.layer3.5.bn3.bias", "module.layer3.5.bn3.running_mean", "module.layer3.5.bn3.running_var", "module.layer3.6.conv3.weight", "module.layer3.6.bn3.weight", "module.layer3.6.bn3.bias", "module.layer3.6.bn3.running_mean", "module.layer3.6.bn3.running_var", "module.layer3.7.conv3.weight", "module.layer3.7.bn3.weight", "module.layer3.7.bn3.bias", "module.layer3.7.bn3.running_mean", "module.layer3.7.bn3.running_var", "module.layer3.8.conv3.weight", "module.layer3.8.bn3.weight", "module.layer3.8.bn3.bias", "module.layer3.8.bn3.running_mean", "module.layer3.8.bn3.running_var", "module.layer3.9.conv3.weight", "module.layer3.9.bn3.weight", "module.layer3.9.bn3.bias", "module.layer3.9.bn3.running_mean", "module.layer3.9.bn3.running_var", "module.layer3.10.conv3.weight", "module.layer3.10.bn3.weight", "module.layer3.10.bn3.bias", "module.layer3.10.bn3.running_mean", "module.layer3.10.bn3.running_var", "module.layer3.11.conv3.weight", "module.layer3.11.bn3.weight", "module.layer3.11.bn3.bias", "module.layer3.11.bn3.running_mean", "module.layer3.11.bn3.running_var", "module.layer3.12.conv3.weight", "module.layer3.12.bn3.weight", "module.layer3.12.bn3.bias", "module.layer3.12.bn3.running_mean", "module.layer3.12.bn3.running_var", "module.layer3.13.conv3.weight", "module.layer3.13.bn3.weight", "module.layer3.13.bn3.bias", "module.layer3.13.bn3.running_mean", "module.layer3.13.bn3.running_var", "module.layer3.14.conv3.weight", "module.layer3.14.bn3.weight", "module.layer3.14.bn3.bias", "module.layer3.14.bn3.running_mean", "module.layer3.14.bn3.running_var", "module.layer3.15.conv3.weight", "module.layer3.15.bn3.weight", "module.layer3.15.bn3.bias", "module.layer3.15.bn3.running_mean", "module.layer3.15.bn3.running_var", "module.layer3.16.conv3.weight", "module.layer3.16.bn3.weight", "module.layer3.16.bn3.bias", "module.layer3.16.bn3.running_mean", "module.layer3.16.bn3.running_var", "module.layer3.17.conv3.weight", "module.layer3.17.bn3.weight", "module.layer3.17.bn3.bias", "module.layer3.17.bn3.running_mean", "module.layer3.17.bn3.running_var".
size mismatch for module.layer1.0.conv1.weight: copying a param with shape torch.Size([16, 16, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.1.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.2.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.3.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.4.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.5.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.6.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.7.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.8.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.9.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.10.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.11.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.12.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.13.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.14.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.15.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.16.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer1.17.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).
size mismatch for module.layer2.0.bn1.weight: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for module.layer2.0.bn1.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for module.layer2.0.bn1.running_mean: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for module.layer2.0.bn1.running_var: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for module.layer2.0.conv1.weight: copying a param with shape torch.Size([32, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 3, 3]).
size mismatch for module.layer2.0.downsample.0.weight: copying a param with shape torch.Size([128, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 1, 1]).
size mismatch for module.layer2.1.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.2.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.3.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.4.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.5.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.6.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.7.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.8.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.9.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.10.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.11.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.12.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.13.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.14.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.15.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.16.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer2.17.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).
size mismatch for module.layer3.0.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for module.layer3.0.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for module.layer3.0.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for module.layer3.0.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for module.layer3.0.conv1.weight: copying a param with shape torch.Size([64, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 32, 3, 3]).
size mismatch for module.layer3.0.downsample.0.weight: copying a param with shape torch.Size([256, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 32, 1, 1]).
size mismatch for module.layer3.1.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.2.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.3.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.4.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.5.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.6.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.7.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.8.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.9.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.10.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.11.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.12.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.13.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.14.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.15.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.16.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.layer3.17.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for module.fc.weight: copying a param with shape torch.Size([100, 256]) from checkpoint, the shape in current model is torch.Size([10, 64]).
size mismatch for module.fc.bias: copying a param with shape torch.Size([100]) from checkpoint, the shape in current model is torch.Size([10]).
did you solve it please ?
from pytorch-classification.
state_dict =checkpoint['state_dict'] from collections import OrderedDict new_state_dict = OrderedDict() for k, v in state_dict.items(): if 'module' not in k: k = 'module.'+k else: k = k.replace('features.module.', 'module.features.') new_state_dict[k]=v model.load_state_dict(new_state_dict)
This is the solution!!!! Thanks!!!!!
from pytorch-classification.
Use model.module.state_dict() instead of model.state_dict() in DP mode
from pytorch-classification.
Related Issues (20)
- running with a newer pytorch version HOT 1
- For vgg16, there are three classifier layers in the provided checkpoint but only one in the model HOT 2
- Error loading pretrained model weights HOT 1
- The pretrain cifar10 resnet110 indeed is resnet164 (BottleNeck) HOT 1
- PreResNet-110 on Cifar100, Top1 error rate is 26.47 rather than 23.65. HOT 3
- about how to inference HOT 2
- 'ProgressBar' object has no attribute 'elapsed_td'?
- The parameters count is different from torchsion resnet.
- CIFAR-10 does not have resnet18
- draw the accuracy curce such as ./utils/images?
- How long does ImageNet take to train?
- the result of resnet18 with imagenet is Test Acc: 0.09(I loaded your pretrained model)
- Inconsistent ResNeXt-29 (16x64) trained model on CIFAR10
- tensor(89.0110, device='cuda:0')
- Problems with the checkpoints of the PreResNet110
- The results of ResNeXt-50 (32x4d) on ImageNet
- Checkpoints are unaccessible HOT 1
- RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead. HOT 1
- why depth of resnet-110 are 164?
- rror: view size is not compatible with input tensor's size and stride
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch-classification.