Hey,
thank you for the nice implementation!
I was mostly interested in the EarlyConvViT. By comparing against the paper for my experiment I notices that when preparing the prepending conv block like this:
self.conv_layers = nn.Sequential(
*[nn.Sequential(
nn.Conv2d(in_channels=n_filter_list[i],
out_channels=n_filter_list[i + 1],
kernel_size=3, # hardcoding for now because that's what the paper used
stride=2, # hardcoding for now because that's what the paper used
padding=1), # hardcoding for now because that's what the paper used
)
for i in range(len(n_filter_list)-1)
])
there should be batch norms and relus added like this:
self.conv_layers = nn.Sequential(
*[nn.Sequential
(
nn.Conv2d(in_channels=n_filter_list[i],
out_channels=n_filter_list[i + 1],
kernel_size=3, # hardcoding for now because that's what the paper used
stride=2, # hardcoding for now because that's what the paper used
padding=1), # hardcoding for now because that's what the paper used
nn.BatchNorm2d(n_filter_list[i + 1]),
nn.ReLU(inplace=True)
)
for i in range(len(n_filter_list)-1)
])
right?