adambielski / siamese-triplet Goto Github PK
View Code? Open in Web Editor NEWSiamese and triplet networks with online pair/triplet mining in PyTorch
License: BSD 3-Clause "New" or "Revised" License
Siamese and triplet networks with online pair/triplet mining in PyTorch
License: BSD 3-Clause "New" or "Revised" License
It's a great work on comparison among softmax, siameses and triplets networks, I am wondering what kind of metric do you think is possible to compare different methods fairly in stead of only qualitative results. I saw it's on TODO list, do you have any idea about it?
hi~
If I don't konw the test_data_labels , how could I choose test triplet?
img1 = self.test_data[self.test_triplets[index][0]] img2 = self.test_data[self.test_triplets[index][1]] img3 = self.test_data[self.test_triplets[index][2]]
this triplet selection reply on the test_labels that we must got before?
I don't know whether my question is clear enough 😁
Nice code ! I learned a lot from you code. However, I still have a question about class 'EmbeddingNet' in the networks.py .
I noticed that, the output's size of the 'EmbeddingNet' is (2). So I worried about the embedding of input sample is too small. And when we use this embedding to caculate the triplet loss. It may have some problem.
Thanks for sharing great work
Whether embedding is normalized ||x||=1 ?
It looks EmbeddingNetL2 adopted normalization of embedding, but it is not used. From ReadMe it is EmbeddingNet that is adopted but it does not normalize embeddings.
The result of validation set is just the last batchsize accuracy of val_loader instead of the whole,i wonder whether you will fix this for accuracy?
Hi I noticed that you used parametric ReLU for these experiments, and I tried replacing them with vanilla ReLU. It turns out that even for simple MNIST classification network, the training cannot progress at all. It is true that ReLU imposes the additional constraint that only the positive quadrant can be used, but still it surprises me that the training loss stays at 2.3, which means nothing.
Did you adopt PReLU as conscious choice and what was the rationale? Thank you so much.
might be interesting to compare with the alternative triplet loss described in: https://arxiv.org/abs/1412.6622
I think that your ConstrativeLoss implementation could have a problem.
In the paper the Constrative Loss is defined by:
But the Constrative Loss function implemented are:
hi nice work on the repo, thank you for these great examples.
I'm trying to increase the size and getting tensor mismatch errors, cannot really see a way around it.
Is there a way to not hard code the FC layer input size and get it from the convnet somehow ?
RuntimeError: size mismatch, m1: [256 x 179776], m2: [1024 x 256] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:268
Thanks for your great work, the code is clean and it is work appropriate.
However, I got some problem about your implementation about contrastive loss. Is it possible to explain it more details, because it is quite different with the description of original paper.
In the mean time, I try some contrastive loss implementation by others, but it is not work, is it possible to explain the difference?
# Custom Contrastive Loss
class ContrastiveLoss(torch.nn.Module):
"""
Contrastive loss function.
Based on: http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
"""
def __init__(self, margin=2.0):
super(ContrastiveLoss, self).__init__()
self.margin = margin
def forward(self, output1, output2, label):
euclidean_distance = F.pairwise_distance(output1, output2)
loss_contrastive = torch.mean((1-label) * torch.pow(euclidean_distance, 2) +
(label) * torch.pow(torch.clamp(self.margin - euclidean_distance, min=0.0), 2))
return loss_contrastive
hello, after we train the siamese and triplet network embeddings, how should we use it to classify?
would you please describe more about this code?
class EmbeddingNet(nn.Module):
def init(self):
super(EmbeddingNet, self).init()
self.convnet = nn.Sequential(nn.Conv2d(1, 32, 5), nn.PReLU(),
nn.MaxPool2d(2, stride=2),
nn.Conv2d(32, 64, 5), nn.PReLU(),
nn.MaxPool2d(2, stride=2))
self.fc = nn.Sequential(nn.Linear(64 * 4 * 4, 256),
nn.PReLU(),
nn.Linear(256, 256),
nn.PReLU(),
**nn.Linear(256, 2)**
)
I mean the bold line.
we have 10 classes, so, it seems to be 10 instead of 2!!
any help would be appreciated.
In your case, 2-dimensional embeddings are learned and visualized. If >=3 dimensional embeddings were learned and then dimension-reduced to 2D using UMAP or t-SNE like algorithms, will the visualization look different than current behavior?
In your definition, there is no max(0,x)
used.https://github.com/adambielski/siamese-triplet#triplet-network
But in your code, you use a F.relu()
to get similar effects with max(0,x)
. Is there crucial differences between these two definitions with or without F.relu()
? What about their characteristics and performance?
hi~
if I don't know the test_dataloader_label, how could i choose test_triplet?
img1 = self.test_data[self.test_triplets[index][0]] img2 = self.test_data[self.test_triplets[index][1]] img3 = self.test_data[self.test_triplets[index][2]]
I want to get test triplet to get classification. the code in dataloader seems select the test_data through test_labels?
I don't know whether my question is clear enough? 😁
How to train on multiple GPUs? How to implement it?
Great work!
For mnist-fashion, I am surprised that it does not look like the contrastive/triplet are better than sotfmax, at least on the plot. I am even more surprised that the online sampling with negative mining does not seem to perform better than random sampling?
I understand you're using a 2d embedding space, but still. Do you have any quantitative numbers using maybe a higher dimensional space?
Could be interesting to plot in 3D as well.
Is there a way to train the network with my own data?
Hey,
Is there any plan to include such complicated dataset as having 10K~100K classes ?
Intuitionally MNIST is easy because it has only 10 classes for clustering.
Thanks for sharing the great work BTW :)
Any plans on releasing this under some open source license?
Dear all,
I ran into this error when I was trying to evaluate the code with SiameseMNIST. It seems that MNIST from PyTorch indeed does not have attributes train_labels or train_data. How do you get these attributes? Could you kindly give some advice? Thank you!
Excuse me, if I'm wrong, but i believe there is a minor mistake in your semi-hard negative selector:
def semihard_negative(loss_values, margin):
semihard_negatives = np.where(np.logical_and(loss_values < margin, loss_values > 0))[0]
return np.random.choice(semihard_negatives) if len(semihard_negatives) > 0 else None
In the FaceNet paper it says that
We call these negative exemplars semi-hard , as they are further away from the anchor than the positive exemplar, but still hard because the squared distance is close to the anchorpositive distance. Those negatives lie inside the margin α
Considering that loss_values
is computed as ap_distance - distance_matrix[torch.LongTensor(np.array([anchor_positive[0]])), torch.LongTensor(negative_indices)] + self.margin
, you want the |f(x_a) - f(x_p)| - |f(x_a) - f(x_n)| to lie on segment [α, 2α], instead of lying on the interval [-α, \inf].
Briefly put: i believe that the function should be defined this way:
def semihard_negative(loss_values, margin):
semihard_negatives = np.where(loss_values > 0)[0]
return np.random.choice(semihard_negatives) if len(semihard_negatives) > 0 else None
Hello
Does it make any sense that my training performance improve (different data than mnist) while the performance over the validation set don't change -at all-, I checked, both of the sets are balanced in labels, and include the same types of labels.
I am initially loading data into CPU and after that passing it to GPU, but it gives the following error,
Input type (torch.cuda.DoubleTensor) and weight type (torch.cuda.FloatTensor) should be the same
Attaching code at below, (I am using own custom data)
class Data_Load(Dataset):
def __init__(self,train_path,test_path,train):
# Transforms
# self.transform = transform
# Read the csv file
self.train=train
if self.train:
self.train_data_info = pd.read_csv(train_path,header=None)
self.train_data =[]
print("printing train data length CUHK")
print(len(self.train_data_info.index))
for (i,j) in enumerate(np.asarray(self.train_data_info.iloc[:, 0])):
try:
self.train_data.append(np.moveaxis(cv2.imread(j)[:,:,:3]/ 255., -1, 0))
except:
print(j)
self.train_data = np.stack(self.train_data)
self.train_labels = np.asarray(self.train_data_info.iloc[:, 1])
self.train_labels = torch.from_numpy(self.train_labels)
self.train_data_len = len(self.train_data_info.index)
else :
self.test_data_info = pd.read_csv(test_path,header=None)
self.test_data =[]
print("printing test data length CUHK")
print(len(self.test_data_info.index))
for (i,j) in enumerate(np.asarray(self.test_data_info.iloc[:, 0])):
try :
self.test_data.append(np.moveaxis(cv2.imread(j)[:,:,:3] / 255., -1, 0))
except :
print(j)
self.test_data = np.stack(self.test_data)
self.test_labels = np.asarray(self.test_data_info.iloc[:, 1])
self.test_labels = torch.from_numpy(self.test_labels)
self.test_data_len = len(self.test_data_info.index)
def __getitem__(self, index):
if self.train:
img, target = self.train_data[index], self.train_labels[index]
else:
img, target = self.test_data[index], self.test_labels[index]
return (img,target)
def __len__(self):
if self.train :
return self.train_data_len
else :
return `self.test_data_len
And in the trainer.py
......
for batch_idx, (data, target) in enumerate(train_loader):
target = target if len(target) > 0 else None
if not type(data) in (tuple, list):
data = (data,)
if cuda:
data = tuple(d.float().cuda() for d in data)
if target is not None:
target = target.float().cuda()
optimizer.zero_grad()
outputs = model(*data)
.....
Traceback :
RuntimeError Traceback (most recent call last)
in ()
----> 1 fit(online_train_loader, online_test_loader, model, loss_fn, optimizer, scheduler, n_epochs, cuda, log_interval)8 frames
/content/siamese-triplet/trainer.py in fit(train_loader, val_loader, model, loss_fn, optimizer, scheduler, n_epochs, cuda, log_interval, metrics, start_epoch)
21
22 # Train stage
---> 23 train_loss, metrics = train_epoch(train_loader, model, loss_fn, optimizer, cuda, log_interval, metrics)
24
25 message = 'Epoch: {}/{}. Train set: Average loss: {:.4f}'.format(epoch + 1, n_epochs, train_loss)/content/siamese-triplet/trainer.py in train_epoch(train_loader, model, loss_fn, optimizer, cuda, log_interval, metrics)
57
58 optimizer.zero_grad()
---> 59 outputs = model(*data)
60
61 if type(outputs) not in (tuple, list):/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)/content/siamese-triplet/networks.py in forward(self, x)
43
44 def forward(self, x):
---> 45 output = self.convnet(x)
46 output = output.view(output.size()[0], -1)
47 output = self.fc(output)/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py in forward(self, input)
98 def forward(self, input):
99 for module in self:
--> 100 input = module(input)
101 return input
102/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py in forward(self, input)
343
344 def forward(self, input):
--> 345 return self.conv2d_forward(input, self.weight)
346
347 class Conv3d(_ConvNd):/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py in conv2d_forward(self, input, weight)
340 _pair(0), self.dilation, self.groups)
341 return F.conv2d(input, weight, self.bias, self.stride,
--> 342 self.padding, self.dilation, self.groups)
343
344 def forward(self, input):RuntimeError: Input type (torch.cuda.DoubleTensor) and weight type (torch.cuda.FloatTensor) should be the same
Can anyone help me to sort out?
Thank You.
Maybe in class HardNegativePairSelector
, the hard negative pairs should be the pairs with closest distances, since the objective is to separate the embeddings in a negative pair.
If this is not right, just ignore this issue.
in fact, i want know the siamese's performace at cifar10. Do you try it?
Hi there i'm using this repo (awesome job by the way) to pre-train a model following the online triplet approach
Since i want to obtain embeddings of 512 dimensions instead of 2d embeddings, should I modify the pdist function?
def pdist(vectors):
distance_matrix = -2 * vectors.mm(torch.t(vectors)) + vectors.pow(2).sum(dim=1).view(1, -1) + vectors.pow(2).sum(
dim=1).view(-1, 1)
return distance_matrix
Sometimes my distance matrix has no 0's in the diagonal when i change the dimension to 512 , I wonder if what change is needed maybe change it to something like this
def pdist(vectors):
distance_matrix = -vectors.size()[1]* vectors.mm(torch.t(vectors)) + vectors.pow(2).sum(dim=1).view(1, -1) + vectors.pow(2).sum(
dim=1).view(-1, 1)
return distance_matrix
or should the loss the online triplet loss function with any embeddings dimension? by the way it is?
thanks in advance
https://twitter.com/alfcnz/status/1133372277876068352
Unfortunately that triplet loss is flawed. The most offending negative sample has zero gradient. That power of 2 should be a power of ½.
— Alfredo Canziani (@alfcnz) May 28, 2019
I feel bad so many people still use it. 😕 https://t.co/M3daSGzlMK
There's some discussion going on in her replies as well, but if there is an issue it should be addressed here.
Hi, I'm relatively new to the concept of triplet loss and I was just curious about the "Average Nonzero Triplets" metric. What exactly is this measuring? And is this something that should be going down/up/neither during training?
Thanks in advance for the help!
I am training the Triplet MNIST model and couldn't find the path where the model is saved?
The MNIST set is arrays, (not images).
When I modify this to take images and return arrays I can't get a good data loader with this repo
class TripletMNIST(Dataset):
When I'm ready for 'Fit' the cell just shows a [*] forever.
I try inputing the data loaders as triplet_train_loader.dataset I try to get the array of the image exactly the same size and shape as the MNIST images but I get mismatch errors. I get errors like 1x0x0 "too small". I spent more than a day on the 'too small' problem changing the dims.
BTW, great reop. Beautiful code sewn up very nicely.
Thanks so much for your amazing work first. And have you tried soft-margin proposed in the person re-id paper? How does it work?
I was wondering if we could add adaptive thresholds and Quadruplet Networks that excel on these task for better results. Would be more than happy to open pull requests for the same .
Hi @adambielski,
I am trying to test CIFAR-100, but the accuracy is getting very low. Can you help me?
I changed the network structure as follows.
class EmbeddingNet(nn.Module):
def __init__(self):
super(EmbeddingNet, self).__init__()
self.convnet = nn.Sequential(nn.Conv2d(3, 32, 1), nn.PReLU(),
nn.MaxPool2d(2, stride=2),
nn.Conv2d(32, 64, 1), nn.PReLU(),
nn.MaxPool2d(2, stride=2))
self.linear1 = nn.Linear(1024 * 2 * 2, 256)
self.prelu1 = nn.PReLU()
self.linear2 = nn.Linear(256, 256)
self.prelu2 = nn.Sigmoid()
self.linear3 = nn.Linear(256, 2)
Hey, thanks for your great work, but I have faced a problem.
I have noticed that with FashionMinist Dataset, in Online triplet selection, the Validation set: average loss is not decreasing during the training, on contrast, it even increased.
I have tried on my own dataset with different network(Resnet18), but face the same problem.
Do you have any idea about why this situation happens?
Thank you very much
Hi @adambielski,
I need to do some testing with the triplet model. My experiments consist of extracting features for each image in the test dataset and using these features for the task of image retrieval. I'm saving the trained model, but I'm not able to load to extract the features of each image because the model is trained for a triplet format input. Some idea?
It seems to me that the fixed test pair set is not guaranteed to be excluded in the random training pairs generated from the SiameseMNIST
class.
Is this is correct this will be causing leakage, no? With high probability in small sample sizes.
Dear author ,I want to use my dataset to train the model,but I think the load way of mnist can't suit me,Can you tell me how to change the load my dataset? thanks!
Great work! Thanks for sharing such elegant codes.
I have a small question, in TripletMNIST(Dataset) class. your annotation about testsets is "Test: Creates fixed triplets for testing"
However, in the code it is
triplets = [[i,
random_state.choice(self.label_to_indices[self.test_labels[i].item()]),
random_state.choice(self.label_to_indices[
np.random.choice(
list(self.labels_set - set([self.test_labels[i].item()]))
)
])
]
I think it maybe should be the followings:
triplets = [[i,
random_state.choice(self.label_to_indices[self.test_labels[i].item()]),
random_state.choice(self.label_to_indices[
random_state.choice(
list(self.labels_set - set([self.test_labels[i].item()]))
)
])
]
I means use random_state.choice instead of np.random.choice of the original code, in order to create fixed testsets.
I am not sure If I am right, can we discuss it?
Hi when I was using siamese loss as my loss function, I ran into the situation that the derivatives suddenly equaled to nan in some iteration. It turns out that it is because some of the distances equal to 0 (very unlikely to happen) and consequently the derivative sqrt(d) to d is nan. I added a small smooth constant to solve this problem and perhaps you should consider adding a smooth parameter too.
i think
loss_values = ap_distance - distance_matrix[torch.LongTensor(np.array([anchor_positive[0]])), torch.LongTensor(negative_indices)] + self.margin
should be
loss_values = ap_distance - distance_matrix[torch.LongTensor(np.array([anchor_positive[0]])), torch.LongTensor(negative_indices)]
and in semihard_negative
loss_values < margin, loss_values > 0
should be
loss_values < 0, loss_values > -margin
Hello!
I have following problem using Triplet Network: during 2-classes training, I cant see the clusters being formed, but rather the 1st class is being surrendeed by embeddings of the 2th class. What could be an issue in my case?
With regards
thanks very much for your implementation. but may I know is that any plan to implement the accuracy metric with siamese net? thanks a lot.
Line 37 in 0c719f9
Why does you using activate relu here?
Thank you.
Hi,
I am currently working on my master's thesis and plan to use this code as part of my project, is there a way to obtain permission for this? I couldn't find anything on the documentation.
What format should I use for citing?
Best,
Davide
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.