adambielski / siamese-triplet Goto Github PK

Siamese and triplet networks with online pair/triplet mining in PyTorch

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

machine-learning deep-learning siamese-network triplet-loss contrastive-loss pytorch embedding triplet-network learning-embeddings

siamese-triplet's People

Contributors

Stargazers

Watchers

Forkers

hbcbh1999 shubhampachori12110095 yalechang duke24k skallumadi mohanarunachalam snazz2001 chenglongchen tony32769 codeaudit pedrodiamel grabber murari023 k0stia pinglmlcv arunirc paul0m haojeng-wang changebio dolphintear karen7j ucasqcz doctorkey ngxbac dreadlord1984 hanfeijp figurine2018 17764591637 forence locussam xuguozhi hdubey dichotomies jackeywang777 movinghera xinxingsu lincn maximalwert zhiyuanding dolphinamy allchen noeagles yaojiebao ninickl dsp6414 ktarplee camilakolling mutual-ai ksrinivs64 xyj77 suruoxi keyky mariopi27 ztf-ucas alexhock xialeiliu allenjeep stefanxinhong cupwater shaosimon chaoyan1037 mahfujau xiangjjj zzuberi yuanpengcheng yang-zhang vivoutlaw stoneyang shreelock tinyloop mansirankawat yernsun zjuqiushi cjnolet chenghuang-ch ahhaa yaopujian zyoohv hep07 phymhan soundup dongfangduoshou123 fezaries yobcmst smrjans s4sarath jiancao92 zy20091082 smazu renke2 gavinic ruizhih ai3dvision tianmmaxingkong ml-lab bityangke nidetaoge ankittaxak5717 mynameischaos xingdi1990

siamese-triplet's Issues

a question about tripletloss

    Nice code ! I learned a lot from you code. However, I still have a question about  class 'EmbeddingNet' in the networks.py .
    I noticed that, the output's size of the 'EmbeddingNet'  is (2). So I worried about the embedding of input sample  is too small. And when we use this embedding  to caculate the triplet loss. It may have  some problem.

Some additions needed

I was wondering if we could add adaptive thresholds and Quadruplet Networks that excel on these task for better results. Would be more than happy to open pull requests for the same .

ConstrativeLoss implementation

I think that your ConstrativeLoss implementation could have a problem.
In the paper the Constrative Loss is defined by:

But the Constrative Loss function implemented are:

training set score improving while validation set score without change at all

Hello

Does it make any sense that my training performance improve (different data than mnist) while the performance over the validation set don't change -at all-, I checked, both of the sets are balanced in labels, and include the same types of labels.

Why vanilla ReLU cannot train at all.

Hi I noticed that you used parametric ReLU for these experiments, and I tried replacing them with vanilla ReLU. It turns out that even for simple MNIST classification network, the training cannot progress at all. It is true that ReLU imposes the additional constraint that only the positive quadrant can be used, but still it surprises me that the training loss stays at 2.3, which means nothing.

Did you adopt PReLU as conscious choice and what was the rationale? Thank you so much.

Tests on CIFAR100

Hi @adambielski,

I am trying to test CIFAR-100, but the accuracy is getting very low. Can you help me?
I changed the network structure as follows.

class EmbeddingNet(nn.Module):
    def __init__(self):
        super(EmbeddingNet, self).__init__()
        self.convnet = nn.Sequential(nn.Conv2d(3, 32, 1), nn.PReLU(),
                                     nn.MaxPool2d(2, stride=2),
                                     nn.Conv2d(32, 64, 1), nn.PReLU(),
                                     nn.MaxPool2d(2, stride=2))

        self.linear1 = nn.Linear(1024 * 2 * 2, 256)
        self.prelu1 =  nn.PReLU()
        self.linear2 = nn.Linear(256, 256)
        self.prelu2 = nn.Sigmoid()
        self.linear3 =  nn.Linear(256, 2)

result of validation set

The result of validation set is just the last batchsize accuracy of val_loader instead of the whole,i wonder whether you will fix this for accuracy？

in network.py why are you set the last layer look like (256,2)?

would you please describe more about this code?

class EmbeddingNet(nn.Module):
def init(self):
super(EmbeddingNet, self).init()
self.convnet = nn.Sequential(nn.Conv2d(1, 32, 5), nn.PReLU(),
nn.MaxPool2d(2, stride=2),
nn.Conv2d(32, 64, 5), nn.PReLU(),
nn.MaxPool2d(2, stride=2))

    self.fc = nn.Sequential(nn.Linear(64 * 4 * 4, 256),
                            nn.PReLU(),
                            nn.Linear(256, 256),
                            nn.PReLU(),
                            **nn.Linear(256, 2)**
                            )

I mean the bold line.
we have 10 classes, so, it seems to be 10 instead of 2!!
any help would be appreciated.

In "trainer" why are you do train and validation in the same epoch loop?

Questions about online triplet loss

Hi there i'm using this repo (awesome job by the way) to pre-train a model following the online triplet approach
Since i want to obtain embeddings of 512 dimensions instead of 2d embeddings, should I modify the pdist function?

def pdist(vectors):
    distance_matrix = -2 * vectors.mm(torch.t(vectors)) + vectors.pow(2).sum(dim=1).view(1, -1) + vectors.pow(2).sum(
        dim=1).view(-1, 1)
    return distance_matrix

Sometimes my distance matrix has no 0's in the diagonal when i change the dimension to 512 , I wonder if what change is needed maybe change it to something like this

def pdist(vectors):
    distance_matrix = -vectors.size()[1]* vectors.mm(torch.t(vectors)) + vectors.pow(2).sum(dim=1).view(1, -1) + vectors.pow(2).sum(
        dim=1).view(-1, 1)
    return distance_matrix

or should the loss the online triplet loss function with any embeddings dimension? by the way it is?
thanks in advance

question about FunctionNegativeTripletSelector

i think
loss_values = ap_distance - distance_matrix[torch.LongTensor(np.array([anchor_positive[0]])), torch.LongTensor(negative_indices)] + self.margin
should be
loss_values = ap_distance - distance_matrix[torch.LongTensor(np.array([anchor_positive[0]])), torch.LongTensor(negative_indices)]

and in semihard_negative
loss_values < margin, loss_values > 0
should be
loss_values < 0, loss_values > -margin

Different device error

I am initially loading data into CPU and after that passing it to GPU, but it gives the following error,

Input type (torch.cuda.DoubleTensor) and weight type (torch.cuda.FloatTensor) should be the same

Attaching code at below, (I am using own custom data)

class Data_Load(Dataset):
    def __init__(self,train_path,test_path,train):
        # Transforms
        
        # self.transform = transform
        # Read the csv file
        self.train=train
        if self.train:
            self.train_data_info = pd.read_csv(train_path,header=None)
            self.train_data =[] 
            
            print("printing train data length CUHK")
            print(len(self.train_data_info.index))

            for (i,j) in enumerate(np.asarray(self.train_data_info.iloc[:, 0])):
                try:
                    self.train_data.append(np.moveaxis(cv2.imread(j)[:,:,:3]/ 255., -1, 0))
                except:
                    print(j)

            self.train_data = np.stack(self.train_data)
            self.train_labels = np.asarray(self.train_data_info.iloc[:, 1])
            self.train_labels = torch.from_numpy(self.train_labels)

            self.train_data_len = len(self.train_data_info.index)

        else :
            self.test_data_info = pd.read_csv(test_path,header=None)
            self.test_data =[] 

            print("printing test data length CUHK")
            print(len(self.test_data_info.index))

            for (i,j) in enumerate(np.asarray(self.test_data_info.iloc[:, 0])):
                try : 
                    self.test_data.append(np.moveaxis(cv2.imread(j)[:,:,:3] / 255., -1, 0))
                except : 
                    print(j)  

            self.test_data = np.stack(self.test_data)
            self.test_labels = np.asarray(self.test_data_info.iloc[:, 1])
            self.test_labels = torch.from_numpy(self.test_labels)
            
            self.test_data_len = len(self.test_data_info.index)
            

    def __getitem__(self, index):
        if self.train:
            img, target = self.train_data[index], self.train_labels[index]
        else:
            img, target = self.test_data[index], self.test_labels[index]

        return (img,target)

    def __len__(self):
        if self.train :
            return self.train_data_len
        else :
            return `self.test_data_len

And in the trainer.py

......
for batch_idx, (data, target) in enumerate(train_loader):
        target = target if len(target) > 0 else None
        if not type(data) in (tuple, list):
            data = (data,)
        
        if cuda:
            data = tuple(d.float().cuda() for d in data)
            if target is not None:
                target = target.float().cuda()

        optimizer.zero_grad()
        outputs = model(*data)
.....

Traceback :

RuntimeError Traceback (most recent call last)
in ()
----> 1 fit(online_train_loader, online_test_loader, model, loss_fn, optimizer, scheduler, n_epochs, cuda, log_interval)

8 frames
/content/siamese-triplet/trainer.py in fit(train_loader, val_loader, model, loss_fn, optimizer, scheduler, n_epochs, cuda, log_interval, metrics, start_epoch)
21
22 # Train stage
---> 23 train_loss, metrics = train_epoch(train_loader, model, loss_fn, optimizer, cuda, log_interval, metrics)
24
25 message = 'Epoch: {}/{}. Train set: Average loss: {:.4f}'.format(epoch + 1, n_epochs, train_loss)

/content/siamese-triplet/trainer.py in train_epoch(train_loader, model, loss_fn, optimizer, cuda, log_interval, metrics)
57
58 optimizer.zero_grad()
---> 59 outputs = model(*data)
60
61 if type(outputs) not in (tuple, list):

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)

/content/siamese-triplet/networks.py in forward(self, x)
43
44 def forward(self, x):
---> 45 output = self.convnet(x)
46 output = output.view(output.size()[0], -1)
47 output = self.fc(output)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py in forward(self, input)
98 def forward(self, input):
99 for module in self:
--> 100 input = module(input)
101 return input
102

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py in forward(self, input)
343
344 def forward(self, input):
--> 345 return self.conv2d_forward(input, self.weight)
346
347 class Conv3d(_ConvNd):

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py in conv2d_forward(self, input, weight)
340 _pair(0), self.dilation, self.groups)
341 return F.conv2d(input, weight, self.bias, self.stride,
--> 342 self.padding, self.dilation, self.groups)
343
344 def forward(self, input):

RuntimeError: Input type (torch.cuda.DoubleTensor) and weight type (torch.cuda.FloatTensor) should be the same

Can anyone help me to sort out?

Thank You.

Accuracy Metric in SiameseNet

thanks very much for your implementation. but may I know is that any plan to implement the accuracy metric with siamese net? thanks a lot.

how to train the net

Semi-hard triplet loss implementation

Excuse me, if I'm wrong, but i believe there is a minor mistake in your semi-hard negative selector:

def semihard_negative(loss_values, margin):
    semihard_negatives = np.where(np.logical_and(loss_values < margin, loss_values > 0))[0]
return np.random.choice(semihard_negatives) if len(semihard_negatives) > 0 else None

In the FaceNet paper it says that

We call these negative exemplars semi-hard , as they are further away from the anchor than the positive exemplar, but still hard because the squared distance is close to the anchorpositive distance. Those negatives lie inside the margin α

Considering that loss_values is computed as ap_distance - distance_matrix[torch.LongTensor(np.array([anchor_positive[0]])), torch.LongTensor(negative_indices)] + self.margin, you want the |f(x_a) - f(x_p)| - |f(x_a) - f(x_n)| to lie on segment [α, 2α], instead of lying on the interval [-α, \inf].

Briefly put: i believe that the function should be defined this way:

def semihard_negative(loss_values, margin):
    semihard_negatives = np.where(loss_values > 0)[0]
return np.random.choice(semihard_negatives) if len(semihard_negatives) > 0 else None

Extract features from images - Triplet Model

Hi @adambielski,

I need to do some testing with the triplet model. My experiments consist of extracting features for each image in the test dataset and using these features for the task of image retrieval. I'm saving the trained model, but I'm not able to load to extract the features of each image because the model is trained for a triplet format input. Some idea?

How to visualize high dimensional feature distribution with this Method?

Siamese loss problem

Hi when I was using siamese loss as my loss function, I ran into the situation that the derivatives suddenly equaled to nan in some iteration. It turns out that it is because some of the distances equal to 0 (very unlikely to happen) and consequently the derivative sqrt(d) to d is nan. I added a small smooth constant to solve this problem and perhaps you should consider adding a smooth parameter too.

Twitter Issue: "triplet loss is flawed"

https://twitter.com/alfcnz/status/1133372277876068352

Unfortunately that triplet loss is flawed. The most offending negative sample has zero gradient. That power of 2 should be a power of ½.
I feel bad so many people still use it. 😕 https://t.co/M3daSGzlMK
— Alfredo Canziani (@alfcnz) May 28, 2019

There's some discussion going on in her replies as well, but if there is an issue it should be addressed here.

Save Model Path?

I am training the Triplet MNIST model and couldn't find the path where the model is saved?

How to train on multiple GPUs?

How to train on multiple GPUs? How to implement it?

Fixed test pair exclusion from train pairs

It seems to me that the fixed test pair set is not guaranteed to be excluded in the random training pairs generated from the SiameseMNIST class.

Is this is correct this will be causing leakage, no? With high probability in small sample sizes.

How to load own dataset to train rather than only mnist dataset？

Dear author ，I want to use my dataset to train the model，but I think the load way of mnist can't suit me，Can you tell me how to change the load my dataset？ thanks！

Evaluation

It's a great work on comparison among softmax, siameses and triplets networks, I am wondering what kind of metric do you think is possible to compare different methods fairly in stead of only qualitative results. I saw it's on TODO list, do you have any idea about it?

AttributeError: 'MNIST' object has no attribute 'train_labels'

Dear all,
I ran into this error when I was trying to evaluate the code with SiameseMNIST. It seems that MNIST from PyTorch indeed does not have attributes train_labels or train_data. How do you get these attributes? Could you kindly give some advice? Thank you!

Question about softmax embedding visualization

In your case, 2-dimensional embeddings are learned and visualized. If >=3 dimensional embeddings were learned and then dimension-reduced to 2D using UMAP or t-SNE like algorithms, will the visualization look different than current behavior?

Testset in TripletMNIST and SiameseMNIST are not fixed

Great work! Thanks for sharing such elegant codes.

I have a small question, in TripletMNIST(Dataset) class. your annotation about testsets is "Test: Creates fixed triplets for testing"

However, in the code it is
triplets = [[i,
random_state.choice(self.label_to_indices[self.test_labels[i].item()]),
random_state.choice(self.label_to_indices[
np.random.choice(
list(self.labels_set - set([self.test_labels[i].item()]))
)
])
]

I think it maybe should be the followings:

        triplets = [[i,
                     random_state.choice(self.label_to_indices[self.test_labels[i].item()]),
                     random_state.choice(self.label_to_indices[
                                            random_state.choice(
                                                 list(self.labels_set - set([self.test_labels[i].item()]))
                                             )
                                         ])
                     ]

I means use random_state.choice instead of np.random.choice of the original code, in order to create fixed testsets.

I am not sure If I am right, can we discuss it?

About choosing hard negative pairs

Maybe in class HardNegativePairSelector, the hard negative pairs should be the pairs with closest distances, since the objective is to separate the embeddings in a negative pair.

If this is not right, just ignore this issue.

What would you change to feed images into this?

The MNIST set is arrays, (not images).

When I modify this to take images and return arrays I can't get a good data loader with this repo

class TripletMNIST(Dataset):

When I'm ready for 'Fit' the cell just shows a [*] forever.
I try inputing the data loaders as triplet_train_loader.dataset I try to get the array of the image exactly the same size and shape as the MNIST images but I get mismatch errors. I get errors like 1x0x0 "too small". I spent more than a day on the 'too small' problem changing the dims.

BTW, great reop. Beautiful code sewn up very nicely.

alternative triplet loss

might be interesting to compare with the alternative triplet loss described in: https://arxiv.org/abs/1412.6622

Importance of F.relu()

In your definition, there is no max(0,x) used.https://github.com/adambielski/siamese-triplet#triplet-network

But in your code, you use a F.relu() to get similar effects with max(0,x). Is there crucial differences between these two definitions with or without F.relu()? What about their characteristics and performance?

Relu in Triplet loss

siamese-triplet/losses.py

Line 37 in 0c719f9

losses = F.relu(distance_positive - distance_negative + self.margin)

Why does you using activate relu here?
Thank you.

Citing Work

Hi,

I am currently working on my master's thesis and plan to use this code as part of my project, is there a way to obtain permission for this? I couldn't find anything on the documentation.

What format should I use for citing?

Best,
Davide

Have u ever tried soft-margin as in In paper Defense of the Triplet Loss for Person Re-Identification?

Thanks so much for your amazing work first. And have you tried soft-margin proposed in the person re-id paper? How does it work?

choose test triplet

hi~
if I don't know the test_dataloader_label, how could i choose test_triplet?
img1 = self.test_data[self.test_triplets[index][0]] img2 = self.test_data[self.test_triplets[index][1]] img3 = self.test_data[self.test_triplets[index][2]]
I want to get test triplet to get classification. the code in dataloader seems select the test_data through test_labels?
I don't know whether my question is clear enough? 😁

Implementation of Contrastive Loss

Thanks for your great work, the code is clean and it is work appropriate.
However, I got some problem about your implementation about contrastive loss. Is it possible to explain it more details, because it is quite different with the description of original paper.

In the mean time, I try some contrastive loss implementation by others, but it is not work, is it possible to explain the difference?

# Custom Contrastive Loss
class ContrastiveLoss(torch.nn.Module):
    """
    Contrastive loss function.
    Based on: http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
    """

    def __init__(self, margin=2.0):
        super(ContrastiveLoss, self).__init__()
        self.margin = margin

    def forward(self, output1, output2, label):
        euclidean_distance = F.pairwise_distance(output1, output2)
        loss_contrastive = torch.mean((1-label) * torch.pow(euclidean_distance, 2) + 
                                      (label) * torch.pow(torch.clamp(self.margin - euclidean_distance, min=0.0), 2))     
 

        return loss_contrastive

how to classify

hello, after we train the siamese and triplet network embeddings, how should we use it to classify?

Avg Nonzero Triplets

Hi, I'm relatively new to the concept of triplet loss and I was just curious about the "Average Nonzero Triplets" metric. What exactly is this measuring? And is this something that should be going down/up/neither during training?

Thanks in advance for the help!

Performance of Triplet on huge number of classes ?

Hey,
Is there any plan to include such complicated dataset as having 10K~100K classes ?
Intuitionally MNIST is easy because it has only 10 classes for clustering.

Thanks for sharing the great work BTW :)

the performance of other dataset.

in fact, i want know the siamese's performace at cifar10. Do you try it?

OnlineTripletLoss

Hello, I am training on custom data using OnlineTripletLoss, But during loss calculation, I got the following error.

TypeError: forward() takes 3 positional arguments but 4 were given

And all feature Functions are as below,

Can Anyone help with it !!
Thank You.

Own data

Is there a way to train the network with my own data?

license

Any plans on releasing this under some open source license?

Cant form the clusters

Hello!
I have following problem using Triplet Network: during 2-classes training, I cant see the clusters being formed, but rather the 1st class is being surrendeed by embeddings of the 2th class. What could be an issue in my case?
With regards

In Online triplet selection,Validation set average loss increased

Hey, thanks for your great work, but I have faced a problem.

I have noticed that with FashionMinist Dataset, in Online triplet selection, the Validation set: average loss is not decreasing during the training, on contrast, it even increased.

I have tried on my own dataset with different network(Resnet18), but face the same problem.

Do you have any idea about why this situation happens?
Thank you very much

increase input image size

hi nice work on the repo, thank you for these great examples.

I'm trying to increase the size and getting tensor mismatch errors, cannot really see a way around it.
Is there a way to not hard code the FC layer input size and get it from the convnet somehow ?

RuntimeError: size mismatch, m1: [256 x 179776], m2: [1024 x 256] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:268

hi, Can you help me please. I want to use this code for my own images dataset but when I execute the function fit(train_loader, test_loader, model, loss_fn, optimizer, scheduler, n_epochs, cuda, log_interval, metrics=[AccumulatedAccuracyMetric()]), I got the Error :Expected 4-dimensional input for 4-dimensional weight 64 1 10 10, but got 2-dimensional input of size [100, 16384] instead.

Whether embedding is normalized ||x||=1 ?

Thanks for sharing great work

Whether embedding is normalized ||x||=1 ?
It looks EmbeddingNetL2 adopted normalization of embedding, but it is not used. From ReadMe it is EmbeddingNet that is adopted but it does not normalize embeddings.

choose test triplet

hi~
If I don't konw the test_data_labels , how could I choose test triplet?
img1 = self.test_data[self.test_triplets[index][0]] img2 = self.test_data[self.test_triplets[index][1]] img3 = self.test_data[self.test_triplets[index][2]]
this triplet selection reply on the test_labels that we must got before?
I don't know whether my question is clear enough 😁

mnist-fashion results

Great work!

For mnist-fashion, I am surprised that it does not look like the contrastive/triplet are better than sotfmax, at least on the plot. I am even more surprised that the online sampling with negative mining does not seem to perform better than random sampling?
I understand you're using a 2d embedding space, but still. Do you have any quantitative numbers using maybe a higher dimensional space?
Could be interesting to plot in 3D as well.