kekmodel / fixmatch-pytorch Goto Github PK
View Code? Open in Web Editor NEWUnofficial PyTorch implementation of "FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence"
License: MIT License
Unofficial PyTorch implementation of "FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence"
License: MIT License
Wonderful job!
I want to know what role does do_interleave function do, is the performance dropping when removing it?
logits = de_interleave(logits, 2*args.mu+1)
Using fixmatch to train on my dataset, LR = 0.03, batchsize = 64. When the unlabeled loss increases gradually, the model gets worse and worse, and the loss of training set will decrease, but the loss of verification set will increase, and the accuracy will decrease?
Hi,kekmodel!
How to object detection model(mobilenetv3_ssd or other) with FixMatch,are there any examples?
Thank you!
Thanks for your reimplement, but I get the Best top-1 acc about cifar10-250 is 89.60(the result of the paper is 94.93 ± 0.65).
Hi --
Are you able to explain the motivation behind this line:
https://github.com/kekmodel/FixMatch-pytorch/blob/master/train.py#L494
Eg, when you use EMA, you're also shrinking the weights of the base model?
Thanks!
I am following the repo instruction and keep the original code, and run command
python train.py --dataset cifar10 --num-labeled 40 --arch wideresnet --batch-size 64 --lr 0.03 --expand-labels --seed 5 --out results/[email protected]
But I can not get the reported accuracy, from the acc curve, we can see the model reach 90.49% accuracy at 100th epoch, but I can only get around 76%.
Can you help me figure it out?
In the function x_u_split(...) in dataset/cifar.py the labeled images are generated without a seed. If training runs consist of multiple start and stops then it is possible that total number of labeled images that the model sees exceeds the set value. For instance, training on 40 labels with 2 stops will lead to 120 unique labeled images over the entire course of training even though the model only sees 40 labeled images at a time. I think this can explain the much higher accuracy obtained by this implementation, especially for the low label tasks.
A quick fix would be adding below snippet before random label generation in the x_u_split(...) function.
np.random.seed(args.seed)
Hi,
It seems the last line in get_cosine_schedule_with_warmup
funtion should be:
return max(0., (math.cos(math.pi * num_cycles * no_progress) + 1 ) * 0.5)
But I am not sure about this, correct me if I am wrong, thanks
01/14/2021 19:40:10 - INFO - models.wideresnet - Model: WideResNet 28x2
01/14/2021 19:40:10 - INFO - __main__ - Total params: 1.47M
01/14/2021 19:40:19 - INFO - __main__ - ***** Running training *****
01/14/2021 19:40:19 - INFO - __main__ - Task = cifar10@4000
01/14/2021 19:40:19 - INFO - __main__ - Num Epochs = 1024
01/14/2021 19:40:19 - INFO - __main__ - Batch size per GPU = 64
01/14/2021 19:40:19 - INFO - __main__ - Total train batch size = 64
01/14/2021 19:40:19 - INFO - __main__ - Total optimization steps = 1048576
Train Epoch: 1/1024. Iter: 1024/1024. LR: 0.0300. Data: 0.253s. Batch: 0.569s. Loss: 1.2500. Loss_x: 1.2062. Loss_u: 0.0437. Mask: 0.08. : 100% 1024/1024 [09:42<00:00, 1.76it/s]
0% 0/157 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 475, in <module>
main()
File "train.py", line 291, in main
model, optimizer, ema_model, scheduler, writer)
File "train.py", line 393, in train
test_loss, test_acc = test(args, test_loader, test_model, epoch)
File "train.py", line 450, in test
prec1, prec5 = accuracy(outputs, targets, topk=(1, 5))
File "/content/FixMatch-pytorch/utils/misc.py", line 41, in accuracy
correct_k = correct[:k].view(-1).float().sum(0)
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
0% 0/157 [00:01<?, ?it/s]
Any help is appreciated!
I want to reproduce the result when the number of labeled data is 40
After executing the following command:
python train.py --dataset cifar10 --num-labeled 40 --arch wideresnet --batch-size 256 --lr 0.01 --expand-labels --seed 5 --out results/[email protected] --gpu 1
Accuracy hovered around 77.6, barely improving.
Is my hyperparameter set wrong?
How about your command?
Train Epoch: 148/1024. Iter: 1024/1024. LR: 0.0294. Data: 0.036s. Batch: 0.507s. Loss: 0.1614. Loss_x: 0.0009. Loss_u: 0.1605. Mask: 0.83. : 100%|█| 1024/
Test Iter: 79/ 79. Data: 0.005s. Batch: 0.017s. Loss: 2.5019. top1: 77.20. top5: 96.61. : 100%|████████████████████████| 79/79 [00:01<00:00, 56.13it/s]
12/07/2020 18:27:21 - INFO - main - top-1 acc: 77.20
12/07/2020 18:27:21 - INFO - main - top-5 acc: 96.61
12/07/2020 18:27:21 - INFO - main - Best top-1 acc: 77.68
12/07/2020 18:27:21 - INFO - main - Mean top-1 acc: 77.20
I guess the batch size of the unlabeled trainloader should be args.batch_size * args.mu
As confirmed by the collaborator of the official implementation the randaugment function should be applied all the time not 50 percent of the time.
The only 50% chance from the original paper refers to the flips etc of weak augmentation only and definitely not the strong augmentation methods
This could explain why this repository does better than the original paper but it isn't what the original paper did.
For those who want the true performance of the original fixmatch paper you need to delete
if random.random() < 0.5:
From the https://github.com/kekmodel/FixMatch-pytorch/blob/master/dataset/randaugment.py file
I found that this is because the model.train() did not open again when evaluation ends.
solution: just move mode.train() to the epoch loop:
->
for epoch in range(args.start_epoch, args.epochs):
model.train()
Do you have plans to make this repository compatible with a custom dataset, and if not, which files would need to be modified to do so?
When fix patch is applied to my dataset, the dataset is unbalanced. Why does the training set perform well, but the loss of verification set first decreases and then keeps rising?
Why do the authors include the labeled examples in the unlabeled dataset?
Shouldn't the AverageMeters in train.py be reseted after each epoch?
The original ImageNet example does this.
Just wanted to know the intuition behind the interleave and deinterleave operations. How does this help?
Thanks for sharing this excellent work. I just wonder if there is any idea to apply this algorithm for multi-class classification. Could I simply replace softmax with sigmoid to implement it?
In your implementation, WRN28-10 is used which has about 36M parameters.
Your model definition:
Lines 165 to 169 in 9044f2e
I used the following code to get the number of parameters
wrn = build_wideresnet(depth=28, widen_factor=10, dropout=0, num_classes=100)
print(f"# params: {sum(p.numel() for p in wrn.parameters()):,}")
which gives the following output:
In the official TensorFlow implementation, a WRN with about 23M parameters is used for CIFAR100 (see below image).
The CLI args for the official code can be found in this issue.
Hi :)
In the function train() in train.py, I think batch normalization layers will have a biased estimation if you feed the concatenated inputs to the model. It should rather be like:
inputs = torch.cat((inputs_x, inputs_u_s)).to(args.device)
targets_x = targets_x.to(args.device)
logits = model(inputs)
logits_x , logits_u_s = logits.chunk(2)
model.eval()
logits_u_w= model(inputs_u_w)
model.train()
del logits
Besides, it is mentioned in the paper that they also applied the unlabeled loss for labeled data
In practice, we include all labeled examples as part of unlabeled data
without using their labels when constructing U.
For the 40 labels on CIFAR-10 the accuracy reaches 93.38, when I run it. I used the same hyperparameters and seed 5. Is the reported acc for seed 5?
Hi, nice work! When reading your code, I find a function named interleave
in the train.py
:
def interleave(x, size):
s = list(x.shape)
return x.reshape([-1, size] + s[1:]).transpose(0, 1).reshape([-1] + s[1:])
Could you make some explanations about this function? I do not understand why we use this operation. Thank you!
FixMatch-pytorch/dataset/cifar.py
Lines 105 to 106 in fe19cc7
I tried to reproduce this source code on CIFAR10 (4000 labeled data), but I only got ~93% Top-1 accuracy.
Is there anyone meet the same problem?
I try to apply FixMatch on one class data.
For the unsupervised loss part, I modified the code like that.
logits_u_w, logits_u_s = logits[batch_size:].chunk(2)
pseudo_label = torch.sigmoid( logits_u_w.detach_() )
mask = pseudo_label.ge( args.threshold ).float()
Lu = (F.binary_cross_entropy( logits_u_s, mask, reduction='none' ) * mask ).mean()
Is that correct?
Hi --
Are you able to share a screenshot of the accuracy curves (test accuracy vs epoch)? I'm trying to reproduce your results, but it'd be helpful to make sure I'm on the right track since the models take so long to train.
Thanks!
Just wondering if it is necessary to use 4 GPUs to reproduce the reported the results? I have been struggling to get the same results with 1 GPU. Anyone succesfully reproduced the result with single GPU?
As mentioned in this comment 1c6fe04#r49610997 syntax of train.py is invalid and therefore train.py does not work
Hello I am very curious about this part of the randaugment implementation in the RandAugmentMC class
if random.random() < 0.5:
img = op(img, v=v, max_v=max_v, bias=bias)
If I am understanding this right, for RandaugmentMC with n=2 there is a 25% chance of no randaugment operator, 50% chance of one randaugment operator and a 25% of two randaugment operators occurring for a given image.
Is this what you found in the fixmatch implementation as this isn't what I understood the paper to do?
Hi, thanks for your impressive work!
It seems that we can get quite a lot better results with your implementation. Could you let me know what is the reason for the difference?
(Lb=40) 6.4% (Yours) | 13.81±3.37% (FixMatch(reported))
(Lb=20) 9% (Yours) | N/A. (FixMatch(reported))
When using DistributedDataParallel, if N labeled training images and K GPUs are used, should we set num_labeled = N / K instead of N? since np.random.shuffle(idx) generate different idxs in different threads.
what's the point of this code in RandAugmentPC
prob = np.random.uniform(0.2, 0.8)
if random.random() + prob >= 1:
Hi Kim,
I find that in
FixMatch-pytorch/dataset/randaugment.py
Line 53 in f549460
v
isn't divided by 2. But in FixMatch-pytorch/dataset/randaugment.py
Line 51 in f549460
v
is divided by 2. Is it a mistake, or there are some foundations?Are you going to add CTAugment?
I've run this code with 40 labels for CIFAR10, but I could not reproduced the report results. I only changed the number 4000 to 40 in the command directed in the USAGE part of the main page. (I got about 90% accuracy, but a little smaller). Is there anything to be modified more?
Hi, For supervised and semi-supervised methods, it is generally advised to use a separate validation set. From the code, it looks like the best test set accuracy is reported.
Is there any specific reason that a separate validation set is not required.
Hello,
Thanks for your good work! I really enjoyed your implementation. While I was trying to reproduce your results, I found some questions about PyTorch distributed training:
Thanks for your help in advance!
If train the model with multiple GPU's the total batch size becomes bigger (batch_size_total = batch_size * num_gpus) but the number of eval_steps in one epoch stays equal. This causes that the number of overall iterations in the training is increased by the factor of the number of GPU's. In the original Tensorflow implementation the number of overall iterations is independent from the number of the GPU's and the batch is divided to the different GPU's.
I'm not 100% sure about this but if it's right the number of eval_steps in one epoch should be reduced or the batch should be divided to the GPU's so that the number of overall iterations stays constant when using multiple GPU's.
Hi, in the paper, the authors say that they include all labeled examples as part of unlabeled data
without using their labels when constructing U. However, in your code, in 106 line of cifar.py, it is unlabeled_idx.extend(idx[label_per_class:]). Is this a mistake?Thanks
e.g. for cifar10 40 labels, 4 for each class; cifar10 250 labels, 25 for each class?
hi, I find that it seems you have not corrected about this problem, right ?
What is the training time of fixmatch without distributed training and using --amp
on A100 GPU?
Hello @kekmodel, thanks for the amazing implementation.
According to the documentation https://pytorch.org/docs/stable/data.html, if DistributedSampler is used, we need to call set_epoch
function before each epoch. However, I did not find set_epoch
in train.py. Should you add that?
Is it too small?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.