kekmodel / fixmatch-pytorch Goto Github PK

View Code? Open in Web Editor NEW

744.0 744.0 170.0 31.4 MB

Unofficial PyTorch implementation of "FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence"

License: MIT License

Python 100.00%

deeplearning fixmatch pytorch randaugment semi-supervised-learning

fixmatch-pytorch's Introduction

fixmatch-pytorch's People

Contributors

Stargazers

Watchers

Forkers

kant1724 dendisuhubdy zlapp fadouakhm xiaopingzeng vitvicky yogsin xrosliang 5l1v3r1 fallensouls hwang787 zhihelu tqtrunghnvn thanit456 gaimjkp jnyborg raynor08 phimachine chen-song baizhenfeng indigopyj cv-ip daniellin94144 zhangshenghust ylsung klauscc charlescxk imtiazziko chaoso guolz-ml cieusy rostifar robert-hardwick sakumashirayuki valeriydutt jmarrietar jo-won bnjasim silencemonk gyanachand1 duolong luciusk greendream182 tacju holim0711 bomtorazek nachuannairong sailfish009 walintonc hadizayer liuyeqing1314520 kevin3-hsu pawandeep2155 nitzantal 13301338176 karan-sarkar a127000555 jiangbo-shi cc-gb2 bmyan mfaraki tarun005 yaoyao1995 dayu11 jiwoncocoder gqduke vincbar can-it-run atbryce raghavab1992 cogito233 rahul13ramesh 33zhengshanshan saulocatharino hulinliu zhangxuemiao jungchulkoo kyhoolee lefanzhang hayreenlee alice1820 joelorentz xown3197 merodriguezblanco tomatoaddicted pexure tiangarin jackyin68 anonymous9027 xuqi927 dadounhind shreya-singh-tech mmajurski dataflowr topel stevenxmy shubham745 gloryings oceanogeology semi-supervised-learning

fixmatch-pytorch's Issues

why add a interleave before the input?

Wonderful job!
I want to know what role does do_interleave function do, is the performance dropping when removing it?

logits = de_interleave(logits, 2*args.mu+1)

Using fixmatch to train on my dataset, LR = 0.03, batchsize = 64. When the unlabeled loss increases gradually, the model gets worse and worse, and the loss of training set will decrease, but the loss of verification set will increase, and the accuracy will decrease?

Using fixmatch to train on my dataset, LR = 0.03, batchsize = 64. When the unlabeled loss increases gradually, the model gets worse and worse, and the loss of training set will decrease, but the loss of verification set will increase, and the accuracy will decrease?

How to delete this question, I've solved it

How to object detection model(mobilenetv3_ssd or other) with FixMatch

Hi,kekmodel!
How to object detection model(mobilenetv3_ssd or other) with FixMatch,are there any examples?

Thank you!

Have you tried SVHN dataset?

The Best top-1 acc about cifar10-250 is 89.60.

Thanks for your reimplement, but I get the Best top-1 acc about cifar10-250 is 89.60(the result of the paper is 94.93 ± 0.65).

Weight decay in EMA?

Hi --

Are you able to explain the motivation behind this line:
https://github.com/kekmodel/FixMatch-pytorch/blob/master/train.py#L494

Eg, when you use EMA, you're also shrinking the weights of the base model?

Thanks!

Can not get repo reported accuracy for [email protected]

I am following the repo instruction and keep the original code, and run command

python train.py --dataset cifar10 --num-labeled 40 --arch wideresnet --batch-size 64 --lr 0.03 --expand-labels --seed 5 --out results/[email protected]

But I can not get the reported accuracy, from the acc curve, we can see the model reach 90.49% accuracy at 100th epoch, but I can only get around 76%.

Can you help me figure it out?

Lack of seed during random labeled image selection may be leading to better performance on training resumption

In the function x_u_split(...) in dataset/cifar.py the labeled images are generated without a seed. If training runs consist of multiple start and stops then it is possible that total number of labeled images that the model sees exceeds the set value. For instance, training on 40 labels with 2 stops will lead to 120 unique labeled images over the entire course of training even though the model only sees 40 labeled images at a time. I think this can explain the much higher accuracy obtained by this implementation, especially for the low label tasks.

A quick fix would be adding below snippet before random label generation in the x_u_split(...) function.

np.random.seed(args.seed)

cos learing rate

Hi,

It seems the last line in get_cosine_schedule_with_warmup funtion should be:
return max(0., (math.cos(math.pi * num_cycles * no_progress) + 1 ) * 0.5)

But I am not sure about this, correct me if I am wrong, thanks

RuntimeError after 1 epoch

01/14/2021 19:40:10 - INFO - models.wideresnet -   Model: WideResNet 28x2
01/14/2021 19:40:10 - INFO - __main__ -   Total params: 1.47M
01/14/2021 19:40:19 - INFO - __main__ -   ***** Running training *****
01/14/2021 19:40:19 - INFO - __main__ -     Task = cifar10@4000
01/14/2021 19:40:19 - INFO - __main__ -     Num Epochs = 1024
01/14/2021 19:40:19 - INFO - __main__ -     Batch size per GPU = 64
01/14/2021 19:40:19 - INFO - __main__ -     Total train batch size = 64
01/14/2021 19:40:19 - INFO - __main__ -     Total optimization steps = 1048576
Train Epoch: 1/1024. Iter: 1024/1024. LR: 0.0300. Data: 0.253s. Batch: 0.569s. Loss: 1.2500. Loss_x: 1.2062. Loss_u: 0.0437. Mask: 0.08. : 100% 1024/1024 [09:42<00:00,  1.76it/s]
  0% 0/157 [00:00<?, ?it/s]Traceback (most recent call last):
  File "train.py", line 475, in <module>
    main()
  File "train.py", line 291, in main
    model, optimizer, ema_model, scheduler, writer)
  File "train.py", line 393, in train
    test_loss, test_acc = test(args, test_loader, test_model, epoch)
  File "train.py", line 450, in test
    prec1, prec5 = accuracy(outputs, targets, topk=(1, 5))
  File "/content/FixMatch-pytorch/utils/misc.py", line 41, in accuracy
    correct_k = correct[:k].view(-1).float().sum(0)
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
  0% 0/157 [00:01<?, ?it/s]

Any help is appreciated!

fail to reproduce the results of your experiment

I want to reproduce the result when the number of labeled data is 40
After executing the following command：
python train.py --dataset cifar10 --num-labeled 40 --arch wideresnet --batch-size 256 --lr 0.01 --expand-labels --seed 5 --out results/[email protected] --gpu 1

Accuracy hovered around 77.6, barely improving.
Is my hyperparameter set wrong?
How about your command?

Train Epoch: 148/1024. Iter: 1024/1024. LR: 0.0294. Data: 0.036s. Batch: 0.507s. Loss: 0.1614. Loss_x: 0.0009. Loss_u: 0.1605. Mask: 0.83. : 100%|█| 1024/
Test Iter: 79/ 79. Data: 0.005s. Batch: 0.017s. Loss: 2.5019. top1: 77.20. top5: 96.61. : 100%|████████████████████████| 79/79 [00:01<00:00, 56.13it/s]
12/07/2020 18:27:21 - INFO - main - top-1 acc: 77.20
12/07/2020 18:27:21 - INFO - main - top-5 acc: 96.61
12/07/2020 18:27:21 - INFO - main - Best top-1 acc: 77.68
12/07/2020 18:27:21 - INFO - main - Mean top-1 acc: 77.20

batch size of unlabeled trainloader

I guess the batch size of the unlabeled trainloader should be args.batch_size * args.mu

The RandAugment Implementation Is Wrong

google-research/fixmatch#65

As confirmed by the collaborator of the official implementation the randaugment function should be applied all the time not 50 percent of the time.

The only 50% chance from the original paper refers to the flips etc of weak augmentation only and definitely not the strong augmentation methods

This could explain why this repository does better than the original paper but it isn't what the original paper did.

For those who want the true performance of the original fixmatch paper you need to delete

if random.random() < 0.5:

From the https://github.com/kekmodel/FixMatch-pytorch/blob/master/dataset/randaugment.py file

train failed when EMA mode is off

I found that this is because the model.train() did not open again when evaluation ends.
solution: just move mode.train() to the epoch loop:

model.train()
for epoch in range(args.start_epoch, args.epochs):

for epoch in range(args.start_epoch, args.epochs):

```
   model.train()
```

custom dataset

Do you have plans to make this repository compatible with a custom dataset, and if not, which files would need to be modified to do so?

When fix patch is applied to my dataset, the dataset is unbalanced. Why does the training set perform well, but the loss of verification set first decreases and then keeps rising?

Datasets

Why do the authors include the labeled examples in the unlabeled dataset?

Wrong use of AverageMeter

Shouldn't the AverageMeters in train.py be reseted after each epoch?

The original ImageNet example does this.

Purpose of Interleave

Just wanted to know the intuition behind the interleave and deinterleave operations. How does this help?

FixMatch for Multi-class Classification

Thanks for sharing this excellent work. I just wonder if there is any idea to apply this algorithm for multi-class classification. Could I simply replace softmax with sigmoid to implement it?

What is the purpose of the parameter 'activate_before_residual' in WRN?

CIFAR100 has different architecture from official implementation

In your implementation, WRN28-10 is used which has about 36M parameters.
Your model definition:

FixMatch-pytorch/train.py

Lines 165 to 169 in 9044f2e

 elif args.dataset == 'cifar100': 

 args.num_classes = 100 

 if args.arch == 'wideresnet': 

 args.model_depth = 28 

 args.model_width = 10

I used the following code to get the number of parameters

wrn = build_wideresnet(depth=28, widen_factor=10, dropout=0, num_classes=100)

print(f"# params: {sum(p.numel() for p in wrn.parameters()):,}")

which gives the following output:

In the official TensorFlow implementation, a WRN with about 23M parameters is used for CIFAR100 (see below image).

Notes

The CLI args for the official code can be found in this issue.

BatchNorm has a biased estimation

Hi :)

In the function train() in train.py, I think batch normalization layers will have a biased estimation if you feed the concatenated inputs to the model. It should rather be like:

inputs = torch.cat((inputs_x, inputs_u_s)).to(args.device)
targets_x = targets_x.to(args.device)
logits = model(inputs)
logits_x , logits_u_s = logits.chunk(2)
model.eval()
logits_u_w= model(inputs_u_w)
model.train()        
del logits

Besides, it is mentioned in the paper that they also applied the unlabeled loss for labeled data

In practice, we include all labeled examples as part of unlabeled data
without using their labels when constructing U.

Accuracy lower than reported

For the 40 labels on CIFAR-10 the accuracy reaches 93.38, when I run it. I used the same hyperparameters and seed 5. Is the reported acc for seed 5?

What does the function `interleave` do?

Hi, nice work! When reading your code, I find a function named interleave in the train.py:

def interleave(x, size):
    s = list(x.shape)
    return x.reshape([-1, size] + s[1:]).transpose(0, 1).reshape([-1] + s[1:])

Could you make some explanations about this function? I do not understand why we use this operation. Thank you!

unlabeled data contain the labeled data?

FixMatch-pytorch/dataset/cifar.py

Lines 105 to 106 in fe19cc7

 labeled_idx.extend(idx[:label_per_class]) 

 unlabeled_idx.extend(idx[:])

In the code, the labeled data contain idx before label_per_class, but the unlabeld data contain all idx? Is this a bug?

Cannot reproduce 94.72% Top-1 accuracy on CIFAR10 (4000 labeled data)

I tried to reproduce this source code on CIFAR10 (4000 labeled data), but I only got ~93% Top-1 accuracy.

Is there anyone meet the same problem?

Unsupervised loss part for one single class

I try to apply FixMatch on one class data.
For the unsupervised loss part, I modified the code like that.

logits_u_w, logits_u_s = logits[batch_size:].chunk(2)
pseudo_label = torch.sigmoid( logits_u_w.detach_() )
mask = pseudo_label.ge( args.threshold ).float()
Lu = (F.binary_cross_entropy( logits_u_s, mask, reduction='none' ) * mask ).mean()

Is that correct?

Accuracy curves

Hi --

Are you able to share a screenshot of the accuracy curves (test accuracy vs epoch)? I'm trying to reproduce your results, but it'd be helpful to make sure I'm on the right track since the models take so long to train.

Thanks!

Can't reproduce the result with single GPU

Just wondering if it is necessary to use 4 GPUs to reproduce the reported the results? I have been struggling to get the same results with 1 GPU. Anyone succesfully reproduced the result with single GPU?

1 Invalid syntax in train.py

As mentioned in this comment 1c6fe04#r49610997 syntax of train.py is invalid and therefore train.py does not work

RandAugment Implementation

Hello I am very curious about this part of the randaugment implementation in the RandAugmentMC class

if random.random() < 0.5:
img = op(img, v=v, max_v=max_v, bias=bias)

If I am understanding this right, for RandaugmentMC with n=2 there is a 25% chance of no randaugment operator, 50% chance of one randaugment operator and a 25% of two randaugment operators occurring for a given image.

Is this what you found in the fixmatch implementation as this isn't what I understood the paper to do?

Why the result with this implementation is better than reported one in the original paper?

Hi, thanks for your impressive work!

It seems that we can get quite a lot better results with your implementation. Could you let me know what is the reason for the difference?

(Lb=40) 6.4% (Yours) | 13.81±3.37% (FixMatch(reported))
(Lb=20) 9% (Yours) | N/A. (FixMatch(reported))

num_labeled in DistributedDataParallel

When using DistributedDataParallel, if N labeled training images and K GPUs are used, should we set num_labeled = N / K instead of N? since np.random.shuffle(idx) generate different idxs in different threads.

is there performance difference between RandAugmentMC and RandAugmentPC

what's the point of this code in RandAugmentPC

prob = np.random.uniform(0.2, 0.8)
            if random.random() + prob >= 1:

v divided by 2 problem in Cutout

Hi Kim,
I find that in

FixMatch-pytorch/dataset/randaugment.py

Line 53 in f549460

x1 = int(min(w, x0 + v))

v isn't divided by 2. But in

FixMatch-pytorch/dataset/randaugment.py

Line 51 in f549460

x0 = int(max(0, x0 - v / 2.))

v is divided by 2. Is it a mistake, or there are some foundations?

Any plans to add CTAugment?

Are you going to add CTAugment?

Performance not reproduced

I've run this code with 40 labels for CIFAR10, but I could not reproduced the report results. I only changed the number 4000 to 40 in the command directed in the USAGE part of the main page. (I got about 90% accuracy, but a little smaller). Is there anything to be modified more?

No Validation Set

Hi, For supervised and semi-supervised methods, it is generally advised to use a separate validation set. From the code, it looks like the best test set accuracy is reported.

Is there any specific reason that a separate validation set is not required.

PyTorch Distributed Training

Hello,

Thanks for your good work! I really enjoyed your implementation. While I was trying to reproduce your results, I found some questions about PyTorch distributed training:

Why don't you reduce losses across GPUs before calling the backward? I think typically you may want to call torch.distributed.all_reduce to synchronize the gradients across the GPUs. Is it just missing from the code or you have some equivalent operations that I have overlooked?
When I run the code on single GPU and multiple GPUs, the distributed training does not make the training process faster(even slower). The overall batch sizes are consistent in both settings(64 for single GPU and 4*16 for 4 GPUs). Is this expected?

Thanks for your help in advance!

total numer of iterations

If train the model with multiple GPU's the total batch size becomes bigger (batch_size_total = batch_size * num_gpus) but the number of eval_steps in one epoch stays equal. This causes that the number of overall iterations in the training is increased by the factor of the number of GPU's. In the original Tensorflow implementation the number of overall iterations is independent from the number of the GPU's and the batch is divided to the different GPU's.
I'm not 100% sure about this but if it's right the number of eval_steps in one epoch should be reduced or the batch should be divided to the GPU's so that the number of overall iterations stays constant when using multiple GPU's.

All labeled examples are as part of unlabeled data without using their labels?

Hi, in the paper, the authors say that they include all labeled examples as part of unlabeled data
without using their labels when constructing U. However, in your code, in 106 line of cifar.py, it is unlabeled_idx.extend(idx[label_per_class:]). Is this a mistake？Thanks

	elif args.dataset == 'cifar100':
	args.num_classes = 100
	if args.arch == 'wideresnet':
	args.model_depth = 28
	args.model_width = 10

	labeled_idx.extend(idx[:label_per_class])
	unlabeled_idx.extend(idx[:])