Reproducing experimental results of LL4AL [Yoo et al. 2019 CVPR]

Python 100.00%

learning-loss-for-active-learning's Introduction

In-Young Cho (조인영) 👋

I'm a Deep Learning Researcher at KRAFTON Inc., passionate about leveraging AI for 3D modeling, rendering, and reconstruction. With a solid foundation in computer science and mathematical science, my work focuses on the intersection of AI and interactive media.

🛠 Expertise

Neural Rendering & 3D Reconstruction: Transforming 2D images and videos into 3D models.
Deep Learning Applications: Enhancing gaming and spatial planning through AI.
Research & Development: Published at SIGGRAPH 2021, exploring deep representation learning for Monte Carlo image reconstruction.

🎓 Education

M.S. in Computer Science, KAIST (2019 - 2021). GPA: 4.25/4.30.
B.S. in Computer Science & Mathematical Science, KAIST (2015 - 2019). GPA: 4.00/4.30, Summa Cum Laude.

💼 Career

Deep Learning Researcher, KRAFTON Inc. (2022 - Present).
Machine Learning Engineer, Spacewalk (2021 - 2022).

📫 Contact

Email: [email protected]
Linkedin: In-Young Cho
Study Log: Notion
Old Study Log: Blog

The image was created using ChatGPT-4 to illustrate a scene that matches the meaning of my name, which is benevolence (仁) and prosperity (榮).

learning-loss-for-active-learning's People

Contributors

Stargazers

Watchers

learning-loss-for-active-learning's Issues

Question about experiment on CIFAR-100

Hello, I read your code and it was really great.

Test on CIFAR-10 was successful, though, I tried to train the model on CIFAR-100, but it get stuck on this point(Main.py) :

        # Model
        resnet18    = resnet.ResNet18(num_classes=100).cuda() # On this point
        loss_module = lossnet.LossNet().cuda()
        models      = {'backbone': resnet18, 'module': loss_module}
        torch.backends.cudnn.benchmark = False

Runtime Error says this this :

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-10-e3617788a34b> in <module>()
     21 
     22         # Model
---> 23         resnet18    = resnet.ResNet18(num_classes=100).cuda()
     24         loss_module = lossnet.LossNet().cuda()
     25         models      = {'backbone': resnet18, 'module': loss_module}

3 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in cuda(self, device)
    489             Module: self
    490         """
--> 491         return self._apply(lambda t: t.cuda(device))
    492 
    493     def xpu(self: T, device: Optional[Union[int, device]] = None) -> T:

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
    385     def _apply(self, fn):
    386         for module in self.children():
--> 387             module._apply(fn)
    388 
    389         def compute_should_use_set_data(tensor, tensor_applied):

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
    407                 # `with torch.no_grad():`
    408                 with torch.no_grad():
--> 409                     param_applied = fn(param)
    410                 should_use_set_data = compute_should_use_set_data(param, param_applied)
    411                 if should_use_set_data:

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in <lambda>(t)
    489             Module: self
    490         """
--> 491         return self._apply(lambda t: t.cuda(device))
    492 
    493     def xpu(self: T, device: Optional[Union[int, device]] = None) -> T:

RuntimeError: CUDA error: device-side assert triggered

How can I fix this RuntimeError?

I'm running this code on google colab

question about then loss function

hello
i,read your code.i have some question.
Is the loss function unique?
is the way you do about loss function same with Original author？
i am not sure about this and the paper don't say it clearly yet

uncertainty argsort wrong order

Shouldn't we sort the uncertainty in descending order?(based on the paper)
The most uncertain ones needs to be sent to the oracle.
However in the code , np.argsort selects the sample in ascending order

Which performance is better between confidence only and learning loss

Hi Mephisto,
I have a concern need to be discussed with you.

As title, I don't know which is better on different setting.

If we just to add the weak img by confidence of pure network for training in every new cycle, maybe we can get same as or better than this paper experiment result.

Have you tried this setting of experiment?

Best regards.
William

Why you randomly sample 10000 unlabeled data points first?

Thanks for you implementation. I attempted to run your code and noticed in main.py, you firstly shuffle the unlabeled set and select 10000 unlabeled data points rather than the whole unlabeled data points.

# main.py
253  # Randomly sample 10000 unlabeled data points
254 random.shuffle(unlabeled_set)
255 subset = unlabeled_set[:SUBSET]

My understanding is that the sampling should happen in the whole unlabeled data points rather than part of the whole unlabeled data points. Am I correct? You do this just for fast training or there are something behind this operation? Have you tried selecting samples on the whole data points? Do they give the similar performance? Thanks.

Random sample of unlabled sample

Hi, I'm so impressed with your study.

By the way, I have one question about your paper.

In 4.1 Image Classification, Dataset you said, 'To address this, [4] obtains a random subset ...'

But I could not find the sentence in 'The power of ensembles for active learning in image classification.'

I think there is something I missed.

Could you let me know where the sentence is?

Thanks!

Questions about the performance of the figure

Is the 1k shown in the figure the result of using an initial 1k sample and an additional 1k sample total of 2k labeled samples?

The test accuracy is not matched with your image.

Hi again.

I run your code and it goes well. But my test acc is like the following :

Files already downloaded and verified
Files already downloaded and verified
Files already downloaded and verified
Train a Model.
/usr/local/lib/python3.6/dist-packages/torch/optim/lr_scheduler.py:82: UserWarning: Detected call of 'lr_scheduler.step()' before 'optimizer.step()'. In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before 'lr_scheduler.step()'. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule.See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
"https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
Finished.
Trial 1/3 || Cycle 1/10 || Label set size 1000: Test acc 47.1
Train a Model.
Finished.
Trial 1/3 || Cycle 2/10 || Label set size 2000: Test acc 60.14
Train a Model.
Finished.
Trial 1/3 || Cycle 3/10 || Label set size 3000: Test acc 69.98
Train a Model.
Finished.
Trial 1/3 || Cycle 4/10 || Label set size 4000: Test acc 79.45
Train a Model.
Finished.
Trial 1/3 || Cycle 5/10 || Label set size 5000: Test acc 81.04
Train a Model.
74%|████████████████████████████████ | 35/47 [00:04<00:01, 8.66it/s]

I know this is in the middle of running, but the accuracy doesn't seem it is matched with your image. I run your code third times but all the results seem unmatched. How should I interpret this result?

Thanks advance.

What is the "ground truth loss" in your reproduced image?

Thanks for your reproduction.

In your reproduced image, there are four labels "Reference, Learn loss, Ground truth loss, and Random". I guess Reference is from the article, Learn loss is what you reproduced, and Random is of RandomSampling(I guess). What is the "Ground truth loss"?

Thanks in advance.

How to reproduce the results?

Hello, Mephisto.

Thank you for the great paper and code.

I'm trying to reproduce the result of the paper.

I've seen that the results of others get overwhelmed by LL4AL.

However, I cannot reproduce the result that is really close to 90 at 10K data.

Also, the random result shows an 85% accuracy average.

Is there anything I'm missing?

Your figure from the paper shows that random has more than 85% accuracy, but, your reproduced result shows much less than 85%.

Is there any difference or explanation like you changed your torch version(c.f. I'm using torch version 1.5 and 1.9)?

Thanks.

Best,
DaeHo Lee

question about your Reproduced Results

hi,i have a question about how the four parts set:
radom part: it is just radomly samlping from the traing set and train?
reference : is it not the learn loss?
ground truth loss: is it diffr=erent from random part?
learn loss: it is the paper using ?

Why is the uncertainty negative ?

When I calculate uncertainty on the data points ( cifar-10 ), all the values I get are negative. So, it means the predicted loss for the data points are negative. How does it makes sense if the actual target loss is always positive?

question about the backbone model

If your lossnet module is suitable for high resolution images (such as imageNet) in datasets, replace the backbone network with resnet101, and simply adjust feature_size and num_channels?
thx!!

why not reinstantiate the network model in Active learning cycles?I am wondering that the way of your writing will make the model aware of the test set in advance?

Dear programmer,
why not reinstantiate the network model in Active learning cycles?A lot of the active learning code I see reinstantiate the network model in Active learning cycles.
For example，
# Active learning cycles
for cycle in range(CYCLES):
# 2、Model（target model + loss predict module）
resnet18 = resnet.ResNet18(num_classes=10).cuda()
loss_module = lossnet.LossNet().cuda()
models = {'backbone': resnet18, 'module': loss_module}
torch.backends.cudnn.benchmark = True
#criterion、optimizer and scheduler (re)initialization
criterion = nn.CrossEntropyLoss(reduction='none')#交叉熵loss
optim_backbone = optim.SGD(models['backbone'].parameters(), lr=LR,
momentum=MOMENTUM, weight_decay=WDECAY)
optim_module = optim.SGD(models['module'].parameters(), lr=LR, 。。。。。
In your code，You write it like this.
I am wondering that the way of your writing will make the model aware of the test set in advance?

Learning-Loss-for-Active-Learning/main.py

Line 226 in 6c38f6f

resnet18 = resnet.ResNet18(num_classes=10).cuda()

About the performance of active learning methods

I try some methods including random, entropy, core-set, VAAL and yours with Resnet-18 on Cifar-100 using fixed initial indices. But icannot get an ideal result. Random, entropy, VAAL and LL4AL have the almost same accuracies. Is this possible? I check my settings many times.

mephisto405 / learning-loss-for-active-learning Goto Github PK