ej0cl6 / deep-active-learning Goto Github PK

View Code? Open in Web Editor NEW

770.0 770.0 180.0 41 KB

Deep Active Learning

License: MIT License

Python 100.00%

active-learning deep-active-learning

deep-active-learning's People

Contributors

Stargazers

Watchers

Forkers

smrjans shlpu grseb9s songfgh yucheng12345 yenchiah ryan2x gxhrid liuying350169 yiyinianhua birajaghoshal gayansampathmanamendra kir30 yydxlv michael-wzhu pq7799 benzei woshifzh chengjianglong williamhsu17 belle9217 yyht xiansenjiang nguyenvo09 mhayatt datasuperman cbli02 onejune2018 moomoofarm1 aaronrmm xiangyongcao apolaris aryanbdps9 blakecheng yiyg510 tea1528 zeyademam cwickniss myeongjin-kim bbrattoli sirius-aerostar abrliu st24hour justin-ibc terminiter sunfeng2016 xwasco jayceelee gaimjkp whoiszyc ry-z bestvimmerjp liygcheng miwenbo soyeong792 indra-ipd zhangzhao156 janicelc gengxiaomeng kh22l22 ykwon0407 abzzzz codeconomics owenxu0510 yichizhang96 soubanerjee chanocy asyrofist liaorongfan yarnerlee haiyang21 luckylym timexue yc-dreaming fundou sodiqadewole evdcush jackyang19 barneyqiao kangzi wanggossip cneyang darrenzhang01 kihyunu ariapoy natnaelt liang-zx null-op keyky noticeable ayu1729 alexlee01 odecamsiul ghostanderson miracle24 aathanush hariharasubramanianm qingzwang nitish-01 linusaronsson

deep-active-learning's Issues

Is there a mistake in Deepfool implementation?

I believe there is a mistake in this line in adversarial_deepfool.py
if value_i < value_l:
ri = value_i/np.linalg.norm(wi.numpy().flatten()) * wi
value_l = value_i # <--- this line should be added here since otherwise,t value_i is just always the last value in the for loop?

Can this pipeline be applied to any deep learning model?

Hey there,

I'm looking to explore deep learning with active learning semi automated labels for a video classification project i have. I currently use a Bidirectional LSTM and may look to explore transformers, gradient boosted decision trees and SVMs as well. For LSTMs and transformers, is it possible for me to use my LSTM model with the current framework?

Tony

Mapping between paper names to python class

Hey, Thanks for this awesome repository.
A quick question, is there a mapping between the paper names to the python code classes?
Specifically I'm interested in ''ACTIVE LEARNING FOR CONVOLUTIONAL NEURAL
NETWORKS: A CORE-SET APPROACH`` which supposes to be one of these:
"RandomSampling", "LeastConfidence", "MarginSampling", "EntropySampling", "LeastConfidenceDropout", "MarginSamplingDropout", "EntropySamplingDropout", "KMeansSampling", "KCenterGreedy", "BALDDropout", "AdversarialBIM", "AdversarialDeepFool",

CoreSet AL algorithm

Hi, thanks for your excellent codes. I have a question regarding to the CoreSet AL algorithm. In your previews version, there is this algorithm but now it is gone. May I know the reason? Thanks a lot in advance!!

TensorFlow Support

HI there,
thanks for your great job.
I'm wondering if you know any TensorFlow Supporting implementation or do you have any plan to implement it as well?

Thanks.

How can I add my own data set for active learning?

Hello, thank you for implementing a code base that integrates multiple active learning methods! It's a great job! I was wondering how can I add my own data sets and use this code to do active learning?

Error in random_sampling.py

The query function use 'return np.random.choice(np.where(self.idxs_lb==0)[0], n)' to generate new indexes, but without 'replace=False', it cannnot generate exact 1000 indexes each iteration. To compare with other query strategies with same labeled number, it should be 'return np.random.choice(np.where(self.idxs_lb==0)[0], n, replace=False)'

Couldn't find coreset implementation

Hi,

I couldn't find the coreset implementation in your code. How can I find it?

Thanks!

TypeError: missing arguments with ALBL strategy

There is a bug in deep-active-learning/query_strategies/active_learning_by_learning.py line 9.

self.strategy_list.append(RandomSampling(X, Y, idxs_lb, args))

I got the error: TypeError: init() missing 2 required positional arguments: 'handler' and 'args'

The fix should be:

self.strategy_list.append(RandomSampling(X, Y, idxs_lb, net, handler, args))

Gurobi License

Thanks for your incredible lines of code. There are no problems until I run model.optimize() in file full_solver_gurobi.py, it raises error: 'GurobiError: Model too large for size-limited license; visit https://www.gurobi.com/free-trial for a full license'.
Gurobi seem not to be free with large size models (because I ran 35000 vectors size of 512) I just want to quickly reproduce the reported results of the paper but not any commercial purposes.
Is there any other way to reproduce the results of Core-Set [3] paper or to get the free Gurobi license?

AL principles

In the train function within the Net class in nets.py, a new model instance (self.net()) is created and trained each time new labeled data becomes available. However, in active learning (AL), it’s generally preferable to continue training the same model on the newly labeled data rather than starting from scratch with a new model instance. This approach allows the model to incrementally learn from the expanding dataset, leveraging previously learned information to make better predictions as more data is labeled and added to the training set.

Is it a bug in the code?

Confusion about network constructor

It seems that each time I call the function Net.train(self, data), a new network with new initial parameters will be constructed.（As the code shown in https://github.com/ej0cl6/deep-active-learning/blob/563723356421bc7d82e3496700265992cf7fcb06/nets.py#L17） As I know, the network's parameters are trained continuously after each query in Active Learning settings, instead of constructing a new network and training from scratch. So the constructor of class Net maybe like:

class Net:
    def __init__(self, net, params, device):
        self.net = net
        self.params = params
        self.device = device
        self.clf = self.net().to(self.device)    # add
        
    def train(self, data):
        n_epoch = self.params['n_epoch']
        # self.clf = self.net().to(self.device)    # remove
        self.clf.train()
        optimizer = optim.SGD(self.clf.parameters(), **self.params['optimizer_args'])

I'm a beginner in Deep Active Learning, so the content above maybe just my misunderstanding about Deep Active Learning. Looking forward to your reply. Thank you.

Asking the probability formula if its a binary problem

Suppose it's a binary problem, did I do it right if I just sort it directly the prob or uncertainties result?

Custom dataset for image classification

Firstly I would like to thank you for the library
Is there a way to use this framework on custom image datasets for image classification with active learning?

Error with coreset strategy

Tried coreset strategy with various initializations: NUM_INIT_LB and NUM_QUERY. All of them produce errors as shown below. Looks like 'sols{}.pkl' is not being generated.

(pytorch_p36) [ec2-user@ip-172-31-38-51 deep-active-learning]$ python run.py
/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/sklearn/externals/joblib/externals/cloudpickle/cloudpickle.py:47: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
number of labeled pool: 1000
number of unlabeled pool: 39000
number of testing pool: 10000
MNIST
SEED 1
CoreSet
Round 0
testing accuracy 0.7458
Round 1
calculate distance matrix
/home/ec2-user/src/deep-active-learning/query_strategies/core_set.py:24: RuntimeWarning: invalid value encountered in sqrt
dist_mat = np.sqrt(dist_mat)
0:00:22.086067
calculate greedy solution
greedy solution 0/50
greedy solution 10/50
greedy solution 20/50
greedy solution 30/50
greedy solution 40/50
0:00:13.353899

/home/ec2-user/src/deep-active-learning/query_strategies/core_set.py(64)query()
63
---> 64 sols = pickle.load(open('sols{}.pkl'.format(SEED), 'rb'))
65

ipdb>

Exiting Debugger.