Git Product home page Git Product logo

mmal-net's People


 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar


 avatar  avatar  avatar  avatar  avatar

mmal-net's Issues



Another N_list config [for speed up training]

Thank you very much for your great work.
Can you please introduce another configuration for N_list (for CAR data set) that will increase the speed of training?
(Do we need to change window_side & ratios?)



About the Loss func

Hi, I would like to discuss the loss func issue with you :) . Do you think using triplet loss or center loss instead of CE loss in FGVC category could increase the model performance?


作者你好,我将AOLM模块加入到我的模型以后会出现一下错误,当我从双卡GPU切换到单卡GPU依然会出现这个问题。但奇怪的是,我曾经在相同参数的情况下完成过一次完整的训练,然而再次训练时就会报出ValueError: max() arg is an empty sequence的错误。
Traceback (most recent call last):
File "", line 33, in
File "/root/autodl-tmp/PART-master/processor/", line 102, in do_train
cls_g, cls_1 = model(img, target,mode='train') #0515
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/", line 168, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/", line 178, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/", line 86, in parallel_apply
File "/root/miniconda3/lib/python3.8/site-packages/torch/", line 425, in reraise
raise self.exc_type(msg)
ValueError: Caught ValueError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/", line 61, in _worker
output = module(*input, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/root/autodl-tmp/PART-master/model/", line 321, in forward
return self.forward_multi(inputs, label)
File "/root/autodl-tmp/PART-master/model/", line 406, in forward_multi
coordinates = torch.tensor(AOLM(out.detach()))
File "/root/autodl-tmp/PART-master/model/", line 33, in AOLM
max_idx = areas.index(max(areas))
ValueError: max() arg is an empty sequence

About ensemble and training details.

Hi, thanks for your simple and efficient methods. I have some comments for your network.
1、Your classification results are based on the output of the second branch. Have you ever tried ensemble three branch. Does it improved?
2、In part branch, input size is 224 * 224 rather than 448*448。This is tricky or not?
3、The metrics of localization iou is computed by total images or test dataset only?




您好,请问一下如果有两张GPU卡,需要调整的地方是在 中的 CUDA_VISIBLE_DEVICES = '0' 改为 CUDA_VISIBLE_DEVICES = '0,1' 就可以了吗?
还有关于 N_list的三个参数,请问16g的显存,对于CUB数据集,应调整为多少合适?



我发现测试代码首先会在test.py中model = MainNet()处加载预训练模型 (config.py中pretrain_path = './models/pretrained/resnet50-19c8e357.pth')

然后会在test.py中的epoch = auto_load_resume(model, pth_path, status='test'),又加载一次你训练的模型





"conda create --name DCL file conda_list.txt" should be"conda create --name DCL --file conda_list.txt"

Stanford cars

请问能提供stanford cars数据集的checkpoint model吗?


properties = measure.regionprops(component_labels)
areas = []
for prop in properties:
max_idx = areas.index(max(areas))
ValueError: max() arg is an empty sequence






UserWarning: Possibly corrupt EXIF data. Expecting to read 200 bytes but only got 0. Skipping tag 0
UserWarning: Possibly corrupt EXIF data. Expecting to read 143 bytes but only got 0. Skipping tag 0
UserWarning: Possibly corrupt EXIF data. Expecting to read 393216 bytes but only got 0. Skipping tag 0


About N_list and ratios in the code


Thanks for your great work and code.

I am not familiar with the detection task. Could you please tell me what N_list and ratios stand for and how to choose them for my own dataset?

Something went wrong when changing default config

I have implemented your code and trained with "Aircraft" dataset. It worked normally until I tried to change your CE loss function into ArcFace loss and your SGD optimizer into Adam. The code still worked but I achieve log: "there is one img no intersection" and the accuracy is very low (approximately 1%). What happened?

There is my ArcFace loss

class ArcFaceLoss(nn.Module):
    def __init__(self, s=30.0, m=0.50, is_cuda=True, base_loss = 'CrossEntropyLoss'):
        super(ArcFaceLoss, self).__init__()
        self.s = s
        self.m = m
        self.criterion = nn.CrossEntropyLoss()
        self.criterion = self.criterion.cuda()

    def forward(self, input, label):
        theta = torch.acos(torch.clamp(input, -1.0 + 1e-7, 1.0 - 1e-7))
        target_logits = torch.cos(theta + self.m) 
        one_hot = torch.zeros_like(input)
        one_hot.scatter_(1, label.view(-1, 1).long(), 1)
        output = input * (1 - one_hot) + target_logits * one_hot
        output = output * self.s
        return self.criterion(output, label)




Error in evaluation using

I trained TBMSL-NET on my custom data set. During the evaluation using the, I am getting the following error:

0%| | 0/15 [00:00<?, ?it/s]
Traceback (most recent call last):
File "", line 1, in
Traceback (most recent call last):
File "", line 58, in
File "E:\Anaconda3\envs\tbmsl_net\lib\multiprocessing\", line 105, in spawn_main
for i, data in enumerate(tqdm(testloader)):
File "E:\Anaconda3\envs\tbmsl_net\lib\site-packages\tqdm\", line 1104, in iter
exitcode = _main(fd)
File "E:\Anaconda3\envs\tbmsl_net\lib\multiprocessing\", line 114, in _main
File "E:\Anaconda3\envs\tbmsl_net\lib\multiprocessing\", line 225, in prepare
File "E:\Anaconda3\envs\tbmsl_net\lib\multiprocessing\", line 277, in _fixup_main_from_path
File "E:\Anaconda3\envs\tbmsl_net\lib\", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "E:\Anaconda3\envs\tbmsl_net\lib\", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "E:\Anaconda3\envs\tbmsl_net\lib\", line 85, in _run_code
exec(code, run_globals)
File "F:\Codes\TBMSL-Net\", line 58, in
for i, data in enumerate(tqdm(testloader)):
File "E:\Anaconda3\envs\tbmsl_net\lib\site-packages\tqdm\", line 1104, in iter
for obj in iterable:
for obj in iterable:
File "E:\Anaconda3\envs\tbmsl_net\lib\site-packages\torch\utils\data\", line 278, in iter
File "E:\Anaconda3\envs\tbmsl_net\lib\site-packages\torch\utils\data\", line 278, in iter
return _MultiProcessingDataLoaderIter(self)
File "E:\Anaconda3\envs\tbmsl_net\lib\site-packages\torch\utils\data\", line 682, in init
return _MultiProcessingDataLoaderIter(self)
File "E:\Anaconda3\envs\tbmsl_net\lib\site-packages\torch\utils\data\", line 682, in init
File "E:\Anaconda3\envs\tbmsl_net\lib\multiprocessing\", line 112, in start
File "E:\Anaconda3\envs\tbmsl_net\lib\multiprocessing\", line 112, in start
self._popen = self._Popen(self)
self._popen = self._Popen(self)
File "E:\Anaconda3\envs\tbmsl_net\lib\multiprocessing\", line 223, in _Popen
File "E:\Anaconda3\envs\tbmsl_net\lib\multiprocessing\", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "E:\Anaconda3\envs\tbmsl_net\lib\multiprocessing\", line 322, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "E:\Anaconda3\envs\tbmsl_net\lib\multiprocessing\", line 322, in _Popen
return Popen(process_obj)
return Popen(process_obj)
File "E:\Anaconda3\envs\tbmsl_net\lib\multiprocessing\", line 89, in init
File "E:\Anaconda3\envs\tbmsl_net\lib\multiprocessing\", line 46, in init
reduction.dump(process_obj, to_child)
File "E:\Anaconda3\envs\tbmsl_net\lib\multiprocessing\", line 60, in dump
prep_data = spawn.get_preparation_data(process_obj._name)
File "E:\Anaconda3\envs\tbmsl_net\lib\multiprocessing\", line 143, in get_preparation_data
File "E:\Anaconda3\envs\tbmsl_net\lib\multiprocessing\", line 136, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.
ForkingPickler(file, protocol).dump(obj)

BrokenPipeError: [Errno 32] Broken pipe

I also tried it on FGVC-Aircraft data set but got the same error. What can cause this?



There is something wrong

Hi! First of all, congratulations on your work. I want to use your model as a pretrained model for my own work.In a dataset containing bird images, I would like to use it to eliminate background or images without birds. When I run on my dataset, acc comes out very low (1%).The following error message appears.("there is one img no intersection")What is the main problem.

And also I want to ask that,In your paper you said that this model extracts object location(bounding box) and discriminative parts. But when I investigate your code,It seem you are using bounding box information but you are not using directly or indirectly.I cant understand how your code works.And how can be apply your model for my custom dataset for related purpose I said before.

Thanks and be healty,Harun Alperen

Gradient of maximum score with respect to input image is 'None'

I am trying to compute the saliency maps for MMAL-Net. I am following this blog. I have trained MMAL-Net on my custom data. To compute the saliency map we first forward an image through the network and compute the score. Then we need to compute gradients of maximum score with respect to each pixel of the input image. This is done using the backward() function from torch.autograd(). In my case, when I use backward() on maximum score its gradients are None.

While the method in the blog works other model that are available in torch but for MMAL-Net gradients are None.

Is there any suggestions how to fix it or am I missing something?

Config for custom datasets

Hi, Thank you for your great work.
I have a question about training on other datasets: How can I change the window_size and ratio for different kinds of data.

Thank you.


从上一个检查点继续训练时,发生错误,RuntimeError:in loading state_dict for MainNet:Missing key in state_dict:...





如何产生Part images的相关问题

# windows info for CUB
N_list = [2, 3, 2]
proposalN = sum(N_list) # proposal window num
window_side = [128, 192, 256]
iou_threshs = [0.25, 0.25, 0.25]
ratios = [[4, 4], [3, 5], [5, 3],
[6, 6], [5, 7], [7, 5],
[8, 8], [6, 10], [10, 6], [7, 9], [9, 7], [7, 10], [10, 7]]
【问题一】 您在论文中提到,part image是通过滑动窗口得到的,我看代码中最后得到的part image大小是用双线性插值到224*224的对吗?那上面的这些参数具体代表什么意思呢?N_list = [2, 3, 2]这个设定我不太理解,既然后面用于非极大值抑制的iou_threshs都是0.25,为什么要分三个?(不知道我理解得对不对)然后ratios是滑动窗的尺寸比例吗?那window_side又是什么呢?

【问题二】 然后关于APPM这块的源码我也读不太懂,可能上述参数的具体含义我理解不清楚导致我读代码的时候碰到比较多的问题,ratios在这里面代表的什么我一直理解不了。能麻烦您帮我注释以下代码的含义吗?非常感谢!!
class APPM(nn.Module):
def init(self):
super(APPM, self).init()
self.avgpools = [nn.AvgPool2d(ratios[i], 1) for i in range(len(ratios))]

def forward(self, proposalN, x, ratios, window_nums_sum, N_list, iou_threshs, DEVICE='cuda'):
    batch, channels, _, _ = x.size()
    avgs = [self.avgpools[i](x) for i in range(len(ratios))]

    # feature map sum
    fm_sum = [torch.sum(avgs[i], dim=1) for i in range(len(ratios))]

    all_scores =[fm_sum[i].view(batch, -1, 1) for i in range(len(ratios))], dim=1)   #cat拼接张量
    windows_scores_np =   #.cpu()将数据的处理设备从其他设备(如.cuda()拿到cpu上),不会改变变量类型,转换后仍然是Tensor变量。
                                                          # ,t.numpy()将Tensor变量转换为ndarray变量
    window_scores = torch.from_numpy(windows_scores_np).to(DEVICE).reshape(batch, -1)  #torch.from_numpy()方法把数组转换成张量,且二者共享内存,对张量进行修改比如重新赋值,那么原始数组也会相应发生改变。
    # nms
    proposalN_indices = []
    for i, scores in enumerate(windows_scores_np):
        indices_results = []
        for j in range(len(window_nums_sum)-1):
            indices_results.append(nms(scores[sum(window_nums_sum[:j+1]):sum(window_nums_sum[:j+2])], proposalN=N_list[j], iou_threshs=iou_threshs[j],
                                       coordinates=coordinates_cat[sum(window_nums_sum[:j+1]):sum(window_nums_sum[:j+2])]) + sum(window_nums_sum[:j+1]))
        # indices_results.reverse()
        proposalN_indices.append(np.concatenate(indices_results, 1))   # reverse

    proposalN_indices = np.array(proposalN_indices).reshape(batch, proposalN)
    proposalN_indices = torch.from_numpy(proposalN_indices).to(DEVICE)
    proposalN_windows_scores =
        [torch.index_select(all_score, dim=0, index=proposalN_indices[i]) for i, all_score in enumerate(all_scores)], 0).reshape(
        batch, proposalN)

    return proposalN_indices, proposalN_windows_scores, window_scores



二、pretrained_model加载是是resnet50,那么所初始化加载的resnet50(original)是否需要cub200上 fine-tune

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.