Git Product home page Git Product logo

biformer's People

Contributors

rayleizhu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

biformer's Issues

Region-to-region routing with directed graph.

The directed graph for establishing Bi-level routing is not described in detail in the paper. It is a key step in BRA. How is it done? Can you provide an intuitive schematic diagram?

Question about reproducing results of biformer-tiny on ImageNet1K classification

Thank you for your work, I'm really interested in your model.

I've tried to reproduce your results, especially biformer-tiny, but I was not able to get the same accuracy in paper.

Since I don't have slurm cluster server, I trained on my local GPU machine with following script

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py \
--data-path ./data/in1k \
--model 'biformer_tiny' \
--output_dir './outputs' \
--input-size 224 \
--batch-size 128 \
--drop-path 0.1 \
--lr 5e-4 \
--dist-eval

I have tried several experiments such as change lr from 5e-4 to 1e-3, as mentioned in your paper, 1024 batch size with different number of GPUs, like 256 batch/GPU with 4 GPUs or 128 batch/GPU with 8 GPUs

However, the result I got was only 81.26%, failed to reproduce 81.4% in your paper (probably 81.37% according to your log, is it right?)

Could you please share the script used to train the biformer-tiny, small and base models? It doesn't matter whether the script is based on hydra_main or slurm.

Thank you.

Problems with using the biformer attention mechanism

RuntimeError: scatter_add_cuda_kernel does not have a deterministic implementation, but you set 'torch.use_deterministic_algorithms(True)'. You can turn off determinism just for this operation, or you can use the 'warn_only=True' option, if that's accepta...

分割模型单卡训练

作者您好,对于您所做的工作表示感谢。我在复现biformer中语义分割代码时,按照github上您的建议,进行修改slurm_train.sh使得我能都在单机单卡上运行。运行sfpn.biformer_small.py,结束后我只取得了miou为42.7的结果,与您的miou=48.9相差较大。请问您对于单机分割模型单卡训练参数修改有什么建议吗?期待您的回信。

about the results

Thank you for your work!
I have run the script in segmentation part, but I can not get the results in paper.
This is my script:

MODEL=upernet.biformer_small
OUTPUT_DIR=/home/h3c/workspace/results/biformer/seg

CONFIG_DIR=configs/ade20k
CONFIG=${CONFIG_DIR}/${MODEL}.py

NOW=$(date '+%m-%d-%H:%M:%S')
WORK_DIR=${OUTPUT_DIR}/${MODEL}/${NOW}
CKPT=/home/h3c/workspace/codes/BiFormer/biformer_small_best.pth

python -m torch.distributed.launch --nproc_per_node=2 --master_port=25643 train.py ${CONFIG} \
            --launcher="pytorch" \
            --work-dir=${WORK_DIR} \
            --options model.pretrained=${CKPT} \

I just use the checkpoint you released in classification part and follow all the parameters in /BiFormer/semantic_segmentation/configs/ade20k/upernet.biformer_small.py

After training for 160K iter,I got the mIoU of 45.32%

{"mode": "val", "epoch": 32, "iter": 1000, "lr": 0.0, "aAcc": 0.8157, "mIoU": 0.4532, "mAcc": 0.6129, "IoU.wall": 0.7498, "IoU.building": 0.8148, "IoU.sky": 0.9405, "IoU.floor": 0.8119, "IoU.tree": 0.7525, "IoU.ceiling": 0.8209, "IoU.road": 0.8311, "IoU.bed ": 0.8721, "IoU.windowpane": 0.6067, "IoU.grass": 0.689, "IoU.cabinet": 0.572, "IoU.sidewalk": 0.6193, "IoU.person": 0.775, "IoU.earth": 0.3497, "IoU.door": 0.4682, "IoU.table": 0.5372, "IoU.mountain": 0.5628, "IoU.plant": 0.5036, "IoU.curtain": 0.7305, "IoU.chair": 0.5011, "IoU.car": 0.8192, "IoU.water": 0.5351, "IoU.painting": 0.6741, "IoU.sofa": 0.5771, "IoU.shelf": 0.3926, "IoU.house": 0.4642, "IoU.sea": 0.6092, "IoU.mirror": 0.6144, "IoU.rug": 0.6618, "IoU.field": 0.3732, "IoU.armchair": 0.34, "IoU.seat": 0.6288, "IoU.fence": 0.4776, "IoU.desk": 0.4634, "IoU.rock": 0.4338, "IoU.wardrobe": 0.4775, "IoU.lamp": 0.5535, "IoU.bathtub": 0.6855, "IoU.railing": 0.3617, "IoU.cushion": 0.5055, "IoU.base": 0.3135, "IoU.box": 0.1889, "IoU.column": 0.4427, "IoU.signboard": 0.3714, "IoU.chest of drawers": 0.3435, "IoU.counter": 0.3965, "IoU.sand": 0.4167, "IoU.sink": 0.6428, "IoU.skyscraper": 0.4852, "IoU.fireplace": 0.6056, "IoU.refrigerator": 0.6387, "IoU.grandstand": 0.382, "IoU.path": 0.2026, "IoU.stairs": 0.3024, "IoU.runway": 0.6852, "IoU.case": 0.6331, "IoU.pool table": 0.9035, "IoU.pillow": 0.5455, "IoU.screen door": 0.5381, "IoU.stairway": 0.3905, "IoU.river": 0.1623, "IoU.bridge": 0.5883, "IoU.bookcase": 0.3064, "IoU.blind": 0.4062, "IoU.coffee table": 0.4865, "IoU.toilet": 0.73, "IoU.flower": 0.362, "IoU.book": 0.4247, "IoU.hill": 0.1346, "IoU.bench": 0.3971, "IoU.countertop": 0.4909, "IoU.stove": 0.6121, "IoU.palm": 0.4961, "IoU.kitchen island": 0.3359, "IoU.computer": 0.5765, "IoU.swivel chair": 0.4285, "IoU.boat": 0.3019, "IoU.bar": 0.4825, "IoU.arcade machine": 0.6025, "IoU.hovel": 0.298, "IoU.bus": 0.799, "IoU.towel": 0.6117, "IoU.light": 0.4729, "IoU.truck": 0.2551, "IoU.tower": 0.2966, "IoU.chandelier": 0.6189, "IoU.awning": 0.3284, "IoU.streetlight": 0.2013, "IoU.booth": 0.3303, "IoU.television receiver": 0.6494, "IoU.airplane": 0.5304, "IoU.dirt track": 0.112, "IoU.apparel": 0.3022, "IoU.pole": 0.1561, "IoU.land": 0.0165, "IoU.bannister": 0.0477, "IoU.escalator": 0.4338, "IoU.ottoman": 0.4951, "IoU.bottle": 0.1214, "IoU.buffet": 0.4683, "IoU.poster": 0.137, "IoU.stage": 0.1453, "IoU.van": 0.4185, "IoU.ship": 0.5525, "IoU.fountain": 0.243, "IoU.conveyer belt": 0.535, "IoU.canopy": 0.1664, "IoU.washer": 0.6119, "IoU.plaything": 0.2774, "IoU.swimming pool": 0.5712, "IoU.stool": 0.3033, "IoU.barrel": 0.379, "IoU.basket": 0.3221, "IoU.waterfall": 0.7458, "IoU.tent": 0.7565, "IoU.bag": 0.0769, "IoU.minibike": 0.6268, "IoU.cradle": 0.5708, "IoU.oven": 0.3523, "IoU.ball": 0.4085, "IoU.food": 0.4589, "IoU.step": 0.0194, "IoU.tank": 0.494, "IoU.trade name": 0.2347, "IoU.microwave": 0.7802, "IoU.pot": 0.3467, "IoU.animal": 0.5789, "IoU.bicycle": 0.5363, "IoU.lake": 0.5452, "IoU.dishwasher": 0.4162, "IoU.screen": 0.519, "IoU.blanket": 0.0402, "IoU.sculpture": 0.4434, "IoU.hood": 0.5812, "IoU.sconce": 0.2392, "IoU.vase": 0.2684, "IoU.traffic light": 0.1992, "IoU.tray": 0.0305, "IoU.ashcan": 0.3684, "IoU.fan": 0.5163, "IoU.pier": 0.3581, "IoU.crt screen": 0.0402, "IoU.plate": 0.5139, "IoU.monitor": 0.0864, "IoU.bulletin board": 0.4406, "IoU.shower": 0.0077, "IoU.radiator": 0.5858, "IoU.glass": 0.0791, "IoU.clock": 0.2303, "IoU.flag": 0.3673, "Acc.wall": 0.8514, "Acc.building": 0.9065, "Acc.sky": 0.9687, "Acc.floor": 0.8836, "Acc.tree": 0.8716, "Acc.ceiling": 0.8823, "Acc.road": 0.8934, "Acc.bed ": 0.9423, "Acc.windowpane": 0.7647, "Acc.grass": 0.8592, "Acc.cabinet": 0.6884, "Acc.sidewalk": 0.7713, "Acc.person": 0.8835, "Acc.earth": 0.47, "Acc.door": 0.6218, "Acc.table": 0.693, "Acc.mountain": 0.7681, "Acc.plant": 0.623, "Acc.curtain": 0.8565, "Acc.chair": 0.6319, "Acc.car": 0.9061, "Acc.water": 0.6655, "Acc.painting": 0.8565, "Acc.sofa": 0.7563, "Acc.shelf": 0.5809, "Acc.house": 0.7459, "Acc.sea": 0.8515, "Acc.mirror": 0.7218, "Acc.rug": 0.7969, "Acc.field": 0.5453, "Acc.armchair": 0.5544, "Acc.seat": 0.8609, "Acc.fence": 0.6484, "Acc.desk": 0.7283, "Acc.rock": 0.6883, "Acc.wardrobe": 0.7032, "Acc.lamp": 0.7083, "Acc.bathtub": 0.8115, "Acc.railing": 0.4746, "Acc.cushion": 0.6132, "Acc.base": 0.5948, "Acc.box": 0.2305, "Acc.column": 0.6038, "Acc.signboard": 0.4677, "Acc.chest of drawers": 0.5021, "Acc.counter": 0.4562, "Acc.sand": 0.588, "Acc.sink": 0.7237, "Acc.skyscraper": 0.6147, "Acc.fireplace": 0.8795, "Acc.refrigerator": 0.8517, "Acc.grandstand": 0.8288, "Acc.path": 0.3338, "Acc.stairs": 0.3531, "Acc.runway": 0.954, "Acc.case": 0.8407, "Acc.pool table": 0.9591, "Acc.pillow": 0.7043, "Acc.screen door": 0.8156, "Acc.stairway": 0.4694, "Acc.river": 0.3618, "Acc.bridge": 0.8505, "Acc.bookcase": 0.5925, "Acc.blind": 0.4818, "Acc.coffee table": 0.8091, "Acc.toilet": 0.9159, "Acc.flower": 0.5195, "Acc.book": 0.6065, "Acc.hill": 0.219, "Acc.bench": 0.5155, "Acc.countertop": 0.6636, "Acc.stove": 0.7709, "Acc.palm": 0.7272, "Acc.kitchen island": 0.8012, "Acc.computer": 0.7625, "Acc.swivel chair": 0.7474, "Acc.boat": 0.5022, "Acc.bar": 0.6881, "Acc.arcade machine": 0.8357, "Acc.hovel": 0.4787, "Acc.bus": 0.9661, "Acc.towel": 0.7556, "Acc.light": 0.5349, "Acc.truck": 0.4912, "Acc.tower": 0.6598, "Acc.chandelier": 0.8069, "Acc.awning": 0.4226, "Acc.streetlight": 0.2572, "Acc.booth": 0.6564, "Acc.television receiver": 0.8086, "Acc.airplane": 0.704, "Acc.dirt track": 0.3559, "Acc.apparel": 0.3625, "Acc.pole": 0.1917, "Acc.land": 0.0255, "Acc.bannister": 0.0705, "Acc.escalator": 0.8254, "Acc.ottoman": 0.5913, "Acc.bottle": 0.1328, "Acc.buffet": 0.6984, "Acc.poster": 0.1911, "Acc.stage": 0.452, "Acc.van": 0.5686, "Acc.ship": 0.9553, "Acc.fountain": 0.2597, "Acc.conveyer belt": 0.9528, "Acc.canopy": 0.3348, "Acc.washer": 0.7959, "Acc.plaything": 0.5114, "Acc.swimming pool": 0.835, "Acc.stool": 0.4093, "Acc.barrel": 0.6512, "Acc.basket": 0.3959, "Acc.waterfall": 0.9322, "Acc.tent": 0.9926, "Acc.bag": 0.0848, "Acc.minibike": 0.8096, "Acc.cradle": 0.8412, "Acc.oven": 0.5692, "Acc.ball": 0.6886, "Acc.food": 0.5119, "Acc.step": 0.0243, "Acc.tank": 0.6731, "Acc.trade name": 0.258, "Acc.microwave": 0.9172, "Acc.pot": 0.4097, "Acc.animal": 0.6256, "Acc.bicycle": 0.7739, "Acc.lake": 0.6362, "Acc.dishwasher": 0.5984, "Acc.screen": 0.8872, "Acc.blanket": 0.0459, "Acc.sculpture": 0.6051, "Acc.hood": 0.656, "Acc.sconce": 0.2872, "Acc.vase": 0.3981, "Acc.traffic light": 0.4289, "Acc.tray": 0.0337, "Acc.ashcan": 0.5113, "Acc.fan": 0.7614, "Acc.pier": 0.7415, "Acc.crt screen": 0.1154, "Acc.plate": 0.6714, "Acc.monitor": 0.1304, "Acc.bulletin board": 0.5742, "Acc.shower": 0.0081, "Acc.radiator": 0.6818, "Acc.glass": 0.0807, "Acc.clock": 0.2586, "Acc.flag": 0.4091}

Besides, I try the new version bra_nchw.py, but I just got a lower performance of about 31% mIoU.

Sorry, Am I doing something wrong?

We couldn't find the code for the Biformer used for object detection

Thank you very much for such a great work. But we couldn't find the code for the Biformer used for object detection.

There is no code for Biformer.py in this directory “/BiFormer/object_detection/mmdet/models/backbones/”.

We would be very grateful if you could provide this code.

Thank you very much!

论文中的图4的可视化

作者您好,请问一下图4中的可视化所采用的top-k 的值是不是16啊,这个可视化来自阶段 3 吗,还是可视化的时候您们在阶段1就选用了top-k的值为16。

About the "gather"

Hello!I want to know what is the "gather" step for? and why the bra_nchw.py didn't had the "gather" step?
I hope you can answer it for me!
thank you!

关于区域划分数量 S*S 的疑问

@rayleizhu
作者您好:
感谢你们的巨大贡献。
在阅读论文过程中,我有个疑问:论文和代码中SxS个划分区域是设置为7x7(以分类为例),但是根据3.3节公式9中的定义:
image

因此想要得到最小的计算复杂度,以输入为224*224的分类图片来说,当 k=4 时,S应该为41.40031811187035, 而与论文设置的 7 相差太多,想请问一下为什么选择S=7?

关于Bi-level Routing Attention的问题

作者您好!最近拜读了您的论文,在目标检测网络上测试了本文提出的方法,效果非常好。
但我还有一些问题想咨询您一下。论文中的Bi-level Routing Attention,有没有具体的结构图?
如果方便的话,麻烦作者提供一下,万分感谢!

What are the functions of the parameters?

I have noticed some parameters in the code, such as kv_downsample diff_routing and so on. What are the functions of the parameters? Howwill they influence the model?

About the argument 'img_size'

Hello, about the model initialization in the script 'biformer.py', there is an argument named 'img_size' in the comment but it is nowhere being used in the script. Is this correct?

分割模型实现问题

   首先感谢作者做出如此精彩的工作并分享出来,下面是我在代码实现过程中遇到的问题。
   我想要实现代码中的semantic_segmentation部分,根据相关的readme文件下载了数据集并对slurm_train.sh进行的修改,在看到这篇代码之前并不理解slurm是什么,查资料后发现您的实现环境跟我并不相同,我只是在实验室的单台服务器上运行代码,我也看了其他问题下有和我类似的情况,所以我将slurm_train.sh改为下面这种情况

#!/usr/bin/env bash

NOW=$(date '+%m-%d-%H:%M:%S')
OUTPUT_DIR=../outputs/seg

CONFIG_DIR=/home/katie/code/semantic_segmentation/configs/ade20k

CKPT=/home/katie/code/semantic_segmentation/biformer_base_best.pth
MODEL=upernet.biformer_base

CONFIG=${CONFIG_DIR}/${MODEL}.py
WORK_DIR=${OUTPUT_DIR}/${MODEL}/${NOW}
mkdir -p ${WORK_DIR}

python -m torch.distributed.launch --nproc_per_node=2 --master_port=25643 train.py ${CONFIG}
--launcher="pytorch"
--work-dir=${WORK_DIR}
--options model.pretrained=${CKPT} \

接着在终端运行bash slurm_train.sh 遇到了下面的问题

(biformer) katie@a-ubuntu-16-04-lts:~/code/semantic_segmentation$ bash slurm_train.sh
/home/katie/anaconda3/envs/biformer/lib/python3.8/site-packages/torch/distributed/launch.py:178: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects --local_rank argument to be set, please
change it to read from os.environ['LOCAL_RANK'] instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions

warnings.warn(
WARNING:torch.distributed.run:


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


/home/katie/anaconda3/envs/biformer/lib/python3.8/site-packages/mmcv/init.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
warnings.warn(
/home/katie/anaconda3/envs/biformer/lib/python3.8/site-packages/mmcv/init.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
warnings.warn(
usage: train.py [-h] [--config CONFIG] [--work-dir WORK_DIR] [--load-from LOAD_FROM] [--resume-from RESUME_FROM] [--no-validate]
[--gpus GPUS | --gpu-ids GPU_IDS [GPU_IDS ...]] [--seed SEED] [--deterministic] [--options OPTIONS [OPTIONS ...]]
[--launcher {none,pytorch,slurm,mpi}] [--local_rank LOCAL_RANK]
train.py: error: unrecognized arguments: /home/katie/code/semantic_segmentation/configs/ade20k/upernet.biformer_base.py
usage: train.py [-h] [--config CONFIG] [--work-dir WORK_DIR] [--load-from LOAD_FROM] [--resume-from RESUME_FROM] [--no-validate]
[--gpus GPUS | --gpu-ids GPU_IDS [GPU_IDS ...]] [--seed SEED] [--deterministic] [--options OPTIONS [OPTIONS ...]]
[--launcher {none,pytorch,slurm,mpi}] [--local_rank LOCAL_RANK]
train.py: error: unrecognized arguments: /home/katie/code/semantic_segmentation/configs/ade20k/upernet.biformer_base.py
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 2) local_rank: 0 (pid: 12772) of binary: /home/katie/anaconda3/envs/biformer/bin/python
Traceback (most recent call last):
File "/home/katie/anaconda3/envs/biformer/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/katie/anaconda3/envs/biformer/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/katie/anaconda3/envs/biformer/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/home/katie/anaconda3/envs/biformer/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/home/katie/anaconda3/envs/biformer/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/home/katie/anaconda3/envs/biformer/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/home/katie/anaconda3/envs/biformer/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/katie/anaconda3/envs/biformer/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

train.py FAILED

Failures:
[1]:
time : 2023-08-15_15:54:52
host : a-ubuntu-16-04-lts
rank : 1 (local_rank: 1)
exitcode : 2 (pid: 12773)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure):
[0]:
time : 2023-08-15_15:54:52
host : a-ubuntu-16-04-lts
rank : 0 (local_rank: 0)
exitcode : 2 (pid: 12772)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

除去一些较长的warning 主要的error我认为可能在train.py: error: unrecognized arguments: /home/katie/code/semantic_segmentation/configs/ade20k/upernet.biformer_base.py,但检查了很多遍都觉得这个路径没有问题,想问一下您是否有更好的解决方法,或者发现我的操作哪里存在问题,非常非常期待您的回复

slurm

i want to know how to run the .sh without slurm to segmentation, because the slurm is hard to use.

block问题

作者您好,在论文中topk = 1,4,16,S^23,也就是说,在stage4用的topk=S^23,但是在BiFormer tiny代码里面的stage:4 的topk=-2,用的AttentionLePE,并不是BiLevelRoutingAttention。
问题一:为什么要在stage:4用AttentionLePE呢?为什么不全部用BiLevelRoutingAttention呢?
问题二:S^23有什么特殊意义吗?

数据集

我是一个新手,正在调试源代码,请问将数据集放在哪里

对BiFormer的理解

q_win, k_win = q.mean([2, 3]), kv[..., 0:self.qk_dim].mean([2, 3]) # window-wise qk, (n, p^2, c_qk), (n, p^2, c_qk)

这段代码的作用是什么?把这两个维度做了平均目的是什么呢?

关于模型复现和检测的问题

你好大佬,我看了你们的文章也star了你们的源码,想尝试复现一下目标检测这块,但目前遇到了几个问题:
1.我发现你们命令行命令是用slurm srun这个命令来启动的,然后gpu设置为8,但是我是单机也没有集群的环境,尝试改了一下直接运行train.py但还是有几个参数搞的不是很懂,请问可以出一个使用python命令行的配置吗?
2。源码上关于目标检测的部分好像都是使用mmdet这个框架,然后也没有提供修改数据集、batch_size和epoch的入口,如果我想使用自己的数据集进行训练的话是否需要先掌握一下mmdet的用法,或者还有别的方法进行更改。
3.训练完好像没有可以进行detect的代码,你们后续有开发可视化的计划吗?如果backbone更换为BiFormer的话检测效率可以达到每秒多少FPS呢
个人技术有限,想复现目前碰到了这些阻碍,如果有收到留言请问大佬方便解答一下吗?

coco2017数据集跑完数据很奇怪

作者您好,我使用这个权重biformer_small_best.pth加载biformer_mm模型训练coco2017数据集第一轮结果如下:
bc4638a687e4c1a0c195453f416ba6e
这个样子是否正常,还有就是环境中有这个apex包,还是报错apex is not installed,这个是否有影响。

mmcv version

Dear author:
Thank you for your excellent work. May I ask if this model can be used under mmcv full=1.3.3.

TypeError: '>' not supported between instances of 'tuple' and 'int'

感谢您出色的工作,但是我把代码加入到我的baseline中出现了如标题所示的错误,
出错的代码位置如下:bra_legacy.py
self.lepe = nn.Conv2d(dim, dim, kernel_size=side_dwconv, stride=1, padding=side_dwconv//2, groups=dim) if side_dwconv > 0 else
lambda x: torch.zeros_like(x)

你好 请问biformer的思路

你好 首先恭喜你们,这是一篇很棒的作品 biformer的我看可以提高一些小目标的检测精度 我想知道这是为什么呢 有什么解读吗

Diffirence in routing attention

Hi Author,
BiFormer inspires me a lot, I am searching for the routing attention where the ops contains two. What is the main routing attention you proposed in your paper.
Thanks!

A question about interpretability

Hello author, I am a newcomer in this field. I would like to ask if your Biformer has interpretability and where it is reflected.

memory problem

When I input the 1X20X600X500 picture into the nchwBRA, the memory will reach about 100G, is there any way to reduce the memory usage (My pictures cannot be cropped),thank you very much.

one 7x7 conv vs. two 3x3 conv

Thank you for your wonderful work!

It has been noted that some recent works used two 3x3 convs (stride=2) instead of one 7x7 conv (stride=4) as stem, is it because the latter can lead to better results?

计算量统计

作者你好,想问下模型计算量是如何统计的,是用库统计的吗?

关于论文当中公式8的求助

FLOPs的routing的2(S方)的平方*C,这个2是怎么来的,不是只有一个Ar的计算涉及到吗?
如果2可以解释,那么最后一行不应该是2的1/3次方吗?论文中写的是4/3次方。

About visualization of routed region and attention map

Hi author,
BiFormer is such inspiring work! I wonder if you could provide the demo code for the visualization of the routed key-value region, or describe how can i get similar results in Figure 4 of the paper.

Thanks a lot!

分割模型loss为nan

大佬您好,我在复现代码时遇到loss为nan的问题。想问问您我是不是哪里配置错误了。
1 我的任务是语义分割,使用sfpn.biformer_small.py模型,导入您的biformer_small_best.pth权重文件。训练使用4卡24G的4090,Batchsize为8,(您是8卡4batchsize)因此我的lr和iters和您一样。
2 我没有使用slurm,而是使用pytorch启动器进行多卡训练。我的脚本如下:
#!/usr/bin/env bash
PARTITION=mediasuper
NOW=$(date '+%m-%d-%H:%M:%S')
JOB_NAME=${MODEL}

CONFIG_DIR=configs/ade20k
MODEL=sfpn.biformer_small
CKPT=pretrained/biformer_small_best.pth
CONFIG=${CONFIG_DIR}/${MODEL}.py

OUTPUT_DIR=../outputs/seg
WORK_DIR=${OUTPUT_DIR}/${MODEL}/${NOW}
mkdir -p ${WORK_DIR}
PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \

export CUDA_VISIBLE_DEVICES=0,1,2,3
torchrun
--nproc_per_node=4
--master_port=29501
train.py --config=${CONFIG}
--launcher="pytorch"
--work-dir=${WORK_DIR}
--options model.pretrained=${CKPT}
&> ${WORK_DIR}/train.${JOB_NAME}.log &
3 在训练道32000iters之后,出现了loss消失的问题。并且我重新训练,或者将32000iters的pth文件重新加载后依然会在同样的地方产生相同的问题。因为第一个epoch已经训练完成了,数据肯定是没有问题的,我查阅了下csdn,我想问下这是不是跟训练精度有关呢?
2024-04-09 17:27:05,483 - mmcv - INFO - Reducer buckets have been rebuilt in this iteration.
2024-04-09 17:27:28,017 - mmseg - INFO - Iter [32050/80000] lr: 1.307e-04, eta: 5 days, 16:27:25, time: 2.407, data_time: 1.969, memory: 15179, decode.loss_ce: 0.2974, decode.acc_seg: 88.5982, loss: 0.2974
2024-04-09 17:27:50,423 - mmseg - INFO - Iter [32100/80000] lr: 1.305e-04, eta: 2 days, 23:47:03, time: 0.448, data_time: 0.013, memory: 15179, decode.loss_ce: 0.3082, decode.acc_seg: 88.0161, loss: 0.3082
2024-04-09 17:28:12,847 - mmseg - INFO - Iter [32150/80000] lr: 1.303e-04, eta: 2 days, 1:56:17, time: 0.448, data_time: 0.012, memory: 15179, decode.loss_ce: 0.3059, decode.acc_seg: 88.2027, loss: 0.3059
2024-04-09 17:28:37,074 - mmseg - INFO - Iter [32200/80000] lr: 1.302e-04, eta: 1 day, 15:04:37, time: 0.485, data_time: 0.011, memory: 15179, decode.loss_ce: nan, decode.acc_seg: 49.3611, loss: nan
2024-04-09 17:29:03,193 - mmseg - INFO - Iter [32250/80000] lr: 1.300e-04, eta: 1 day, 8:38:24, time: 0.522, data_time: 0.010, memory: 15179, decode.loss_ce: nan, decode.acc_seg: 15.7751, loss: nan
2024-04-09 17:29:29,717 - mmseg - INFO - Iter [32300/80000] lr: 1.298e-04, eta: 1 day, 4:21:26, time: 0.530, data_time: 0.011, memory: 15179, decode.loss_ce: nan, decode.acc_seg: 15.8742, loss: nan
2024-04-09 17:29:55,593 - mmseg - INFO - Iter [32350/80000] lr: 1.296e-04, eta: 1 day, 1:16:04, time: 0.518, data_time: 0.012, memory: 15179, decode.loss_ce: nan, decode.acc_seg: 17.5384, loss: nan

Objection detect code

Hello, may I ask when the code for object detection will be released? I look forward to your response. Thank you.

How to deploy on deepstream6.0?

When I use trtexec(TensorRT: 8.0.1.6) to generate engine file, there is a error.

$ /usr/local/TensorRT-8.6.1.6/bin/trtexec --onnx=best.onnx --saveEngine=best_4.engine --explicitBatch --fp16 --workspace=1024 --buildOnly --threads=8 (base) david@david-ubuntu20:BGF-YOLO$ /usr/local/TensorRT-8.6.1./bin/trtexec --onnx=best.onnx --saveEngine=best_4.engine --explicitBatch --fp16 --workspace=1024 --buildOnly --threads=8  2023-12-04_08:50:01#(base) david@david-ubuntu20:BGF-YOLO$ /usr/local/TensorRT-8.5.3.1/bin/trtexec --onnx=best.onnx --saveEngine=best_4.engine --explicitBatch --fp16 --workspace=1024 --buildOnly --threads=8 (base) david@david-ubuntu20:BGF-YOLO$ /usr/local/TensorRT-8.5.3.1/bin/trtexec --onnx=best.onnx --saveEngine=best_4.engine --explicitBatch --fp16 --workspace=1024 --buildOnly --threads=8 2023-12-04_08:50:03#&&&& RUNNING TensorRT.trtexec [TensorRT v8503] # /usr/local/TensorRT-8.5.3.1/bin/trtexec --onnx=best.onnx --saveEngine=best_4.engine --explicitBatch --fp16 --workspace=1024 --buildOnly --threads=8
2023-12-04_08:50:03#[12/04/2023-08:50:03] [W] --explicitBatch flag has been deprecated and has no effect!
2023-12-04_08:50:03#[12/04/2023-08:50:03] [W] Explicit batch dim is automatically enabled if input model is ONNX or if dynamic shapes are provided when the engine is built.
2023-12-04_08:50:03#[12/04/2023-08:50:03] [W] --workspace flag has been deprecated by --memPoolSize flag.
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] === Model Options ===
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Format: ONNX
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Model: best.onnx
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Output:
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] === Build Options ===
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Max batch: explicit batch
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Memory Pools: workspace: 1024 MiB, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] minTiming: 1
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] avgTiming: 8
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Precision: FP32+FP16
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] LayerPrecisions: 
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Calibration: 
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Refit: Disabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Sparsity: Disabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Safe mode: Disabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] DirectIO mode: Disabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Restricted mode: Disabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Build only: Enabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Save engine: best_4.engine
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Load engine: 
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Profiling verbosity: 0
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Tactic sources: Using default tactic sources
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] timingCacheMode: local
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] timingCacheFile: 
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Heuristic: Disabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Preview Features: Use default preview flags.
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Input(s)s format: fp32:CHW
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Output(s)s format: fp32:CHW
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Input build shapes: model
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Input calibration shapes: model
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] === System Options ===
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Device: 0
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] DLACore: 
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Plugins:
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] === Inference Options ===
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Batch: Explicit
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Input inference shapes: model
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Iterations: 10
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Duration: 3s (+ 200ms warm up)
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Sleep time: 0ms
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Idle time: 0ms
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Streams: 1
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] ExposeDMA: Disabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Data transfers: Enabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Spin-wait: Disabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Multithreading: Enabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] CUDA Graph: Disabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Separate profiling: Disabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Time Deserialize: Disabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Time Refit: Disabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] NVTX verbosity: 0
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Persistent Cache Ratio: 0
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Inputs:
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] === Reporting Options ===
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Verbose: Disabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Averages: 10 inferences
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Percentiles: 90,95,99
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Dump refittable layers:Disabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Dump output: Disabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Profile: Disabled
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Export timing to JSON file: 
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Export output to JSON file: 
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Export profile to JSON file: 
2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] 
2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] === Device Information ===
2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] Selected Device: NVIDIA GeForce RTX 3080 Ti
2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] Compute Capability: 8.6
2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] SMs: 80
2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] Compute Clock Rate: 1.665 GHz
2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] Device Global Memory: 12042 MiB
2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] Shared Memory per SM: 100 KiB
2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] Memory Bus Width: 384 bits (ECC disabled)
2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] Memory Clock Rate: 9.501 GHz
2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] 
2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] TensorRT version: 8.5.3
2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] [TRT] [MemUsageChange] Init CUDA: CPU +446, GPU +0, now: CPU 459, GPU 486 (MiB)
2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] Start parsing network model
2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] ----------------------------------------------------------------
2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] Input filename:   best.onnx
2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] ONNX IR version:  0.0.8
2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] Opset version:    17
2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] Producer name:    pytorch
2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] Producer version: 2.1.0
2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] Domain:           
2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] Model version:    0
2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] Doc string:       
2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] ----------------------------------------------------------------
2023-12-04_08:50:05#[12/04/2023-08:50:05] [W] [TRT] onnx2trt_utils.cpp:366: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] No importer registered for op: Mod. Attempting to import as plugin.
2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] Searching for plugin: Mod, plugin_version: 1, plugin_namespace: 
2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] [TRT] ModelImporter.cpp:769: While parsing node number 160 [Mod -> "/model.12/Mod_output_0"]:
2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] [TRT] ModelImporter.cpp:770: --- Begin node ---
2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] [TRT] ModelImporter.cpp:771: input: "/model.12/Constant_output_0"
2023-12-04_08:50:05#input: "/model.12/Constant_1_output_0"
2023-12-04_08:50:05#output: "/model.12/Mod_output_0"
2023-12-04_08:50:05#name: "/model.12/Mod"
2023-12-04_08:50:05#op_type: "Mod"
2023-12-04_08:50:05#attribute {
2023-12-04_08:50:05#  name: "fmod"
2023-12-04_08:50:05#  i: 0
2023-12-04_08:50:05#  type: INT
2023-12-04_08:50:05#}
2023-12-04_08:50:05#
2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] [TRT] ModelImporter.cpp:772: --- End node ---
2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] [TRT] ModelImporter.cpp:775: ERROR: builtin_op_importers.cpp:4870 In function importFallbackPluginImporter:
2023-12-04_08:50:05#[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] Failed to parse onnx file
2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] Finish parsing network model
2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] Parsing model failed
2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] Failed to create engine from model or file.
2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] Engine set up failed
2023-12-04_08:50:05#&&&& FAILED TensorRT.trtexec [TensorRT v8503] # /usr/local/TensorRT-8.5.3.1/bin/trtexec --onnx=best.onnx --saveEngine=best_4.engine --explicitBatch --fp16 --workspace=1024 --buildOnly --threads=8

Could you give me a hint how to solve this issue without changing tensorrt version?
Because I find there is no error when I use TensorRT-8.5.3.1.

Environmental settings

could u please tell me what is the runtime environment required to run tbiformer for image classification tasks, bsc I don't see in the readme, such as torch python, etc,Looking forward to your reply soon

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.