atten4vis / lw-detr Goto Github PK

This repository is an official implementation of the paper "LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection".

License: Apache License 2.0

Python 78.57% Shell 5.49% C++ 1.48% Cuda 14.45%

lw-detr's People

Contributors

Stargazers

Watchers

Forkers

leo-q8 khankindle zhangxinyu-xyz xbsu tang799319844 whuhxb cv-det

lw-detr's Issues

Text Ecnoder

你好，论文中说ovlw-detr的text encoder来自clip-convnext-large的text encoder。请问，用text encoder提取文本特征，然后与image embedding计算相似度的部分，在代码里的哪里呢？（没找到）

AttributeError: 'LWDETR' object has no attribute 'module'

I am currently using Windows and have installed all the necessary files for the model. I have converted the Linux commands to Windows commands and executed them. The medium model is training correctly, but I encountered an error where the large model fails to load. I do not know what the issue is. Can you help me identify the problem if I attach the Windows commands and training logs?

commands - bat file

@echo off
setlocal

set model_name=lwdetr_large_coco

set coco_path=%1

python -u main.py ^
--lr 1e-4 ^
--lr_encoder 1.5e-4 ^
--batch_size 2 ^
--weight_decay 1e-4 ^
--epochs 60 ^
--lr_drop 60 ^
--lr_vit_layer_decay 0.7 ^
--lr_component_decay 0.5 ^
--encoder vit_small ^
--drop_path 0.1 ^
--vit_encoder_num_layers 10 ^
--window_block_indexes 0 1 3 6 7 9 ^
--out_feature_indexes 2 4 5 9 ^
--dec_layers 3 ^
--group_detr 13 ^
--two_stage ^
--projector_scale P3 P5 ^
--hidden_dim 384 ^
--sa_nheads 12 ^
--ca_nheads 24 ^
--dec_n_points 4 ^
--bbox_reparam ^
--lite_refpoint_refine ^
--ia_bce_loss ^
--cls_loss_coef 1 ^
--num_select 300 ^
--dataset_file coco ^
--coco_path %coco_path% ^
--square_resize_div_64 ^
--use_ema ^
--pretrained_encoder pretrain_weights/caev2_small_300e_objects365.pth ^
--pretrain_weights pretrain_weights/LWDETR_large_30e_objects365.pth ^
--pretrain_keys_modify_to_load transformer.enc_out_class_embed.0.weight transformer.enc_out_class_embed.1.weight transformer.enc_out_class_embed.2.weight transformer.enc_out_class_embed.3.weight transformer.enc_out_class_embed.4.weight transformer.enc_out_class_embed.5.weight transformer.enc_out_class_embed.6.weight transformer.enc_out_class_embed.7.weight transformer.enc_out_class_embed.8.weight transformer.enc_out_class_embed.9.weight transformer.enc_out_class_embed.10.weight transformer.enc_out_class_embed.11.weight transformer.enc_out_class_embed.12.weight transformer.enc_out_class_embed.0.bias transformer.enc_out_class_embed.1.bias transformer.enc_out_class_embed.2.bias transformer.enc_out_class_embed.3.bias transformer.enc_out_class_embed.4.bias transformer.enc_out_class_embed.5.bias transformer.enc_out_class_embed.6.bias transformer.enc_out_class_embed.7.bias transformer.enc_out_class_embed.8.bias transformer.enc_out_class_embed.9.bias transformer.enc_out_class_embed.10.bias transformer.enc_out_class_embed.11.bias transformer.enc_out_class_embed.12.bias class_embed.weight class_embed.bias ^
--output_dir output/%model_name%

endlocal

======================================

(lwDETR) C:\Users\USER\Desktop\LW-DETR>C:\Users\USER\Desktop\LW-DETR\scripts\lwdetr_large_coco_train.bat C:\Users\USER\Desktop\LW-DETR\dataset\test
Not using distributed mode
git:
sha: f10c0f3, status: has uncommited changes, branch: main

Namespace(lr=0.0001, lr_encoder=0.00015, batch_size=2, weight_decay=0.0001, epochs=60, lr_drop=60, clip_max_norm=0.1, lr_vit_layer_decay=0.7, lr_component_decay=0.5, dropout=0, drop_path=0.1, drop_mode='standard', drop_schedule='constant', cutoff_epoch=0, pretrained_encoder='pretrain_weights/caev2_small_300e_objects365.pth', pretrain_weights='pretrain_weights/LWDETR_large_30e_objects365.pth', pretrain_exclude_keys=None, pretrain_keys_modify_to_load=['transformer.enc_out_class_embed.0.weight', 'transformer.enc_out_class_embed.1.weight', 'transformer.enc_out_class_embed.2.weight', 'transformer.enc_out_class_embed.3.weight', 'transformer.enc_out_class_embed.4.weight', 'transformer.enc_out_class_embed.5.weight', 'transformer.enc_out_class_embed.6.weight', 'transformer.enc_out_class_embed.7.weight', 'transformer.enc_out_class_embed.8.weight', 'transformer.enc_out_class_embed.9.weight', 'transformer.enc_out_class_embed.10.weight', 'transformer.enc_out_class_embed.11.weight', 'transformer.enc_out_class_embed.12.weight', 'transformer.enc_out_class_embed.0.bias', 'transformer.enc_out_class_embed.1.bias', 'transformer.enc_out_class_embed.2.bias', 'transformer.enc_out_class_embed.3.bias', 'transformer.enc_out_class_embed.4.bias', 'transformer.enc_out_class_embed.5.bias', 'transformer.enc_out_class_embed.6.bias', 'transformer.enc_out_class_embed.7.bias', 'transformer.enc_out_class_embed.8.bias', 'transformer.enc_out_class_embed.9.bias', 'transformer.enc_out_class_embed.10.bias', 'transformer.enc_out_class_embed.11.bias', 'transformer.enc_out_class_embed.12.bias', 'class_embed.weight', 'class_embed.bias'], encoder='vit_small', vit_encoder_num_layers=10, window_block_indexes=[0, 1, 3, 6, 7, 9], position_embedding='sine', out_feature_indexes=[2, 4, 5, 9], dec_layers=3, dim_feedforward=2048, hidden_dim=384, sa_nheads=12, ca_nheads=24, num_queries=300, group_detr=13, two_stage=True, projector_scale=['P3', 'P5'], lite_refpoint_refine=True, num_select=300, dec_n_points=4, decoder_norm='LN', bbox_reparam=True, set_cost_class=2, set_cost_bbox=5, set_cost_giou=2, cls_loss_coef=1.0, bbox_loss_coef=5, giou_loss_coef=2, focal_alpha=0.25, aux_loss=True, sum_group_losses=False, use_varifocal_loss=False, use_position_supervised_loss=False, ia_bce_loss=True, dataset_file='coco', coco_path='C:\Users\USER\Desktop\LW-DETR\dataset\test', square_resize_div_64=True, output_dir='output/lwdetr_large_coco', checkpoint_interval=10, seed=42, resume='', start_epoch=0, eval=False, use_ema=True, ema_decay=0.9997, num_workers=2, device='cuda', world_size=1, dist_url='env://', sync_bn=True, fp16_eval=False, subcommand=None, distributed=False)
IncompatibleKeys(missing_keys=[], unexpected_keys=['mask_token', 'cls_token', 'norm.weight', 'norm.bias', 'pretext_neck.regressor_blocks.0.gamma_1_cross', 'pretext_neck.regressor_blocks.0.gamma_2_cross', 'pretext_neck.regressor_blocks.0.norm1_q.weight', 'pretext_neck.regressor_blocks.0.norm1_q.bias', 'pretext_neck.regressor_blocks.0.norm1_k.weight', 'pretext_neck.regressor_blocks.0.norm1_k.bias', 'pretext_neck.regressor_blocks.0.norm1_v.weight', 'pretext_neck.regressor_blocks.0.norm1_v.bias', 'pretext_neck.regressor_blocks.0.norm2_cross.weight', 'pretext_neck.regressor_blocks.0.norm2_cross.bias', 'pretext_neck.regressor_blocks.0.cross_attn.q_bias', 'pretext_neck.regressor_blocks.0.cross_attn.v_bias', 'pretext_neck.regressor_blocks.0.cross_attn.q.weight', 'pretext_neck.regressor_blocks.0.cross_attn.k.weight', 'pretext_neck.regressor_blocks.0.cross_attn.v.weight', 'pretext_neck.regressor_blocks.0.cross_attn.proj.weight', 'pretext_neck.regressor_blocks.0.cross_attn.proj.bias', 'pretext_neck.regressor_blocks.0.mlp_cross.fc1.weight', 'pretext_neck.regressor_blocks.0.mlp_cross.fc1.bias', 'pretext_neck.regressor_blocks.0.mlp_cross.fc2.weight', 'pretext_neck.regressor_blocks.0.mlp_cross.fc2.bias', 'pretext_neck.decoder_blocks.0.gamma_1', 'pretext_neck.decoder_blocks.0.gamma_2', 'pretext_neck.decoder_blocks.0.norm1.weight', 'pretext_neck.decoder_blocks.0.norm1.bias', 'pretext_neck.decoder_blocks.0.attn.q_bias', 'pretext_neck.decoder_blocks.0.attn.v_bias', 'pretext_neck.decoder_blocks.0.attn.qkv.weight', 'pretext_neck.decoder_blocks.0.attn.proj.weight', 'pretext_neck.decoder_blocks.0.attn.proj.bias', 'pretext_neck.decoder_blocks.0.norm2.weight', 'pretext_neck.decoder_blocks.0.norm2.bias', 'pretext_neck.decoder_blocks.0.mlp.fc1.weight', 'pretext_neck.decoder_blocks.0.mlp.fc1.bias', 'pretext_neck.decoder_blocks.0.mlp.fc2.weight', 'pretext_neck.decoder_blocks.0.mlp.fc2.bias', 'pretext_neck.norm.weight', 'pretext_neck.norm.bias', 'pretext_neck.norm2.weight', 'pretext_neck.norm2.bias', 'pretext_neck.head.weight', 'blocks.10.gamma_1', 'blocks.10.gamma_2', 'blocks.10.norm1.weight', 'blocks.10.norm1.bias', 'blocks.10.attn.q_bias', 'blocks.10.attn.v_bias', 'blocks.10.attn.qkv.weight', 'blocks.10.attn.proj.weight', 'blocks.10.attn.proj.bias', 'blocks.10.norm2.weight', 'blocks.10.norm2.bias', 'blocks.10.mlp.fc1.weight', 'blocks.10.mlp.fc1.bias', 'blocks.10.mlp.fc2.weight', 'blocks.10.mlp.fc2.bias', 'blocks.11.gamma_1', 'blocks.11.gamma_2', 'blocks.11.norm1.weight', 'blocks.11.norm1.bias', 'blocks.11.attn.q_bias', 'blocks.11.attn.v_bias', 'blocks.11.attn.qkv.weight', 'blocks.11.attn.proj.weight', 'blocks.11.attn.proj.bias', 'blocks.11.norm2.weight', 'blocks.11.norm2.bias', 'blocks.11.mlp.fc1.weight', 'blocks.11.mlp.fc1.bias', 'blocks.11.mlp.fc2.weight', 'blocks.11.mlp.fc2.bias'])
number of params: 46823650
name: backbone.0.encoder.pos_embed, lr_decay: 0.019773267429999988
name: backbone.0.encoder.pos_embed, weight_decay rate: 0.0
name: backbone.0.encoder.patch_embed.proj.weight, lr_decay: 0.019773267429999988
name: backbone.0.encoder.patch_embed.proj.weight, weight_decay rate: 1.0
name: backbone.0.encoder.patch_embed.proj.bias, lr_decay: 0.019773267429999988
name: backbone.0.encoder.patch_embed.proj.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.0.gamma_1, lr_decay: 0.02824752489999998
name: backbone.0.encoder.blocks.0.gamma_1, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.0.gamma_2, lr_decay: 0.02824752489999998
name: backbone.0.encoder.blocks.0.gamma_2, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.0.norm1.weight, lr_decay: 0.02824752489999998
name: backbone.0.encoder.blocks.0.norm1.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.0.norm1.bias, lr_decay: 0.02824752489999998
name: backbone.0.encoder.blocks.0.norm1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.0.attn.q_bias, lr_decay: 0.02824752489999998
name: backbone.0.encoder.blocks.0.attn.q_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.0.attn.v_bias, lr_decay: 0.02824752489999998
name: backbone.0.encoder.blocks.0.attn.v_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.0.attn.qkv.weight, lr_decay: 0.02824752489999998
name: backbone.0.encoder.blocks.0.attn.qkv.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.0.attn.proj.weight, lr_decay: 0.02824752489999998
name: backbone.0.encoder.blocks.0.attn.proj.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.0.attn.proj.bias, lr_decay: 0.02824752489999998
name: backbone.0.encoder.blocks.0.attn.proj.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.0.norm2.weight, lr_decay: 0.02824752489999998
name: backbone.0.encoder.blocks.0.norm2.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.0.norm2.bias, lr_decay: 0.02824752489999998
name: backbone.0.encoder.blocks.0.norm2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.0.mlp.fc1.weight, lr_decay: 0.02824752489999998
name: backbone.0.encoder.blocks.0.mlp.fc1.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.0.mlp.fc1.bias, lr_decay: 0.02824752489999998
name: backbone.0.encoder.blocks.0.mlp.fc1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.0.mlp.fc2.weight, lr_decay: 0.02824752489999998
name: backbone.0.encoder.blocks.0.mlp.fc2.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.0.mlp.fc2.bias, lr_decay: 0.02824752489999998
name: backbone.0.encoder.blocks.0.mlp.fc2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.1.gamma_1, lr_decay: 0.04035360699999998
name: backbone.0.encoder.blocks.1.gamma_1, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.1.gamma_2, lr_decay: 0.04035360699999998
name: backbone.0.encoder.blocks.1.gamma_2, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.1.norm1.weight, lr_decay: 0.04035360699999998
name: backbone.0.encoder.blocks.1.norm1.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.1.norm1.bias, lr_decay: 0.04035360699999998
name: backbone.0.encoder.blocks.1.norm1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.1.attn.q_bias, lr_decay: 0.04035360699999998
name: backbone.0.encoder.blocks.1.attn.q_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.1.attn.v_bias, lr_decay: 0.04035360699999998
name: backbone.0.encoder.blocks.1.attn.v_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.1.attn.qkv.weight, lr_decay: 0.04035360699999998
name: backbone.0.encoder.blocks.1.attn.qkv.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.1.attn.proj.weight, lr_decay: 0.04035360699999998
name: backbone.0.encoder.blocks.1.attn.proj.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.1.attn.proj.bias, lr_decay: 0.04035360699999998
name: backbone.0.encoder.blocks.1.attn.proj.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.1.norm2.weight, lr_decay: 0.04035360699999998
name: backbone.0.encoder.blocks.1.norm2.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.1.norm2.bias, lr_decay: 0.04035360699999998
name: backbone.0.encoder.blocks.1.norm2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.1.mlp.fc1.weight, lr_decay: 0.04035360699999998
name: backbone.0.encoder.blocks.1.mlp.fc1.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.1.mlp.fc1.bias, lr_decay: 0.04035360699999998
name: backbone.0.encoder.blocks.1.mlp.fc1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.1.mlp.fc2.weight, lr_decay: 0.04035360699999998
name: backbone.0.encoder.blocks.1.mlp.fc2.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.1.mlp.fc2.bias, lr_decay: 0.04035360699999998
name: backbone.0.encoder.blocks.1.mlp.fc2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.2.gamma_1, lr_decay: 0.05764800999999997
name: backbone.0.encoder.blocks.2.gamma_1, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.2.gamma_2, lr_decay: 0.05764800999999997
name: backbone.0.encoder.blocks.2.gamma_2, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.2.norm1.weight, lr_decay: 0.05764800999999997
name: backbone.0.encoder.blocks.2.norm1.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.2.norm1.bias, lr_decay: 0.05764800999999997
name: backbone.0.encoder.blocks.2.norm1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.2.attn.q_bias, lr_decay: 0.05764800999999997
name: backbone.0.encoder.blocks.2.attn.q_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.2.attn.v_bias, lr_decay: 0.05764800999999997
name: backbone.0.encoder.blocks.2.attn.v_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.2.attn.qkv.weight, lr_decay: 0.05764800999999997
name: backbone.0.encoder.blocks.2.attn.qkv.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.2.attn.proj.weight, lr_decay: 0.05764800999999997
name: backbone.0.encoder.blocks.2.attn.proj.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.2.attn.proj.bias, lr_decay: 0.05764800999999997
name: backbone.0.encoder.blocks.2.attn.proj.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.2.norm2.weight, lr_decay: 0.05764800999999997
name: backbone.0.encoder.blocks.2.norm2.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.2.norm2.bias, lr_decay: 0.05764800999999997
name: backbone.0.encoder.blocks.2.norm2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.2.mlp.fc1.weight, lr_decay: 0.05764800999999997
name: backbone.0.encoder.blocks.2.mlp.fc1.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.2.mlp.fc1.bias, lr_decay: 0.05764800999999997
name: backbone.0.encoder.blocks.2.mlp.fc1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.2.mlp.fc2.weight, lr_decay: 0.05764800999999997
name: backbone.0.encoder.blocks.2.mlp.fc2.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.2.mlp.fc2.bias, lr_decay: 0.05764800999999997
name: backbone.0.encoder.blocks.2.mlp.fc2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.3.gamma_1, lr_decay: 0.08235429999999996
name: backbone.0.encoder.blocks.3.gamma_1, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.3.gamma_2, lr_decay: 0.08235429999999996
name: backbone.0.encoder.blocks.3.gamma_2, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.3.norm1.weight, lr_decay: 0.08235429999999996
name: backbone.0.encoder.blocks.3.norm1.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.3.norm1.bias, lr_decay: 0.08235429999999996
name: backbone.0.encoder.blocks.3.norm1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.3.attn.q_bias, lr_decay: 0.08235429999999996
name: backbone.0.encoder.blocks.3.attn.q_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.3.attn.v_bias, lr_decay: 0.08235429999999996
name: backbone.0.encoder.blocks.3.attn.v_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.3.attn.qkv.weight, lr_decay: 0.08235429999999996
name: backbone.0.encoder.blocks.3.attn.qkv.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.3.attn.proj.weight, lr_decay: 0.08235429999999996
name: backbone.0.encoder.blocks.3.attn.proj.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.3.attn.proj.bias, lr_decay: 0.08235429999999996
name: backbone.0.encoder.blocks.3.attn.proj.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.3.norm2.weight, lr_decay: 0.08235429999999996
name: backbone.0.encoder.blocks.3.norm2.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.3.norm2.bias, lr_decay: 0.08235429999999996
name: backbone.0.encoder.blocks.3.norm2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.3.mlp.fc1.weight, lr_decay: 0.08235429999999996
name: backbone.0.encoder.blocks.3.mlp.fc1.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.3.mlp.fc1.bias, lr_decay: 0.08235429999999996
name: backbone.0.encoder.blocks.3.mlp.fc1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.3.mlp.fc2.weight, lr_decay: 0.08235429999999996
name: backbone.0.encoder.blocks.3.mlp.fc2.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.3.mlp.fc2.bias, lr_decay: 0.08235429999999996
name: backbone.0.encoder.blocks.3.mlp.fc2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.4.gamma_1, lr_decay: 0.11764899999999996
name: backbone.0.encoder.blocks.4.gamma_1, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.4.gamma_2, lr_decay: 0.11764899999999996
name: backbone.0.encoder.blocks.4.gamma_2, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.4.norm1.weight, lr_decay: 0.11764899999999996
name: backbone.0.encoder.blocks.4.norm1.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.4.norm1.bias, lr_decay: 0.11764899999999996
name: backbone.0.encoder.blocks.4.norm1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.4.attn.q_bias, lr_decay: 0.11764899999999996
name: backbone.0.encoder.blocks.4.attn.q_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.4.attn.v_bias, lr_decay: 0.11764899999999996
name: backbone.0.encoder.blocks.4.attn.v_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.4.attn.qkv.weight, lr_decay: 0.11764899999999996
name: backbone.0.encoder.blocks.4.attn.qkv.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.4.attn.proj.weight, lr_decay: 0.11764899999999996
name: backbone.0.encoder.blocks.4.attn.proj.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.4.attn.proj.bias, lr_decay: 0.11764899999999996
name: backbone.0.encoder.blocks.4.attn.proj.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.4.norm2.weight, lr_decay: 0.11764899999999996
name: backbone.0.encoder.blocks.4.norm2.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.4.norm2.bias, lr_decay: 0.11764899999999996
name: backbone.0.encoder.blocks.4.norm2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.4.mlp.fc1.weight, lr_decay: 0.11764899999999996
name: backbone.0.encoder.blocks.4.mlp.fc1.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.4.mlp.fc1.bias, lr_decay: 0.11764899999999996
name: backbone.0.encoder.blocks.4.mlp.fc1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.4.mlp.fc2.weight, lr_decay: 0.11764899999999996
name: backbone.0.encoder.blocks.4.mlp.fc2.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.4.mlp.fc2.bias, lr_decay: 0.11764899999999996
name: backbone.0.encoder.blocks.4.mlp.fc2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.5.gamma_1, lr_decay: 0.16806999999999994
name: backbone.0.encoder.blocks.5.gamma_1, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.5.gamma_2, lr_decay: 0.16806999999999994
name: backbone.0.encoder.blocks.5.gamma_2, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.5.norm1.weight, lr_decay: 0.16806999999999994
name: backbone.0.encoder.blocks.5.norm1.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.5.norm1.bias, lr_decay: 0.16806999999999994
name: backbone.0.encoder.blocks.5.norm1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.5.attn.q_bias, lr_decay: 0.16806999999999994
name: backbone.0.encoder.blocks.5.attn.q_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.5.attn.v_bias, lr_decay: 0.16806999999999994
name: backbone.0.encoder.blocks.5.attn.v_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.5.attn.qkv.weight, lr_decay: 0.16806999999999994
name: backbone.0.encoder.blocks.5.attn.qkv.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.5.attn.proj.weight, lr_decay: 0.16806999999999994
name: backbone.0.encoder.blocks.5.attn.proj.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.5.attn.proj.bias, lr_decay: 0.16806999999999994
name: backbone.0.encoder.blocks.5.attn.proj.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.5.norm2.weight, lr_decay: 0.16806999999999994
name: backbone.0.encoder.blocks.5.norm2.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.5.norm2.bias, lr_decay: 0.16806999999999994
name: backbone.0.encoder.blocks.5.norm2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.5.mlp.fc1.weight, lr_decay: 0.16806999999999994
name: backbone.0.encoder.blocks.5.mlp.fc1.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.5.mlp.fc1.bias, lr_decay: 0.16806999999999994
name: backbone.0.encoder.blocks.5.mlp.fc1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.5.mlp.fc2.weight, lr_decay: 0.16806999999999994
name: backbone.0.encoder.blocks.5.mlp.fc2.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.5.mlp.fc2.bias, lr_decay: 0.16806999999999994
name: backbone.0.encoder.blocks.5.mlp.fc2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.6.gamma_1, lr_decay: 0.24009999999999995
name: backbone.0.encoder.blocks.6.gamma_1, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.6.gamma_2, lr_decay: 0.24009999999999995
name: backbone.0.encoder.blocks.6.gamma_2, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.6.norm1.weight, lr_decay: 0.24009999999999995
name: backbone.0.encoder.blocks.6.norm1.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.6.norm1.bias, lr_decay: 0.24009999999999995
name: backbone.0.encoder.blocks.6.norm1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.6.attn.q_bias, lr_decay: 0.24009999999999995
name: backbone.0.encoder.blocks.6.attn.q_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.6.attn.v_bias, lr_decay: 0.24009999999999995
name: backbone.0.encoder.blocks.6.attn.v_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.6.attn.qkv.weight, lr_decay: 0.24009999999999995
name: backbone.0.encoder.blocks.6.attn.qkv.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.6.attn.proj.weight, lr_decay: 0.24009999999999995
name: backbone.0.encoder.blocks.6.attn.proj.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.6.attn.proj.bias, lr_decay: 0.24009999999999995
name: backbone.0.encoder.blocks.6.attn.proj.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.6.norm2.weight, lr_decay: 0.24009999999999995
name: backbone.0.encoder.blocks.6.norm2.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.6.norm2.bias, lr_decay: 0.24009999999999995
name: backbone.0.encoder.blocks.6.norm2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.6.mlp.fc1.weight, lr_decay: 0.24009999999999995
name: backbone.0.encoder.blocks.6.mlp.fc1.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.6.mlp.fc1.bias, lr_decay: 0.24009999999999995
name: backbone.0.encoder.blocks.6.mlp.fc1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.6.mlp.fc2.weight, lr_decay: 0.24009999999999995
name: backbone.0.encoder.blocks.6.mlp.fc2.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.6.mlp.fc2.bias, lr_decay: 0.24009999999999995
name: backbone.0.encoder.blocks.6.mlp.fc2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.7.gamma_1, lr_decay: 0.3429999999999999
name: backbone.0.encoder.blocks.7.gamma_1, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.7.gamma_2, lr_decay: 0.3429999999999999
name: backbone.0.encoder.blocks.7.gamma_2, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.7.norm1.weight, lr_decay: 0.3429999999999999
name: backbone.0.encoder.blocks.7.norm1.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.7.norm1.bias, lr_decay: 0.3429999999999999
name: backbone.0.encoder.blocks.7.norm1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.7.attn.q_bias, lr_decay: 0.3429999999999999
name: backbone.0.encoder.blocks.7.attn.q_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.7.attn.v_bias, lr_decay: 0.3429999999999999
name: backbone.0.encoder.blocks.7.attn.v_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.7.attn.qkv.weight, lr_decay: 0.3429999999999999
name: backbone.0.encoder.blocks.7.attn.qkv.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.7.attn.proj.weight, lr_decay: 0.3429999999999999
name: backbone.0.encoder.blocks.7.attn.proj.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.7.attn.proj.bias, lr_decay: 0.3429999999999999
name: backbone.0.encoder.blocks.7.attn.proj.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.7.norm2.weight, lr_decay: 0.3429999999999999
name: backbone.0.encoder.blocks.7.norm2.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.7.norm2.bias, lr_decay: 0.3429999999999999
name: backbone.0.encoder.blocks.7.norm2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.7.mlp.fc1.weight, lr_decay: 0.3429999999999999
name: backbone.0.encoder.blocks.7.mlp.fc1.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.7.mlp.fc1.bias, lr_decay: 0.3429999999999999
name: backbone.0.encoder.blocks.7.mlp.fc1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.7.mlp.fc2.weight, lr_decay: 0.3429999999999999
name: backbone.0.encoder.blocks.7.mlp.fc2.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.7.mlp.fc2.bias, lr_decay: 0.3429999999999999
name: backbone.0.encoder.blocks.7.mlp.fc2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.8.gamma_1, lr_decay: 0.48999999999999994
name: backbone.0.encoder.blocks.8.gamma_1, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.8.gamma_2, lr_decay: 0.48999999999999994
name: backbone.0.encoder.blocks.8.gamma_2, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.8.norm1.weight, lr_decay: 0.48999999999999994
name: backbone.0.encoder.blocks.8.norm1.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.8.norm1.bias, lr_decay: 0.48999999999999994
name: backbone.0.encoder.blocks.8.norm1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.8.attn.q_bias, lr_decay: 0.48999999999999994
name: backbone.0.encoder.blocks.8.attn.q_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.8.attn.v_bias, lr_decay: 0.48999999999999994
name: backbone.0.encoder.blocks.8.attn.v_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.8.attn.qkv.weight, lr_decay: 0.48999999999999994
name: backbone.0.encoder.blocks.8.attn.qkv.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.8.attn.proj.weight, lr_decay: 0.48999999999999994
name: backbone.0.encoder.blocks.8.attn.proj.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.8.attn.proj.bias, lr_decay: 0.48999999999999994
name: backbone.0.encoder.blocks.8.attn.proj.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.8.norm2.weight, lr_decay: 0.48999999999999994
name: backbone.0.encoder.blocks.8.norm2.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.8.norm2.bias, lr_decay: 0.48999999999999994
name: backbone.0.encoder.blocks.8.norm2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.8.mlp.fc1.weight, lr_decay: 0.48999999999999994
name: backbone.0.encoder.blocks.8.mlp.fc1.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.8.mlp.fc1.bias, lr_decay: 0.48999999999999994
name: backbone.0.encoder.blocks.8.mlp.fc1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.8.mlp.fc2.weight, lr_decay: 0.48999999999999994
name: backbone.0.encoder.blocks.8.mlp.fc2.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.8.mlp.fc2.bias, lr_decay: 0.48999999999999994
name: backbone.0.encoder.blocks.8.mlp.fc2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.9.gamma_1, lr_decay: 0.7
name: backbone.0.encoder.blocks.9.gamma_1, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.9.gamma_2, lr_decay: 0.7
name: backbone.0.encoder.blocks.9.gamma_2, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.9.norm1.weight, lr_decay: 0.7
name: backbone.0.encoder.blocks.9.norm1.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.9.norm1.bias, lr_decay: 0.7
name: backbone.0.encoder.blocks.9.norm1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.9.attn.q_bias, lr_decay: 0.7
name: backbone.0.encoder.blocks.9.attn.q_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.9.attn.v_bias, lr_decay: 0.7
name: backbone.0.encoder.blocks.9.attn.v_bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.9.attn.qkv.weight, lr_decay: 0.7
name: backbone.0.encoder.blocks.9.attn.qkv.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.9.attn.proj.weight, lr_decay: 0.7
name: backbone.0.encoder.blocks.9.attn.proj.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.9.attn.proj.bias, lr_decay: 0.7
name: backbone.0.encoder.blocks.9.attn.proj.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.9.norm2.weight, lr_decay: 0.7
name: backbone.0.encoder.blocks.9.norm2.weight, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.9.norm2.bias, lr_decay: 0.7
name: backbone.0.encoder.blocks.9.norm2.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.9.mlp.fc1.weight, lr_decay: 0.7
name: backbone.0.encoder.blocks.9.mlp.fc1.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.9.mlp.fc1.bias, lr_decay: 0.7
name: backbone.0.encoder.blocks.9.mlp.fc1.bias, weight_decay rate: 0.0
name: backbone.0.encoder.blocks.9.mlp.fc2.weight, lr_decay: 0.7
name: backbone.0.encoder.blocks.9.mlp.fc2.weight, weight_decay rate: 1.0
name: backbone.0.encoder.blocks.9.mlp.fc2.bias, lr_decay: 0.7
name: backbone.0.encoder.blocks.9.mlp.fc2.bias, weight_decay rate: 0.0
loading annotations into memory...
Done (t=0.02s)
creating index...
index created!
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
Get benchmark
Get model size, FLOPs, and FPS
0%| | 0/20 [00:00<?, ?it/s]C:\Users\USER\Desktop\LW-DETR\models\backbone\vit.py:41: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
size = int(math.sqrt(xy_num))
C:\Users\USER\Desktop\LW-DETR\models\backbone\vit.py:42: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert size * size == xy_num
C:\Users\USER\Desktop\LW-DETR\models\backbone\vit.py:44: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if size != h or size != w:
C:\Users\USER\Desktop\LW-DETR\models\backbone\vit.py:354: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert (H % 4 == 0) and (W % 4 == 0)
C:\Users\USER\Desktop\LW-DETR\models\transformer.py:221: TracerWarning: torch.as_tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
spatial_shapes = torch.as_tensor(spatial_shapes, dtype=torch.long, device=memory.device)
C:\Users\USER\Desktop\LW-DETR\models\transformer.py:85: TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
for lvl, (H, W_) in enumerate(spatial_shapes):
C:\Users\USER\anaconda3\envs\lwDETR\lib\site-packages\torch\functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\TensorShape.cpp:3484.)
return VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
C:\Users\USER\Desktop\LW-DETR\models\transformer.py:54: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if pos_tensor.size(-1) == 2:
C:\Users\USER\Desktop\LW-DETR\models\transformer.py:56: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
elif pos_tensor.size(-1) == 4:
C:\Users\USER\Desktop\LW-DETR\models\attention.py:303: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert embed_dim == embed_dim_to_check,
C:\Users\USER\Desktop\LW-DETR\models\attention.py:314: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert head_dim * num_heads == embed_dim, f"embed_dim {embed_dim} not divisible by num_heads {num_heads}"
C:\Users\USER\Desktop\LW-DETR\models\attention.py:320: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert key.shape == value.shape, f"key shape {key.shape} does not match value shape {value.shape}"
C:\Users\USER\Desktop\LW-DETR\models\attention.py:596: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
q = q / math.sqrt(E)
C:\Users\USER\Desktop\LW-DETR\models\attention.py:440: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert list(attn_output.size()) == [bsz * num_heads, tgt_len, v_head_dim]
C:\Users\USER\Desktop\LW-DETR\models\ops\modules\ms_deform_attn.py:111: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert (input_spatial_shapes[:, 0] * input_spatial_shapes[:, 1]).sum() == Len_in
C:\Users\USER\Desktop\LW-DETR\models\ops\modules\ms_deform_attn.py:121: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if reference_points.shape[-1] == 2:
C:\Users\USER\Desktop\LW-DETR\models\ops\modules\ms_deform_attn.py:125: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
elif reference_points.shape[-1] == 4:
C:\Users\USER\Desktop\LW-DETR\models\lwdetr.py:203: TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
for a, b in zip(outputs_class[:-1], outputs_coord[:-1])]
C:\Users\USER\anaconda3\envs\lwDETR\lib\collections_init.py:860: RuntimeWarning: overflow encountered in scalar add
self[elem] += count
C:\Users\USER\Desktop\LW-DETR\util\benchmark.py:174: RuntimeWarning: overflow encountered in scalar multiply
flop = batch_size * out_size * Cout_dim * Cin_dim * kernel_size
WARNING:root:Skipped operation aten::pad 2 time(s)
WARNING:root:Skipped operation aten::upsample_bicubic2d 1 time(s)
WARNING:root:Skipped operation aten::unbind 15 time(s)
WARNING:root:Skipped operation aten::gelu 10 time(s)
WARNING:root:Skipped operation aten::silu_ 16 time(s)
WARNING:root:Skipped operation aten::mean 4 time(s)
WARNING:root:Skipped operation aten::sqrt 2 time(s)
WARNING:root:Skipped operation aten::bitwise_not 10 time(s)
WARNING:root:Skipped operation aten::prod 1 time(s)
WARNING:root:Skipped operation aten::ScalarImplicit 4 time(s)
WARNING:root:Skipped operation aten::lt 1 time(s)
WARNING:root:Skipped operation aten::exp 3 time(s)
WARNING:root:Skipped operation prim::PythonOp 3 time(s)
WARNING:root:Skipped operation aten::split 1 time(s)
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:17<00:00, 1.13it/s]
{
"nparam": 46823650,
"detailed_flops": {
"aten::sub": 0.005222407,
"aten::add": 0.017623603,
"aten::floor_divide": 1.98e-07,
"aten::mul": 0.02214301,
"norm": 0.081408,
"matmul": 8.6016,
"softmax": 0.689064,
"dropout": 0.0336,
"batchnorm": 0.0546816,
"aten::pow": 0.002611392,
"aten::div": 0.003546599,
"aten::relu_": 0.0006144,
"conv": 0.9437184,
"elementwise": 6.8e-06,
"aten::sum": 4e-07,
"aten::cumsum": 2e-09,
"aten::relu": 0.007872,
"aten::sin": 0.0001152,
"aten::cos": 0.0001152,
"bmm": 0.20736,
"linear": 2.0708352
},
"flops": {
"mean": 12.742138410999996,
"std": 1.7763568394002505e-15,
"min": 12.742138410999997,
"max": 12.742138410999997
},
"time": {
"mean": 0.023655427296956386,
"std": 0.00031874718595618136,
"min": 0.02307140827178955,
"max": 0.02422287464141846
},
"fps": 42.27359698248463
}
Min DP = 0.1000000, Max DP = 0.1000000
Start training
Traceback (most recent call last):
File "C:\Users\USER\Desktop\LW-DETR\main.py", line 426, in
main(args)
File "C:\Users\USER\Desktop\LW-DETR\main.py", line 318, in main
train_stats = train_one_epoch(
File "C:\Users\USER\Desktop\LW-DETR\engine.py", line 43, in train_one_epoch
model.module.update_drop_path(schedules['dp'][it], vit_encoder_num_layers)
File "C:\Users\USER\anaconda3\envs\lwDETR\lib\site-packages\torch\nn\modules\module.py", line 1614, in getattr
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'LWDETR' object has no attribute 'module'

Objects365预训练代码

请问为什么没有开源Objects365的预训练代码？

大概的Release时间可以透露吗?

请问可以告知一下大致的放出源码的时间吗？

Code release

Hi,

Looking forward to implement LW-DETR

Kindly update asap!

About OV version of LW-DETR

There seems no code for thesis OVLW-DETR, where can I find it?

运行lwdetr_tiny_coco_eval.sh出现这个问题怎么解决

File ".\LW-DETR-main\models\matcher.py", line 88, in forward
cost_class = pos_cost_class[:, tgt_ids] - neg_cost_class[:, tgt_ids]
RuntimeError: CUDA error: device-side assert triggered
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

关于Table 5, 没有预训练实验的超参数

您好，为了验证新加入模块的有效性。需要尝试在没有预训练的情况下直接在COCO上训练。请问Table 5中w/o预训练的实验，是只需要传入vit的预训练权重，把pretrain_weight注释掉就行了是吗?其余超参不变吗？

o365预训练权重加载问题

如果自定义数据集的类别不在o365中，该怎么处理权重加载问题呢

源码呢？

木有呀

请问有推理demo代码吗

Questions about ONNX and PyTorch Inference

I used your code to convert to ONNX, and I wrote the inference code myself. However, when I checked the time, it takes more than 200ms per image. If possible, could you provide torch inference code or onnx inference code? I used my code as follows.

======================================
import os
os.environ['CUDA_MODULE_LOADING'] = 'LAZY'
import torchvision
import argparse
import numpy as np
from PIL import Image
import cv2
import onnxruntime as nxrun
import torch
import torchvision.transforms as T
import tqdm
import time

def parser_args():
parser = argparse.ArgumentParser('Object detection using ONNX model')
parser.add_argument('--path', type=str, required=True, help='ONNX model file path')
parser.add_argument('--image_dir', type=str, required=True, help='Directory containing images to run inference on')
parser.add_argument('--output_dir', type=str, required=True, help='Directory to save output images with detections')
parser.add_argument('--threshold', type=float, default=0.5, help='Score threshold for displaying bounding boxes')
parser.add_argument('--iou_threshold', type=float, default=0.5, help='IoU threshold for non-max suppression')
parser.add_argument('--class_names', type=str, required=True, help='Path to class names file')
return parser.parse_args()

def findClassNameYOLO(annotationPath):
with open(annotationPath, 'r') as file:
className = file.read().splitlines()
return className

def load_image(file_path):
return Image.open(file_path).convert("RGB")

def infer_transforms():
normalize = T.Compose([
T.ToTensor(),
T.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])
return T.Compose([
T.Resize((640, 640)),
normalize,
])

def box_cxcywh_to_xyxy(x):
x_c, y_c, w, h = x.unbind(-1)
b = [(x_c - 0.5 * w.clamp(min=0.0)), (y_c - 0.5 * h.clamp(min=0.0)),
(x_c + 0.5 * w.clamp(min=0.0)), (y_c + 0.5 * h.clamp(min=0.0))]
return torch.stack(b, dim=-1)

def soft_nms(boxes, scores, iou_threshold=0.5, sigma=0.5, score_threshold=0.001):
N = boxes.shape[0]
indexes = torch.arange(0, N, dtype=torch.float).view(N, 1)
dets = torch.cat((boxes, scores.view(N, 1), indexes), dim=1)
keep = []

while dets.shape[0]:
    max_idx = torch.argmax(dets[:, 4])
    max_box = dets[max_idx, :4]
    max_score = dets[max_idx, 4]
    keep.append(dets[max_idx, 5].item())

    dets = torch.cat((dets[:max_idx], dets[max_idx+1:]), dim=0)
    if not dets.shape[0]:
        break

    ious = torchvision.ops.box_iou(max_box.unsqueeze(0), dets[:, :4]).squeeze()
    weights = torch.exp(-(ious ** 2) / sigma)
    dets[:, 4] *= weights
    dets = dets[dets[:, 4] > score_threshold]

return torch.tensor(keep, dtype=torch.long)

def generateColors(numClass):
colors = []
golden_ratio_conjugate = 0.618033988749895
hue = 0

for i in range(numClass):
    hue += golden_ratio_conjugate
    hue %= 1
    
    rgb = hsv2rgb(hue, 0.9, 0.95)
    colors.append((int(rgb[2] * 255), int(rgb[1] * 255), int(rgb[0] * 255)))  # BGR format

return colors

def hsv2rgb(h, s, v):
if s == 0.0:
return (v, v, v)

i = int(h * 6.)
f = (h * 6.) - i
p, q, t = v * (1. - s), v * (1. - s * f), v * (1. - s * (1. - f))
i %= 6

if i == 0:
    return (v, t, p)
if i == 1:
    return (q, v, p)
if i == 2:
    return (p, v, t)
if i == 3:
    return (p, q, v)
if i == 4:
    return (t, p, v)
if i == 5:
    return (v, p, q)

def post_process(outputs, target_sizes, iou_threshold, confidence_threshold):
out_logits, out_bbox = outputs['labels'], outputs['dets']

prob = out_logits.sigmoid()
topk_values, topk_indexes = torch.topk(prob.view(out_logits.shape[0], -1), 300, dim=1)
scores = topk_values
topk_boxes = topk_indexes // out_logits.shape[2]
labels = topk_indexes % out_logits.shape[2]
boxes = box_cxcywh_to_xyxy(out_bbox)
boxes = torch.gather(boxes, 1, topk_boxes.unsqueeze(-1).repeat(1,1,4))

img_h, img_w = target_sizes.unbind(1)
scale_fct = torch.stack([img_w, img_h, img_w, img_h], dim=1)
boxes = boxes * scale_fct[:, None, :]

results = []
for s, l, b in zip(scores, labels, boxes):
    keep = s > confidence_threshold
    results.append({
        'scores': s[keep],
        'labels': l[keep],
        'boxes': b[keep]
    })

# Apply NMS
for result in results:
    keep = torchvision.ops.nms(result['boxes'], result['scores'], iou_threshold)
    result['boxes'] = result['boxes'][keep]
    result['scores'] = result['scores'][keep]
    result['labels'] = result['labels'][keep]

return results

def saveImage(image, predictions, className, destPath, fileName, colors, original_size, resized_size):
num_detections, detected_boxes, detected_scores, detected_labels = predictions

# Calculate scaling factors
scale_x = original_size[1] / resized_size[1]
scale_y = original_size[0] / resized_size[0]

for i in range(num_detections):
    box = detected_boxes[i]
    score = detected_scores[i]
    label = int(detected_labels[i])

    # Adjust box coordinates to original image size
    start_point = (int(box[0]), int(box[1]))
    end_point = (int(box[2]), int(box[3]))
    color = colors[label]
    thickness = 2

    image = cv2.rectangle(image, start_point, end_point, color, thickness)

    label_text = f"{className[label]}: {score:.2f}"
    (label_width, label_height), baseline = cv2.getTextSize(label_text, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1)
    top_left = (start_point[0], start_point[1] - label_height - baseline)
    bottom_right = (start_point[0] + label_width, start_point[1])

    image = cv2.rectangle(image, top_left, bottom_right, color, cv2.FILLED)
    image = cv2.putText(image, label_text, (start_point[0], start_point[1] - baseline), 
                        cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1, cv2.LINE_AA)

savePath = os.path.join(destPath, fileName)
cv2.imwrite(savePath, image)

def infer_onnx(sess, image_dir, output_dir, threshold, iou_threshold, class_names):
os.makedirs(output_dir, exist_ok=True)
log_file_path = os.path.join(output_dir, "inference_log.txt")

with open(log_file_path, 'w') as log_file:
    image_paths = [os.path.join(image_dir, img_name) for img_name in os.listdir(image_dir) if img_name.endswith('.bmp')]
    colors = generateColors(len(class_names))
    
    total_time = 0
    min_time = float('inf')
    max_time = 0
    num_images = len(image_paths)
    
    for idx, img_path in enumerate(tqdm.tqdm(image_paths)):
        image = load_image(img_path)
        width, height = image.size
        orig_target_sizes = torch.Tensor([height, width])
        image_tensor = infer_transforms()(image)

        samples = image_tensor[None].numpy()
        start_time = time.time()
        res = sess.run(None, {"input": samples})
        end_time = time.time()

        outputs = {'labels': torch.Tensor(res[1]), 'dets': torch.Tensor(res[0])}
        orig_target_sizes = torch.stack([orig_target_sizes], dim=0)
        results = post_process(outputs, orig_target_sizes, iou_threshold, threshold)

        process_time = (end_time - start_time) * 1000

        image_cv = cv2.cvtColor(np.array(image), cv2.COLOR_RGB2BGR)
        saveImage(image_cv, (len(results[0]['scores']), results[0]['boxes'], results[0]['scores'], results[0]['labels']), 
                class_names, output_dir, os.path.basename(img_path), colors, (height, width), (height, width))
        
        total_time += process_time
        min_time = min(min_time, process_time)
        max_time = max(max_time, process_time)
        
        log_file.write(f"{idx:04d} // Process Time: {process_time:.4f} ms // {img_path}\n")
    
    avg_time = total_time / num_images
    log_file.write(f"\nMin Inference Time: {min_time:.2f} ms\n")
    log_file.write(f"Max Inference Time: {max_time:.2f} ms\n")
    log_file.write(f"Avg Inference Time: {avg_time:.2f} ms\n")

def main():
args = parser_args()
class_names = findClassNameYOLO(args.class_names)
sess = nxrun.InferenceSession(args.path)
infer_onnx(sess, args.image_dir, args.output_dir, args.threshold, args.iou_threshold, class_names)

if name == 'main':
main()

=================================================

Min Inference Time: 217.61 ms
Max Inference Time: 295.26 ms
Avg Inference Time: 236.82 ms

期待releases

期待LW-DETR

Question about denoising and group detr

Thank you for your amazing work. I have a question. In your Group DETR paper, it is noted that the DN-DETR + Group DETR method improves the mAP. Why did you only use the Group DETR training method in LW-DETR but not the DeNoising training method? Have you tried adding the DeNoising method in LW-DETR experiments? Looking forward to your response.

code

the code is not pubulished?

Code Release

Dear Authors,

The work is amazing. When do you intend to make the code available?

Cheers.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.