Git Product home page Git Product logo

centernet-better-plus's Introduction

centernet-better-plus's People

Contributors

lbin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

centernet-better-plus's Issues

FloatingPointError: Loss became infinite or NaN at iteration=167!

dataset :coco2017
IMS_PER_BATCH:2

[08/21 15:47:57 d2.data.common]: Serializing 117266 elements to byte tensors and concatenating them all ...
[08/21 15:48:01 d2.data.common]: Serialized dataset takes 451.21 MiB
[08/21 15:48:01 d2.data.build]: Using training sampler TrainingSampler
[08/21 15:48:05 fvcore.common.checkpoint]: No checkpoint found. Initializing model from scratch
[08/21 15:48:05 d2.engine.train_loop]: Starting training from iteration 0
[08/21 15:48:06 d2.utils.events]:  eta: 2:36:14  iter: 19  total_loss: 20.71  loss_cls: 18.63  loss_box_wh: 3.366  loss_center_reg: 0.3933  time: 0.0750  data_time: 0.0180  lr: 0.00039962  max_mem: 411M
[08/21 15:48:08 d2.utils.events]:  eta: 2:39:49  iter: 39  total_loss: 12.1  loss_cls: 8.956  loss_box_wh: 1.814  loss_center_reg: 0.368  time: 0.0758  data_time: 0.0027  lr: 0.00079922  max_mem: 411M
[08/21 15:48:10 d2.utils.events]:  eta: 2:40:21  iter: 59  total_loss: 10.76  loss_cls: 6.582  loss_box_wh: 3.11  loss_center_reg: 0.3979  time: 0.0760  data_time: 0.0024  lr: 0.0011988  max_mem: 411M
[08/21 15:48:11 d2.utils.events]:  eta: 2:40:19  iter: 79  total_loss: 11.63  loss_cls: 7.729  loss_box_wh: 2.653  loss_center_reg: 0.2949  time: 0.0762  data_time: 0.0027  lr: 0.0015984  max_mem: 411M
[08/21 15:48:13 d2.utils.events]:  eta: 2:39:15  iter: 99  total_loss: 11.51  loss_cls: 6.495  loss_box_wh: 2.633  loss_center_reg: 0.2932  time: 0.0757  data_time: 0.0027  lr: 0.001998  max_mem: 411M
[08/21 15:48:14 d2.utils.events]:  eta: 2:39:43  iter: 119  total_loss: 17.47  loss_cls: 11.55  loss_box_wh: 2.588  loss_center_reg: 0.2923  time: 0.0757  data_time: 0.0025  lr: 0.0023976  max_mem: 411M
[08/21 15:48:16 d2.utils.events]:  eta: 2:38:11  iter: 139  total_loss: 13.35  loss_cls: 8.074  loss_box_wh: 3.39  loss_center_reg: 0.267  time: 0.0755  data_time: 0.0024  lr: 0.0027972  max_mem: 411M
[08/21 15:48:17 d2.utils.events]:  eta: 2:39:10  iter: 159  total_loss: 12.71  loss_cls: 8.765  loss_box_wh: 2.91  loss_center_reg: 0.2659  time: 0.0758  data_time: 0.0026  lr: 0.0031968  max_mem: 411M
ERROR [08/21 15:48:18 d2.engine.train_loop]: Exception during training:
Traceback (most recent call last):
  File "/home/ma-user/work/Projects/CenterNet-better-plus/detectron2-master/detectron2/engine/train_loop.py", line 141, in train
    self.run_step()
  File "/home/ma-user/work/Projects/CenterNet-better-plus/detectron2-master/detectron2/engine/train_loop.py", line 244, in run_step
    self._detect_anomaly(losses, loss_dict)
  File "/home/ma-user/work/Projects/CenterNet-better-plus/detectron2-master/detectron2/engine/train_loop.py", line 257, in _detect_anomaly
    self.iter, loss_dict
FloatingPointError: Loss became infinite or NaN at iteration=167!
loss_dict = {'loss_cls': tensor(inf, device='cuda:0', grad_fn=<MulBackward0>), 'loss_box_wh': tensor(2.2988, device='cuda:0', grad_fn=<MulBackward0>), 'loss_center_reg': tensor(0.2414, device='cuda:0', grad_fn=<MulBackward0>), 'data_time': 0.0025475993752479553}
[08/21 15:48:18 d2.engine.hooks]: Overall training speed: 165 iterations in 0:00:12 (0.0762 s / it)
[08/21 15:48:18 d2.engine.hooks]: Total training time: 0:00:12 (0:00:00 on hooks)
Traceback (most recent call last):
  File "train_net.py", line 67, in <module>
    args=(args,),
  File "/home/ma-user/work/Projects/CenterNet-better-plus/detectron2-master/detectron2/engine/launch.py", line 62, in launch
    main_func(*args)
  File "train_net.py", line 55, in main
    return trainer.train()
  File "/home/ma-user/work/Projects/CenterNet-better-plus/detectron2-master/detectron2/engine/defaults.py", line 402, in train
    super().train(self.start_iter, self.max_iter)
  File "/home/ma-user/work/Projects/CenterNet-better-plus/detectron2-master/detectron2/engine/train_loop.py", line 141, in train
    self.run_step()
  File "/home/ma-user/work/Projects/CenterNet-better-plus/detectron2-master/detectron2/engine/train_loop.py", line 244, in run_step
    self._detect_anomaly(losses, loss_dict)
  File "/home/ma-user/work/Projects/CenterNet-better-plus/detectron2-master/detectron2/engine/train_loop.py", line 257, in _detect_anomaly
    self.iter, loss_dict
FloatingPointError: Loss became infinite or NaN at iteration=167!
loss_dict = {'loss_cls': tensor(inf, device='cuda:0', grad_fn=<MulBackward0>), 'loss_box_wh': tensor(2.2988, device='cuda:0', grad_fn=<MulBackward0>), 'loss_center_reg': tensor(0.2414, device='cuda:0', grad_fn=<MulBackward0>), 'data_time': 0.0025475993752479553}

为啥loss会异常呢,有哪里不对吗,我只是把IMS_PER_BATCH从128改成了2,因为内存不够

version of detectron2

I install pytorch 1.4.0 which is not compatible with the current detectron2's version

It remind me "ModuleNotFoundError: No module named 'torch.utils.hipify"

So I want to know your detectron2's version

I can only get a map of 22.8

On resnet18, I didn't change any configuration files. I just got a 22.8 map.

My config:
CUDNN_BENCHMARK: false
DATALOADER:
ASPECT_RATIO_GROUPING: true
FILTER_EMPTY_ANNOTATIONS: true
NUM_WORKERS: 4
REPEAT_THRESHOLD: 0.0
SAMPLER_TRAIN: TrainingSampler
DATASETS:
PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000
PROPOSAL_FILES_TEST: []
PROPOSAL_FILES_TRAIN: []
TEST:

  • coco_2017_val
    TRAIN:
  • coco_2017_train
    GLOBAL:
    HACK: 1.0
    INPUT:
    CROP:
    ENABLED: false
    SIZE:
    • 0.9
    • 0.9
      TYPE: relative_range
      FORMAT: RGB
      MASK_FORMAT: polygon
      MAX_SIZE_TEST: 1333
      MAX_SIZE_TRAIN: 1333
      MIN_SIZE_TEST: 0
      MIN_SIZE_TRAIN:
  • 640
  • 672
  • 704
  • 736
  • 768
  • 800
    MIN_SIZE_TRAIN_SAMPLING: choice
    RANDOM_FLIP: horizontal
    MODEL:
    ANCHOR_GENERATOR:
    ANGLES:
      • -90
      • 0
      • 90
        ASPECT_RATIOS:
      • 0.5
      • 1.0
      • 2.0
        NAME: DefaultAnchorGenerator
        OFFSET: 0.0
        SIZES:
      • 32
      • 64
      • 128
      • 256
      • 512
        BACKBONE:
        FREEZE_AT: 2
        NAME: build_torch_backbone
        CENTERNET:
        BIAS_VALUE: -2.19
        DECONV_CHANNEL:
    • 512
    • 256
    • 128
    • 64
      DECONV_KERNEL:
    • 4
    • 4
    • 4
      DOWN_SCALE: 4
      IN_FEATURES:
    • res5
      LOSS:
      CLS_WEIGHT: 1
      REG_WEIGHT: 1
      WH_WEIGHT: 0.1
      MIN_OVERLAP: 0.7
      MODULATE_DEFORM: true
      NUM_CLASSES: 80
      OUTPUT_SIZE:
    • 128
    • 128
      TENSOR_DIM: 128
      TEST_PIPELINES: []
      TRAIN_PIPELINES:
      • CenterAffine
      • boarder: 128
        output_size:
        • 512
        • 512
          random_aug: true
      • RandomFlip
      • {}
      • RandomBrightness
      • intensity_max: 1.4
        intensity_min: 0.6
      • RandomContrast
      • intensity_max: 1.4
        intensity_min: 0.6
      • RandomSaturation
      • intensity_max: 1.4
        intensity_min: 0.6
      • RandomLighting
      • scale: 0.1
        DEVICE: cuda
        FPN:
        FUSE_TYPE: sum
        IN_FEATURES:
    • res3
    • res4
    • res5
      NORM: ''
      OUT_CHANNELS: 256
      KEYPOINT_ON: false
      LOAD_PROPOSALS: false
      MASK_ON: false
      META_ARCHITECTURE: CenterNet
      PANOPTIC_FPN:
      COMBINE:
      ENABLED: true
      INSTANCES_CONFIDENCE_THRESH: 0.5
      OVERLAP_THRESH: 0.5
      STUFF_AREA_LIMIT: 4096
      INSTANCE_LOSS_WEIGHT: 1.0
      PIXEL_MEAN:
  • 0.485
  • 0.456
  • 0.406
    PIXEL_STD:
  • 0.229
  • 0.224
  • 0.225
    PROPOSAL_GENERATOR:
    MIN_SIZE: 0
    NAME: RPN
    RESNETS:
    DEFORM_MODULATED: false
    DEFORM_NUM_GROUPS: 1
    DEFORM_ON_PER_STAGE:
    • false
    • false
    • false
    • false
      DEPTH: 18
      NORM: FrozenBN
      NUM_GROUPS: 1
      OUT_FEATURES:
    • res5
      RES2_OUT_CHANNELS: 256
      RES5_DILATION: 1
      STEM_OUT_CHANNELS: 64
      STRIDE_IN_1X1: true
      WIDTH_PER_GROUP: 64
      RETINANET:
      BBOX_REG_LOSS_TYPE: smooth_l1
      BBOX_REG_WEIGHTS: &id001
    • 1.0
    • 1.0
    • 1.0
    • 1.0
      FOCAL_LOSS_ALPHA: 0.25
      FOCAL_LOSS_GAMMA: 2.0
      IN_FEATURES:
    • p3
    • p4
    • p5
    • p6
    • p7
      IOU_LABELS:
    • 0
    • -1
    • 1
      IOU_THRESHOLDS:
    • 0.4
    • 0.5
      NMS_THRESH_TEST: 0.5
      NORM: ''
      NUM_CLASSES: 80
      NUM_CONVS: 4
      PRIOR_PROB: 0.01
      SCORE_THRESH_TEST: 0.05
      SMOOTH_L1_LOSS_BETA: 0.1
      TOPK_CANDIDATES_TEST: 1000
      ROI_BOX_CASCADE_HEAD:
      BBOX_REG_WEIGHTS:
      • 10.0
      • 10.0
      • 5.0
      • 5.0
      • 20.0
      • 20.0
      • 10.0
      • 10.0
      • 30.0
      • 30.0
      • 15.0
      • 15.0
        IOUS:
    • 0.5
    • 0.6
    • 0.7
      ROI_BOX_HEAD:
      BBOX_REG_LOSS_TYPE: smooth_l1
      BBOX_REG_LOSS_WEIGHT: 1.0
      BBOX_REG_WEIGHTS:
    • 10.0
    • 10.0
    • 5.0
    • 5.0
      CLS_AGNOSTIC_BBOX_REG: false
      CONV_DIM: 256
      FC_DIM: 1024
      NAME: ''
      NORM: ''
      NUM_CONV: 0
      NUM_FC: 0
      POOLER_RESOLUTION: 14
      POOLER_SAMPLING_RATIO: 0
      POOLER_TYPE: ROIAlignV2
      SMOOTH_L1_BETA: 0.0
      TRAIN_ON_PRED_BOXES: false
      ROI_HEADS:
      BATCH_SIZE_PER_IMAGE: 512
      IN_FEATURES:
    • res4
      IOU_LABELS:
    • 0
    • 1
      IOU_THRESHOLDS:
    • 0.5
      NAME: Res5ROIHeads
      NMS_THRESH_TEST: 0.5
      NUM_CLASSES: 80
      POSITIVE_FRACTION: 0.25
      PROPOSAL_APPEND_GT: true
      SCORE_THRESH_TEST: 0.05
      ROI_KEYPOINT_HEAD:
      CONV_DIMS:
    • 512
    • 512
    • 512
    • 512
    • 512
    • 512
    • 512
    • 512
      LOSS_WEIGHT: 1.0
      MIN_KEYPOINTS_PER_IMAGE: 1
      NAME: KRCNNConvDeconvUpsampleHead
      NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: true
      NUM_KEYPOINTS: 17
      POOLER_RESOLUTION: 14
      POOLER_SAMPLING_RATIO: 0
      POOLER_TYPE: ROIAlignV2
      ROI_MASK_HEAD:
      CLS_AGNOSTIC_MASK: false
      CONV_DIM: 256
      NAME: MaskRCNNConvUpsampleHead
      NORM: ''
      NUM_CONV: 0
      POOLER_RESOLUTION: 14
      POOLER_SAMPLING_RATIO: 0
      POOLER_TYPE: ROIAlignV2
      RPN:
      BATCH_SIZE_PER_IMAGE: 256
      BBOX_REG_LOSS_TYPE: smooth_l1
      BBOX_REG_LOSS_WEIGHT: 1.0
      BBOX_REG_WEIGHTS: *id001
      BOUNDARY_THRESH: -1
      CONV_DIMS:
    • -1
      HEAD_NAME: StandardRPNHead
      IN_FEATURES:
    • res4
      IOU_LABELS:
    • 0
    • -1
    • 1
      IOU_THRESHOLDS:
    • 0.3
    • 0.7
      LOSS_WEIGHT: 1.0
      NMS_THRESH: 0.7
      POSITIVE_FRACTION: 0.5
      POST_NMS_TOPK_TEST: 1000
      POST_NMS_TOPK_TRAIN: 2000
      PRE_NMS_TOPK_TEST: 6000
      PRE_NMS_TOPK_TRAIN: 12000
      SMOOTH_L1_BETA: 0.0
      SEM_SEG_HEAD:
      COMMON_STRIDE: 4
      CONVS_DIM: 128
      IGNORE_VALUE: 255
      IN_FEATURES:
    • p2
    • p3
    • p4
    • p5
      LOSS_WEIGHT: 1.0
      NAME: SemSegFPNHead
      NORM: GN
      NUM_CLASSES: 54
      WEIGHTS: ./output/model_0094999.pth
      OUTPUT_DIR: ./output
      SEED: -1
      SOLVER:
      AMP:
      ENABLED: false
      BASE_LR: 0.002
      BIAS_LR_FACTOR: 1.0
      CHECKPOINT_PERIOD: 5000
      CLIP_GRADIENTS:
      CLIP_TYPE: value
      CLIP_VALUE: 1.0
      ENABLED: false
      NORM_TYPE: 2.0
      GAMMA: 0.1
      IMS_PER_BATCH: 32
      LR_SCHEDULER_NAME: WarmupMultiStepLR
      MAX_ITER: 126000
      MOMENTUM: 0.9
      NESTEROV: false
      REFERENCE_WORLD_SIZE: 0
      STEPS:
  • 81000
  • 108000
    WARMUP_FACTOR: 0.001
    WARMUP_ITERS: 1000
    WARMUP_METHOD: linear
    WEIGHT_DECAY: 0.0001
    WEIGHT_DECAY_BIAS: 0.0001
    WEIGHT_DECAY_NORM: 0.0
    TEST:
    AUG:
    ENABLED: false
    FLIP: true
    MAX_SIZE: 4000
    MIN_SIZES:
    • 400
    • 500
    • 600
    • 700
    • 800
    • 900
    • 1000
    • 1100
    • 1200
      DETECTIONS_PER_IMAGE: 100
      EVAL_PERIOD: 0
      EXPECTED_RESULTS: []
      KEYPOINT_OKS_SIGMAS: []
      PRECISE_BN:
      ENABLED: false
      NUM_ITER: 200
      VERSION: 2
      VIS_PERIOD: 0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.