facebookresearch / cutler Goto Github PK

Code release for "Cut and Learn for Unsupervised Object Detection and Instance Segmentation" and "VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation"

License: Other

Python 93.52% Shell 0.62% C++ 0.58% Cuda 5.28%

cutler's Issues

maskcut with query features

Hey, thanks for your work! I'm trying to test the maskcut demo step. I find performing maskcut with key or value features (args.vit-feat ='k' or 'v') can produce reasonable segmentation results, while using the query or qkv features can't. Do you have any idea about this?

maskcut.py; the meaning of arguments

Thanks for sharing your model.
I'm going to train my custom dataset, and make psuedo-masks with MaskCut.
Could I know the meaning of each argument that goes into 'maskcut.py'?

cd maskcut

python maskcut.py
--vit-arch base #1.
--patch-size 8 #2.
--tau 0.15 #3.
--fixed_size 480 #4.
--N 3 #5.
--num-folder-per-job 1000 #6.
--job-index 0 #7.
--dataset-path /path/to/dataset/traindir
--out-dir /path/to/save/annotations

#1. what is '--vit-arch' ? Are there any options other than 'base'?
#2. Is '--patch-size' the correct number of segments for the input-image?
#3. what is '--tau' ?
#4. what is '--fixed_size' ?
#5. '--N' is the number of objects that must be masked in that image, right? Is it a Maximum number??
#6. Is 'num-folder-per-job' the number of folders that make up the input-data?
#7. what is '--job-index' ?

If you could answer this questions, it would be really helpful to use the CutLer :)

Clarification request about implemetation details

Hello,

I would have a couple of questions about section 3.4. and 3.5.:
About section 3.4:

"To de-duplicate the predictions and the ground truth from round t, we filter out ground-truth masks with an IoU > 0:5 with the predicted masks."

Is this performed automatically if all annotation files are placed under DETECTRON2_DATASETS/imagenet/annotations/?

About section 3.5., I am a bit confused about which learning rate you used for which training stage. In the Detector paragraph it is written:

"We train the detector on ImageNet with initial masks and bounding boxes for 160K iterations with a batch size of 16."

What is the learning rate here?

A little further it is written:

"We then optimize the detector for 160K iterations using SGD with a learning rate of 0.005, which is decreased by 5 after 80K iterations, and a batch size of 16"

Assuming that these 2 sentences refer to the training stages where you are using DropLoss:

When I check the cascade_mask_rcnn_R_50_FPN.yaml file, the BASE_LR parameter is set to 0.01, and GAMMA is set to 0.02 (decreased by 50), which is not coherent with anything from your paper. Is it normal, or might it come from a typo?
Did you train only once before moving to the self-training stages (do the 2 previous sentences refer to the same training stage)?

Then, in the Self-training paragraph:

"We optimize the detector using SGD with a learning rate of 0.01 for 80K iterations."

When I check the cascade_mask_rcnn_R_50_FPN_self_train.yaml file, the BASE_LR parameter is set to 0.005, which was the learning rate specified for the training using the DropLoss. Is it normal or might it come from a typo?

Would it be possible to have further clarifications?
Please let me know.

Can't download a model from the model zoo.

IOU threshold used for DropLoss

Hello, I am a bit confused with the droploss threshold. From my understanding, a high threshold = more loss (the model is penalized for exploring), while a low threshold encourage the model to explore. This seems to be the case when I look at the code:

weights = iou_max.le(self.droploss_iou_thresh).float()
weights = 1 - weights.ge(1.0).float()
losses = self.box_predictor.losses(predictions, proposals, weights=weights.detach())

However the wording in the paper says the opposite:

Finally, in Table 8d, we vary the IOU threshold used for DropLoss. With a high threshold, we ignore the loss for a higher number of predicted regions while encouraging the model to explore. 0:01 works best for the trade-off between exploration and detection performance.

About cutler/demo/demo.py

Thank you for posting your code.

When I run cutler/demo/demo.py, I get same results as shown in images below
I download the detectron2 with Linux (torch 1.12/ cuda 11.3) and use A6000 GPUs.

I followed this.
But, I have two issues.
(1) the results are strange.
(2) whenever I run this file, the results change.

Result1

Result2

Could I have missed something?

I look forward to hearing from you.

MaskCut leaves out some folders

Hello, thank you for sharing this nice work.

I am encountering an issue while running MaskCut on my dataset, as some folders are not being processed. I attached a picture of the masking process for a dummy dataset splitted into 10 folders of 2 images each.

This also happens when the 10 folders contain the exact same 2 pictures (1 out of 10 is left out)
This also happens when I process the 5 first folders only (1 out of 5 is left out)
This also happens when I process the 5 remaining folders (1 out of 5 is left out)

I'm not sure what could be causing this problem, would you have any suggestions ?

ModuleNotFoundError: No module named 'third_party'

While trying to run maskcut/demo.py, this error is repeatedly encountered:
ModuleNotFoundError: No module named 'third_party'
The git submodules have been all updated, still the error persists.
This error is also encountered in the Colab notebook for MaskCut.

Any suggestions to fix the issue?

Can I achieve more fine-grained segmentation?

Can I achieve more fine-grained segmentation? For example, dividing the parts of an object.Thanks for your reply.

supervised fine-tuning on my dataset,box map is lower then use mmdet mask-rcnn

1.train data use my dataset, about 6k images,size is 1024*1024, only box label, no mask label;
config like coco-semisupervised, single-gpu, modify config:
IMS_PER_BATCH=8; base_lr:0.04/8, max_iter:20000; about 30 epoch; MASK_ON:false
INPUT: max_size_train:1024; min_size_train:(1024,);
TEST: MAX_SIZE:1024,MIN_SIZES:1024

use prtrained model http://dl.fbaipublicfiles.com/cutler/checkpoints/cutler_cascade_final.pth

(1) how can i improve the mAP? if config file is not right?
(2) if i want to self-training use my dataset(unlabeled data is 60K ) ,should i do like this:
step 1.use mast_cut.py and merge_json.py generate pseudo mask
step 2.use pseudo mask, unsupervised Model Learning
setp 3:self training

my dataset has some features:
small object; complex scene;

Correct steps for self-training (custom dataset w/o annotations)

Hi and thank you for the cool work! :)

I am trying to perform unsupervised segmentation on a custom dataset (let's call it customdataset here for less confusion) using CutLER and have several questions which appeared when performing the following steps
(I referred to #16 already, it is related, but here I am asking about more things related to usage of the repo and its underdtanding)

Generate presudo masks using MaskCut -> output is a .json file.
Modify dataset scripts to enable registering a custom dataset. For that I added in cutler/data/datasets/builtin.py _PREDEFINED_SPLITS_customdataset = {} _PREDEFINED_SPLITS_customdataset["custom_dataset"] = { 'custom_dataset_train': ("custom_dataset/images/train", "custom_dataset/annotations/merged_imagenet_train_fixsize480_tau0.15_N3.json"), }

and
def register_all_customdataset(root): for dataset_name, splits_per_dataset in _PREDEFINED_SPLITS_customdataset.items(): for key, (image_root, json_file) in splits_per_dataset.items(): # Assume pre-defined datasets live in ./datasets. register_coco_instances( key, _get_builtin_metadata(dataset_name), os.path.join(root, json_file) if "://" not in json_file else json_file, os.path.join(root, image_root), )

and
register_all_customdataset(_root)

In the file cutler/data/datasets/builtin_meta.py It is written that for custom datasets it is not necessary to write hard-coded meta-data. But when debugging errors with registration, I added the follwingin the function _get_builtin_metadata:
elif dataset_name in ["imagenet", "kitti", "cross_domain", "lvis", "voc", "coco_cls_agnostic", "objects365", 'openimages', **'custom_dataset'**]: return _get_imagenet_instances_meta()

Question: Is it a correct way to handle meta data? Or shoudl annotations created by MaskCut be used with coco_instances instead? That is, I shoudl add my dataset name to this list here? if dataset_name in ["coco", "coco_semi"]: return _get_coco_instances_meta()
Or is it a wrong approach alltogether? My CustomDataset is not real-worl data and categories do not match. At this point, if I only care about segmenting out different objects without naming them, shoudl I use UVO function?

Use generated pseuso masks for performing self-training.

In the Self-Training Cutler there are 3 steps described for self-training:
step1 - "Firstly, we can get model predictions on ImageNet via running".
step2 - "Secondly, we can run the following command to generate the json file for the first round of self-training"
step3 - "Finally, place "cutler_imagenet1k_train_r1.json" under "DETECTRON2_DATASETS/imagenet/annotations/", then launch the self-training process".

Question: For custom datasets, should I skip step1 and step2? As I thought the maskCut already gives us the .json file that can be used for self-training?

I did not run step1 and step2 and directly ran the followowing command from step3 to train mode on a custom datset using maskcut annotations, and used the imageNet Cutler as model weights initialization.
python train_net.py --num-gpus 1 \ --config-file model_zoo/configs/CutLER-ImageNet/cascade_mask_rcnn_R_50_FPN_self_train.yaml \ --train-dataset custom_dataset_train \ MODEL.WEIGHTS http://dl.fbaipublicfiles.com/cutler/checkpoints/cutler_mrcnn_final.pth \ \ OUTPUT_DIR outputs/cascade/custom_dataset_selftrain-r1 \

It launched the training and I got a model.

(optional) Do another round of self training

Question: After the first round, do I understand correctly that I would need to run step1 (get model predictions from my newly trained model on maskcut annotations) and step2 (generate a json file for that) and then step3 (launch the self training process using a new json file)? Right? And the self training rounds shoudl all be done on the same data? Only the ground-trueth predictions are updated, right?

Inference.

I ran only one round of self training (just on maskcut annotations ) and then ran the demo to visualize the learned masks usign command

python demo/demo.py \ --config-file model_zoo/configs/CutLER-ImageNet/mask_rcnn_R_50_FPN.yaml \ --input ../../data/custom_dataset/images/train/*.jpg --output outputs/inference/custom_dataset_selftrain1 \ --opts MODEL.WEIGHTS outputs/custom_dataset_selftrain-r1/model_final.pth

But the demo images were crowded with label "person" and confidence percentage.
Question: I understand that the problem must be related to the fact of using ImageNet metadata, right? Is there a way to only visulzie the segmentations without any labels?
So far, my intuition is to create a custom Visualizer for detectron... But I still wanted to ask...

Looking forward to hearing any feedback! :)

where are '{1,2,5,10,20,30,40,50,60,80}perc_instances_train2017.json'

Hello!
I'm going to prepare a coco-dataset for training according to your guidelines.
but, there are no '{1,2,5,10,20,30,40,50,60,80}perc_instances_train2017.json'
How can I download a '1perc_instances_train2017.json' file ??

How can I self-train with 1 gpu？

Thank you for the cool work!
Now I have a question that how can I use just 1 gpu to train my dataset ?
Since there is a choice " --num-gpus " in the arg set , and I only have 1 gpu ,and my datasets is pretty small.

But there is a bug when I launch the script:
python train_net.py --num-gpus 1 --config-file model_zoo/configs/CutLER-ImageNet/cascade_mask_rcnn_R_50_FPN_self_train.yaml --train-dataset imagenet_train_r1 OUTPUT_DIR ../model_output/self-train-r1/
But it doesn't work , here is the part of logs with bug.

[07/21 15:40:26 d2.engine.train_loop]: Starting training from iteration 0
/root/autodl-tmp/project/CutLER/cutler/data/detection_utils.py:437: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:143.)
torch.stack([torch.from_numpy(np.ascontiguousarray(x)) for x in masks])
/root/autodl-tmp/project/CutLER/cutler/data/detection_utils.py:437: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:143.)
torch.stack([torch.from_numpy(np.ascontiguousarray(x)) for x in masks])
ERROR [07/21 15:40:27 d2.engine.train_loop]: Exception during training:
Traceback (most recent call last):
File "/root/autodl-tmp/project/detectron2/detectron2/engine/train_loop.py", line 155, in train
self.run_step()
File "/root/autodl-tmp/project/CutLER/cutler/engine/defaults.py", line 505, in run_step
self._trainer.run_step()
File "/root/autodl-tmp/project/CutLER/cutler/engine/train_loop.py", line 335, in run_step
loss_dict = self.model(data)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/autodl-tmp/project/CutLER/cutler/modeling/meta_arch/rcnn.py", line 160, in forward
features = self.backbone(images.tensor)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/autodl-tmp/project/detectron2/detectron2/modeling/backbone/fpn.py", line 139, in forward
bottom_up_features = self.bottom_up(x)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/autodl-tmp/project/detectron2/detectron2/modeling/backbone/resnet.py", line 445, in forward
x = self.stem(x)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/autodl-tmp/project/detectron2/detectron2/modeling/backbone/resnet.py", line 356, in forward
x = self.conv1(x)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/autodl-tmp/project/detectron2/detectron2/layers/wrappers.py", line 131, in forward
x = self.norm(x)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 532, in forward
world_size = torch.distributed.get_world_size(process_group)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 711, in get_world_size
return _get_group_size(group)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 263, in _get_group_size
default_pg = _get_default_group()
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 347, in _get_default_group
raise RuntimeError("Default process group has not been initialized, "
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
[07/21 15:40:27 d2.engine.hooks]: Total training time: 0:00:01 (0:00:00 on hooks)
[07/21 15:40:27 d2.utils.events]: iter: 0 lr: N/A max_mem: 1098M
Traceback (most recent call last):
File "train_net.py", line 170, in
launch(
File "/root/autodl-tmp/project/detectron2/detectron2/engine/launch.py", line 84, in launch
main_func(*args)
File "train_net.py", line 160, in main
return trainer.train()
File "/root/autodl-tmp/project/CutLER/cutler/engine/defaults.py", line 495, in train
super().train(self.start_iter, self.max_iter)
File "/root/autodl-tmp/project/detectron2/detectron2/engine/train_loop.py", line 155, in train
self.run_step()
File "/root/autodl-tmp/project/CutLER/cutler/engine/defaults.py", line 505, in run_step
self._trainer.run_step()
File "/root/autodl-tmp/project/CutLER/cutler/engine/train_loop.py", line 335, in run_step
loss_dict = self.model(data)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/autodl-tmp/project/CutLER/cutler/modeling/meta_arch/rcnn.py", line 160, in forward
features = self.backbone(images.tensor)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/autodl-tmp/project/detectron2/detectron2/modeling/backbone/fpn.py", line 139, in forward
bottom_up_features = self.bottom_up(x)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/autodl-tmp/project/detectron2/detectron2/modeling/backbone/resnet.py", line 445, in forward
x = self.stem(x)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/autodl-tmp/project/detectron2/detectron2/modeling/backbone/resnet.py", line 356, in forward
x = self.conv1(x)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/autodl-tmp/project/detectron2/detectron2/layers/wrappers.py", line 131, in forward
x = self.norm(x)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 532, in forward
world_size = torch.distributed.get_world_size(process_group)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 711, in get_world_size
return _get_group_size(group)
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 263, in _get_group_size
default_pg = _get_default_group()
File "/root/miniconda3/envs/cutler/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 347, in _get_default_group
raise RuntimeError("Default process group has not been initialized, "
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

It looks like referring to DDP.
Thanks for your help !

Can i visualize the synthesis training data?

Thank you for great works!

In the paper, it is mentioned that ImageCut2Video is used to create a synthetic video dataset using pseudo-annotations generated from maskcut. I understand it is generated from video_imagenet_train_fixsize480_tau0.15_N3.json, but the code appears to use internal APIs of Detectron2.

I'm wondering if there is any guide or method for me to view this synthetic data!

Set up panoptic/semantic segmentation with cutler

Hi!
Thanks for your awesome work!

I ran maskcut and cutler on a custom dataset, and registered the dataset the same way as imagenet. So it used 2 categories, background and foreground. I am wondering how to set up a custom cutler training on a dataset with more classes (or panotpic segmentation which would include stuff).

my concern and confusion is that from maskcut we get single foreground object masks, which have no semantic meaning (class) attached to them. How to tackle this situation?

My intuition was that i could learn foreground object segmentation by training cutler on class agnostic masks (that i get from maskcut or from somewhere), but it also seems wrong, since distinguishing classes in mrcnn is not utilised in such a case to a full extent.

looking forward to hearing your thoughts!

Kind regards,
Alexa

Dataset

what dose “ The datasets are assumed to exist in a directory specified by the environment variable DETECTRON2_DATASETS. Under this directory, detectron2 will look for datasets in the structure described below, if needed.” mean?

Pretraining with CutLER

Hello, reading your paper was pretty interesting and insightful.
I was wondering how well an object detector model such as ViTDet can benefit by pretraining with CutLER?
For instance, from the ViTDet paper, the authors acheive 55.6 APbox and 49.2. APmask (table 5 in Exploring Plain Vision Transformer Backbones for Object Detection), so is it possible to pretrain a ViTDet with CutLER and finetune it in a supervised learning way on COCO to improve detection results?

Thanks again for the great paper.

Failed when applied to mice dataset

hi! when i I use the demo images command in readme, the Maskcut can classify different instances:

Following the instructions, I typed in following command:
python demo.py --img-path imgs/00000.jpg --N 6 --tau 0.2 --vit-arch base --patch-size 8
and change parameters. however, the Maskcut always treat different mice as the same instance. What can I do to solve the problem?

Generating Annotations for VOC with MaskCut

Hey, thanks for your work!
I have tried to generate anotations for voc with maskcut, but there are garbled characters in the generated json file when it comes to annotation_info.
(The command i ran is python maskcut.py --vit-arch base --patch-size 8 --tau 0.15 --fixed_size 480 --N 3 --num-folder-per-job 1000 --job-index 0 --dataset-path /path/to/dataset/traindir --out-dir /path/to/save/annotations )

And I have changed decoding method from ascii to gbk and utf-8, but it still does not work.

Do you have any idea about this?

VOC annotations generated by maskcut

Hey, this is my jsonfile for VOC annoatations generated by maskcut ,but the bbox information is different from the file you submitted in github. Therefore I visualize the results of both of us on JPEGImages/009829.jpg ，the yellow boxes are mine and the red are yours.

Do you have any idea why there are these differences?
voc_annotations.zip

#28 (comment)

Slow when computing eigenvectors

I think it takes a long time in solve generalised eigenvalue problems in maskcut.py:

def second_smallest_eigenvector(A, D):
    # get the second smallest eigenvector from affinity matrix
    _, eigenvectors = eigh(D-A, D, subset_by_index=[1,2])
    eigenvec = np.copy(eigenvectors[:, 0])
    second_smallest_vec = eigenvectors[:, 0]
    return eigenvec, second_smallest_vec

Maybe torch.lobpcg will be a better option

Error with imports when using the notebook demo for maskcut

When I was trying out the demo for maskcut in jupyter notebook using the link in readme. I cannot seem to import the metric module from third_party.TokenCut.unsupervised_saliency_detection. The system path doesn't seem to work. I have tried work arounds but there is still an import in maskcut which I don't wish to disturb. Otherwise I haven't been able to find an elegant solution. Can you help me with it?

Official Evaluation on COCO val2017 with MaskCut only

Hi Xudong and all,
I'm new to segmentation/detection and I'm right now dealing with the evaluation metric of the predicted annotations generated by simply the MaskCut process on COCO validation 2017 set which contains 5k images. I'm wondering if you have any existing approach/code to evaluate the generated JSON file? If so, it would be a great help to me!

Thank you so much!

Dependencies for pydensecrf

Stuck in installing pydensecrf just now and finally solved~ This issue is just for the convenience of followers~

cython 3.0.0 currently doesn't support the compilation of pydensecrf and directly runing the following command
pip install git+https://github.com/lucasb-eyer/pydensecrf.git will fail.

Use the following instead:
pip3 install --force-reinstall cython==0.29.36
pip3 install --no-build-isolation git+https://github.com/lucasb-eyer/pydensecrf.git

Reference:
lucasb-eyer/pydensecrf#123 (comment)

Error when running videocutler demo

I'm trying to run the videocutler demo. I have run the maskcut and the cutler demos succesfully. However, when running the videocutler demo I get the following error:

Traceback (most recent call last):
  File "demo_video/demo.py", line 27, in <module>
    from mask2former import add_maskformer2_config
  File "/home/bea/CutLER-main/videocutler/demo_video/../mask2former/__init__.py", line 3, in <module>
    from . import modeling
  File "/home/bea/CutLER-main/videocutler/demo_video/../mask2former/modeling/__init__.py", line 4, in <module>
    from .pixel_decoder.msdeformattn import MSDeformAttnPixelDecoder
  File "/home/bea/CutLER-main/videocutler/demo_video/../mask2former/modeling/pixel_decoder/msdeformattn.py", line 19, in <module>
    from .ops.modules import MSDeformAttn
  File "/home/bea/CutLER-main/videocutler/demo_video/../mask2former/modeling/pixel_decoder/ops/modules/__init__.py", line 12, in <module>
    from .ms_deform_attn import MSDeformAttn
  File "/home/bea/CutLER-main/videocutler/demo_video/../mask2former/modeling/pixel_decoder/ops/modules/ms_deform_attn.py", line 24, in <module>
    from ..functions import MSDeformAttnFunction
  File "/home/bea/CutLER-main/videocutler/demo_video/../mask2former/modeling/pixel_decoder/ops/functions/__init__.py", line 12, in <module>
    from .ms_deform_attn_func import MSDeformAttnFunction
  File "/home/bea/CutLER-main/videocutler/demo_video/../mask2former/modeling/pixel_decoder/ops/functions/ms_deform_attn_func.py", line 22, in <module>
    import MultiScaleDeformableAttention as MSDA
ImportError: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory

I am running this in a conda env with:
Python 3.8
PyTorch 1.9
Cuda 10.2

I'm not very sure about my CUDA version, but I believe that I installed that one.

Can i train a "videocutler"?

First of all, thank you for your great words.

I am trying to train a "videocutler." After preparing all the necessary datasets and running the code, I encounter the following error:

Exception has occurred: KeyError
'video_id'
File "/home/sujung/repo/CutLER/videocutler/mask2former_video/data_video/datasets/ytvis_api/ytvos.py", line 75, in createIndex
vidToAnns[ann['video_id']].append(ann)
File "/home/sujung/repo/CutLER/videocutler/mask2former_video/data_video/datasets/ytvis_api/ytvos.py", line 66, in init
self.createIndex()
File "/home/sujung/repo/CutLER/videocutler/mask2former_video/data_video/datasets/ytvis.py", line 173, in load_ytvis_json
ytvis_api = YTVOS(json_file)
File "/home/sujung/repo/CutLER/videocutler/mask2former_video/data_video/datasets/ytvis.py", line 310, in
DatasetCatalog.register(name, lambda: load_ytvis_json(json_file, image_root, name))
File "/home/sujung/repo/CutLER/detectron2/detectron2/data/catalog.py", line 58, in get
return f()
File "/home/sujung/repo/CutLER/videocutler/mask2former_video/data_video/build.py", line 92, in
dataset_dicts = [DatasetCatalog.get(dataset_name) for dataset_name in dataset_names]
File "/home/sujung/repo/CutLER/videocutler/mask2former_video/data_video/build.py", line 92, in get_detection_dataset_dicts
dataset_dicts = [DatasetCatalog.get(dataset_name) for dataset_name in dataset_names]
File "/home/sujung/repo/CutLER/videocutler/train_net_video.py", line 84, in build_train_loader
dataset_dict = get_detection_dataset_dicts(
File "/home/sujung/repo/CutLER/videocutler/mask2former_video/engine/defaults.py", line 391, in init
data_loader = self.build_train_loader(cfg)
File "/home/sujung/repo/CutLER/videocutler/train_net_video.py", line 305, in main
trainer = Trainer(cfg)
File "/home/sujung/repo/CutLER/detectron2/detectron2/engine/launch.py", line 84, in launch
main_func(*args)
File "/home/sujung/repo/CutLER/videocutler/train_net_video.py", line 313, in
launch(
KeyError: 'video_id'

Upon examining the problematic part, I noticed that the 'ann' dictionary does not contain the 'video_id' key. So, I changed 'video_id' to 'image_id' on lines 75 and 88 of 'ytvos.py.' However, when I tried running it again, I encountered the following error:

Exception has occurred: AssertionError
Dataset 'imagenet_video_train_cls_agnostic' is empty!
File "/home/sujung/repo/CutLER/videocutler/mask2former_video/data_video/build.py", line 94, in get_detection_dataset_dicts
assert len(dicts), "Dataset '{}' is empty!".format(dataset_name)
File "/home/sujung/repo/CutLER/videocutler/train_net_video.py", line 84, in build_train_loader
dataset_dict = get_detection_dataset_dicts(
File "/home/sujung/repo/CutLER/videocutler/mask2former_video/engine/defaults.py", line 391, in init
data_loader = self.build_train_loader(cfg)
File "/home/sujung/repo/CutLER/videocutler/train_net_video.py", line 305, in main
trainer = Trainer(cfg)
File "/home/sujung/repo/CutLER/detectron2/detectron2/engine/launch.py", line 84, in launch
main_func(*args)
File "/home/sujung/repo/CutLER/videocutler/train_net_video.py", line 313, in
launch(
AssertionError: Dataset 'imagenet_video_train_cls_agnostic' is empty!

WARNING [11/06 11:08:38 mask2former_video.data_video.datasets.ytvis]: /home/sujung/repo/CutLER/videocutler/DETECTRON2_DATASETS/imagenet/annotations/video_imagenet_train_fixsize480_tau0.15_N3.json contains 1933347 annotations, but only 0 of them match to images in the file.
[11/06 11:08:38 mask2former_video.data_video.datasets.ytvis]: Loaded 0 videos in YTVIS format from /home/sujung/repo/CutLER/videocutler/DETECTRON2_DATASETS/imagenet/annotations/video_imagenet_train_fixsize480_tau0.15_N3.json

I would greatly appreciate any advice on how to resolve this issue.

backbone change

when i replace the Vit backbone to resnet, maskcut result uncorrect mask, so the backbone can only use Transformer ways?

Zero-shot detection evaluation

Thanks for your great work. I have a confusion about the zero-shot evaluation. In the 'cascade_mask_rcnn_R_50_FPN.yaml' config file, ROI_HEADS.NUM_CLASSES is setted to 1, so CutLER only can distinguish the foreground objects and background objects. However, such as in COCO dataset, there are 80 different classes, how CutLER compute the AP50 metric in zero-shot?

Supervised/unsupervised custom dataset

First of all, thank you for the amazing methods introduced in the paper. As the title suggests, I’m trying to train an object detector for my custom dataset using both approaches to see which one’s better.

I have generated annotations using maskcut.py with a change of fixed_size from 480 to 640, and placed the .json under ./datasets/imagenet/annotations. I also placed the images under ./datasets/imagenet/train. I then renamed the path in cutler/data/datasets/builtin.py to ‘imagenet_train_fixsize640_tau0.15_N3.json’ to reflect my file. However it gives me an error ‘no valid images’ when i run train_net.py. Is there something I missed?
For human-made annotations (coco format) to train a fullysupervised model using train_net.py, I’m registering my custom dataset using ‘register_coco_instances’ and modified the config in model zoo from ‘coco_train_2017’ to ‘my_dataset’. However it gives me an error that my custom dataset is not yet registered.

Any help that can point me in the right direction would be greatly appreciated.

AttributeError: 'FPN' object has no attribute 'padding_constraints'

when i train my own datasets i met this bug

ModuleNotFoundError: No module named 'third_party.TokenCut.unsupervised_saliency_detection'

File "maskcut.py", line 25, in

from third_party.TokenCut.unsupervised_saliency_detection import utils, metric

Sorry, I can't find what the third_party module is, could you please point it? Very thanks!

Customer coco dataset in self-training

Thanks for the nice work. I have a question regarding the customer coco dataset used in self-training. For my coco data, I have instances_train.py and instances_val.py, and I registered two datasets for both train and val, but in the first step of self-training, --test-dataset only take the 'imagenet_train'.

Does it mean Imagenet only use one json file for both train and validation? Or json file generation of self-training can only be applied to training data itself not val data. I am confused about it.

python merge_jsons.py : num-folder-per-job missing argument

Hello,

Thank you for your work.

For MaskCut part the following command is provided by your instructions.

python merge_jsons.py
--base-dir /path/to/save/annotations
--num-folder-per-job 2 --fixed-size 480
--tau 0.15 --N 3
--save-path imagenet_train_fixsize480_tau0.15_N3.json

However, num-folder-per-job argument does not exist in merge_jsons.py file.

maskcut_with_submitit.py has no attribute 'get_args_parser'

Hi, thanks for your great work !
But when I run_maskcut_with_submitit.sh, I have a problem that main_func( maskcut_with_submitit ) has no attribute 'get_args_parser' ?

AttributeError: function 'SetsConsoleMode' not found. Did you mean: 'GetConsoleMode'?

AttributeError: function 'SetsConsoleMode' not found. Did you mean: 'GetConsoleMode'?
How can I solve this problem?Thanks for your reply.

why use imagenet to pretrain CutLER

During training, multiple masks are generated for training. However, in ImageNet, there is only one object per image. How can this training approach be successful? Shouldn't we use the COCO dataset, which contains images with multiple objects?

Train on custom dataste

Hi, thanks for sharing this interesting work.

How can I train this model on custom datasets and how can I prepare it. Do I need an annotation or not at all.

Regarding running CutLER on Custom dataset

Hi Folks, excellent read and amazing work! I've been trying to run the CutLER on my dataset and had some queries regarding running the experiment, but also some clarifications regarding the paper in general. Please let me know if this is not the appropriate medium for the question, I'll send a mail instead. Thanks!

When I create a custom dataset as mentioned I believe I'll need to run the following script to register a COCO Format dataset
from detectron2.data.datasets import register_coco_instances
register_coco_instances("my_dataset", {}, "json_annotation.json", "path/to/image/dir")

Where do I need to run this code snippet from? Can I just create a jupyter notebook in the CutLER folder and run these snippets? And if I do, I need to provide the annotations file as well, but I'm trying to use the MaskCUT approach discussed to generate the psuedo ground truth, in that case how do I pass the .json file to register the dataset

Would it be easier to just use the naming convention of imagenet, and put my domain related images in that folder and train it with imagenet or would that make any difference? Because that approach sounds easier to me rather than registering the custom dataset.
In the command to run the merge_jsons.py, the savepath passes --save-path imagenet_train_fixsize480_tau0.15_N3.json however the naming convention of the json file generated by running the maskcut.py is different, so while running the merge_jsons.py are we supposed to pass the imagenet_train_fixsize480_tau0.15_N3.json or the one that was generated after running the maskcut.py
While doing self training on the new dataset using the given command
python train_net.py --num-gpus 8 \ --config-file model_zoo/configs/CutLER-ImageNet/cascade_mask_rcnn_R_50_FPN.yaml \ --test-dataset imagenet_train \ --eval-only TEST.DETECTIONS_PER_IMAGE 30 \ MODEL.WEIGHTS output/model_final.pth \ # load previous stage/round checkpoints OUTPUT_DIR output/ # path to save model predictions
Could you please explain a little bit about the parameters

test-dataset: are we supposed to pass the whole train dataset?
MODEL.WEIGHTS: it is output/model_final.pth, is the output folder to be created in the cutler folder?
OUTPUT_DIR: is it the same directory where we are providing the path to the model weights, and

And when we want to get the annotations using the following command
python tools/get_self_training_ann.py \ --new-pred output/inference/coco_instances_results.json \ # load model predictions --prev-ann DETECTRON2_DATASETS/imagenet/annotations/imagenet_train_fixsize480_tau0.15_N3.json \ # path to the old annotation file. --save-path DETECTRON2_DATASETS/imagenet/annotations/cutler_imagenet1k_train_r1.json \ # path to save a new annotation file. --threshold 0.7

Here we are passing the coco_instances_results.json # load model predictions, but are we supposed to pass anything else instead if we are doing custom training on our dataset? If you could elaborate what that file is and will it be generated when we train it?

Lastly, lets say after carrying out preliminary experiment on N images I want to run the entire pipeline Cut and Learn, what is the best way to go about this? Repeat in another folder or will the newly created files naming convention handle the different runs?

I have some more theoretical doubts as well, let me know If I add them this to this issue or create a separate issue as well? Thanks and sorry for an extended and (possibly) trivial queries regarding semantics.

gpu memory is increasing as training

Hi, thanks for your great work !
But when I train as the README, the gpu memory is increasing， Then out of memory ？ The commands as follow：

cd cutler
export DETECTRON2_DATASETS=/path/to/DETECTRON2_DATASETS/
python train_net.py --num-gpus 2
--config-file model_zoo/configs/CutLER-ImageNet/cascade_mask_rcnn_R_50_FPN.yaml

Not working

Notebook not working

Running the notebook results in missing module colormap, installing it resulting in missing module easydev, after installing it I got ImportError: cannot import name 'random_color' from 'colormap' (/usr/local/lib/python3.8/dist-packages/colormap/__init__.py)

I am wondering how you run it, looking at the torch version it looks like you may have run it some months ago.

Thank you for the help

Cheers,

Fra

training slow

Hi, thanks for this good work.

I run the training on 8 A100 gpus and find it will cost almost 20 days to finish one round training. Is that normal? How much time it will cost you to train one round?

errors occur when running cutler/demo/demo.py

hey! Thanks for ur work! I came across a problem when I try to run the demo.py in the "cutler/demo" file. I download the detectron2 with Linux instruction (torch 1.10/ cuda 11.3) and the detectron2 version is 0.6.
the errors can be described as below: when I follwed the instruction to run the demo.py of "cutler"

Traceback (most recent call last):
File "demo/demo.py", line 23, in
from predictor import VisualizationDemo
File "/root/data/CUTLER_LUKA/CutLER/cutler/demo/predictor.py", line 12, in
from engine.defaults import DefaultPredictor
File "/root/data/CUTLER_LUKA/CutLER/cutler/./engine/init.py", line 7, in
from .defaults import *
File "/root/data/CUTLER_LUKA/CutLER/cutler/./engine/defaults.py", line 41, in
from modeling import build_model
File "/root/data/CUTLER_LUKA/CutLER/cutler/./modeling/init.py", line 3, in
from .roi_heads import (
File "/root/data/CUTLER_LUKA/CutLER/cutler/./modeling/roi_heads/init.py", line 3, in
from .roi_heads import (
File "/root/data/CUTLER_LUKA/CutLER/cutler/./modeling/roi_heads/roi_heads.py", line 25, in
from .fast_rcnn import FastRCNNOutputLayers
File "/root/data/CUTLER_LUKA/CutLER/cutler/./modeling/roi_heads/fast_rcnn.py", line 11, in
from detectron2.data.detection_utils import get_fed_loss_cls_weights
ImportError: cannot import name 'get_fed_loss_cls_weights' from 'detectron2.data.detection_utils' (/opt/conda/envs/CUTLER/lib/python3.8/site-packages/detectron2/data/detection_utils.py)

ModuleNotFoundError: No module named 'third_party.TokenCut.unsupervised_saliency_detection'

Traceback (most recent call last):
File "demo.py", line 16, in
from third_party.TokenCut.unsupervised_saliency_detection import metric
ModuleNotFoundError: No module named 'third_party.TokenCut.unsupervised_saliency_detection'

Compatibility of Cut and Learn (CutLER) Model with Windows

I am interested in using the Cut and Learn (CutLER) Model for my project, but I am uncertain about its compatibility with the Windows operating system. Can anyone confirm if the CutLER Model can run on Windows, and if so, are there any specific steps or considerations I should be aware of?

If the CutLER Model is not compatible with Windows, I would appreciate any recommendations for alternative models with similar capabilities that are known to work seamlessly on the Windows platform. Thank you.

How to deal with the useless masks generated in MaskCut

Hi, thanks for the great work and open-sourcing the code!

It was confirmed that when entering a larger number of --N than the number of objects in the demo image, some blank masks or some misdetected masks were generated.

In general, the number of objects present in a single image is not all the same.

In this case, how did you deal with the masks generated?

How to learn on other datasets

Thanks for the author's contribution
We noticed that the author mentioned in Features that CutLER can learn unsupervised object detectors and instance segmentors solely on ImageNet-1K.
Whether CutLER can be used for our datasets, which is not included in the ImageNet-1K？

Self-training error

After having generated the json for the first round of training, I get this error. The given snippet asks to provide the train-dataset value for round1 training, but it seems script does not take train-dataset input. Is it the "test-dataset" that should be provided?

python train_net.py --num-gpus 8
--config-file model_zoo/configs/CutLER-ImageNet/cascade_mask_rcnn_R_50_FPN_self_train.yaml
--train-dataset imagenet_train_r1
MODEL.WEIGHTS output/model_final.pth \ # load previous stage/round checkpoints
OUTPUT_DIR output/self-train-r1/ # path to save checkpoints

Error:
usage: train_net.py [-h] [--config-file FILE] [--resume] [--eval-only] [--num-gpus NUM_GPUS] [--num-machines NUM_MACHINES] [--machine-rank MACHINE_RANK] [--test-dataset TEST_DATASET]
[--no-segm] [--dist-url DIST_URL]
...
train_net.py: error: unrecognized arguments: --train-dataset

I have used a very very small subset of imagenet and have successfully executed till generation of cutler_imagenet1k_train_r1.json. Please help!

Score for AP calculation

Hi,

Thank you for sharing such a work. I would like to ask what is used as score to compute AP_mask when just using MaskCut as reported in Table~7.

eigh() got an unexpected keyword argument 'subset_by_index'

Two warning
UserWarning: Default upsampling behavior when mode=bicubic is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode)

UserWarning: The default behavior for interpolate/upsample with float scale_factor changed in 1.6.0 to align with other frameworks/libraries, and now uses scale_factor directly, instead of relying on the computed output size. If you wish to restore the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details.
"The default behavior for interpolate/upsample with float scale_factor changed "

One error
TypeError: eigh() got an unexpected keyword argument 'subset_by_index'

facebookresearch / cutler Goto Github PK

cutler's Issues

Recommend Projects

Recommend Topics

Recommend Org