Hi Folks, excellent read and amazing work! I've been trying to run the CutLER on my da

Regarding running CutLER on Custom dataset about cutler HOT 2 CLOSED

facebookresearch commented on September 12, 2024 1

Regarding running CutLER on Custom dataset

from cutler.

Comments (2)

frank-xwang commented on September 12, 2024 2

Hi, sorry for the late reply. I'll do my best to answer all of your questions, but please let me know if I miss anything.

Registering a COCO format dataset: Since we're using Detectron2, I recommend checking out the "Use Custom Datasets" tutorial in the Detectron2 documentation for a detailed explanation on how to register custom datasets. You can also follow our approach to registering ImageNet by modifying the "builtin.py" and "builtin_meta.py" files in the "cutler/data/datasets" directory of our GitHub repository.
Would it be easier to just use the naming convention of imagenet? Yes, it is. But I may recommend you register a new dataset.
The command to run the merge_jsons.py. Yes, you should use the one that was generated after running the maskcut.py.
Parameters. 1) The test-dataset should be the entire training set, as we'll be using the model's predictions on the training set as the pseudo-masks for the next stage of self-training. 2) The MODEL.WEIGHTS parameter should point to the checkpoint obtained from the unsupervised model learning stage. 3) OUTPUT_DIR specifies the path where the model predictions will be saved. These predictions will be used as the "ground-truth" for the next stage of self-training. 4) The default name for the model predictions is "coco_instances_results.json", but you can check the files saved under OUTPUT_DIR/inference/ and modify the name accordingly if needed.
Repeat the self-training process multiple times. If you only care about the final results and not the intermediate ones, the easiest approach is to overwrite the results of the previous runs. This means that you should always use the same file name, such as r1.json or r2.json. However, if you want to keep track of the results from each run, you'll need to register the "new" dataset. The images will be the same as before, but the annotations will be updated for each run. For example, you could name the updated datasets "cutler_imagenet1k_train_r3.json" or "cutler_imagenet1k_train_r4.json".

Hope these answers help.
Best,
XuDong

from cutler.

frank-xwang commented on September 12, 2024

Closing it now. Please feel free to reopen it if you have further questions.

from cutler.

Recommend Projects