Git Product home page Git Product logo

Comments (2)

frank-xwang avatar frank-xwang commented on September 12, 2024 2

Hi, sorry for the late reply. I'll do my best to answer all of your questions, but please let me know if I miss anything.

  1. Registering a COCO format dataset: Since we're using Detectron2, I recommend checking out the "Use Custom Datasets" tutorial in the Detectron2 documentation for a detailed explanation on how to register custom datasets. You can also follow our approach to registering ImageNet by modifying the "builtin.py" and "builtin_meta.py" files in the "cutler/data/datasets" directory of our GitHub repository.
  2. Would it be easier to just use the naming convention of imagenet? Yes, it is. But I may recommend you register a new dataset.
  3. The command to run the merge_jsons.py. Yes, you should use the one that was generated after running the maskcut.py.
  4. Parameters. 1) The test-dataset should be the entire training set, as we'll be using the model's predictions on the training set as the pseudo-masks for the next stage of self-training. 2) The MODEL.WEIGHTS parameter should point to the checkpoint obtained from the unsupervised model learning stage. 3) OUTPUT_DIR specifies the path where the model predictions will be saved. These predictions will be used as the "ground-truth" for the next stage of self-training. 4) The default name for the model predictions is "coco_instances_results.json", but you can check the files saved under OUTPUT_DIR/inference/ and modify the name accordingly if needed.
  5. Repeat the self-training process multiple times. If you only care about the final results and not the intermediate ones, the easiest approach is to overwrite the results of the previous runs. This means that you should always use the same file name, such as r1.json or r2.json. However, if you want to keep track of the results from each run, you'll need to register the "new" dataset. The images will be the same as before, but the annotations will be updated for each run. For example, you could name the updated datasets "cutler_imagenet1k_train_r3.json" or "cutler_imagenet1k_train_r4.json".

Hope these answers help.
Best,
XuDong

from cutler.

frank-xwang avatar frank-xwang commented on September 12, 2024

Closing it now. Please feel free to reopen it if you have further questions.

from cutler.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.