fudan-zvg / semantic-segment-anything Goto Github PK

Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).

License: Apache License 2.0

Python 100.00%

semantic-segment-anything's People

Contributors

Stargazers

Watchers

Forkers

jiaqi-chen-00 jingdujingdu hengle zhangxiaotao2018 lingkeyang westail liuwenhaha lxy1993 hufeihu techthiyanes hayate-hsu jarygrace chenxwh peins zyuipo vaasesun kyuaschee code4indo jinx-ustc goswamig lit1088 guskun8 ibrandiay avivsham hemeda3 cj99 kemolo am0s-ac-x smyucas salmagro xiaoachen98 zillaru j-ru latitudezhou mohammadreza-sheykhmousa ethan-jiang-1 ductai199x modelai johnson7788 wzp8023391 breenglespic cloveryww qianqian121 cv-seg minhlong94 tanjingme coderstrong studiovc zero506 feb-col jerryzfc sunrainyg zijiny kleinyuan davidchoi76 vn-os ishrat-tl three-liu sheffieldcao hanoch666 drewzzzz6 shitoudidi 2132660698 junqiangchen jjhw offworld-projects sddai jiachen0212 diningsystem qianguo123 ricklentz lyf6 codeaudit yinpeidai jasongilholme radreports jzw0025 holliemin9090 mcx whuhxb liuqinglong110 pyqth successhaha zafirshi dq-soulie thkelper gogopen jie311 foobar41 kaynewest airooter linhong00316 andreynz691 lishatang metamorphart standardgalactic bttung-2020 arghyachatterjee theturehooha csxuwu

semantic-segment-anything's Issues

How to create image mask from JSON

Hi Team,

Am so much interested in semantic segment. is there any way to create image mask for each segment using the JSON output.

https://replicate.delivery/pbxt/ZNh6er6mvwWfNUSZWdfStS9lTfQl1PnCGkk6QI9zvJWm8XGDB/seg_out.json

Why is my semantic segmentation result obviously more, but the SSA result is missing?

pycocotools ImportError undefined symbol: __intel_sse2_strchr

Thank you very much for your contribution.

I have installed all the requirements and tried to run it on my own data using the following command in miniconda3 environment on a Linux HPC:
python scripts/main_ssa_engine.py --data_dir=data/ --out_dir=output --world_size=8 --save_img --sam --ckpt_path=ckp/sam_vit_h_4b8939.pth

However I keep getting the following error at import pycocotools.mask as maskUtils
import pycocotools._mask as _mask
ImportError: ssa/lib/python3.8/site-packages/pycocotools/_mask.cpython-38-x86_64-linux-gnu.so: undefined symbol: __intel_sse2_strchr

Can I ask if this is a dependency issue and is there any way to solve it?

Thank you in advance.

fine tune the model

Hello,
Is the fine tuning possible to a custom dataset and custom labels? Thank you

ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by ../anaconda3/envs/ssa/lib/python3.8/site-packages/spacy/tokens/span_group.cpython-38-x86_64-linux-gnu.so)

Error when running this time：python -m spacy download en_core_web_sm
ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /../anaconda3/envs/ssa/lib/python3.8/site-packages/spacy/tokens/span_group.cpython-38-x86_64-linux-gnu.so)

Suggestion - Integrate MobileSAM into the pipeline for lightweight and faster inference

Reference: https://github.com/ChaoningZhang/MobileSAM

Our project performs on par with the original SAM and keeps exactly the same pipeline as the original SAM except for a change on the image encode, therefore, it is easy to Integrate into any project.

MobileSAM is around 60 times smaller and around 50 times faster than original SAM, and it is around 7 times smaller and around 5 times faster than the concurrent FastSAM. The comparison of the whole pipeline is summarzed as follows:

可以自行调整标签么

图片处理后发现很多mask出现重叠现象，面积加和超越了原本图片的尺寸大小，而且我也不需要分类标签这么丰富，所以请问可以按照自己的意愿想法修改能被标注的标签种类吗

Is it possible we can define our own labels?

Thanks for sharing this great work!

Can we define our labels or apply transfer learning to this project? I did not figure out how do we run this project if our dataset is not from SA-B dataset.
e.g. I want to apply it on an indoor scene where the labels are mainly the furniture. Would simple addition of config files on configs work?

Appreciate your help!

Apple Silicon Support

Requirements section lists CUDA 11.1+ as a requirement. Is it possible to run this Semantic-Segment_Anything on Apple Silicon hardware (M1, M2)?

Repository license needed

This work looks quite valuable! I would like to explore SSA, but I noticed there is no license.

Would you please add a license that helps clarify the terms under which this work can or cannot be used, edited, extended, and/or redistributed?

About env

Great Job!!!
How about creating an environment without using Conda, such as using Python from virtualenv, and what libraries are needed?

Release model weights

Really appreciate this work. Do you plan on releasing model weights/checkpoints for SSA? Would really appreciate it.

Thanks

Mask/Label quality

It seems there are too many masks and labels for some simple images, is it possible to use dense crf to improve the masks/labels quality?

Great work! Do you happen to have any corresponding papers? I couldn't seem to find any.

Can I use my own dataset

Hello, thank you for your work !

I want to use SSA on custom categories and datasets. I saw you mentioned that users can customize in terms of the segmentor's architecture and the interested categories. But can I use my own datasets ?
I've made some attempts. But I don't understand what is "semantic_branch_processor", I try to use one of it directly.But an error will be reported —— ValueError: You have to specify the task_input. Found None. And I guess it has something to do with the fact that I didn't set the "semantic_branch_processor" correctly.
I want to know how to set up "semantic_branch_processor" if I want to use my own dataset.

Thank you very much for any reply.

Appendix
command：
python scripts/main_ssa.py --ckpt_path ./ckp/sam_vit_h_4b8939.pth --save_img --world_size 1 --dataset VOC2012 --data_dir /media/guo/DATA/chen/lraspp/data/VOCdevkit/VOC2012/JPEGImages --gt_path /media/guo/DATA/chen/lraspp/data/VOCdevkit/VOC2012/Annotations --out_dir output_VOC2012
Complete error reporting：
Traceback (most recent call last):
File "scripts/main_ssa_try.py", line 269, in
main(0, args)
File "scripts/main_ssa_try.py", line 248, in main
semantic_segment_anything_inference(file_name, args.out_dir, rank, img=img, save_img=args.save_img,
File "/media/guo/DATA/chen/SSA/Semantic-Segment-Anything/scripts/pipeline.py", line 168, in semantic_segment_anything_inference
class_ids = segformer_func(img, semantic_branch_processor, semantic_branch_model, rank)
File "/media/guo/DATA/chen/SSA/Semantic-Segment-Anything/scripts/segformer.py", line 5, in segformer_segmentation
inputs = processor(images=image, return_tensors="pt").to(rank)
File "/home/guo/anaconda3/envs/ssa/lib/python3.8/site-packages/transformers/models/oneformer/processing_oneformer.py", line 112, in call
raise ValueError("You have to specify the task_input. Found None.")
ValueError: You have to specify the task_input. Found None.

Apply SSA on a single Image

Hi, I was thinking if it is possible to have a single image as input, apply the Segment Anything Model from Meta and the use this tool to get the actual labels for the predicted masks

Simple Python code available?

Hello!
I love this project and the impressing results! But I would like to handle it simple with a few lines of Python code. Is there anything available? I tried to filter out the relevant parts of your project scripts but failed.
Simple giving the address to a single image and in return receiving an array or a list with labels, coordinates etc. Is that possible?

Best regards
Marc

Is it possible to be used in Google Colab?

Hi! I am new to CV and want to use this mode. I am wondering if it is possible to run this in Google Colab. Specifically, assume that I have a bunch of images in a folder "images," how could I import the library and use it? Thank you and look forward to the reply:)

推理耗时非常慢

我使用main_ssa.py进行单张图像推理，硬件为单个A100显卡，40G显存，发现在Semantic Voting模块耗时很多，每个mask都几乎需要处理1s，和git中有出入，请问这是正常的吗

cannot import print.log from mmcv.util using google colab

Traceback (most recent call last):
File "/content/Semantic-Segment-Anything/scripts/main_ssa_engine.py", line 5, in
from pipeline import semantic_annotation_pipeline
File "/content/Semantic-Segment-Anything/scripts/pipeline.py", line 8, in
from mmcv.utils import print_log
ImportError: cannot import name 'print_log' from 'mmcv.utils' (/usr/local/lib/python3.10/dist-packages/mmcv/utils/init.py)

Inference demo

Great work. And I'm wondering why not just give a simple inference demo to a single Image?

what the command to run the cog.yaml?

When I see the prediction.py ,I used the command: pip install cog? this is useful?

Does it support segment a certain type of objects by click one of them? #13

for example, find all cats by click one of them?

Run without GPU

Hi,
thanks for the amazing code! Is there any chance to get it running without a cuda capable gpu? Maybe via cpu (as for the sam)?
Thanks!

Can't find model 'en_core_web_sm'

python scripts/main.py --data=image_20230408/ --out_dir output --world_size=1 --save_img
Traceback (most recent call last):
File "scripts/main.py", line 4, in
from pipeline import semantic_annotation_pipeline
File "/data2/queenie_2023/Semantic-Segment-Anything/scripts/pipeline.py", line 13, in
from blip import open_vocabulary_classification_blip
File "/data2/queenie_2023/Semantic-Segment-Anything/scripts/blip.py", line 3, in
from utils import get_noun_phrases
File "/data2/queenie_2023/Semantic-Segment-Anything/scripts/utils.py", line 2, in
nlp = spacy.load('en_core_web_sm')
File "/data2/queenie/anaconda3/envs/mmTrans/lib/python3.7/site-packages/spacy/init.py", line 60, in load
config=config,
File "/data2/queenie/anaconda3/envs/mmTrans/lib/python3.7/site-packages/spacy/util.py", line 449, in load_model
raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a Python package or a valid path to a data directory.

Don't get any output after inference

Hi All,
Thank you for your amazing work and repo!
I'm trying to inference the open vocab model by some random image.
I followed the installation instructions and completed them without any errors, then I tried to inference the model as explained in the README file (see the attached photo). The inference was completed without errors just warnings, but when I entered the output path provided when calling main.py it was empty.
What am I doing wrong?

Cheers,

Bug in `scripts/pipeline.py`

In scripts/pipeline.py, there are some code like this

patch_small = mmcv.imcrop(img, np.array(
            [ann['bbox'][0], ann['bbox'][1], ann['bbox'][0] + ann['bbox'][2], ann['bbox'][1] + ann['bbox'][3]]),
                                  scale=scale_small)
patch_large = mmcv.imcrop(img, np.array(
    [ann['bbox'][0], ann['bbox'][1], ann['bbox'][0] + ann['bbox'][2], ann['bbox'][1] + ann['bbox'][3]]),
                          scale=scale_large)
patch_huge = mmcv.imcrop(img, np.array(
    [ann['bbox'][0], ann['bbox'][1], ann['bbox'][0] + ann['bbox'][2], ann['bbox'][1] + ann['bbox'][3]]),
                          scale=scale_large)

would patch_huge have scale=scale_hugh?

是否能用于stable diffusion？

如何与facebook的 segment anything 安装在一起？

thanks for open-souring such a nice repo

i am confusing about lower evaluation result with segformer method

why the segformer result is lower than segformer paper display? is the different with evaluation metircs? i think the paper miou arrive to 84% , but the result you pointed out was 71% thanks~

显存占用

请问您在推理时使用的显卡显存是多少呢？我使用24G显存的4090推理时爆显存了，请问至少应该使用多少张以及多大显存的显卡呢？

ModuleNotFoundError: No module named 'cog'

Hi, thanks for your work.
I just build the environment as you suggested as follows:

git clone [email protected]:fudan-zvg/Semantic-Segment-Anything.git
cd Semantic-Segment-Anything
conda env create -f environment.yaml
conda activate ssa
python -m spacy download en_core_web_sm
# install segment-anything
cd ..
git clone [email protected]:facebookresearch/segment-anything.git
cd segment-anything; pip install -e .; cd ../Semantic-Segment-Anything

Then I just try to run python predict.py and it show the following error:

Traceback (most recent call last):
  File "predict.py", line 19, in <module>
    from cog import BasePredictor, Input, Path, BaseModel
ModuleNotFoundError: No module named 'cog'

Can you give me some suggestions on how to solve this?

运行一会后报错

  File "/media/admin1/envs/anaconda3/envs/leng_lip/lib/python3.7/site-packages/transformers/models/clipseg/modeling_clipseg.py", line 1458, in forward
    conditional_pixel_values=conditional_pixel_values,
  File "/media/admin1/envs/anaconda3/envs/leng_lip/lib/python3.7/site-packages/transformers/models/clipseg/modeling_clipseg.py", line 1360, in get_conditional_embeddings
    raise ValueError("Make sure to pass as many prompt texts as there are query images")
ValueError: Make sure to pass as many prompt texts as there are query images

能成功100多张图片，然后就会出现这样的报错停止。
使用命令如下

python scripts/main_ssa_engine.py --data_dir=data/UCM_Captions --out_dir=output --world_size=4 --save_img --sam --ckpt_path=../../mydata/sam_vit_h_4b8939.pth --light_mode

messy category generation data

For the messy category generation data, such as "the word '50' in white letters," "three blue plastic rabbits", "three blue plastic snowflakes", "some very pretty blue and black items" and "1 ultra blue" are there any other methods to further clean them?

evaluation results available?

Looks like, with SSA, it becomes possible to compare the performances of SAM against SOTA on popular benchmark datasets. Would you report the validation results of SSA for semantic segmentation or instance segmentation on AED20k or coco?

TypeError: list indices must be integers or slices, not str

Thank you very much for creating this project & publishing it so soon after the segment-anything release!

I got this error when running SSA inference on a directory of images:

python scripts/main_ssa_engine.py --data_dir=data/examples --out_dir=data/output --save_img --world_size 1

Before running this command, I first ran segment-anything on my directory of images, in "coco_rle" output mode to generate the JSONs:

python scripts/amg.py --checkpoint models/sam_vit_h_4b8939.pth --model-type default --input ../Semantic-Segment-Anything/data/examples --output ../Semantic-Segment-Anything/data/examples/ --convert-to-rle

main_ssa_engine.py threw this error:

torch.multiprocessing.spawn.ProcessRaisedException: 

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "/home/laurens/anaconda3/envs/ssa/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
    fn(i, *args)
  File "/home/laurens/git/OSS/Semantic-Segment-Anything/scripts/main_ssa_engine.py", line 46, in main
    semantic_annotation_pipeline(file_name, args.data_dir, args.out_dir, rank, save_img=args.save_img,
  File "/home/laurens/git/OSS/Semantic-Segment-Anything/scripts/pipeline.py", line 69, in semantic_annotation_pipeline
    for ann in anns['annotations']:
TypeError: list indices must be integers or slices, not str

I figured this error is because of this part: https://github.com/fudan-zvg/Semantic-Segment-Anything/blob/main/scripts/pipeline.py#L61-L69

If the pipeline has mask_generator, the segmentations are put in dict with "annotations" key - not sure why.

In my case, I'm not using the mask_generator since the JSONs are already generated, so when I remove the key annotations in the for loop on line 69, it works.

Change Semantic Branch

Can I switch the semantic branch segformer to other semantic segmentation models？

Thank you

I must say bravo and thank you for doing exactly what I would like to start doing now.

How do I change the backbone to use a different backbone. I have a trained Segformer model on my custom dataset, which I would like to use.

Get each segments from output of semantic segmentation

Thank you for sharing a wonderful code!

I want to get segments of specific class from result.

Segmentation results have multiple classes, but I only want to see the segmentation results for a specific class. However, the resulting data (json) does not provide information on the coordinates of the segments or the index for each class, making it difficult to access the desired information.

I tried extracting unique the color of segments, but 'unique of color' and 'number of segments' are different....so it failed

for example, I just want mask that pixel's class_name is 'building'.

What can be done?
Thanks

torch and transformers version mismatch error

Hi, when I tried to run SSA inference, I met this following error:

Traceback (most recent call last):
File "scripts/main_ssa.py", line 122, in
main(0, args)
File "scripts/main_ssa.py", line 66, in main
from transformers import SegformerFeatureExtractor, SegformerForSemanticSegmentation
File "", line 1039, in _handle_fromlist
File "/home/rxu37/.local/lib/python3.8/site-packages/transformers/utils/import_utils.py", line 1117, in getattr
value = getattr(module, name)
File "/home/rxu37/.local/lib/python3.8/site-packages/transformers/utils/import_utils.py", line 1116, in getattr
module = self._get_module(self._class_to_module[name])
File "/home/rxu37/.local/lib/python3.8/site-packages/transformers/utils/import_utils.py", line 1128, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.segformer.modeling_segformer because of the following error (look up to see its traceback):
No module named 'torch.distributed.algorithms.join'

I installed my environment using the exact provided environment.yaml , so my torch version is 1.9.1+cu111, and my transformers version is 4.27.1. I looked up the PyTorch doc and found that only PyTorch 2.0 has the module torch.distributed.algorithms.join. Thus I upgraded PyTorch to 2.0.1+cu118, but now I met the error

Traceback (most recent call last):
File "scripts/main_ssa.py", line 6, in
from pipeline import semantic_segment_anything_inference, eval_pipeline, img_load
File "/mnt/d/GitHub/Semantic-Segment-Anything/scripts/pipeline.py", line 8, in
from mmdet.core.visualization.image import imshow_det_bboxes
File "/home/rxu37/anaconda3/envs/ssa/lib/python3.8/site-packages/mmdet/core/init.py", line 3, in
from .bbox import * # noqa: F401, F403
File "/home/rxu37/anaconda3/envs/ssa/lib/python3.8/site-packages/mmdet/core/bbox/init.py", line 8, in
from .samplers import (BaseSampler, CombinedSampler,
File "/home/rxu37/anaconda3/envs/ssa/lib/python3.8/site-packages/mmdet/core/bbox/samplers/init.py", line 12, in
from .score_hlr_sampler import ScoreHLRSampler
File "/home/rxu37/anaconda3/envs/ssa/lib/python3.8/site-packages/mmdet/core/bbox/samplers/score_hlr_sampler.py", line 3, in
from mmcv.ops import nms_match
File "/home/rxu37/anaconda3/envs/ssa/lib/python3.8/site-packages/mmcv/ops/init.py", line 2, in
from .assign_score_withk import assign_score_withk
File "/home/rxu37/anaconda3/envs/ssa/lib/python3.8/site-packages/mmcv/ops/assign_score_withk.py", line 5, in
ext_module = ext_loader.load_ext(
File "/home/rxu37/anaconda3/envs/ssa/lib/python3.8/site-packages/mmcv/utils/ext_loader.py", line 13, in load_ext
ext = importlib.import_module('mmcv.' + name)
File "/home/rxu37/anaconda3/envs/ssa/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory

I have tried to rebuild the mmcv module after I upgrade torch, but it does not seem to work. Would greatly appreciate any pointer/help on this issue. Thank you!

Human class

Can we get the mask only showing a human body? Not with all the other elements in the image/video

Difference between this work and Grounded SAM

Hi, could you please explain the difference between this work and Grounded SAM (https://github.com/IDEA-Research/Grounded-Segment-Anything/tree/main)?

out of memory！！！

cuda:11.1 torch:1.10 single A6000
命令：python scripts/main_ssa_engine.py --data_dir=/mnt/usb/gxy/dataset_path/images --out_dir=output --world_size=1 --save_img --sam --checkpoint-path=checkpoint-path/sam_vit_h_4b8939.pth
报错：RuntimeError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 47.54 GiB total capacity; 537.07 MiB already allocated; 12.06 MiB free; 580.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
help!!!help!!!

About the visualization.

Thanks for this great work and the open-soursed repo. I want to know how to visualize the result after the inference like you show in the repo in the end.

missing stable_two_stage_multi_segmenter_clip_seg.py file in scripts

I cloned the repo and run this command
python scripts/stable_two_stage_multi_segmenter_clip_seg.py --data_dir=data/examples --out_dir=output --world_size=8 --save_img
i got:
python3: can't open file '/content/Semantic-Segment-Anything/scripts/stable_two_stage_multi_segmenter_clip_seg.py': [Errno 2] No such file or directory