fudan-zvg / semantic-segment-anything Goto Github PK
View Code? Open in Web Editor NEWAutomated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).
License: Apache License 2.0
Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).
License: Apache License 2.0
Hi Team,
Am so much interested in semantic segment. is there any way to create image mask for each segment using the JSON output.
https://replicate.delivery/pbxt/ZNh6er6mvwWfNUSZWdfStS9lTfQl1PnCGkk6QI9zvJWm8XGDB/seg_out.json
Why is my semantic segmentation result obviously more, but the SSA result is missing?
Thank you very much for your contribution.
I have installed all the requirements and tried to run it on my own data using the following command in miniconda3 environment on a Linux HPC:
python scripts/main_ssa_engine.py --data_dir=data/ --out_dir=output --world_size=8 --save_img --sam --ckpt_path=ckp/sam_vit_h_4b8939.pth
However I keep getting the following error at import pycocotools.mask as maskUtils
import pycocotools._mask as _mask
ImportError: ssa/lib/python3.8/site-packages/pycocotools/_mask.cpython-38-x86_64-linux-gnu.so: undefined symbol: __intel_sse2_strchr
Can I ask if this is a dependency issue and is there any way to solve it?
Thank you in advance.
Hello,
Is the fine tuning possible to a custom dataset and custom labels? Thank you
Error when running this time:python -m spacy download en_core_web_sm
ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /../anaconda3/envs/ssa/lib/python3.8/site-packages/spacy/tokens/span_group.cpython-38-x86_64-linux-gnu.so)
Reference: https://github.com/ChaoningZhang/MobileSAM
Our project performs on par with the original SAM and keeps exactly the same pipeline as the original SAM except for a change on the image encode, therefore, it is easy to Integrate into any project.
MobileSAM is around 60 times smaller and around 50 times faster than original SAM, and it is around 7 times smaller and around 5 times faster than the concurrent FastSAM. The comparison of the whole pipeline is summarzed as follows:
图片处理后发现很多mask出现重叠现象,面积加和超越了原本图片的尺寸大小,而且我也不需要分类标签这么丰富,所以请问可以按照自己的意愿想法修改能被标注的标签种类吗
Thanks for sharing this great work!
Can we define our labels or apply transfer learning to this project? I did not figure out how do we run this project if our dataset is not from SA-B dataset.
e.g. I want to apply it on an indoor scene where the labels are mainly the furniture. Would simple addition of config files on configs work?
Appreciate your help!
Requirements section lists CUDA 11.1+ as a requirement. Is it possible to run this Semantic-Segment_Anything on Apple Silicon hardware (M1, M2)?
This work looks quite valuable! I would like to explore SSA, but I noticed there is no license.
Would you please add a license that helps clarify the terms under which this work can or cannot be used, edited, extended, and/or redistributed?
Great Job!!!
How about creating an environment without using Conda, such as using Python from virtualenv, and what libraries are needed?
Hi
Really appreciate this work. Do you plan on releasing model weights/checkpoints for SSA? Would really appreciate it.
Thanks
It seems there are too many masks and labels for some simple images, is it possible to use dense crf to improve the masks/labels quality?
Hello, thank you for your work !
I want to use SSA on custom categories and datasets. I saw you mentioned that users can customize in terms of the segmentor's architecture and the interested categories. But can I use my own datasets ?
I've made some attempts. But I don't understand what is "semantic_branch_processor", I try to use one of it directly.But an error will be reported —— ValueError: You have to specify the task_input. Found None. And I guess it has something to do with the fact that I didn't set the "semantic_branch_processor" correctly.
I want to know how to set up "semantic_branch_processor" if I want to use my own dataset.
Thank you very much for any reply.
Appendix
command:
python scripts/main_ssa.py --ckpt_path ./ckp/sam_vit_h_4b8939.pth --save_img --world_size 1 --dataset VOC2012 --data_dir /media/guo/DATA/chen/lraspp/data/VOCdevkit/VOC2012/JPEGImages --gt_path /media/guo/DATA/chen/lraspp/data/VOCdevkit/VOC2012/Annotations --out_dir output_VOC2012
Complete error reporting:
Traceback (most recent call last):
File "scripts/main_ssa_try.py", line 269, in
main(0, args)
File "scripts/main_ssa_try.py", line 248, in main
semantic_segment_anything_inference(file_name, args.out_dir, rank, img=img, save_img=args.save_img,
File "/media/guo/DATA/chen/SSA/Semantic-Segment-Anything/scripts/pipeline.py", line 168, in semantic_segment_anything_inference
class_ids = segformer_func(img, semantic_branch_processor, semantic_branch_model, rank)
File "/media/guo/DATA/chen/SSA/Semantic-Segment-Anything/scripts/segformer.py", line 5, in segformer_segmentation
inputs = processor(images=image, return_tensors="pt").to(rank)
File "/home/guo/anaconda3/envs/ssa/lib/python3.8/site-packages/transformers/models/oneformer/processing_oneformer.py", line 112, in call
raise ValueError("You have to specify the task_input. Found None.")
ValueError: You have to specify the task_input. Found None.
Hi, I was thinking if it is possible to have a single image as input, apply the Segment Anything Model from Meta and the use this tool to get the actual labels for the predicted masks
Hello!
I love this project and the impressing results! But I would like to handle it simple with a few lines of Python code. Is there anything available? I tried to filter out the relevant parts of your project scripts but failed.
Simple giving the address to a single image and in return receiving an array or a list with labels, coordinates etc. Is that possible?
Best regards
Marc
Hi! I am new to CV and want to use this mode. I am wondering if it is possible to run this in Google Colab. Specifically, assume that I have a bunch of images in a folder "images," how could I import the library and use it? Thank you and look forward to the reply:)
我使用main_ssa.py进行单张图像推理,硬件为单个A100显卡,40G显存,发现在Semantic Voting模块耗时很多,每个mask都几乎需要处理1s,和git中有出入,请问这是正常的吗
Traceback (most recent call last):
File "/content/Semantic-Segment-Anything/scripts/main_ssa_engine.py", line 5, in
from pipeline import semantic_annotation_pipeline
File "/content/Semantic-Segment-Anything/scripts/pipeline.py", line 8, in
from mmcv.utils import print_log
ImportError: cannot import name 'print_log' from 'mmcv.utils' (/usr/local/lib/python3.10/dist-packages/mmcv/utils/init.py)
Great work. And I'm wondering why not just give a simple inference demo to a single Image?
When I see the prediction.py ,I used the command: pip install cog? this is useful?
for example, find all cats by click one of them?
Hi,
thanks for the amazing code! Is there any chance to get it running without a cuda capable gpu? Maybe via cpu (as for the sam)?
Thanks!
python scripts/main.py --data=image_20230408/ --out_dir output --world_size=1 --save_img
Traceback (most recent call last):
File "scripts/main.py", line 4, in
from pipeline import semantic_annotation_pipeline
File "/data2/queenie_2023/Semantic-Segment-Anything/scripts/pipeline.py", line 13, in
from blip import open_vocabulary_classification_blip
File "/data2/queenie_2023/Semantic-Segment-Anything/scripts/blip.py", line 3, in
from utils import get_noun_phrases
File "/data2/queenie_2023/Semantic-Segment-Anything/scripts/utils.py", line 2, in
nlp = spacy.load('en_core_web_sm')
File "/data2/queenie/anaconda3/envs/mmTrans/lib/python3.7/site-packages/spacy/init.py", line 60, in load
config=config,
File "/data2/queenie/anaconda3/envs/mmTrans/lib/python3.7/site-packages/spacy/util.py", line 449, in load_model
raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a Python package or a valid path to a data directory.
Hi All,
Thank you for your amazing work and repo!
I'm trying to inference the open vocab model by some random image.
I followed the installation instructions and completed them without any errors, then I tried to inference the model as explained in the README file (see the attached photo). The inference was completed without errors just warnings, but when I entered the output path provided when calling main.py
it was empty.
What am I doing wrong?
Cheers,
In scripts/pipeline.py
, there are some code like this
patch_small = mmcv.imcrop(img, np.array(
[ann['bbox'][0], ann['bbox'][1], ann['bbox'][0] + ann['bbox'][2], ann['bbox'][1] + ann['bbox'][3]]),
scale=scale_small)
patch_large = mmcv.imcrop(img, np.array(
[ann['bbox'][0], ann['bbox'][1], ann['bbox'][0] + ann['bbox'][2], ann['bbox'][1] + ann['bbox'][3]]),
scale=scale_large)
patch_huge = mmcv.imcrop(img, np.array(
[ann['bbox'][0], ann['bbox'][1], ann['bbox'][0] + ann['bbox'][2], ann['bbox'][1] + ann['bbox'][3]]),
scale=scale_large)
would patch_huge
have scale=scale_hugh
?
如何与facebook的 segment anything 安装在一起?
why the segformer result is lower than segformer paper display? is the different with evaluation metircs? i think the paper miou arrive to 84% , but the result you pointed out was 71% thanks~
请问您在推理时使用的显卡显存是多少呢?我使用24G显存的4090推理时爆显存了,请问至少应该使用多少张以及多大显存的显卡呢?
Hi, thanks for your work.
I just build the environment as you suggested as follows:
git clone [email protected]:fudan-zvg/Semantic-Segment-Anything.git
cd Semantic-Segment-Anything
conda env create -f environment.yaml
conda activate ssa
python -m spacy download en_core_web_sm
# install segment-anything
cd ..
git clone [email protected]:facebookresearch/segment-anything.git
cd segment-anything; pip install -e .; cd ../Semantic-Segment-Anything
Then I just try to run python predict.py
and it show the following error:
Traceback (most recent call last):
File "predict.py", line 19, in <module>
from cog import BasePredictor, Input, Path, BaseModel
ModuleNotFoundError: No module named 'cog'
Can you give me some suggestions on how to solve this?
File "/media/admin1/envs/anaconda3/envs/leng_lip/lib/python3.7/site-packages/transformers/models/clipseg/modeling_clipseg.py", line 1458, in forward
conditional_pixel_values=conditional_pixel_values,
File "/media/admin1/envs/anaconda3/envs/leng_lip/lib/python3.7/site-packages/transformers/models/clipseg/modeling_clipseg.py", line 1360, in get_conditional_embeddings
raise ValueError("Make sure to pass as many prompt texts as there are query images")
ValueError: Make sure to pass as many prompt texts as there are query images
能成功100多张图片,然后就会出现这样的报错停止。
使用命令如下
python scripts/main_ssa_engine.py --data_dir=data/UCM_Captions --out_dir=output --world_size=4 --save_img --sam --ckpt_path=../../mydata/sam_vit_h_4b8939.pth --light_mode
For the messy category generation data, such as "the word '50' in white letters," "three blue plastic rabbits", "three blue plastic snowflakes", "some very pretty blue and black items" and "1 ultra blue" are there any other methods to further clean them?
Looks like, with SSA, it becomes possible to compare the performances of SAM against SOTA on popular benchmark datasets. Would you report the validation results of SSA for semantic segmentation or instance segmentation on AED20k or coco?
Thank you very much for creating this project & publishing it so soon after the segment-anything release!
I got this error when running SSA inference on a directory of images:
python scripts/main_ssa_engine.py --data_dir=data/examples --out_dir=data/output --save_img --world_size 1
Before running this command, I first ran segment-anything on my directory of images, in "coco_rle" output mode to generate the JSONs:
python scripts/amg.py --checkpoint models/sam_vit_h_4b8939.pth --model-type default --input ../Semantic-Segment-Anything/data/examples --output ../Semantic-Segment-Anything/data/examples/ --convert-to-rle
main_ssa_engine.py threw this error:
torch.multiprocessing.spawn.ProcessRaisedException:
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/laurens/anaconda3/envs/ssa/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
fn(i, *args)
File "/home/laurens/git/OSS/Semantic-Segment-Anything/scripts/main_ssa_engine.py", line 46, in main
semantic_annotation_pipeline(file_name, args.data_dir, args.out_dir, rank, save_img=args.save_img,
File "/home/laurens/git/OSS/Semantic-Segment-Anything/scripts/pipeline.py", line 69, in semantic_annotation_pipeline
for ann in anns['annotations']:
TypeError: list indices must be integers or slices, not str
I figured this error is because of this part: https://github.com/fudan-zvg/Semantic-Segment-Anything/blob/main/scripts/pipeline.py#L61-L69
If the pipeline has mask_generator, the segmentations are put in dict with "annotations" key - not sure why.
In my case, I'm not using the mask_generator since the JSONs are already generated, so when I remove the key annotations
in the for loop on line 69, it works.
Can I switch the semantic branch segformer to other semantic segmentation models?
I must say bravo and thank you for doing exactly what I would like to start doing now.
Thank you for sharing a wonderful code!
I want to get segments of specific class from result.
Segmentation results have multiple classes, but I only want to see the segmentation results for a specific class. However, the resulting data (json) does not provide information on the coordinates of the segments or the index for each class, making it difficult to access the desired information.
I tried extracting unique the color of segments, but 'unique of color' and 'number of segments' are different....so it failed
for example, I just want mask that pixel's class_name is 'building'.
What can be done?
Thanks
Hi, when I tried to run SSA inference, I met this following error:
Traceback (most recent call last):
File "scripts/main_ssa.py", line 122, in
main(0, args)
File "scripts/main_ssa.py", line 66, in main
from transformers import SegformerFeatureExtractor, SegformerForSemanticSegmentation
File "", line 1039, in _handle_fromlist
File "/home/rxu37/.local/lib/python3.8/site-packages/transformers/utils/import_utils.py", line 1117, in getattr
value = getattr(module, name)
File "/home/rxu37/.local/lib/python3.8/site-packages/transformers/utils/import_utils.py", line 1116, in getattr
module = self._get_module(self._class_to_module[name])
File "/home/rxu37/.local/lib/python3.8/site-packages/transformers/utils/import_utils.py", line 1128, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.segformer.modeling_segformer because of the following error (look up to see its traceback):
No module named 'torch.distributed.algorithms.join'
I installed my environment using the exact provided environment.yaml
, so my torch version is 1.9.1+cu111, and my transformers version is 4.27.1. I looked up the PyTorch doc and found that only PyTorch 2.0 has the module torch.distributed.algorithms.join
. Thus I upgraded PyTorch to 2.0.1+cu118, but now I met the error
Traceback (most recent call last):
File "scripts/main_ssa.py", line 6, in
from pipeline import semantic_segment_anything_inference, eval_pipeline, img_load
File "/mnt/d/GitHub/Semantic-Segment-Anything/scripts/pipeline.py", line 8, in
from mmdet.core.visualization.image import imshow_det_bboxes
File "/home/rxu37/anaconda3/envs/ssa/lib/python3.8/site-packages/mmdet/core/init.py", line 3, in
from .bbox import * # noqa: F401, F403
File "/home/rxu37/anaconda3/envs/ssa/lib/python3.8/site-packages/mmdet/core/bbox/init.py", line 8, in
from .samplers import (BaseSampler, CombinedSampler,
File "/home/rxu37/anaconda3/envs/ssa/lib/python3.8/site-packages/mmdet/core/bbox/samplers/init.py", line 12, in
from .score_hlr_sampler import ScoreHLRSampler
File "/home/rxu37/anaconda3/envs/ssa/lib/python3.8/site-packages/mmdet/core/bbox/samplers/score_hlr_sampler.py", line 3, in
from mmcv.ops import nms_match
File "/home/rxu37/anaconda3/envs/ssa/lib/python3.8/site-packages/mmcv/ops/init.py", line 2, in
from .assign_score_withk import assign_score_withk
File "/home/rxu37/anaconda3/envs/ssa/lib/python3.8/site-packages/mmcv/ops/assign_score_withk.py", line 5, in
ext_module = ext_loader.load_ext(
File "/home/rxu37/anaconda3/envs/ssa/lib/python3.8/site-packages/mmcv/utils/ext_loader.py", line 13, in load_ext
ext = importlib.import_module('mmcv.' + name)
File "/home/rxu37/anaconda3/envs/ssa/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory
I have tried to rebuild the mmcv module after I upgrade torch, but it does not seem to work. Would greatly appreciate any pointer/help on this issue. Thank you!
Can we get the mask only showing a human body? Not with all the other elements in the image/video
Hi, could you please explain the difference between this work and Grounded SAM (https://github.com/IDEA-Research/Grounded-Segment-Anything/tree/main)?
cuda:11.1 torch:1.10 single A6000
命令:python scripts/main_ssa_engine.py --data_dir=/mnt/usb/gxy/dataset_path/images --out_dir=output --world_size=1 --save_img --sam --checkpoint-path=checkpoint-path/sam_vit_h_4b8939.pth
报错:RuntimeError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 47.54 GiB total capacity; 537.07 MiB already allocated; 12.06 MiB free; 580.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
help!!!help!!!
Thanks for this great work and the open-soursed repo. I want to know how to visualize the result after the inference like you show in the repo in the end.
I cloned the repo and run this command
python scripts/stable_two_stage_multi_segmenter_clip_seg.py --data_dir=data/examples --out_dir=output --world_size=8 --save_img
i got:
python3: can't open file '/content/Semantic-Segment-Anything/scripts/stable_two_stage_multi_segmenter_clip_seg.py': [Errno 2] No such file or directory
加载下载好的模型, oneformer出错,huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name':
oneformer_ade20k_processor = OneFormerProcessor.from_pretrained("shi-labs/oneformer_ade20k_swin_large")
oneformer_ade20k_model = OneFormerForUniversalSegmentation.from_pretrained("shi-labs/oneformer_ade20k_swin_large").to(rank)
How do I change the backbone to use a different backbone. I have a trained Segformer and Segmenter models, trained on my custom dataset, which I would like to use. Are there any pointers as to how I can change the dataset and model?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.