Git Product home page Git Product logo

sam-tool's Introduction

SAM-Labelimg

利用Segment Anything(SAM)模型进行快速标注

1.下载项目

项目1:https://github.com/zhouayi/SAM-Tool

项目2:https://github.com/facebookresearch/segment-anything

git clone https://github.com/zhouayi/SAM-Tool.git

git clone https://github.com/facebookresearch/segment-anything.git
cd segment-anything
pip install -e .

下载SAM模型:https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

2.把数据放置在<dataset_path>/images/*这样的路径中,并创建空文件夹<dataset_path>/embeddings

3.将项目1中的helpers文件夹复制到项目2的主目录下

3.1 运行extrac_embeddings.py文件来提取图片的embedding
# cd到项目2的主目录下
python helpers\extract_embeddings.py --checkpoint-path sam_vit_h_4b8939.pth --dataset-folder <dataset_path> --device cpu
  • checkpoint-path:上面下载好的SAM模型路径
  • dataset-folder:数据路径
  • device:默认cuda,没有GPUcpu也行的,就是速度挺慢的

运行完毕后,<dataset_path>/embeddings下会生成相应的npy文件

3.2 运行generate_onnx.pypth文件转换为onnx模型文件
# cd到项目2的主目录下
python helpers\generate_onnx.py --checkpoint-path sam_vit_h_4b8939.pth --onnx-model-path ./sam_onnx.onnx --orig-im-size 1080 1920
  • checkpoint-path:同样的SAM模型路径

  • onnx-model-path:得到的onnx模型保存路径

  • orig-im-size:数据中图片的尺寸大小(height, width)

注意:提供给的代码转换得到的onnx模型并不支持动态输入大小,所以如果你的数据集中图片尺寸不一,那么可选方案是以不同的orig-im-size参数导出不同的onnx模型供后续使用

4.将生成的sam_onnx.onnx模型复制到项目1的主目录下,运行segment_anything_annotator.py进行标注

# cd到项目1的主目录下
python segment_anything_annotator.py --onnx-model-path sam_onnx.onnx --dataset-path <dataset_path> --categories cat,dog
  • onnx-model-path:导出的onnx模型路径
  • dataset-path:数据路径
  • categories:数据集的类别(每个类别以,分割,不要有空格)

在对象位置出点击鼠标左键为增加掩码,点击右键为去掉该位置掩码。

其他使用快捷键有:

Esc:退出app a:前一张图片 d:下一张图片
k:调低透明度 l:调高透明度 n:添加对象
r:重置 Ctrl+s:保存

image

最后生成的标注文件为coco格式,保存在<dataset_path>/annotations.json

5.检查标注结果

python cocoviewer.py -i <dataset_path> -a <dataset_path>\annotations.json

image

6.其他

  • 修改标注框线条的宽度的代码位置
# salt/displat_utils.py
class DisplayUtils:
    def __init__(self):
        self.transparency = 0.65 # 默认的掩码透明度
        self.box_width = 2 # 默认的边界框线条宽度
  • 修改标注文本的格式的代码位置
# salt/displat_utils.py
def draw_box_on_image(self, image, categories, ann, color):
    x, y, w, h = ann["bbox"]
    x, y, w, h = int(x), int(y), int(w), int(h)
    image = cv2.rectangle(image, (x, y), (x + w, y + h), color, self.box_width)

    text = '{} {}'.format(ann["id"],categories[ann["category_id"]])
    txt_color = (0, 0, 0) if np.mean(color) > 127 else (255, 255, 255)
    font = cv2.FONT_HERSHEY_SIMPLEX
    txt_size = cv2.getTextSize(text, font, 1.5, 1)[0]
    cv2.rectangle(image, (x, y + 1), (x + txt_size[0] + 1, y + int(1.5*txt_size[1])), color, -1)
    cv2.putText(image, text, (x, y + txt_size[1]), font, 1.5, txt_color, thickness=5)
    return image
  • 2023.04.14新增加撤销上一个标注对象功能,快捷键Ctrl+z

Reference

https://github.com/facebookresearch/segment-anything

https://github.com/anuragxel/salt

https://github.com/trsvchn/coco-viewer

sam-tool's People

Contributors

anuragxel avatar daa98 avatar zhouayi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.