Git Product home page Git Product logo

sam-tool's Introduction

SAM-Labelimg

利用Segment Anything(SAM)模型进行快速标注

1.下载项目

项目1:https://github.com/zhouayi/SAM-Tool

项目2:https://github.com/facebookresearch/segment-anything

git clone https://github.com/zhouayi/SAM-Tool.git

git clone https://github.com/facebookresearch/segment-anything.git
cd segment-anything
pip install -e .

下载SAM模型:https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

2.把数据放置在<dataset_path>/images/*这样的路径中,并创建空文件夹<dataset_path>/embeddings

3.将项目1中的helpers文件夹复制到项目2的主目录下

3.1 运行extrac_embeddings.py文件来提取图片的embedding
# cd到项目2的主目录下
python helpers\extract_embeddings.py --checkpoint-path sam_vit_h_4b8939.pth --dataset-folder <dataset_path> --device cpu
  • checkpoint-path:上面下载好的SAM模型路径
  • dataset-folder:数据路径
  • device:默认cuda,没有GPUcpu也行的,就是速度挺慢的

运行完毕后,<dataset_path>/embeddings下会生成相应的npy文件

3.2 运行generate_onnx.pypth文件转换为onnx模型文件
# cd到项目2的主目录下
python helpers\generate_onnx.py --checkpoint-path sam_vit_h_4b8939.pth --onnx-model-path ./sam_onnx.onnx --orig-im-size 1080 1920
  • checkpoint-path:同样的SAM模型路径

  • onnx-model-path:得到的onnx模型保存路径

  • orig-im-size:数据中图片的尺寸大小(height, width)

注意:提供给的代码转换得到的onnx模型并不支持动态输入大小,所以如果你的数据集中图片尺寸不一,那么可选方案是以不同的orig-im-size参数导出不同的onnx模型供后续使用

4.将生成的sam_onnx.onnx模型复制到项目1的主目录下,运行segment_anything_annotator.py进行标注

# cd到项目1的主目录下
python segment_anything_annotator.py --onnx-model-path sam_onnx.onnx --dataset-path <dataset_path> --categories cat,dog
  • onnx-model-path:导出的onnx模型路径
  • dataset-path:数据路径
  • categories:数据集的类别(每个类别以,分割,不要有空格)

在对象位置出点击鼠标左键为增加掩码,点击右键为去掉该位置掩码。

其他使用快捷键有:

Esc:退出app a:前一张图片 d:下一张图片
k:调低透明度 l:调高透明度 n:添加对象
r:重置 Ctrl+s:保存

image

最后生成的标注文件为coco格式,保存在<dataset_path>/annotations.json

5.检查标注结果

python cocoviewer.py -i <dataset_path> -a <dataset_path>\annotations.json

image

6.其他

  • 修改标注框线条的宽度的代码位置
# salt/displat_utils.py
class DisplayUtils:
    def __init__(self):
        self.transparency = 0.65 # 默认的掩码透明度
        self.box_width = 2 # 默认的边界框线条宽度
  • 修改标注文本的格式的代码位置
# salt/displat_utils.py
def draw_box_on_image(self, image, categories, ann, color):
    x, y, w, h = ann["bbox"]
    x, y, w, h = int(x), int(y), int(w), int(h)
    image = cv2.rectangle(image, (x, y), (x + w, y + h), color, self.box_width)

    text = '{} {}'.format(ann["id"],categories[ann["category_id"]])
    txt_color = (0, 0, 0) if np.mean(color) > 127 else (255, 255, 255)
    font = cv2.FONT_HERSHEY_SIMPLEX
    txt_size = cv2.getTextSize(text, font, 1.5, 1)[0]
    cv2.rectangle(image, (x, y + 1), (x + txt_size[0] + 1, y + int(1.5*txt_size[1])), color, -1)
    cv2.putText(image, text, (x, y + txt_size[1]), font, 1.5, txt_color, thickness=5)
    return image
  • 2023.04.14新增加撤销上一个标注对象功能,快捷键Ctrl+z

Reference

https://github.com/facebookresearch/segment-anything

https://github.com/anuragxel/salt

https://github.com/trsvchn/coco-viewer

sam-tool's People

Contributors

anuragxel avatar zhouayi avatar daa98 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.