tucan9389 / semanticsegmentation-coreml Goto Github PK

View Code? Open in Web Editor NEW

320.0 6.0 32.0 36.71 MB

The example project of inferencing Semantic Segementation using Core ML

Home Page: https://github.com/motlabs/awesome-ml-demos-with-ios

License: MIT License

Swift 90.44% Metal 9.56%

semanticsegmentation-coreml's People

Contributors

Stargazers

Watchers

semanticsegmentation-coreml's Issues

Total time(cpu) ?

In the performance comparison chart, what is total time (cpu) and how is it measured?

Extract Image

is there any way to extract image from segmentation map or how to convert MLMultiArray to image?

The result is rotated

Hello,
the project is very interesting but it's not working:

Any idea why?

Daniel

metal workflow and ARKIT

Hi !

I wonder if the metal workflow you are implementing in the repo is compatible with ARKIT ?

Source Model Link

https://github.com/zllrunning/face-parsing.PyTorch

Core ML Model Download Link

https://github.com/tucan9389/SemanticSegmentation-CoreML/releases/download/support-face-parsing/FaceParsing.mlmodel

Model Spec

Input: 512x512 image
Output: 512x512 (Int32)
- Catetory index of each pixel
- Defined 19 categories: ['background', 'skin', 'l_brow', 'r_brow', 'l_eye', 'r_eye', 'eye_g', 'l_ear', 'r_ear', 'ear_r', 'nose', 'mouth', 'u_lip', 'l_lip', 'neck', 'neck_l', 'cloth', 'hair', 'hat']
Size: 52.7 MB
Inference time: 30-50 ms in iPhone 11 Pro

Conversion Script

import torch

import os.path as osp
import json
from PIL import Image
import torchvision.transforms as transforms
from model import BiSeNet

import coremltools as ct

dspth = 'res/test-img'
cp = '79999_iter.pth'
device = torch.device('cpu')

output_mlmodel_path = "FaceParsing.mlmodel"

labels = ['background', 'skin', 'l_brow', 'r_brow', 'l_eye', 'r_eye', 'eye_g', 'l_ear', 'r_ear', 'ear_r',
            'nose', 'mouth', 'u_lip', 'l_lip', 'neck', 'neck_l', 'cloth', 'hair', 'hat']
n_classes = len(labels)
print("n_classes:", n_classes)

class MyBiSeNet(torch.nn.Module):
    def __init__(self, n_classes, pretrained_model_path):
        super(MyBiSeNet, self).__init__()
        self.model = BiSeNet(n_classes=n_classes)
        self.model.load_state_dict(torch.load(pretrained_model_path, map_location=device))
        self.model.eval()

    def forward(self, x):
        x = self.model(x)
        x = x[0]
        x = torch.argmax(x, dim=1)
        x = torch.squeeze(x)
        return x

pretrained_model_path = osp.join('res/cp', cp)
model = MyBiSeNet(n_classes=n_classes, pretrained_model_path=pretrained_model_path)
model.eval()

example_input = torch.rand(1, 3, 512, 512)  # after test, will get 'size mismatch' error message with size 256x256
preprocess = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225],
    ),
])

traced_model = torch.jit.trace(model, example_input)


# Convert to Core ML using the Unified Conversion API
print(example_input.shape)

scale = 1.0 / (0.226 * 255.0)
red_bias   = -0.485 / 0.226
green_bias = -0.456 / 0.226
blue_bias  = -0.406 / 0.226

mlmodel = ct.convert(
    traced_model,
    inputs=[ct.ImageType(name="input",
                         shape=example_input.shape,
                         scale=scale,
                         color_layout="BGR",
                         bias=[blue_bias, green_bias, red_bias])], #name "input_1" is used in 'quickstart'
)



labels_json = {"labels": labels}

mlmodel.user_defined_metadata["com.apple.coreml.model.preview.type"] = "imageSegmenter"
mlmodel.user_defined_metadata['com.apple.coreml.model.preview.params'] = json.dumps(labels_json)

mlmodel.save(output_mlmodel_path)

import coremltools.proto.FeatureTypes_pb2 as ft

spec = ct.utils.load_spec(output_mlmodel_path)

for feature in spec.description.output:
    if feature.type.HasField("multiArrayType"):
        feature.type.multiArrayType.dataType = ft.ArrayFeatureType.INT32

ct.utils.save_spec(spec, output_mlmodel_path)

Support Human-Segmentation-PyTorch repo

https://github.com/thuyngch/Human-Segmentation-PyTorch

Model Size (MB), Minimum iOS Version

Model	Size	Minimum iOS Version
DeepLabV3	8.6	iOS12
DeepLabV3FP16	4.3	iOS12
DeepLabV3Int8LUT	2.3	iOS12

Infernece Time (ms)

Model vs. Device	XS	X
DeepLabV3	135	177
DeepLabV3FP16	136	177
DeepLabV3Int8LUT	135	177

Total Time (ms)

Model vs. Device	XS	X
DeepLabV3	409	531
DeepLabV3FP16	403	530
DeepLabV3Int8LUT	412	517

FPS

Model vs. Device	XS	X
DeepLabV3	2	1
DeepLabV3FP16	2	1
DeepLabV3Int8LUT	2	1

Segment whole image

Hi there - first off, great work on this repo! :D

I wonder if there's a way to segment the whole image by padding the sides - as it stands, since imageCropAndScaleOption is .centerCrop, we only get the center.

Check mobilenet+unet

https://github.com/akirasosa/mobile-semantic-segmentation

Drawing heatmap view with Metal

I'm searching for this issue slowly. https://stackoverflow.com/questions/61154192/are-there-recommended-ways-for-drawing-2d-array-in-fast-time

Performance issue on post-processing

Now:
Inference -> convert to primitive array from mlarray -> draw the converted array

Suggest:
Inference -> draw the mlarray

Support various colors in GPU live demo

AS-IS

Just visualize human segmentation

TO-BE

Not only human, but only 30 categories

Do I need to transform the image to model input size? And how to add mosaic on the original image?

Hi, sorry the this issue to ask for help.

I have followed the demo.

The input image is 1920 * 1080 pixel, and the result is 513 * 513 array, renderView is 300 * 300 pt.

Why the renderView is rendered normal even the input is not suitable?
And how can I add mosaic on the original image depend on the segmentation?

Let's imagine the coreml network needs a cropped image (only a subpart of the camera feed). No matter how the crop is done (it can be hard coded for testing purpose), I wonder if there is a way to change the DrawingSegmentationView for example, to achieve this. Right now, if the input image is cropped, the output view resize the image to the viewport and the result isn't well registered.