zllrunning / face-parsing.pytorch Goto Github PK

View Code? Open in Web Editor NEW

2.2K 34.0 451.0 3.07 MB

Using modified BiSeNet for face parsing in PyTorch

License: MIT License

Python 66.62% C 1.14% C++ 11.47% Cuda 20.77%

face-parsing semantic-segmentation pytorch celeba-hq-dataset bisenet face-segmentation

face-parsing.pytorch's People

Contributors

Stargazers

Watchers

Forkers

liujianzhao6328057 mrronjt paper-mat-refs zlldo jry01 ingeniousfrog wuyunxiangwyx mingkin paojianghu zhouleisjtu zkchen95 zjcrt sidriaz ravikumarsahu vstarkov mikechen1023 nhokcrazy199 jianqiangren xialuxi moerbenkaola axfv xiaoye77 dreadlord1984 yangheng111 lixueqingqq hansonsun zhaipro simon5u agermanidis ml-lab wenqingchu beknown-j yangyw08 gnaixzl zhaoyk1986 shiyongde yes7rose usideu dtgrid xiaozhenchen prateekj2903 xiamenwcy lanxielee keyky helios001 unstoppable xstgavin veintiocho mandymo gerwang oopming akinoriosamura jehovahxu dtmddus edmig ivanfadillah nla-asia tomarraj008 jerrysun1 1620252042 dushwe hyzcn blyucs redrock303 stonermax xavier31 syedrz theothings wjgaas match08 avaseghi juanraul8 hanson-young felixzhang7 wmonica l1129433134 xiaozhuka insightai templeblock xiankgx ahuirecome peterzhousz jacklongking jjandnn fightseed vitoralbiero amirstudy ash368 tetterl jiajun-xiang valgur thomaslin1990 zhaoluo justfitting peterzs peternara yezimoshi272 lelechen63 liannice davknapp

face-parsing.pytorch's Issues

Model/weights license

I was looking to use this work for commercial purposes, but found that the original dataset is flagged as non-commercial for research only: https://github.com/switchablenorms/CelebAMask-HQ

Is this something that should be flagged when redistributing the weights?

Errror in preprocess_data.py

mask[sep_mask == 225] = l

Shouldn't it be 255?

Key already registered with the same priority: GroupSpatialSoftmax

When I am running the code from the following command

python test.py

Then I am getting just the following output:
Key already registered with the same priority: GroupSpatialSoftmax

Please help me.

unexpected EOF, expected 3832831 more bytes

Hi Team,
While using the pretrained model I am getting the above error on running test.py. Upon searching on the internet I got that it might be due to corrupted model file. Is it so? Please guide about the cause and solution of this error.
Below is an image of the error.

An error occurred when I run the test.py

os:ubuntu18.04
cuda:10.02
the error:
Traceback (most recent call last):
File "test.py", line 88, in
evaluate(dspth='/home/face-parsing/data/', cp='79999_iter.pth')
File "test.py", line 74, in evaluate
out = net(img)[0]
File "/home/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/face-parsing/model.py", line 243, in forward
feat_res8, feat_cp8, feat_cp16 = self.cp(x) # here return res3b1 feature
File "/home/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/face-parsing/model.py", line 106, in forward
feat8, feat16, feat32 = self.resnet(x)
File "/home/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/face-parsing/resnet.py", line 73, in forward
x = F.relu(self.bn1(x))
File "/home/.local/lib/python2.7/site-packages/torch/nn/functional.py", line 914, in relu
result = torch.relu(input)
RuntimeError: CUDA error: no kernel image is available for execution on the device

Dimension Problem in makeup.py

Can not download pre-trained model

I can open drive.google.com ,but i cant download the file

Torch configuration for Windows

How can I adopt this below part for Windows?. As I know this part for Ubuntu
def train():
args = parse_args()
torch.cuda.set_device(args.local_rank)
dist.init_process_group(
backend = 'nccl',
init_method = 'tcp://127.0.0.1:33241',
world_size = torch.cuda.device_count(),
rank=args.local_rank
)
setup_logger(respth)

without using GPU

Good day. Can only use face-parsing with CPU?

请问有可以在android/iOS运行的版本吗

The test result

I change the image size here:

But I get a completely different result?

128*128

512*512

Why is that? I just resized the image.

Looking forward to your answer!

How to you implement Resnet?

In your code, you import Resnet18, but there is no publicly available library for PyTorch that provides this net. Which repository have you used to be able to run this code:

from resnet import Resnet18
class ContextPath(nn.Module):
def init(self, *args, **kwargs):
super(ContextPath, self).init()
#self.resnet = ResNet.from_name("resnet18")
self.resnet = Resnet18()
self.arm16 = AttentionRefinementModule(256, 128)
self.arm32 = AttentionRefinementModule(512, 128)
self.conv_head32 = ConvBNReLU(128, 128, ks=3, stride=1, padding=1)
self.conv_head16 = ConvBNReLU(128, 128, ks=3, stride=1, padding=1)
self.conv_avg = ConvBNReLU(512, 128, ks=1, stride=1, padding=0)
self.init_weight()

I have tried installing pytorch_resnet, but the implementation is different. Installing resnet from Python PIP leads to a dependency problem with Tensorflow. The following error is thrown if attempting to use pytorch-resnet:

File "/SAFA/face_parsing/model.py", line 112, in forward feat8, feat16, feat32 = self.resnet(x) ValueError: not enough values to unpack (expected 3, got 1)

This project does not indicate how Resnet has been implemented. How have you installed it on your system? Please advise.

Hair colour black to other colour is not working why?

changing hair-colour from one colour to other it's working except black to other colour, something has to change , please explain

只想要人脸解析图的话怎么弄呢？

Could you tell me how to get the black mask like hair.png. Thanks a lot! My experiment results of mask is colourful

Face occlusion detection?

Is it possible to use this library to detect if a face is occluded or not?

Thanks!

关于前向结果 net(img)

在测试脚本 test.py 中使用的是 net(img)[0], 实际net(img)的结果有3维，为何这边取的是 [0]维度呢？另外两个维度实际有在其他功能模块中被使用到吗？

您好！关于网络的准确度是如何的？有训练好的网络可以提供吗？

感谢您的代码啊！！！
就是您能提供下您的实验日志那些的吗准确率什么的？还有预训练网络可以提供下吗手头设备不是太好非常感谢了！！！

Training loss increases with time

Hello there!

I have finetuned your model with 2 output classes (skin and background). As a backbone I took your pretrained (19 classes) model in order to speed-up the training process.
So, I just took context path, ffm, conv, conv16, conv32. Thus, only three convolutional layers (feat_out, feat_out16, feat_out32 respectively) are trained from scratch.

I have utilized all of your hyperparameters and all of your approaches to fit the model. But after some steps (after 250 steps) training loss increase (from 1.58) until 1.98 and afterwards remains there with small fluctuations.
What could be the problem? Any other ideas?

I tried using scheduler after 180-200 steps (initial lr=0.01, gamma=0.1), gradient clipping. Anyway, it is always the same picture - an increase in loss.
Thanks in advance for any ideas!

Batchsize = 64
Initial learning rate = 0.01
Oprimizer: SGD
Loss: OhemCELoss

P.S. I have been waiting for the 2500 steps (near 7 epochs) - there is not a single hint subsequent decreasing.

I'm a newcomer in face parsing. Can I ask you a question?

in transform.py:
the last two line:img = Image.open('data/img.jpg'), lb = Image.open('data/label.png')，Which two files of the dataset do the paths in parentheses correspond to？

Support the Core ML model for iOS

I made a model conversion script for iOS. The script imports pre-trained .pt file and convert into .mlmodel for CoreML.

I uploaded the converted model in my tucan9389/SemanticSegmentation-CoreML repo through release, and here is the Core ML model download link.

The converted model size is 52.7 MB, and the model inference time is measured as 30~50 ms in my iPhone 11 Pro. It looks the model can support real-time on the high-end mobile device.

If I made a real-time demo app for iOS, I'll share it on this issue.

Thank you for sharing the awesome repo and model!

import torch

import os.path as osp
import json
from PIL import Image
import torchvision.transforms as transforms
from model import BiSeNet

import coremltools as ct

dspth = 'res/test-img'
cp = '79999_iter.pth'
device = torch.device('cpu')

output_mlmodel_path = "FaceParsing.mlmodel"

labels = ['background', 'skin', 'l_brow', 'r_brow', 'l_eye', 'r_eye', 'eye_g', 'l_ear', 'r_ear', 'ear_r',
            'nose', 'mouth', 'u_lip', 'l_lip', 'neck', 'neck_l', 'cloth', 'hair', 'hat']
n_classes = len(labels)
print("n_classes:", n_classes)

class MyBiSeNet(torch.nn.Module):
    def __init__(self, n_classes, pretrained_model_path):
        super(MyBiSeNet, self).__init__()
        self.model = BiSeNet(n_classes=n_classes)
        self.model.load_state_dict(torch.load(pretrained_model_path, map_location=device))
        self.model.eval()

    def forward(self, x):
        x = self.model(x)
        x = x[0]
        x = torch.argmax(x, dim=1)
        x = torch.squeeze(x)
        return x

pretrained_model_path = osp.join('res/cp', cp)
model = MyBiSeNet(n_classes=n_classes, pretrained_model_path=pretrained_model_path)
model.eval()

example_input = torch.rand(1, 3, 512, 512)  # after test, will get 'size mismatch' error message with size 256x256
preprocess = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225],
    ),
])

traced_model = torch.jit.trace(model, example_input)


# Convert to Core ML using the Unified Conversion API
print(example_input.shape)

scale = 1.0 / (0.226 * 255.0)
red_bias   = -0.485 / 0.226
green_bias = -0.456 / 0.226
blue_bias  = -0.406 / 0.226

mlmodel = ct.convert(
    traced_model,
    inputs=[ct.ImageType(name="input",
                         shape=example_input.shape,
                         scale=scale,
                         color_layout="BGR",
                         bias=[blue_bias, green_bias, red_bias])], #name "input_1" is used in 'quickstart'
)



labels_json = {"labels": labels}

mlmodel.user_defined_metadata["com.apple.coreml.model.preview.type"] = "imageSegmenter"
mlmodel.user_defined_metadata['com.apple.coreml.model.preview.params'] = json.dumps(labels_json)

mlmodel.save(output_mlmodel_path)

import coremltools.proto.FeatureTypes_pb2 as ft

spec = ct.utils.load_spec(output_mlmodel_path)

for feature in spec.description.output:
    if feature.type.HasField("multiArrayType"):
        feature.type.multiArrayType.dataType = ft.ArrayFeatureType.INT32

ct.utils.save_spec(spec, output_mlmodel_path)

where can I find all label

    table = {
        'hair': 17,
        'upper_lip': 12,
        'lower_lip': 13
    }

Looks different from celebamask-hq

Mask labels are defined as following:

Label list
0: 'background'	1: 'skin'	2: 'nose'
3: 'eye_g'	4: 'l_eye'	5: 'r_eye'
6: 'l_brow'	7: 'r_brow'	8: 'l_ear'
9: 'r_ear'	10: 'mouth'	11: 'u_lip'
12: 'l_lip'	13: 'hair'	14: 'hat'
15: 'ear_r'	16: 'neck_l'	17: 'neck'
18: 'cloth'

2 images are generated

whenever i run test.py
I get 2 images one is the parsing map and other one the black image. Can any1 guide me why 2 images are generated

train

how do i train if i only have one gpu

How to apply this method to 1024 resolution images

I need to calculate the loss of adjacent frames in a region of the image at 1024 resolution, should I interpolate the resulting mask to 1024 size？

图像分辨率以及眼镜问题

你好，网络的输入会将图像的分辨率调整为统一的尺寸，有什么办法可以将其复原吗，同时网络在戴眼镜的情况无法识别眼睛

训练数据集

感谢你开源的工作，请问你是把celebMask的所有数据都拿来当训练集了吗？

Must the width and height of the image be consistent

loss problem

In OhemCELoss, you set a parameter named score_thres, can you tell me what's the role of this parameter?Thank you very much!

train classes mis-leading..

Since there are 17 classes1 in the prepare stags, while there are 19 classes in pretrained model from test.py.2
So,my concern is what is the differences between those two models?

dataset I think there may be a bug

In the code, the image is resized to 512, but the mask does not perform any operations. Their shapes are different

face-parsing.PyTorch/face_dataset.py

Line 47 in d2e684c

img = img.resize((512, 512), Image.BILINEAR)

face-parsing.PyTorch/face_dataset.py

Line 48 in d2e684c

 label = Image.open(osp.join(self.rootpth, 'mask', impth[:-3]+'png')).convert('P') 

The latter is changed through RandomScale.
I think add a line of code
label = label.resize((512, 512), Image.NEAREST)

眼睛部分：闭眼无法识别眼睛部分？？？

眼睛闭眼的时候眼睛部分是没有区分开这是为什么不考虑眼睛的位置嘛

How can it be used for my own dataset?

Hi there,

I've got my own image and mask sets, could you kindly tell me how to change the codes so that it can be used for a more general situation like an image set and a mask set? Because I see some file path and other codes in your scripts which only fit your system environment.

Appreciate it.

how to run on GPU? both on training and test out the code

Could you share the code about how to calculate the PSNR?Thanks a lot！！！

I want to convert attribute X to attribute Y. Should I use real X and generated Y to calculate PSNR, or use real X and generated X to calculate PSNR? I don't have the real Y because the data is not paired.

转onnx出错怎么解决呢？

The class number is 18, why set `BiSeNet(19)`

明明只有18个类别，为啥train.py和test.py都设置成下面这样：

n_classes = 19
net = BiSeNet(n_classes=n_classes)

另外为啥把HQ数据集的background标签去掉了，加上不是正好19个嘛？？

侧面识别问题

模型对侧面90度人像的分割效果似乎不是很好，请问有对侧面人像分割的处理办法吗？

How to get test data

Thanks for your wonderful work!
I have one little question. How did you get the test data. Did you split the origin dataset or something else? Can you go over the steps in detail?
Thanks a lot!