zllrunning / face-parsing.pytorch Goto Github PK
View Code? Open in Web Editor NEWUsing modified BiSeNet for face parsing in PyTorch
License: MIT License
Using modified BiSeNet for face parsing in PyTorch
License: MIT License
谢谢
I was looking to use this work for commercial purposes, but found that the original dataset is flagged as non-commercial for research only: https://github.com/switchablenorms/CelebAMask-HQ
Is this something that should be flagged when redistributing the weights?
mask[sep_mask == 225] = l
Shouldn't it be 255?
How do I batch process a lot of images?
When I am running the code from the following command
python test.py
Then I am getting just the following output:
Key already registered with the same priority: GroupSpatialSoftmax
Please help me.
os:ubuntu18.04
cuda:10.02
the error:
Traceback (most recent call last):
File "test.py", line 88, in
evaluate(dspth='/home/face-parsing/data/', cp='79999_iter.pth')
File "test.py", line 74, in evaluate
out = net(img)[0]
File "/home/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/face-parsing/model.py", line 243, in forward
feat_res8, feat_cp8, feat_cp16 = self.cp(x) # here return res3b1 feature
File "/home/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/face-parsing/model.py", line 106, in forward
feat8, feat16, feat32 = self.resnet(x)
File "/home/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/face-parsing/resnet.py", line 73, in forward
x = F.relu(self.bn1(x))
File "/home/.local/lib/python2.7/site-packages/torch/nn/functional.py", line 914, in relu
result = torch.relu(input)
RuntimeError: CUDA error: no kernel image is available for execution on the device
I can open drive.google.com ,but i cant download the file
How can I adopt this below part for Windows?. As I know this part for Ubuntu
def train():
args = parse_args()
torch.cuda.set_device(args.local_rank)
dist.init_process_group(
backend = 'nccl',
init_method = 'tcp://127.0.0.1:33241',
world_size = torch.cuda.device_count(),
rank=args.local_rank
)
setup_logger(respth)
Good day. Can only use face-parsing with CPU?
In your code, you import Resnet18, but there is no publicly available library for PyTorch that provides this net. Which repository have you used to be able to run this code:
from resnet import Resnet18
class ContextPath(nn.Module):
def init(self, *args, **kwargs):
super(ContextPath, self).init()
#self.resnet = ResNet.from_name("resnet18")
self.resnet = Resnet18()
self.arm16 = AttentionRefinementModule(256, 128)
self.arm32 = AttentionRefinementModule(512, 128)
self.conv_head32 = ConvBNReLU(128, 128, ks=3, stride=1, padding=1)
self.conv_head16 = ConvBNReLU(128, 128, ks=3, stride=1, padding=1)
self.conv_avg = ConvBNReLU(512, 128, ks=1, stride=1, padding=0)
self.init_weight()
I have tried installing pytorch_resnet, but the implementation is different. Installing resnet from Python PIP leads to a dependency problem with Tensorflow. The following error is thrown if attempting to use pytorch-resnet:
File "/SAFA/face_parsing/model.py", line 112, in forward feat8, feat16, feat32 = self.resnet(x) ValueError: not enough values to unpack (expected 3, got 1)
This project does not indicate how Resnet has been implemented. How have you installed it on your system? Please advise.
changing hair-colour from one colour to other it's working except black to other colour, something has to change , please explain
Is it possible to use this library to detect if a face is occluded or not?
Thanks!
在测试脚本 test.py 中使用的是 net(img)[0], 实际net(img)的结果有3维, 为何这边取的是 [0]维度呢? 另外两个维度实际有在其他功能模块中被使用到吗?
感谢您的代码啊!!!
就是您能提供下 您的实验日志那些的吗 准确率什么的? 还有预训练网络可以提供下吗 手头设备不是太好 非常感谢了!!!
Hello there!
I have finetuned your model with 2 output classes (skin and background). As a backbone I took your pretrained (19 classes) model in order to speed-up the training process.
So, I just took context path, ffm, conv, conv16, conv32. Thus, only three convolutional layers (feat_out, feat_out16, feat_out32 respectively) are trained from scratch.
I have utilized all of your hyperparameters and all of your approaches to fit the model. But after some steps (after 250 steps) training loss increase (from 1.58) until 1.98 and afterwards remains there with small fluctuations.
What could be the problem? Any other ideas?
I tried using scheduler after 180-200 steps (initial lr=0.01, gamma=0.1), gradient clipping. Anyway, it is always the same picture - an increase in loss.
Thanks in advance for any ideas!
Batchsize = 64
Initial learning rate = 0.01
Oprimizer: SGD
Loss: OhemCELoss
P.S. I have been waiting for the 2500 steps (near 7 epochs) - there is not a single hint subsequent decreasing.
in transform.py:
the last two line:img = Image.open('data/img.jpg'), lb = Image.open('data/label.png'),Which two files of the dataset do the paths in parentheses correspond to?
I made a model conversion script for iOS. The script imports pre-trained .pt
file and convert into .mlmodel
for CoreML.
I uploaded the converted model in my tucan9389/SemanticSegmentation-CoreML repo through release, and here is the Core ML model download link.
The converted model size is 52.7 MB, and the model inference time is measured as 30~50 ms in my iPhone 11 Pro. It looks the model can support real-time on the high-end mobile device.
If I made a real-time demo app for iOS, I'll share it on this issue.
Thank you for sharing the awesome repo and model!
import torch
import os.path as osp
import json
from PIL import Image
import torchvision.transforms as transforms
from model import BiSeNet
import coremltools as ct
dspth = 'res/test-img'
cp = '79999_iter.pth'
device = torch.device('cpu')
output_mlmodel_path = "FaceParsing.mlmodel"
labels = ['background', 'skin', 'l_brow', 'r_brow', 'l_eye', 'r_eye', 'eye_g', 'l_ear', 'r_ear', 'ear_r',
'nose', 'mouth', 'u_lip', 'l_lip', 'neck', 'neck_l', 'cloth', 'hair', 'hat']
n_classes = len(labels)
print("n_classes:", n_classes)
class MyBiSeNet(torch.nn.Module):
def __init__(self, n_classes, pretrained_model_path):
super(MyBiSeNet, self).__init__()
self.model = BiSeNet(n_classes=n_classes)
self.model.load_state_dict(torch.load(pretrained_model_path, map_location=device))
self.model.eval()
def forward(self, x):
x = self.model(x)
x = x[0]
x = torch.argmax(x, dim=1)
x = torch.squeeze(x)
return x
pretrained_model_path = osp.join('res/cp', cp)
model = MyBiSeNet(n_classes=n_classes, pretrained_model_path=pretrained_model_path)
model.eval()
example_input = torch.rand(1, 3, 512, 512) # after test, will get 'size mismatch' error message with size 256x256
preprocess = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225],
),
])
traced_model = torch.jit.trace(model, example_input)
# Convert to Core ML using the Unified Conversion API
print(example_input.shape)
scale = 1.0 / (0.226 * 255.0)
red_bias = -0.485 / 0.226
green_bias = -0.456 / 0.226
blue_bias = -0.406 / 0.226
mlmodel = ct.convert(
traced_model,
inputs=[ct.ImageType(name="input",
shape=example_input.shape,
scale=scale,
color_layout="BGR",
bias=[blue_bias, green_bias, red_bias])], #name "input_1" is used in 'quickstart'
)
labels_json = {"labels": labels}
mlmodel.user_defined_metadata["com.apple.coreml.model.preview.type"] = "imageSegmenter"
mlmodel.user_defined_metadata['com.apple.coreml.model.preview.params'] = json.dumps(labels_json)
mlmodel.save(output_mlmodel_path)
import coremltools.proto.FeatureTypes_pb2 as ft
spec = ct.utils.load_spec(output_mlmodel_path)
for feature in spec.description.output:
if feature.type.HasField("multiArrayType"):
feature.type.multiArrayType.dataType = ft.ArrayFeatureType.INT32
ct.utils.save_spec(spec, output_mlmodel_path)
like
table = {
'hair': 17,
'upper_lip': 12,
'lower_lip': 13
}
Looks different from celebamask-hq
Label list | ||
---|---|---|
0: 'background' | 1: 'skin' | 2: 'nose' |
3: 'eye_g' | 4: 'l_eye' | 5: 'r_eye' |
6: 'l_brow' | 7: 'r_brow' | 8: 'l_ear' |
9: 'r_ear' | 10: 'mouth' | 11: 'u_lip' |
12: 'l_lip' | 13: 'hair' | 14: 'hat' |
15: 'ear_r' | 16: 'neck_l' | 17: 'neck' |
18: 'cloth' |
How do I add another face category?
whenever i run test.py
I get 2 images one is the parsing map and other one the black image. Can any1 guide me why 2 images are generated
how do i train if i only have one gpu
I need to calculate the loss of adjacent frames in a region of the image at 1024 resolution, should I interpolate the resulting mask to 1024 size?
你好,网络的输入会将图像的分辨率调整为统一的尺寸,有什么办法可以将其复原吗,同时网络在戴眼镜的情况无法识别眼睛
感谢你开源的工作,请问你是把celebMask的所有数据都拿来当训练集了吗?
Must the width and height of the image be consistent
In OhemCELoss, you set a parameter named score_thres, can you tell me what's the role of this parameter?Thank you very much!
In the code, the image is resized to 512, but the mask does not perform any operations. Their shapes are different
face-parsing.PyTorch/face_dataset.py
Line 47 in d2e684c
face-parsing.PyTorch/face_dataset.py
Line 48 in d2e684c
The latter is changed through RandomScale.
I think add a line of code
label = label.resize((512, 512), Image.NEAREST)
眼睛闭眼的时候 眼睛部分是没有区分开 这是为什么 不考虑眼睛的位置嘛
Hi there,
I've got my own image and mask sets, could you kindly tell me how to change the codes so that it can be used for a more general situation like an image set and a mask set? Because I see some file path and other codes in your scripts which only fit your system environment.
Appreciate it.
I want to convert attribute X to attribute Y. Should I use real X and generated Y to calculate PSNR, or use real X and generated X to calculate PSNR? I don't have the real Y because the data is not paired.
明明只有18个类别,为啥train.py和test.py都设置成下面这样:
n_classes = 19
net = BiSeNet(n_classes=n_classes)
另外为啥把HQ数据集的background标签去掉了,加上不是正好19个嘛??
模型对侧面90度人像的分割效果似乎不是很好,请问有对侧面人像分割的处理办法吗?
Thanks for your wonderful work!
I have one little question. How did you get the test data. Did you split the origin dataset or something else? Can you go over the steps in detail?
Thanks a lot!
Hello,
How can I mask out the forehead separately?
Thanks!
首先很感谢你的项目,效果很高.我想问的是,假设我想对图片中的水果进行分割,那么我至少需要准备多少数据集,可以达到类似的效果呢.一千张以内有可能吗?因为自己动手标注实在太累了.
另外,你训练这个模型花费了多少时间呢
It's nice github repo, Could you please tell me how can i get probability prediction at the and with parsing labels ? Thank you
谢谢你的代码!
可以把pre-trained model发到我邮箱嘛?因为我这边连接不上google drive了。
非常感谢!!!
Hi,
Thanks for your great work. I used your face parse network as part of an upcoming academic contribution. Do you have a preferred paper that I should cite?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.