protossw512 / adaptivewingloss Goto Github PK

View Code? Open in Web Editor NEW

394.0 18.0 88.0 3.79 MB

[ICCV 2019] Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression - Official Implementation

License: Apache License 2.0

Python 99.18% Shell 0.82%

adaptivewingloss's Introduction

AdaptiveWingLoss

arXiv

Pytorch Implementation of Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression.

Update Logs:

October 28, 2019

Pretrained Model and evaluation code on WFLW dataset is released.

Installation

Note: Code was originally developed under Python2.X and Pytorch 0.4. This released version was revisioned from original code and was tested on Python3.5.7 and Pytorch 1.3.0.

Install system requirements:

sudo apt-get install python3-dev python3-pip python3-tk libglib2.0-0

Install python dependencies:

pip3 install -r requirements.txt

Run Evaluation on WFLW dataset

Download and process WFLW dataset
- Download WFLW dataset and annotation from Here.
- Unzip WFLW dataset and annotations and move files into ./dataset directory. Your directory should look like this:
```
AdaptiveWingLoss
└───dataset
   │
   └───WFLW_annotations
   │   └───list_98pt_rect_attr_train_test
   │   │
   │   └───list_98pt_test
   │
   └───WFLW_images
       └───0--Parade
       │
       └───...
```
- Inside ./dataset directory, run:
```
python convert_WFLW.py
```
  A new directory ./dataset/WFLW_test should be generated with 2500 processed testing images and corresponding landmarks.
Download pretrained model from Google Drive and put it in ./ckpt directory.
Within ./Scripts directory, run following command:
```
sh eval_wflw.sh
```
*GTBbox indicates the ground truth landmarks are used as bounding box to crop faces.

Future Plans

Release evaluation code and pretrained model on WFLW dataset.
Release training code on WFLW dataset.
Release pretrained model and code on 300W, AFLW and COFW dataset.
Replease facial landmark detection API

Citation

If you find this useful for your research, please cite the following paper.

@InProceedings{Wang_2019_ICCV,
author = {Wang, Xinyao and Bo, Liefeng and Fuxin, Li},
title = {Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}
}

Acknowledgments

This repository borrows or partially modifies hourglass model and data processing code from face alignment and pose-hg-train.

adaptivewingloss's People

Contributors

Stargazers

Watchers

Forkers

chaos1992 banana1024 viliusmat wqz960 siyue0211 funkykoki tiqq111 oxyai zys1994 baucheng peternara jingziyou mistreanuionutcosmin asaphlightricks zhangxuecheng qaz734913414 liuwenhua6666 aliushn royzon kernela dreadlord1984 mbahri 0x4mio akumar14 vertyxzz johnbhlm lux-jwang wuxiaolianggit lovepug-xc daxiafresh zhaoluo idesignitx zhongkey99 liuguoyou yuyufei88 tao-he ibmua easy-shu suixiaodan thanhvinhle26 123wangju123 zhouqianyu0918 shiyu555 githubfragments mornydew kail85 nguyenthean heathcliffyang aniket004 walkergoxq zengxianyu jiaxiangshang adas-eye daijuting 00itamarts00 yiweichen04 timlod sinianyutian wangdeyu jinwoo1225 mamonaawan 0614lsn huaqinghe strx2322 lyqsr nastyamittseva liujingxiu23 jasondu1993 fodark liuchuan0408 hosedudeface azuredsky emmalacaille zhenxili96 starhiking ashok-arjun sogemao tianmaxingkong12 yblir uci-cleft-ai dl-loss haleuh jjaewon7210 mr-nobody-dey orlevit pickleyang

adaptivewingloss's Issues

Two years have passed, have you considered release the train code?

cant find AdaptiveWingLoss implement

I check this repository and dont find any file or function implement Adaptive Wing loss.
Could you sure AdWingloss shared.

Hi, in your model arch, you return outputs and boundary_channels with their sizes [B,landmarks+1,W,H] and [B,2,W,H]. I am not familiar with coordconv, should we calculate heatmap between concat([B,landmarks,W,H],[B,1,W,H]) and outputs[i]? If so, why should we return boundary_channels ? Thanks for your reply.

how visualize result?

hi, thank for your dedicate.

How can I visualize the results using custom test data?

Metrics for COFW dataset, Inter-ocular or Inter-pupil?

Hi, In section 7.2, you announced to use inter-pupil distance for normalization ("For the COFW dataset, we use inter-pupil (distance of eye centers) as the normalization factor".
However, in Table 2, the metric for other previous SOTA methods(Robust face landmark estimation under occlusion, etc,) used inter-ocular distance as the normalization factor.
Is this a mistype?

Does the 'GTBbox' mean the bbox generated by MTCNN or the coordinates in annotation?

Hi, thanks for sharing your implementation. I am kinda confused about the 'GTBbox' in the result table. Does it mean the bbox generated by the MTCNN detector?

what does the model return if there is no face in the image?

I was trying to pass a landascape image with no faces at all.
the model seems to predict anyway the landmarks.. this seems wrong

When will the pretrained model on 300W be released?

68 Facial landmarks model

Hi, thanks for the implementation.

Are you planning on releasing the 68 facial landmarks pre-trained model? At the moment, the only model available seems to be the one with 98 landmarks.

Thanks

Why does the output have one more channel than the num of landmarks?

In your code of the model you have

self.add_module('l' + str(hg_module), nn.Conv2d(256, num_landmarks+1, kernel_size=1, stride=1, padding=0))

and in evaler.py the last channel isn't used

pred_heatmap = outputs[-1][:, :-1, :, :][i].detach().cpu()

So I guess it's used somewhere in the loss?

About the model's generalization ability

Thank you for your implementation!
My experiments require facial landmarks annotated in some dataset without ground truth annotation. So I intended to use your model as preprocessing.
Firstly, I ran this model as instructed on WFLW dataset, it did perform perfectly.
Then, I ran it on KDEF dataset, which is a lab conditioned, multi-posed facial dataset. While the landmarks were way from accurate positions.
Why couldn't it generalize from a obviously difficult dataset to a much simpler one?

When testing in videos, the results are dithering

在视频中测试时,结果抖动明显

hello, can you please provide the weight files ?

inference speed

When I test, the inference speed isn't as fast as the paper says. It can only achieve 50fps with 1HG stack.

Correctness of the adaptive wing loss implementation

diff_abs = (target - prediction).abs()
loss = diff_abs.clone()



idx_smaller = diff_abs < theta
idx_bigger = diff_abs >= theta

loss[idx_smaller] = width * torch.log((1 +torch.pow(torch.abs(diff_abs[idx_smaller] / epsilon),(alpha-target[idx_smaller]))))


loss[idx_bigger] = width*(1/(1+torch.pow((theta/epsilon),(alpha-target[idx_bigger]))))*(alpha-target[idx_bigger])*(torch.pow((theta/epsilon),(alpha-target[idx_bigger]-1)))*(1.0/epsilon)*torch.abs(diff_abs[idx_bigger])-\
(theta*(width*(1/(1+torch.pow((theta/epsilon),(alpha-target[idx_bigger]))))*(alpha-target[idx_bigger])*(torch.pow((theta/epsilon),(alpha-target[idx_bigger]-1)))*(1.0/epsilon))-width*(torch.log(1+torch.pow((theta/epsilon),alpha-target[idx_bigger]))))

Hi, I am pretty interested in your excellent paper, and I implemented it in pytorch by myself. However it's wired that I didn't get an obvious improvement compared to MSE Loss. Can you help me find whether my implementation is correct? Or do you have any plan to upload your implementation recently?

Hi, when will you release the training code.

How to show boundary map

Hi, really thx for you excellent contribution.
I wonder how to show boundary map in your code (boundary_channels) like the boundary heat map show in your paper

Thanks!

Get 4.68 NME on WFLW dataset

Hello, I have trained the model, and the NME of the model is 4.68. How can I get 4.36 NME. Can you tell me the details of the training model?

why facial landmarks for covered parts?

Hi, just wanted to know how is the network able to draw facial landmarks even when the facial feature is hidden behind some other objects?

How to make the network draw facial landmarks only when the facial features are not hidden?

what is the 'boundary_channels' output by Coordconv and how to use them?

I have noticed that the FAN returns heatmaps (contains landmarks and boundaries) and the boundary channels. The boundary channels seem to be returned by the Coordconv, should they be something like [[000,111,222,...], [012,012,012,...]] ? while I got some other values after trainning. Should I add some loss for 'boundary channels'?

what is output dimention?

Its obvious input dimension of model is 256*256.
but it seems output landmarks aren't fit to 256 * 256
for example result of them on image is like this:

NME calculation for 68 points

why are you taking an average of points for eye coordinates to calculate the norm_factor

left_eye = np.average(gt_landmarks[36:42], axis=0)
right_eye = np.average(gt_landmarks[42:48], axis=0)
norm_factor = np.linalg.norm(left_eye - right_eye)

The challenge suggests the one that you have commented

# norm_factor = np.linalg.norm(gt_landmarks[36]- gt_landmarks[45])

can you please explain is this for any kind of improvements?

Please consider releasing the training code

Training code implementation

`
import matplotlib.pyplot as plt
import cv2
import sys
import os
from PIL import Image, ImageDraw
from utils.utils import fan_NME, show_landmarks, get_preds_fromhm
import numpy as np
from skimage import io
import shutil
from torch.autograd import Variable
import time
import copy
from torch import nn
import torch
import math
import matplotlib
matplotlib.use('Agg')

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")


class AdaptiveWingLoss(nn.Module):
    def __init__(self, omega=14, theta=0.5, epsilon=1, alpha=2.1):
        super(AdaptiveWingLoss, self).__init__()
        self.omega = omega
        self.theta = theta
        self.epsilon = epsilon
        self.alpha = alpha

    def forward(self, pred, weight_map, target):
        y = target
        y_hat = pred
        delta_y = (y - y_hat).abs()
        delta_y1 = delta_y[delta_y < self.theta]
        delta_y2 = delta_y[delta_y >= self.theta]
        y1 = y[delta_y < self.theta]
        y2 = y[delta_y >= self.theta]
        loss1 = self.omega * torch.log(1 + torch.pow(
            delta_y1 / self.omega, self.alpha - y1)) * weight_map[delta_y < self.theta]
        A = self.omega * (1 / (1 + torch.pow(self.theta / self.epsilon, self.alpha - y2))) * (self.alpha - y2) * (
            torch.pow(self.theta / self.epsilon, self.alpha - y2 - 1)) * (1 / self.epsilon)
        C = self.theta * A - self.omega * \
            torch.log(1 + torch.pow(self.theta / self.epsilon, self.alpha - y2))
        loss2 = (A * delta_y2 - C) * weight_map[delta_y >= self.theta]
        return (loss1.sum() + loss2.sum()) / (len(loss1) + len(loss2))


def train_model(model, dataloaders, dataset_sizes, use_gpu=True, epoches=5,
                save_path='./', num_landmarks=68, start_epoch=0):
    best_acc = 100
    optimizer = torch.optim.RMSprop(
        model.parameters(), lr=0.0000001, weight_decay=0)
    loss_AW = AdaptiveWingLoss()
    for epoch in range(start_epoch, epoches + start_epoch):
        running_loss = 0
        step = 0
        total_nme = 0
        total_count = 0
        fail_count = 0
        nmes = []
        # running_corrects = 0
        step_start = time.time()

        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            # Iterate over data.
            # with torch.set_grad_enabled(True):
            for data in dataloaders[phase]:
                optimizer.zero_grad()
                total_runtime = 0
                run_count = 0

                step += 1
                # get the inputs
                inputs = data['image'].type(torch.FloatTensor)
                labels_heatmap = data['heatmap'].type(torch.FloatTensor)
                labels_boundary = data['boundary'].type(torch.FloatTensor)
                gt_landmarks = data['landmarks'].type(torch.FloatTensor)
                loss_weight_map = data['weight_map'].type(torch.FloatTensor)
                # wrap them in Variable
                if use_gpu:
                    inputs = inputs.to(device)
                    labels_heatmap = labels_heatmap.to(device)
                    labels_boundary = labels_boundary.to(device)
                    loss_weight_map = loss_weight_map.to(device)
                else:
                    inputs, labels_heatmap = Variable(
                        inputs), Variable(labels_heatmap)
                    labels_boundary = Variable(labels_boundary)
                labels = torch.cat((labels_heatmap, labels_boundary), 1)
                single_start = time.time()
                with torch.set_grad_enabled(phase == 'train'):
                    outputs, boundary_channels = model(inputs)
                    pred_labels = torch.cat(
                        (outputs[-1][:, :-1, :, :], boundary_channels[-1][:, :-1, :, :]), 1)
                    ###
                    loss_total = loss_AW(
                        pred_labels, loss_weight_map * 10 + 1, labels)
                    ###
                    #print("Batch Loss: {:.6f}".format(loss.item()))
                    if phase == 'train':
                        loss_total.backward()
                        optimizer.step()
                batch_nme = fan_NME(
                    outputs[-1][:, :-1, :, :].detach().cpu(), gt_landmarks, num_landmarks)
                #print("Batch NME: {:.6f}".format(batch_nme))
                # batch_nme = 0
                total_nme += batch_nme
            epoch_nme = total_nme / dataset_sizes[phase]
            step_end = time.time()
            print(phase + ' NME: {:.6f}'.format(epoch_nme))
            if phase == 'val' and epoch_nme < best_acc:
                state = {
                    'next_epoch': epoch+1,
                    'epoch_total_nme': epoch_nme,
                    'state_dict': model.state_dict(),
                    # 'scheduler' : scheduler.state_dict(),
                    'optimizer': optimizer.state_dict()
                }
                torch.save(state, save_path+'{:02d}'.format(epoch)+'.pth')
        #nme_save_path = os.path.join(save_path, 'nme_log.npy')
        #np.save(nme_save_path, np.array(nmes))
        #print('NME: {:.6f} Failure Rate: {:.6f} Total Count: {:.6f} Fail Count: {:.6f}'.format(epoch_nme, fail_count/total_count, total_count, fail_count))

    #print('Everage runtime for a single batch: {:.6f}'.format(total_runtime/run_count))
    return model

`
@protossw512 code you please check if my training implementation is correct

recently can u realese the training code

thaks for u sharing.

Run the code on custom data?

Hi, and thank you for making this code available.

What steps do I need to do to run this code on a single image and draw the landmarks?
Is there any example code for this?

Thanks again!

convert_WFLW.py script missing?

Would you please provide convert_WFLW.py script? It seems that it's not included in the current project. Thank you.

The error maybe be about the model

I try to the run sh eval_wflw.sh, but I meet the error:
'''
Traceback (most recent call last):
File "D:/Face/FaceAlignment/AdaptiveWingLoss-master/eval.py", line 72, in
model_ft.load_state_dict(model_weights)
File "D:\Anaconda3\envs\fa37\lib\site-packages\torch\nn\modules\module.py", line 845, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for FAN:
size mismatch for l0.weight: copying a param with shape torch.Size([99, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([69, 256, 1, 1]).
size mismatch for l0.bias: copying a param with shape torch.Size([99]) from checkpoint, the shape in current model is torch.Size([69]).
size mismatch for al0.weight: copying a param with shape torch.Size([256, 99, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 69, 1, 1]).
size mismatch for l1.weight: copying a param with shape torch.Size([99, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([69, 256, 1, 1]).
size mismatch for l1.bias: copying a param with shape torch.Size([99]) from checkpoint, the shape in current model is torch.Size([69]).
size mismatch for al1.weight: copying a param with shape torch.Size([256, 99, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 69, 1, 1]).
size mismatch for l2.weight: copying a param with shape torch.Size([99, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([69, 256, 1, 1]).
size mismatch for l2.bias: copying a param with shape torch.Size([99]) from checkpoint, the shape in current model is torch.Size([69]).
size mismatch for al2.weight: copying a param with shape torch.Size([256, 99, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 69, 1, 1]).
size mismatch for l3.weight: copying a param with shape torch.Size([99, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([69, 256, 1, 1]).
size mismatch for l3.bias: copying a param with shape torch.Size([99]) from checkpoint, the shape in current model is torch.Size([69]).

'''
I think the the model is incomplete, so I downloaded the model again, but I still meet the same error, how can I solve the error?

Thank your better work than before, and I am look forward to you future release!