Git Product home page Git Product logo

adaptivewingloss's Introduction

AdaptiveWingLoss

Pytorch Implementation of Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression.

Update Logs:

October 28, 2019

  • Pretrained Model and evaluation code on WFLW dataset is released.

Installation

Note: Code was originally developed under Python2.X and Pytorch 0.4. This released version was revisioned from original code and was tested on Python3.5.7 and Pytorch 1.3.0.

Install system requirements:

sudo apt-get install python3-dev python3-pip python3-tk libglib2.0-0

Install python dependencies:

pip3 install -r requirements.txt

Run Evaluation on WFLW dataset

  1. Download and process WFLW dataset

    • Download WFLW dataset and annotation from Here.
    • Unzip WFLW dataset and annotations and move files into ./dataset directory. Your directory should look like this:
      AdaptiveWingLoss
      └───dataset
         │
         └───WFLW_annotations
         │   └───list_98pt_rect_attr_train_test
         │   │
         │   └───list_98pt_test
         │
         └───WFLW_images
             └───0--Parade
             │
             └───...
      
    • Inside ./dataset directory, run:
      python convert_WFLW.py
      
      A new directory ./dataset/WFLW_test should be generated with 2500 processed testing images and corresponding landmarks.
  2. Download pretrained model from Google Drive and put it in ./ckpt directory.

  3. Within ./Scripts directory, run following command:

    sh eval_wflw.sh
    
    *GTBbox indicates the ground truth landmarks are used as bounding box to crop faces.

Future Plans

  • Release evaluation code and pretrained model on WFLW dataset.

  • Release training code on WFLW dataset.

  • Release pretrained model and code on 300W, AFLW and COFW dataset.

  • Replease facial landmark detection API

Citation

If you find this useful for your research, please cite the following paper.

@InProceedings{Wang_2019_ICCV,
author = {Wang, Xinyao and Bo, Liefeng and Fuxin, Li},
title = {Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}
}

Acknowledgments

This repository borrows or partially modifies hourglass model and data processing code from face alignment and pose-hg-train.

adaptivewingloss's People

Contributors

protossw512 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

adaptivewingloss's Issues

Usage of coordconv?

Hi, in your model arch, you return outputs and boundary_channels with their sizes [B,landmarks+1,W,H] and [B,2,W,H]. I am not familiar with coordconv, should we calculate heatmap between concat([B,landmarks,W,H],[B,1,W,H]) and outputs[i]? If so, why should we return boundary_channels ? Thanks for your reply.

how visualize result?

hi, thank for your dedicate.

How can I visualize the results using custom test data?

Metrics for COFW dataset, Inter-ocular or Inter-pupil?

Hi, In section 7.2, you announced to use inter-pupil distance for normalization ("For the COFW dataset, we use inter-pupil (distance of eye centers) as the normalization factor".
However, in Table 2, the metric for other previous SOTA methods(Robust face landmark estimation under occlusion, etc,) used inter-ocular distance as the normalization factor.
Is this a mistype?

68 Facial landmarks model

Hi, thanks for the implementation.

Are you planning on releasing the 68 facial landmarks pre-trained model? At the moment, the only model available seems to be the one with 98 landmarks.

Thanks

Why does the output have one more channel than the num of landmarks?

In your code of the model you have

self.add_module('l' + str(hg_module), nn.Conv2d(256, num_landmarks+1, kernel_size=1, stride=1, padding=0))

and in evaler.py the last channel isn't used

pred_heatmap = outputs[-1][:, :-1, :, :][i].detach().cpu()

So I guess it's used somewhere in the loss?

About the model's generalization ability

Thank you for your implementation!
My experiments require facial landmarks annotated in some dataset without ground truth annotation. So I intended to use your model as preprocessing.
Firstly, I ran this model as instructed on WFLW dataset, it did perform perfectly.
Then, I ran it on KDEF dataset, which is a lab conditioned, multi-posed facial dataset. While the landmarks were way from accurate positions.
Why couldn't it generalize from a obviously difficult dataset to a much simpler one?
step10
step20
step30
step40

step10
step20
step30
step40

inference speed

When I test, the inference speed isn't as fast as the paper says. It can only achieve 50fps with 1HG stack.

Correctness of the adaptive wing loss implementation

diff_abs = (target - prediction).abs()
loss = diff_abs.clone()



idx_smaller = diff_abs < theta
idx_bigger = diff_abs >= theta

loss[idx_smaller] = width * torch.log((1 +torch.pow(torch.abs(diff_abs[idx_smaller] / epsilon),(alpha-target[idx_smaller]))))


loss[idx_bigger] = width*(1/(1+torch.pow((theta/epsilon),(alpha-target[idx_bigger]))))*(alpha-target[idx_bigger])*(torch.pow((theta/epsilon),(alpha-target[idx_bigger]-1)))*(1.0/epsilon)*torch.abs(diff_abs[idx_bigger])-\
(theta*(width*(1/(1+torch.pow((theta/epsilon),(alpha-target[idx_bigger]))))*(alpha-target[idx_bigger])*(torch.pow((theta/epsilon),(alpha-target[idx_bigger]-1)))*(1.0/epsilon))-width*(torch.log(1+torch.pow((theta/epsilon),alpha-target[idx_bigger]))))

Hi, I am pretty interested in your excellent paper, and I implemented it in pytorch by myself. However it's wired that I didn't get an obvious improvement compared to MSE Loss. Can you help me find whether my implementation is correct? Or do you have any plan to upload your implementation recently?

How to show boundary map

Hi, really thx for you excellent contribution.
I wonder how to show boundary map in your code (boundary_channels) like the boundary heat map show in your paper

Thanks!

Get 4.68 NME on WFLW dataset

Hello, I have trained the model, and the NME of the model is 4.68. How can I get 4.36 NME. Can you tell me the details of the training model?

why facial landmarks for covered parts?

Hi, just wanted to know how is the network able to draw facial landmarks even when the facial feature is hidden behind some other objects?

How to make the network draw facial landmarks only when the facial features are not hidden?

what is the 'boundary_channels' output by Coordconv and how to use them?

I have noticed that the FAN returns heatmaps (contains landmarks and boundaries) and the boundary channels. The boundary channels seem to be returned by the Coordconv, should they be something like [[000,111,222,...], [012,012,012,...]] ? while I got some other values after trainning. Should I add some loss for 'boundary channels'?

what is output dimention?

Its obvious input dimension of model is 256*256.
but it seems output landmarks aren't fit to 256 * 256
for example result of them on image is like this:
test

NME calculation for 68 points

why are you taking an average of points for eye coordinates to calculate the norm_factor

left_eye = np.average(gt_landmarks[36:42], axis=0)
right_eye = np.average(gt_landmarks[42:48], axis=0)
norm_factor = np.linalg.norm(left_eye - right_eye)

The challenge suggests the one that you have commented

# norm_factor = np.linalg.norm(gt_landmarks[36]- gt_landmarks[45])

can you please explain is this for any kind of improvements?

Training code implementation

`
import matplotlib.pyplot as plt
import cv2
import sys
import os
from PIL import Image, ImageDraw
from utils.utils import fan_NME, show_landmarks, get_preds_fromhm
import numpy as np
from skimage import io
import shutil
from torch.autograd import Variable
import time
import copy
from torch import nn
import torch
import math
import matplotlib
matplotlib.use('Agg')

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")


class AdaptiveWingLoss(nn.Module):
    def __init__(self, omega=14, theta=0.5, epsilon=1, alpha=2.1):
        super(AdaptiveWingLoss, self).__init__()
        self.omega = omega
        self.theta = theta
        self.epsilon = epsilon
        self.alpha = alpha

    def forward(self, pred, weight_map, target):
        y = target
        y_hat = pred
        delta_y = (y - y_hat).abs()
        delta_y1 = delta_y[delta_y < self.theta]
        delta_y2 = delta_y[delta_y >= self.theta]
        y1 = y[delta_y < self.theta]
        y2 = y[delta_y >= self.theta]
        loss1 = self.omega * torch.log(1 + torch.pow(
            delta_y1 / self.omega, self.alpha - y1)) * weight_map[delta_y < self.theta]
        A = self.omega * (1 / (1 + torch.pow(self.theta / self.epsilon, self.alpha - y2))) * (self.alpha - y2) * (
            torch.pow(self.theta / self.epsilon, self.alpha - y2 - 1)) * (1 / self.epsilon)
        C = self.theta * A - self.omega * \
            torch.log(1 + torch.pow(self.theta / self.epsilon, self.alpha - y2))
        loss2 = (A * delta_y2 - C) * weight_map[delta_y >= self.theta]
        return (loss1.sum() + loss2.sum()) / (len(loss1) + len(loss2))


def train_model(model, dataloaders, dataset_sizes, use_gpu=True, epoches=5,
                save_path='./', num_landmarks=68, start_epoch=0):
    best_acc = 100
    optimizer = torch.optim.RMSprop(
        model.parameters(), lr=0.0000001, weight_decay=0)
    loss_AW = AdaptiveWingLoss()
    for epoch in range(start_epoch, epoches + start_epoch):
        running_loss = 0
        step = 0
        total_nme = 0
        total_count = 0
        fail_count = 0
        nmes = []
        # running_corrects = 0
        step_start = time.time()

        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            # Iterate over data.
            # with torch.set_grad_enabled(True):
            for data in dataloaders[phase]:
                optimizer.zero_grad()
                total_runtime = 0
                run_count = 0

                step += 1
                # get the inputs
                inputs = data['image'].type(torch.FloatTensor)
                labels_heatmap = data['heatmap'].type(torch.FloatTensor)
                labels_boundary = data['boundary'].type(torch.FloatTensor)
                gt_landmarks = data['landmarks'].type(torch.FloatTensor)
                loss_weight_map = data['weight_map'].type(torch.FloatTensor)
                # wrap them in Variable
                if use_gpu:
                    inputs = inputs.to(device)
                    labels_heatmap = labels_heatmap.to(device)
                    labels_boundary = labels_boundary.to(device)
                    loss_weight_map = loss_weight_map.to(device)
                else:
                    inputs, labels_heatmap = Variable(
                        inputs), Variable(labels_heatmap)
                    labels_boundary = Variable(labels_boundary)
                labels = torch.cat((labels_heatmap, labels_boundary), 1)
                single_start = time.time()
                with torch.set_grad_enabled(phase == 'train'):
                    outputs, boundary_channels = model(inputs)
                    pred_labels = torch.cat(
                        (outputs[-1][:, :-1, :, :], boundary_channels[-1][:, :-1, :, :]), 1)
                    ###
                    loss_total = loss_AW(
                        pred_labels, loss_weight_map * 10 + 1, labels)
                    ###
                    #print("Batch Loss: {:.6f}".format(loss.item()))
                    if phase == 'train':
                        loss_total.backward()
                        optimizer.step()
                batch_nme = fan_NME(
                    outputs[-1][:, :-1, :, :].detach().cpu(), gt_landmarks, num_landmarks)
                #print("Batch NME: {:.6f}".format(batch_nme))
                # batch_nme = 0
                total_nme += batch_nme
            epoch_nme = total_nme / dataset_sizes[phase]
            step_end = time.time()
            print(phase + ' NME: {:.6f}'.format(epoch_nme))
            if phase == 'val' and epoch_nme < best_acc:
                state = {
                    'next_epoch': epoch+1,
                    'epoch_total_nme': epoch_nme,
                    'state_dict': model.state_dict(),
                    # 'scheduler' : scheduler.state_dict(),
                    'optimizer': optimizer.state_dict()
                }
                torch.save(state, save_path+'{:02d}'.format(epoch)+'.pth')
        #nme_save_path = os.path.join(save_path, 'nme_log.npy')
        #np.save(nme_save_path, np.array(nmes))
        #print('NME: {:.6f} Failure Rate: {:.6f} Total Count: {:.6f} Fail Count: {:.6f}'.format(epoch_nme, fail_count/total_count, total_count, fail_count))

    #print('Everage runtime for a single batch: {:.6f}'.format(total_runtime/run_count))
    return model

`
@protossw512 code you please check if my training implementation is correct

Run the code on custom data?

Hi, and thank you for making this code available.

What steps do I need to do to run this code on a single image and draw the landmarks?
Is there any example code for this?

Thanks again!

The error maybe be about the model

I try to the run sh eval_wflw.sh, but I meet the error:
'''
Traceback (most recent call last):
File "D:/Face/FaceAlignment/AdaptiveWingLoss-master/eval.py", line 72, in
model_ft.load_state_dict(model_weights)
File "D:\Anaconda3\envs\fa37\lib\site-packages\torch\nn\modules\module.py", line 845, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for FAN:
size mismatch for l0.weight: copying a param with shape torch.Size([99, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([69, 256, 1, 1]).
size mismatch for l0.bias: copying a param with shape torch.Size([99]) from checkpoint, the shape in current model is torch.Size([69]).
size mismatch for al0.weight: copying a param with shape torch.Size([256, 99, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 69, 1, 1]).
size mismatch for l1.weight: copying a param with shape torch.Size([99, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([69, 256, 1, 1]).
size mismatch for l1.bias: copying a param with shape torch.Size([99]) from checkpoint, the shape in current model is torch.Size([69]).
size mismatch for al1.weight: copying a param with shape torch.Size([256, 99, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 69, 1, 1]).
size mismatch for l2.weight: copying a param with shape torch.Size([99, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([69, 256, 1, 1]).
size mismatch for l2.bias: copying a param with shape torch.Size([99]) from checkpoint, the shape in current model is torch.Size([69]).
size mismatch for al2.weight: copying a param with shape torch.Size([256, 99, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 69, 1, 1]).
size mismatch for l3.weight: copying a param with shape torch.Size([99, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([69, 256, 1, 1]).
size mismatch for l3.bias: copying a param with shape torch.Size([99]) from checkpoint, the shape in current model is torch.Size([69]).

'''
I think the the model is incomplete, so I downloaded the model again, but I still meet the same error, how can I solve the error?

Thank your better work than before, and I am look forward to you future release!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.