zhenglinzhou / star Goto Github PK

View Code? Open in Web Editor NEW

149.0 149.0 18.0 1004 KB

[CVPR 2023] STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection

Python 100.00%

star's People

Contributors

Stargazers

Watchers

Forkers

nmber5 timsc anguoyang catherineytw prototypeex midhun07 shenhanqian doanngoccuongbkegnh mjanddy zhangxujinsh g807504297 xiezixiustc liuguoyou wf1024966 hahaherher jackzhousz

star's Issues

About performance of WFLW subsets

Wonderful work!

We wanna cite and compare your great work, but lack some results in WFLW subsets.

Can you give the performance (NME, FR and AUC ) of WFLW subset ?

Performance speed up

Thank you for your work. The results are very good but I realised that it can be quite slow.

~0.07s per frame

RTX2070Super mobile, 640x480 image from webcam

is there a way to speed up the inference process?

I wanted to use your trained models to predict landmarks for images from the FFHQ dataset containing cropped and centered faces.
I followed the pre- and post-processing steps as in your evlauation script:

image = cv2.imread("test2.jpg")
input_tensor, matrix = preprocess(image, 1, 128.0, 128.0)
output, heatmap, landmarks = net(input_tensor)
landmarks = denorm_points(landmarks)
landmarks = landmarks.data.cpu().numpy()[0]
landmarks = postprocess(landmarks, np.linalg.inv(matrix))

However, the predicted landmarks do not match the face at all:

Could you please help me to figure out how to correctly use your model here?

Thanks!

Cannot access the parameters

The website is broken or cannot accessble from Europe. VPN also does not work. Can you upload them to the google drive or something accessble from Europe?

Fit a 3DMM

Has anyone used this to convert the 2D landmarks to a 3D model?

How to convert STAR 98 landmarks to Face alignment 68 landmarks

Thanks for your great job. Could I convert the STAR 98 landmarks to Face alignment 68 landmarks?

regarding the metadata_path

Hi @ZhenglinZhou , there is a para named with metadata_path, where can I get it?

python evaluate.py --device_ids=0 \
                   --model_path=${model_path} --metadata_path=${metadata_path} \
                   --image_dir=${image_dir} --data_definition={WFLW, 300W, COFW} \

btw, I downloaded the 300w dataset, it seems the annotation of the test data included in test.ts, just wonder how you split it into full, common and challenge and evaluate them respectively? it seems there is no item in the annotation to tell which image is challenge, thank you

Not properly working in case of occlusion and tilted head pose.

Hey @ZhenglinZhou
The STAR model gives extrapolated landmarks in case of occlusion and tilted head pose like this--

Is the model not taking care of the above scenarios? Ideally, the landmarks along the lip and facial contour should not be detected. Most of the facial landmark detection models have this issue.

About the license for this model

Thank you for sharing your great code. 😺

What is the license for this model? I'd like to cite it to the repository I'm working on if possible, but I want to post the license correctly.

Converted ONNX / TFLite
https://github.com/PINTO0309/PINTO_model_zoo/tree/main/414_STAR

Thank you.

Nothing here?

params and flops of the model

你好！我对这篇工作十分感兴趣，想在自己的文章中引用，利用你提供框架，我测得的params=17.18 M，flops=17.52 GMac，请问是否与你测得的相一致呢？

About modifying model input size

Hi, first of all thanks for your good work with STAR loss. I encountered an issue when trying to reduce the network input size from (256, 256) to (112, 112) and train the model on 300W from scratch:

Training process stills work fine with original configs (256x256). Do you have any idea what to change to fix this?

Btw, could you give me some advice on how to prepare other hyperparams when changing the input size?
Thank you very much.

Question about the data augmentation

Hi @ZhenglinZhou :

Here is a question of the data augmentation.

In the AlignmentDataset the raw landmarks of a picture will be loaded and sent to do the augmentation.

STAR/lib/dataset/alignmentDataset.py

Line 293 in 2196c57

img, landmarks_target, matrix = \

In the augmentation steps, the lmk will be transposed to aug_lmk

STAR/lib/dataset/augmentation.py

Line 56 in 2196c57

 def process(self, img, lmk, lmk_5pts=None, scale=1.0, center_w=0, center_h=0, is_train=True): 

Then the augmented lmk will be sent to _norm_points.

STAR/lib/dataset/alignmentDataset.py

Line 296 in 2196c57

 landmarks = self._norm_points(torch.from_numpy(landmarks_target), self.image_height, self.image_width) 

The the processed lmks will served as the labels in training steps.

So I want to know what is the mathematical meaning of some of the transformations done on the landmarks in this step of data augmentation. What format will the initial lmks be converted into for training?

I am new in this field and not familiar with some operations. Hope to get your help, thanks!!

Change face detection model

Thank you for your work!

I saw #16 (comment) this comment.

So I try to change the face detection model and retrain STAR.

However, it doesn't seem to have a face detection process when training, so can you tell me which part I need to modify to retrain with my own face detection model?

backward error in training

Hello, I got an error when try to train on my own dataset, in the optimizer.step() . Do you have any idea about it?

Dataset links

Hi@ZhenglinZhou,

I've been trying to download datasets for this project from the link provided in the README, but the links for COFW and WLFW appear to be broken and I'm unable to access the datasets. Could you please provide updated links for downloading the datasets?

300w dataset

Thank you very much for providing open-source code that the author can refer to.

Output Visualization in tester

Hi, I want to visualize test results. I can not see landmark on an image or draw it because of the normalization the landmarks comes normalized way such as negatives. How to convert them to unnormalized version to print the image?

Commercial Use

Can this be used commercially?

train on new dataset

Thanks for your excellent work. I want to train the STAR model on my own dataset, could give some advice on how to prepare the training data.
Can I finetune the model on my own dataset, with only the face images and corresponding landmarks. I see that in your code, only images and landmarks are the datas used to train the STAR model.

star loss question

Hello, Zhenglin. I am trying to reimplement this star loss in mmpose. However, I have a question, could you please explain the functions of following two lines?

        normal_dist = self.dist_func(normal_error, torch.zeros_like(normal_error).to(normal_error), reduction='none')
        tangent_dist = self.dist_func(tangent_error, torch.zeros_like(tangent_error).to(tangent_error), reduction='none')

From my understanding, the default value of self.dist_func is SmoothL1Loss(). But I am not sure what is the value of normal_dist and tangent_dist and the meaning of them.

maybe it is not easy to explain, if so, may I get your wechat or email? thanks a lot! Best regard!

About Datasets Variables

What is center_w and center_h which use in Alignment process in evaluation?

About visualization of the PCA results

I am very interested in the method used to draw the red region in Figure 3, as this type of plot is exactly what I need for my current research work. I would like to learn how to create it.

Could STAR loss works in coor regression methods?

About reproducing the results of the paper

Hi, I'm trying to reproduce your paper results, I followed your steps:
python main.py --mode=train --device_ids=0,1,2,3 --image_dir=${image_dir} --annot_dir=${annot_dir} --data_definition=WFLW
and my config is :
loader_type: alignment
loss_func: STARLoss_v2
batch_size: 128
val_batch_size: 32
test_batch_size: 16
channels: 3
width: 256
height: 256
means: (127.5, 127.5, 127.5)
scale: 0.00784313725490196
display_iteration: 10
milestones: [200, 350, 450]
max_epoch: 500
net: stackedHGnet_v1
nstack: 4
optimizer: adam
learn_rate: 0.001
momentum: 0.01
weight_decay: 1e-05
nesterov: False
scheduler: MultiStepLR
gamma: 0.1
loss_weights: [0.125, 1.25, 1.25, 0.25, 2.5, 2.5, 0.5, 5.0, 5.0, 1.0, 10.0, 10.0]
criterions: ['STARLoss_v2', 'AWingLoss', 'AWingLoss', 'STARLoss_v2', 'AWingLoss', 'AWingLoss', 'STARLoss_v2', 'AWingLoss', 'AWingLoss', 'STARLoss_v2', 'AWingLoss', 'AWingLoss']
metrics: ['NME', None, None, 'NME', None, None, 'NME', None, None, 'NME', None, None]
key_metric_index: 9
classes_num: [98, 9, 98]
label_num: 12
ema: True
use_AAM: True
writer: <tensorboardX.writer.SummaryWriter object at 0x7ff5ef0a76d0>
logger: <RootLogger root (NOTSET)>
data_definition: WFLW
test_file: test.tsv
aug_prob: 1.0
val_epoch: 1
valset: test.tsv
norm_type: default
encoder_type: default
decoder_type: default
betas: [0.9, 0.999]
train_num_workers: 16
val_num_workers: 16
test_num_workers: 0
add_coord: True
star_w: 1
star_dist: smoothl1
edge_info: ((False, (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)), (True, (33, 34, 35, 36, 37, 38, 39, 40, 41)), (True, (42, 43, 44, 45, 46, 47, 48, 49, 50)), (False, (51, 52, 53, 54)), (False, (55, 56, 57, 58, 59)), (True, (60, 61, 62, 63, 64, 65, 66, 67)), (True, (68, 69, 70, 71, 72, 73, 74, 75)), (True, (76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87)), (True, (88, 89, 90, 91, 92, 93, 94, 95)))
nme_left_index: 60
nme_right_index: 72
crop_op: True

However, the model accuracy (NME) obtained by my training is only 4.08. Do you know what went wrong?