Git Product home page Git Product logo

star's People

Contributors

zhenglinzhou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

star's Issues

About performance of WFLW subsets

Wonderful work!

We wanna cite and compare your great work, but lack some results in WFLW subsets.

Can you give the performance (NME, FR and AUC ) of WFLW subset ?

Performance speed up

Thank you for your work. The results are very good but I realised that it can be quite slow.

~0.07s per frame

RTX2070Super mobile, 640x480 image from webcam

is there a way to speed up the inference process?

Apply to other datasets

Hi,

thanks for your work!

I wanted to use your trained models to predict landmarks for images from the FFHQ dataset containing cropped and centered faces.
I followed the pre- and post-processing steps as in your evlauation script:

image = cv2.imread("test2.jpg")
input_tensor, matrix = preprocess(image, 1, 128.0, 128.0)
output, heatmap, landmarks = net(input_tensor)
landmarks = denorm_points(landmarks)
landmarks = landmarks.data.cpu().numpy()[0]
landmarks = postprocess(landmarks, np.linalg.inv(matrix))

However, the predicted landmarks do not match the face at all:
output

Could you please help me to figure out how to correctly use your model here?

Thanks!

Cannot access the parameters

The website is broken or cannot accessble from Europe. VPN also does not work. Can you upload them to the google drive or something accessble from Europe?

Fit a 3DMM

Has anyone used this to convert the 2D landmarks to a 3D model?

regarding the metadata_path

Hi @ZhenglinZhou , there is a para named with metadata_path, where can I get it?

python evaluate.py --device_ids=0 \
                   --model_path=${model_path} --metadata_path=${metadata_path} \
                   --image_dir=${image_dir} --data_definition={WFLW, 300W, COFW} \ 

btw, I downloaded the 300w dataset, it seems the annotation of the test data included in test.ts, just wonder how you split it into full, common and challenge and evaluate them respectively? it seems there is no item in the annotation to tell which image is challenge, thank you

params and flops of the model

你好!我对这篇工作十分感兴趣,想在自己的文章中引用,利用你提供框架,我测得的params=17.18 M,flops=17.52 GMac,请问是否与你测得的相一致呢?

About modifying model input size

Hi, first of all thanks for your good work with STAR loss. I encountered an issue when trying to reduce the network input size from (256, 256) to (112, 112) and train the model on 300W from scratch:

Screenshot 2024-04-08 at 10 07 40

Training process stills work fine with original configs (256x256). Do you have any idea what to change to fix this?

Btw, could you give me some advice on how to prepare other hyperparams when changing the input size?
Thank you very much.

Question about the data augmentation

Hi @ZhenglinZhou :

Here is a question of the data augmentation.

In the AlignmentDataset the raw landmarks of a picture will be loaded and sent to do the augmentation.

img, landmarks_target, matrix = \

In the augmentation steps, the lmk will be transposed to aug_lmk

def process(self, img, lmk, lmk_5pts=None, scale=1.0, center_w=0, center_h=0, is_train=True):

Then the augmented lmk will be sent to _norm_points.

landmarks = self._norm_points(torch.from_numpy(landmarks_target), self.image_height, self.image_width)

The the processed lmks will served as the labels in training steps.

So I want to know what is the mathematical meaning of some of the transformations done on the landmarks in this step of data augmentation. What format will the initial lmks be converted into for training?

I am new in this field and not familiar with some operations. Hope to get your help, thanks!!

Change face detection model

Thank you for your work!

I saw #16 (comment) this comment.

So I try to change the face detection model and retrain STAR.

However, it doesn't seem to have a face detection process when training, so can you tell me which part I need to modify to retrain with my own face detection model?

backward error in training

Hello, I got an error when try to train on my own dataset, in the optimizer.step() . Do you have any idea about it?
image

Dataset links

Hi@ZhenglinZhou,

I've been trying to download datasets for this project from the link provided in the README, but the links for COFW and WLFW appear to be broken and I'm unable to access the datasets. Could you please provide updated links for downloading the datasets?

300w dataset

Thank you very much for providing open-source code that the author can refer to.

Output Visualization in tester

Hi, I want to visualize test results. I can not see landmark on an image or draw it because of the normalization the landmarks comes normalized way such as negatives. How to convert them to unnormalized version to print the image?

train on new dataset

Thanks for your excellent work. I want to train the STAR model on my own dataset, could give some advice on how to prepare the training data.
Can I finetune the model on my own dataset, with only the face images and corresponding landmarks. I see that in your code, only images and landmarks are the datas used to train the STAR model.
image

star loss question

Hello, Zhenglin. I am trying to reimplement this star loss in mmpose. However, I have a question, could you please explain the functions of following two lines?

        normal_dist = self.dist_func(normal_error, torch.zeros_like(normal_error).to(normal_error), reduction='none')
        tangent_dist = self.dist_func(tangent_error, torch.zeros_like(tangent_error).to(tangent_error), reduction='none')

From my understanding, the default value of self.dist_func is SmoothL1Loss(). But I am not sure what is the value of normal_dist and tangent_dist and the meaning of them.

maybe it is not easy to explain, if so, may I get your wechat or email? thanks a lot! Best regard!

About visualization of the PCA results

I am very interested in the method used to draw the red region in Figure 3, as this type of plot is exactly what I need for my current research work. I would like to learn how to create it.

About reproducing the results of the paper

Hi, I'm trying to reproduce your paper results, I followed your steps:
python main.py --mode=train --device_ids=0,1,2,3 --image_dir=${image_dir} --annot_dir=${annot_dir} --data_definition=WFLW
and my config is :
loader_type: alignment
loss_func: STARLoss_v2
batch_size: 128
val_batch_size: 32
test_batch_size: 16
channels: 3
width: 256
height: 256
means: (127.5, 127.5, 127.5)
scale: 0.00784313725490196
display_iteration: 10
milestones: [200, 350, 450]
max_epoch: 500
net: stackedHGnet_v1
nstack: 4
optimizer: adam
learn_rate: 0.001
momentum: 0.01
weight_decay: 1e-05
nesterov: False
scheduler: MultiStepLR
gamma: 0.1
loss_weights: [0.125, 1.25, 1.25, 0.25, 2.5, 2.5, 0.5, 5.0, 5.0, 1.0, 10.0, 10.0]
criterions: ['STARLoss_v2', 'AWingLoss', 'AWingLoss', 'STARLoss_v2', 'AWingLoss', 'AWingLoss', 'STARLoss_v2', 'AWingLoss', 'AWingLoss', 'STARLoss_v2', 'AWingLoss', 'AWingLoss']
metrics: ['NME', None, None, 'NME', None, None, 'NME', None, None, 'NME', None, None]
key_metric_index: 9
classes_num: [98, 9, 98]
label_num: 12
ema: True
use_AAM: True
writer: <tensorboardX.writer.SummaryWriter object at 0x7ff5ef0a76d0>
logger: <RootLogger root (NOTSET)>
data_definition: WFLW
test_file: test.tsv
aug_prob: 1.0
val_epoch: 1
valset: test.tsv
norm_type: default
encoder_type: default
decoder_type: default
betas: [0.9, 0.999]
train_num_workers: 16
val_num_workers: 16
test_num_workers: 0
add_coord: True
star_w: 1
star_dist: smoothl1
edge_info: ((False, (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)), (True, (33, 34, 35, 36, 37, 38, 39, 40, 41)), (True, (42, 43, 44, 45, 46, 47, 48, 49, 50)), (False, (51, 52, 53, 54)), (False, (55, 56, 57, 58, 59)), (True, (60, 61, 62, 63, 64, 65, 66, 67)), (True, (68, 69, 70, 71, 72, 73, 74, 75)), (True, (76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87)), (True, (88, 89, 90, 91, 92, 93, 94, 95)))
nme_left_index: 60
nme_right_index: 72
crop_op: True

However, the model accuracy (NME) obtained by my training is only 4.08. Do you know what went wrong?

image

face landmark detection without dlib

Hi @ZhenglinZhou , thank you for the contribution, I found that the demo.py used dlib for landmark detection, just wonder if I could do the landmark detection without dlib, is it possible to get the landmark with STAR only?

test my own image

hello, how can i use my own image to test, and how can i get the tsv file? looking forward to your reply, thanks

Confidence Score of Detected Facial Landmarks

Is there any way we can find the confidence score of the detected landmarks?
Like in face Detection, we get confidence score which tells about its accuracy, similarly is it possible in STAR Loss Model.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.