faceperceiver / farl Goto Github PK

View Code? Open in Web Editor NEW

359.0 359.0 21.0 606 KB

FaRL for Facial Representation Learning [Official, CVPR 2022]

Home Page: https://arxiv.org/abs/2112.03109

License: MIT License

Python 64.78% C++ 33.29% Cuda 1.83% Jupyter Notebook 0.10%

face-alignment face-parsing face-pretraining

farl's Issues

Missing file

list_eval_partition.txt is missing while training with celebamask

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation!!

I tried to train face parsing using following code but i got error:

python -m blueprint.run
farl/experiments/face_parsing/train_lapa_farl-b-ep16_448_refinebb.yaml
--exp_name farl --blob_root ./blob

====== RUNNING farl/experiments/face_parsing/train_lapa_farl-b-ep16_448_refinebb.yaml ======
blueprint: Parsing farl/experiments/face_parsing/train_lapa_farl-b-ep16_448_refinebb.yaml
DistributedGPURun: init_process_group: 0/1
blueprint: Parsing farl/experiments/face_parsing/./trainers/lapa_farl.yaml
blueprint: Parsing farl/experiments/face_parsing/./trainers/../augmenters/lapa/train.yaml
blueprint: Parsing farl/experiments/face_parsing/./trainers/../augmenters/lapa/test.yaml
blueprint: Parsing farl/experiments/face_parsing/./trainers/../augmenters/lapa/test_post.yaml
blueprint: Parsing farl/experiments/face_parsing/./trainers/../networks/farl.yaml
blueprint: Parsing farl/experiments/face_parsing/./trainers/../scorers/lapa.yaml
blueprint: Parsing farl/experiments/face_parsing/./trainers/../optimizers/refine_backbone.yaml
Mon Apr 25 11:21:11 2022 - farl_0 - outputs_dir: ./blob/outputs/farl/face_parsing.train_lapa_farl-b-ep16_448_refinebb
Mon Apr 25 11:21:11 2022 - farl_0 - states_dir: ./blob/states/farl/face_parsing.train_lapa_farl-b-ep16_448_refinebb
Mon Apr 25 11:21:11 2022 - farl_0 - locating the latest loadable state ...
Mon Apr 25 11:21:11 2022 - farl_0 - no valid state files found in ./blob/states/farl/face_parsing.train_lapa_farl-b-ep16_448_refinebb
Mon Apr 25 11:21:11 2022 - farl_0 - There will be 6056 training steps in this epoch.
loss=2.4654557704925537
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.7/dist-packages/blueprint/run.py", line 69, in
_main()
File "/usr/local/lib/python3.7/dist-packages/blueprint/run.py", line 65, in _main
runnable()
File "/usr/local/lib/python3.7/dist-packages/blueprint/ml/distributed.py", line 123, in call
_single_thread_run, args=(num_gpus, self), nprocs=num_gpus, join=True)
File "/usr/local/lib/python3.7/dist-packages/torch/multiprocessing/spawn.py", line 230, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/usr/local/lib/python3.7/dist-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
while not context.join():
File "/usr/local/lib/python3.7/dist-packages/torch/multiprocessing/spawn.py", line 150, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
fn(i, *args)
File "/usr/local/lib/python3.7/dist-packages/blueprint/ml/distributed.py", line 68, in _single_thread_run
local_run()
File "/usr/local/lib/python3.7/dist-packages/blueprint/ml/trainer.py", line 194, in call
self._backward(loss)
File "/usr/local/lib/python3.7/dist-packages/blueprint/ml/trainer.py", line 120, in _backward
loss.backward()
File "/usr/local/lib/python3.7/dist-packages/torch/_tensor.py", line 307, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/usr/local/lib/python3.7/dist-packages/torch/autograd/init.py", line 156, in backward
allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [3, 768, 28, 28]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

I'm using colab to run your project.
and changed batchsize to 3.

Some questions about the application

Hello, I'm a newcomer to Facial Representation Learning. I want to view the result of your training model, for example, face parsing. I want to input a picture and output a result with each area marked.

FaRL 448x448 fine tune model release

Will you release pre-trained weights for the clip model fine tuned at 448x448 resolution?

Thanks!

face alignement task

Hello there, thank for your contribution !
would it be possible for you to add the landmarks detection /face alignement task in this repo ?
thank you

Demo!!

hi,
thanks for your amazing results.
can you please provide some demo code to take the output from any input image?

could you offer a script to reproduce face attribute

could you offer a script to reproduce the acc of face attribute?

Error downloading object

when git clone the repo, encounter the problem

Cloning into 'FaRL'...
remote: Enumerating objects: 587, done.
remote: Counting objects: 100% (39/39), done.
remote: Compressing objects: 100% (37/37), done.
remote: Total 587 (delta 24), reused 4 (delta 2), pack-reused 548
Receiving objects: 100% (587/587), 556.46 KiB | 761.00 KiB/s, done.
Resolving deltas: 100% (302/302), done.
Downloading farl/network/ext/p2i_ops/sample.ipynb (2.3 KB)
Error downloading object: farl/network/ext/p2i_ops/sample.ipynb (f7d4c2c): Smudge error: Error downloading farl/network/ext/p2i_ops/sample.ipynb (f7d4c2c0c21613b6c6d6bad83a10723b21a3606c4156d39551d4adba13ef47e1): batch response: This repository is over its data quota. Account responsible for LFS bandwidth should purchase more data packs to restore access.

Errors logged to /Users/dongdengke/Desktop/FaRL/.git/lfs/logs/20230930T170447.302127.log
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
fatal: farl/network/ext/p2i_ops/sample.ipynb: smudge filter lfs failed
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'

how to solve it

Error

raise RuntimeError("Distributed package doesn't have NCCL "
RuntimeError: Distributed package doesn't have NCCL built in

Some questions on training downstream face parsing tasks

Has anyone used the author's pre-trained model to train a downstream task for face parsing? RuntimeError: CUDA error: device-side assert triggered when using multi-GPU training. Has anyone encountered this? Also I want to convert pth model to onnx model, has anyone done it?

Question about input size

Hi, thank you very much for this great work. I see that the input to the CLIP model is of size 224x224, however the parsing and the alignment models' input is 448x448, would you please clarify this? thank you.

LAION-FACE Dataset

Hi,
Appreciate your work,

Are you planning on releasing the LAION-Face subset (metadata)?
You demonstrated that pretraining on LAION-Face improved 3 downstream tasks, I was wondering have you done a benchmark on Face Recognition Task?

download pre-trained-backbones leads to 404 not found

First of all, I would like to thank you for your fantastic project. It has been incredibly helpful and well-designed.

I've come across an issue while trying to access the pretrained backbone. The provided link results in a "404 Not Found" error. Could you please help me resolve this problem? It would be great if you could share an updated link or guide me on how to access the correct resource.

https://github.com/microsoft/FaRL#pre-trained-backbones

Pretrained model link does not exist!!

link is corrupted
https://github.com/microsoft/FaRL#pre-trained-backbones

User friendly inference?

Reposting for visibility here: FacePerceiver/facer#9

Weird time behaviour for face parsing

I believe there is something about the JIT (Just-In-Time) load causing some unusual behavior. The first batch takes around 2 seconds, the second batch takes around 20 seconds, but the third batch and subsequent batches only take 0.1 seconds.

Is there any information available about this issue?

How to use fine-tuned model to produce inference images in face parsing task?

About the usage of the blueprint framework

Hello, nice work! I wonder is here any docs about the usage of the blueprint framework?

Do you open source your pretraining code?

Text encoder

Thank you for your awesome work. Do you have plan to release pretrained text encoder?

How to use trianed model in facer?

hi, thanks for your work,
We trained your model on our own dataset which contains lapa dataset and new images including wearing mask.
after train we got checkpoints like these in blob/states/farl/face_parsing.train_lapa_farl-b-ep64_448_refinebb directory:
2.5G 10902_2.pth
2.5G 21804_4.pth
2.5G 32706_6.pth
2.5G 43608_8.pth
2.5G 54510_10.pth
2.5G 65412_12.pth
2.5G 76314_14.pth
2.5G 87216_16.pth
4.0K _records.pth
which everyone contains model weights, optimizer and etc.

the problem is how to use these weight in facer project?
i checked facer loading proccess and it use torchscript model.

Reproduce Results

I hope you are doing well. This is Faizan Riasat, a student at National University of Science and Technology. I am working on face alignment, for which i need to reproduce the results as claimed by General Facial Representation Learning in a Visual-Linguistic Mann.
Can you please explain to me the procedure?

Can we quantise this model somehow? I tried converting it to float point 16 but the inference pipelines does not seem to support that.

Did you use the pretrained codebook provided by OpenAI in the papar "FARL"?

Dear Authors,

I am Deyu, Zhou, a Ph.D student from HKUST(GZ). I am very interested in your work “FARL”! Well Done!
I am curious about the codebook you used to do Masked Image Modelling.
I notice that the DALLE’s codebook is able to get 1,024 tokens given an image as input.
However, I note that your work use sequence with length as 196 (tokens) + 1 ([CLS] token).
So did you train the codebook from scratch by yourselves or did you use any other pretrained codebook?
Or could you release the codebook in your github?

Thanks,
Deyu

faceperceiver / farl Goto Github PK

farl's Issues

Recommend Projects

Recommend Topics

Recommend Org