Git Product home page Git Product logo

codeformer's Introduction

Hi there 👋

  • 👨🏼‍💻 I am a Ph.D. student at MMLab@NTU, Nanyang Technological University (NTU)
  • 🔭 I’m currently working on image/video restoration, enhancement, and editing ...
  • 🚀 Most of my projects are open-sourced at GitHub
  • 🏠 How to reach me: my homepage
  • 📖 Check my publications: google scholar

codeformer's People

Contributors

chenxwh avatar sczhou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

codeformer's Issues

Background upscale isn't working / Real-ESRGAN ignored?

Hello, Thank you for this great project! 💙

I'm running this on Windows 10 and Anaconda, Installation was very easy and simple to follow thanks to your step-by-step instructions, I appreciate it.

Problem Description:

I've added the argument: --bg_upsampler realesrgan
But it seems to ignore it and just upscale the face without the background, I get this warning:

inference_codeformer.py:22: RuntimeWarning: The unoptimized RealESRGAN is slow on CPU. We do not use it. If you really want to use it, please modify the corresponding codes.
  warnings.warn('The unoptimized RealESRGAN is slow on CPU. We do not use it. '
Face detection model: retinaface_resnet50
Background upsampling: False, Face upsampling: False
Processing: 5a.png
        detect 1 faces

All results are saved in results/_SOURCE__0.7

Since I'm not a programmer I don't know how to fix or mess with code in general,
Can you please tell me how to make it work?

Thanks ahead!

Code Release

Very interesting work! When do you plan to make the code available?

Sharpness of the restored face

Is there a way for the fixed face to be sharper in the results? As you can see in the fixed result, the transition to the sharp hair on the top of her head is pretty harsh and the overall sharpness of the face is greatly reduced compared to the sharpness of the input image.

Input:

SinCity_Mulan_full_body_Disney_princess_from_Mulan_character_be_727fc4b2-be79-44be-9b3c-f7cd8f1877e5

Fixed with CodeFormer.
This is with 0.5 fidelity and with background upscale on:

SinCity_Mulan_full_body_Disney_princess_from_Mulan_character_be_727fc4b2-be79-44be-9b3c-f7cd8f1877e5

Sharp area (top) going to unsharp area below (face fixed):

Screenshot at Sept 02 08-48-06

Thanks for looking into this :)

realesrgan doesn't appear to be functioning

Hi, this tool is absolutely amazing, not using it for it's intended purpose per se, but it produces incredible results regardless.

I'm interested in the new option for BG upscaling via realesrgan, however when I use the flag --bg_upsampler realesrgan, results are identical in comparison without this flag, processing time is suspicously similar too.

RealESRGAN_x2plus.pth is located in the approprate folder too ./weights/realesrgan/RealESRGAN_x2plus.pth

Is it possible that I'm missing something?

Testing Datasets

Hi.
Thanks for the awesome work.

How could I get the testing datasets (e.g., CelebA-Test, WIDER-Test) used in your paper?

Black and white photos become colourized

Hi.

Do you think it would be possible to detect whether an image is black and white and desturate combined results in the event that they are?

I imagine it would be more difficult to achieve this for sepia images though.

Thank you for your work.

FID

Is the FID a non-reference indicator and how did you calculate it?
(The FID I know is calculated with a reference image)

Inpainting?

I've been amazed by the efficacy of CodeFormer, even for difficult, complex, or other-than-use-case pictures. However, I've never been able to get inpainting to work. Is there a switch that isn't listed in the documentation? Is there a particular masking method (transparent regions, white regions, certain file types, etc.) that are required for inpainting to function? Does it require a certain masked/unmasked ratio? I'm afraid I'm just too dense to quite figure out what I'm missing.

metrics

In your article, you wrote:
For the evaluation on real-world datasets without ground truth, we employ the widely-used non-reference perceptual metrics: FID and NIQE.
Is FID a non-reference indicator?
I have some trouble understanding this sentence. Thanks!

Results are not saved to disk post-inference

Running the inference_codefomer.py without errors, however at the end when it displays the message "all results are saved in results", the results directory is empty. No error, just an empty directory. Have you run into this issue before?

Improving the sharpness of facial output

Hey @sczhou
I was trying codeformer on some test images, and I realised it does not produce really sharpen faces as output. ( Samples attached ). I was thinking if we increase the image size which goes to model from 512x512-->1024x1024, will this help in achieving more sharpened faces ? Or what is the way to achieve really clear output features ( such as eyes, nose, eye brows ).

--CodeFormer output

codeformer

--Expected sort of output

expected_output

--original Image

original

Please let me know how to proceed forward with this. Thanks

How do I modify the default configuration?

Hello, I would like to ask how to change the network structure, I want to replace the RRDBnet of realesragan with SRVGGNet, that is, replace the realesragan model of the repair background, how to operate? I can't find a specific configuration file.

PyTorch nightly error

On PyTorch nightly I get the error:

ValueError: invalid literal for int() with base 10: 'dev20220909'

Looks like torch.version. returns dev20220909 and IS_HIGH_VERSION is expecting a semver?

cant recognize RealESRGAN on gpu

how can i fix this? thank you!

RuntimeWarning: The unoptimized RealESRGAN is slow on CPU. We do not use it. If you really want to use it, please modify the corresponding codes. warnings.warn('The unoptimized RealESRGAN is slow on CPU. We do not use it. '

Face Detection Help

Scenario: Having 512x512 image created by AI where the face only makes up 5-10% of the image.
It does not detect the face. So I'm cropping the face in Photoshop and reloading it to CodeFormer (It now detects the face), afterwards loading it to Photoshop again to blend it with the original image. It works well but lots of manual work.

Would it be possible to create some kind of interactive selection to guide/assist/help CodeFormer?
Imagine creating some kind of square selection over the image, now it crops automatically that area, uses that for the restoration. Then after the restoration it adds the cropped image on top of the original image again. + Upscaling if needed.

Colab to run inference on videos downloaded from YouTube

Hi, first of all, congratulations on the amazing work. I wanted to share with you a step by step tutorial I built on Colab here: https://github.com/machinelearnear/towards_robust_blind_face_restoration where I download a video from YT, split into frames, run inference on them, and them reconstruct them as a restored video. The results are excellent, for example here: https://www.youtube.com/watch?v=ZZoB_iD-l4c&ab_channel=machinelearnear, although it loses some temporal coherence as I'm submitting frames one by one.

Thanks again for the great work.

RuntimeWarning: The unoptimized RealESRGAN is slow on CPU

When running codeformer I get this error. Not sure how to fix it.

inference_codeformer.py:22: RuntimeWarning: The unoptimized RealESRGAN is slow on CPU. We do not use it. If you really want to use it, please modify the corresponding codes.
warnings.warn('The unoptimized RealESRGAN is slow on CPU. We do not use it. '

Weight initialization

I see that here there is some logic to initialize weights, but it only covers certain types of layers - in particular, it does not cover Conv2d. What kind of initialization do you use for other layers?

About training stage III

Thanks for sharing your great job! I've trained CodeFormer on my task for stage I and II. When testing on real images, the overall performance is good except for some corner cases (e.g. large poses and heavy shadow). I understand this is due to my limited dataset (~10k images), and the skipped features in stage III is indeed needed to complement the limited coverage of codebook. However, since I've trained stage I and II with w=0, the training of stage III quickly break down when I change w to 1, the discriminator is immediately too strong. Also, I fixed the quantizer and decoder and change the lr to 2e-5 in stage III. Can you give me some suggestions?

Some problems in environment configuration

First of all, the installation help you provided is very useful, but I still encountered some problems. Running on the lab was completely smooth, but on my Win 10 computer, I installed torch 1.10.0+cu102, torch version 0.11.1+cu102, and my CUDA version 10.2 with RTX 3060, which met your requirements. However, when I executed the command you give on the command line, I still could not run the program normally after waiting for a long time, and then after about one night, I got a bunch of noise pictures.

I think it's a problem with the torch version, but when I check the "torch.cuda.is_available()", the reply is True.

Can you tell me how to solve this problem?

[Feature Suggestion] - Output Control

I noticed that by default I can choose the INPUT directory but not the output.

Explanation how it works with the current version:

At the moment the output directory set to "results \ Name of chosen input directory \ Weight Number" and then 3 directories for: Final result, Cropped faces and Restored faces.
For Example, to check out the Final results I go to:
root master project\results\_SOURCE_1.0\ final_results\

In my case I just created batch files to make it easy to test,
For each weight to experiment, I use 0.7 and 1.0 for now.
Each one of them have scale of x1, x2, x4 and so on.. (that's a lot of batch files but it's easier since there is no GUI and I'm not a programmer) I use it for quick testing so it's faster.

Whenever I change the weight, I will get ANOTHER directory.


Suggestion for Output Control:

The current default output is really nice to keep all the other versions of the images, I wouldn't replace it but I would go for extra control.

But In most cases I just want to get the FINAL result and change the prefix of the output files, probably others will also like that as an extra option.

I can only compare to the way I use it with GFPGAN as example, this could be a very nice addition as extra control probably via more argument / commands.

EXAMPLE using GFPGAN arguments:
-i ./_SOURCE_ -o ./_RESULTS_ -v 1.2 -s 2 --suffix " (v1.2) - x2"

The result will be:
Output directory is "RESULTS" inside the root of the master project.
"Image Name (model version) - scale size" = "Image (v1.2) - x2.png"

In CodeFormer we can do the same with, instead of Model Version the Weight Number, and the scaling instead of me doing it manually, could be also an argument.
The end _prefix is a good control just in case like I did in the example on GFPGAN outputs.

That will give me 1 input for FINAL version only with the specific details such as Weight Number used, Scaled used, and only 1 directory instead I don't need the Cropped and Restored versions. everything will be much cleaner and easier to compare.


Conclusion:

Just to be clear I'm not a programmer so I can't do this by myself, but maybe it is not very complex to add in the future as it's just adding: Output Path control + End Prefix text.

Sorry about my bad English, I hope my suggestion is clear enough.

Python errors on Colab

I'm getting this error

File ["<ipython-input-3-f08a15d58857>"](https://localhost:8080/#), line 7 if $BACKGROUND_ENHANCE: ^ SyntaxError: invalid syntax

If I remove the code regarding background enhance, I'm able to run it, but then get the following error:

[<ipython-input-5-07a9491e9358>](https://localhost:8080/#) in <module>
     10   basename = os.path.splitext(os.path.basename(input_path))[0]
     11   output_path = os.path.join(result_folder, basename+'.png')
---> 12   img_output = imread(output_path)
     13   display(img_input, img_output)

[<ipython-input-1-6275b88f3c05>](https://localhost:8080/#) in imread(img_path)
     30 def imread(img_path):
     31   img = cv2.imread(img_path)
---> 32   img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
     33   return img

error: OpenCV(4.6.0) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'```

I can't improve the photo

IMG_20220905_161424_678
IMG_20220905_161427_882
IMG_20220905_161443_853

Hi, sometimes there are photos that the algorithm can't handle.
1 original photo. 2 photos - your algorithm w 1.0. 3 photos gfpgan 1.3
I tried different W settings, I could not get an acceptable result, try it yourself.

Another difficult photo that no algorithm could make
IMG_20220905_162203_277

Try to improve these photos or the algorithm will not be able to do it ? Thanks

TypeError: Cannot read properties of undefined (reading '_uploadFiles')

Step 4 on Google Colab

MessageError                              Traceback (most recent call last)
[<ipython-input-7-dd9625c8d7bb>](https://localhost:8080/#) in <module>()
     10 os.mkdir(upload_folder)
     11 
---> 12 uploaded = files.upload()
     13 for filename in uploaded.keys():
     14   dst_path = os.path.join(upload_folder, filename)

3 frames
[/usr/local/lib/python3.7/dist-packages/google/colab/files.py](https://localhost:8080/#) in upload()
     39   """
     40 
---> 41   uploaded_files = _upload_files(multiple=True)
     42   # Mapping from original filename to filename as saved locally.
     43   local_filenames = dict()

[/usr/local/lib/python3.7/dist-packages/google/colab/files.py](https://localhost:8080/#) in _upload_files(multiple)
    116   result = _output.eval_js(
    117       'google.colab._files._uploadFiles("{input_id}", "{output_id}")'.format(
--> 118           input_id=input_id, output_id=output_id))
    119   files = _collections.defaultdict(bytes)
    120 

[/usr/local/lib/python3.7/dist-packages/google/colab/output/_js.py](https://localhost:8080/#) in eval_js(script, ignore_result, timeout_sec)
     38   if ignore_result:
     39     return
---> 40   return _message.read_reply_from_input(request_id, timeout_sec)
     41 
     42 

[/usr/local/lib/python3.7/dist-packages/google/colab/_message.py](https://localhost:8080/#) in read_reply_from_input(message_id, timeout_sec)
    100         reply.get('colab_msg_id') == message_id):
    101       if 'error' in reply:
--> 102         raise MessageError(reply['error'])
    103       return reply.get('data', None)
    104 

MessageError: TypeError: Cannot read properties of undefined (reading '_uploadFiles')

No module named 'gdown' error

(codeformer) c:\CodeFormer>python inference_codeformer.py --w 0.7 --test_path \inputs\whole_imgs
Traceback (most recent call last):
File "inference_codeformer.py", line 10, in
from facelib.utils.face_restoration_helper import FaceRestoreHelper
File "c:\CodeFormer\facelib\utils_init_.py", line 2, in
from .misc import img2tensor, load_file_from_url, download_pretrained_models, scandir
File "c:\CodeFormer\facelib\utils\misc.py", line 7, in
import gdown
ModuleNotFoundError: No module named 'gdown'

training code

Hi, I'm very interested in your great work! I want to know when will the training code be released?

Discriminator used in training

Hi @sczhou , first let me thank you for your impressive work!

While reading the paper, I noticed that there are no details about the kind of discriminator that you used, it is just mentioned that you used an adversarial loss without further information. Could you share some information about this?

Applying restoration to the whole image

Is there a way to modify CodeFormer to apply the face restoration to the whole image? Face restoration did superb job with hair texture but it’s very jarring for long haired subject when half their hair is blurred below their face. I can mask everything else with photoshop.

the trainning code

hi,the paper has recived by NeurIPS 2022, so when will release the trainning code,
wait in hope.

about eq.6 in the paper

Hi, thank you for your excellent work.
I have a question when I read your paper.
the latter part of eq.6 is
捕获
but you said that this loss is used to force the LQ feature Zl to approach the quantized feature Zc.
I cannot understand the relationship between Zh in eq.6 and Zl.
Could you explain how to get Zh in eq.6 or the definition of Zh?
Thank you.

No Upscale on Colab

Hi would it be possible to please add an Upscale feature on the Codeformer Google Colab? The Demo on Hugging Face and Replicate have one but not on Colab. Would love one on Colab if possible. I've tried to add one myself but I'm an idiot. Thanks in advance :)

Identity loss is noticeable

Hey is there a chance thats going to be improved in the model ,i can see it changing eye colours too much rom dark ones to blue if lowres image is a bit noisy in eye area for example, faces look good and realistic but something like GPEN or GFPGAN hves better identity preservation on the same face.
Its not like all upscaled faces lose identity but a lot of them do

Results not really good on close portraits

Hey @sczhou Thank you so much for this repo
I have been waiting for training code, but you have said that it is delayed for some time. But codeformer performs really good in face restoration for images which are at a distance, but fails miserably for selfies and mostly people with glasses. It adds double eyebrows etc. So if you cannot release training code right now, then please train codeformer on more dataset which can make it robust on real faces and can be used as face restoration on general images.

Use of regularization

Hi.

Thanks for the amazing work!
I have not seen anything regarding the use of regularization like L2 while developing the model, do you use it during training? Or have you anyway tried it to then discard the idea?

CUDNN_STATUS_NOT_SUPPORTED on inference colab

I get this error :

Processing: 00.jpg
Traceback (most recent call last):
File "inference_codeformer.py", line 83, in
img_path, upsample_num_times=args.upsample_num_times, only_keep_largest=args.only_keep_largest)
File "/content/CodeFormer/basicsr/utils/face_util.py", line 70, in detect_faces
det_faces = self.face_detector(self.input_img, upsample_num_times)
RuntimeError: Error while calling cudnnConvolutionBiasActivationForward( context(), &alpha1, descriptor(data), data.device(), (const cudnnFilterDescriptor_t)filter_handle, filters.device(), (const cudnnConvolutionDescriptor_t)conv_handle, (cudnnConvolutionFwdAlgo_t)forward_algo, forward_workspace, forward_workspace_size_in_bytes, &alpha2, out_desc, out, descriptor(biases), biases.device(), identity_activation_descriptor(), out_desc, out) in file /tmp/pip-install-uwm3abyj/dlib_961ff087af364da4874d302ad406612d/dlib/cuda/cudnn_dlibapi.cpp:1237. code: 9, reason: CUDNN_STATUS_NOT_SUPPORTED

Thanks for the Project

First of all thank you for the research and sharing your work.

While the project is really cool,I have some feature request.
Previous networks both Gfpgan and Gpen is supporting background enhancement with "tiles processing"(so the gpu's doesn't go out of memory).

I liked the face enhancement model but the background enhancement is lacking for the whole images. Is this something we shall expect to be implemented in the future for a robust full image enhancement solution? It is very important both for final image quality and true comparison between projects.

I am hoping there will be an update about this for the repo in future releases.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.