Git Product home page Git Product logo

Comments (20)

ryanxingql avatar ryanxingql commented on June 2, 2024

You can combine the original u, v channels and the enhanced Y channel, to form a RGB video frame.
Some useful codes:

yuv420p -> yuv444p image
https://github.com/ryanxingql/pythonutils/blob/57c3624b1f89f40f4f81300f0bd0c575379bea9c/conversion.py#L172

yuv444p image -> bgr image for cv2.imwrite function
https://github.com/ryanxingql/pythonutils/blob/57c3624b1f89f40f4f81300f0bd0c575379bea9c/conversion.py#L157

from stdf-pytorch.

woaicv avatar woaicv commented on June 2, 2024

Hi, thank you very much for the steps. But because I just got in touch with it and the level is not enough, the color of the picture I transferred is very strange, and I don't know how to solve it after a day. The following link is the strange picture that I transferred. I found that all the yuv channels read in with the import_yuv you provided are 0-255, but in the comments in ycbcr2rgb, the range of the y channel is required to be 16-235, I used clamp to compress to this range, but it is still similar Such a weird picture. Then I use the ycbcr2rgb in the second link you gave me, and the range of Yb and Cr is required to be 16-240. I use this to convert the same clamp to the range of 16-240, and the picture is still the same. Do you know why? thank you very much.

from stdf-pytorch.

woaicv avatar woaicv commented on June 2, 2024

949EBA6E-3607-4E21-B72C-575237EEE00B

from stdf-pytorch.

ryanxingql avatar ryanxingql commented on June 2, 2024
  1. 关于范围。根据BT601,从rgb转到ycbcr,y的取值范围是16到235;但我们程序输出的y不一定在这个范围内。你不用考虑函数的备注,可以直接用这些函数。
  2. 绿色图像不是因为取值问题导致的;我猜测是你通道搞错了。你是怎么写图像的?如果是opencv,注意hwc中c的顺序是bgr;如果是其他函数,例如scikit-image,可能又不一样。建议去看一看你写图像函数的文档。

from stdf-pytorch.

woaicv avatar woaicv commented on June 2, 2024

您好,非常感谢您的解答,抱歉再次打扰您,我发现conversion.py里yuv转rgb你用的函数与rgb转yuv是一个,都是skimage.color.rgb2ycbcr,是不是应该是ycbcr2rgb。当我用ycbcr2rgb时,出来的图是黑的,我发现我把uint8的范围是0-255的图传到ycbcr2rgb里,转出来的范围大概在-0.0024-0.011多的范围值,感觉还是很奇怪,我看github里有问scikit-image的问题的,我看也是一样的操作,转出来的范围在0-1之间,不知道为什么。另外我用的是cv.imwrite进行存储的,我把通道已经转为bgr了。现在感觉这个转换有些问题

from stdf-pytorch.

ryanxingql avatar ryanxingql commented on June 2, 2024

非常感谢您的勘误!是的,这个仓库的utils存在问题;
请您参考我pythonutils仓库的版本,不需要scikit-image,而是用了矩阵运算。

还是建议您按照我的步骤来:把已增强的Y,和未增强的U、V拼在一起,恢复为yuv444p采样格式,然后转为bgr,再用opencv写。
注意,我这里的函数要求是uint8输入,因为矩阵mat也是针对uint8范围的数据写的;程序里会自动转为float格式
https://github.com/ryanxingql/pythonutils/blob/57c3624b1f89f40f4f81300f0bd0c575379bea9c/conversion.py#L138

如果您还遇到问题,欢迎把您的数据发给我,我来处理,给您脚本参考

from stdf-pytorch.

woaicv avatar woaicv commented on June 2, 2024

非常非常感谢!我明天再试试,我水平不太行,折腾好久了。明天可能还会问您下,非常不好意思,万分感谢!

from stdf-pytorch.

woaicv avatar woaicv commented on June 2, 2024

您好,非常感谢您的意见。我今天又试了下,今天才发现还有个问题出现在yuv420转yuv444上,原来仓库的这个函数是有问题的。然后我用了您给的新的仓库的函数,可以正常转出来。另外还有个奇怪的问题,就是用import_yuv读进来的yuv通道,gt的y的范围是16-235,而lq读进来的y的范围是0-255,然后我为了一致,将增强后的结果*255压缩到0-255之间,然后进行替换。比较奇怪的是gt和lq的y通道的范围不一致,转出来的图,看起来还可以,您看看是不是正常的,会不会因为数据的范围没弄好,导致最后的视觉效果有一定的下降。我挑了一张给您看下,如下图:第一张是真值图片,第二张是压缩失真的图片,第三张是恢复的图像。
gt1331
lq1331
restore1331

from stdf-pytorch.

ryanxingql avatar ryanxingql commented on June 2, 2024

太好了,解决问题就好!

在新仓库里,我根据BT601标准,用矩阵运算完成了rgb和ycbcr色域的变换。
实际上,反变换的矩阵就是正变换矩阵的逆矩阵。
因此,虽然lq的Y不在16-235之间,但直接操作是没问题的,因为就是乘一个逆矩阵。我理解这一步操作不会有太大问题。

看图的话,主观效果正常。

有一个小小的建议,可以请您尝试一下:您可以算一下两种PSNR:

  1. 把增强后的Y直接用于转换rgb,和gt算rgb PSNR;
  2. 把增强后的lq先乘以255,从0-1映射到0到255;然后,把小于16的都设为16,大于235的都设为235;再转换rgb,最后算rgb PSNR。
    您比较一下结果,我估计是方案1更高?

from stdf-pytorch.

woaicv avatar woaicv commented on June 2, 2024

非常感谢!我刚刚做了下测试,选择了test_18里的四个低分辨率的视频进行了测试。结果附在下面,第二种处理方式在原始和增强后都有一点点的提升,但确实影响很小,非常感谢您,我现在能把测试这块确定下来。我现在研一刚转到这个方向,刚开始还是有些问题,这两天非常麻烦您了。您是徐迈教授的学生是嘛?我导师之前跟我提到过徐迈教授,在视频压缩方面做的很强。再次表示感谢!

1、
BQSquare_416x240_600.yuv: [27.382] dB -> [28.361] dB
BasketballPass_416x240_500.yuv: [28.718] dB -> [29.466] dB
BlowingBubbles_416x240_500.yuv: [26.269] dB -> [26.777] dB
RaceHorses_416x240_300.yuv: [27.554] dB -> [28.101] dB

ori: [27.481] dB
ave: [28.176] dB
delta: [0.696] dB

2、
BQSquare_416x240_600.yuv: [27.422] dB -> [28.377] dB
BasketballPass_416x240_500.yuv: [28.719] dB -> [29.466] dB
BlowingBubbles_416x240_500.yuv: [26.270] dB -> [26.777] dB
RaceHorses_416x240_300.yuv: [27.562] dB -> [28.101] dB

ori: [27.493] dB
ave: [28.180] dB
delta: [0.687] dB

from stdf-pytorch.

ryanxingql avatar ryanxingql commented on June 2, 2024

谢谢您的实验!

哈哈是的,我是徐迈老师的博士二年级研究生,各种问题欢迎随时联系~

from stdf-pytorch.

woaicv avatar woaicv commented on June 2, 2024

from stdf-pytorch.

woaicv avatar woaicv commented on June 2, 2024

您好,非常不好意思打扰您。我刚做这个方向,最近有些问题。这两个月我套了些模型,虽然QP37的情况下psnr比RFDA高了0.06,但是我的计算量弄的特别大。我想请问下您这样做是不是没太有意义,目前压缩视频恢复更关注的点在于什么?我组里没有人做压缩,我也对视频压缩不太了解,现在很迷茫,没有人能请教,如果您能回复,万分感谢!🙏

from stdf-pytorch.

ryanxingql avatar ryanxingql commented on June 2, 2024

您好,关于这个方向,我觉得有以下几点可以做:
1、进一步提升视频 PSNR。这一点其实非常需要 GPU 资源,涉及到视频任务都很耗费资源。你可以从压缩视频的特性(参考 MFQE)、结合编码器信息、改进网络结构等角度入手。0.06 dB 的增益还不够。BasicVSR++ 就是一个很好的 Baseline 模型,它的性能能达到 1 dB 以上。
2、减少资源的消耗。很多实际场景无法使用昂贵的 GPU 资源,对速度的要求也比较高。这一点可以参考 RBQE。
3、面向更实际的质量提升。在很多实际业务中,PSNR 是不受关注的。有两个指标更重要:主观质量和码率消耗。前者影响用户体验,后者影响带宽。
4、多任务等交叉问题。
5、面向更普适的 Video Restoration 任务。

from stdf-pytorch.

woaicv avatar woaicv commented on June 2, 2024

好的非常感谢您,我目前这边只有2块3090,是不是对于第一点不太好做,BasicVSR++应该计算复杂度挺大的吧

from stdf-pytorch.

ryanxingql avatar ryanxingql commented on June 2, 2024

是的,刷性能很吃力。现在最新的模型,八卡32GB要训练一周,测试分辨率勉强2K。

from stdf-pytorch.

woaicv avatar woaicv commented on June 2, 2024

好的,非常感谢您的回复,您指的最新的模型是在NTIRE 2022 challenge上的工作嘛?我看到在竞赛里的复杂度都挺高的,应该训练挺慢的吧。

from stdf-pytorch.

ryanxingql avatar ryanxingql commented on June 2, 2024

嗯嗯对

from stdf-pytorch.

woaicv avatar woaicv commented on June 2, 2024

好的,非常感谢您的回复!🙏

from stdf-pytorch.

LHL731 avatar LHL731 commented on June 2, 2024

555感动哭了,我这几天也在为这个问题苦恼,身边也没有人能讨论,看到二位就像看到了家人

from stdf-pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.