holywu / vs-femasr Goto Github PK

View Code? Open in Web Editor NEW

13.0 13.0 0.0 45 KB

FeMaSR function for VapourSynth

License: Creative Commons Zero v1.0 Universal

Python 100.00%

femasr pytorch vapoursynth

vs-femasr's People

Contributors

Stargazers

vs-femasr's Issues

Tensporrt 8.6.1

Since updating to latest vs-dpir vs-femasr doesn't work when using TensorRT.
Would be nice if you could fix it.

Strange highlights and Tensor not working.

When using:

# Imports
import vapoursynth as vs
# getting Vapoursynth core
core = vs.core
import site
import os
import ctypes
# Adding torch dependencies to PATH
path = site.getsitepackages()[0]+'/torch_dependencies/'
ctypes.windll.kernel32.SetDllDirectoryW(path)
path = path.replace('\\', '/')
os.environ["PATH"] = path + os.pathsep + os.environ["PATH"]
# Loading Plugins
core.std.LoadPlugin(path="i:/Hybrid/64bit/vsfilters/Support/fmtconv.dll")
core.std.LoadPlugin(path="i:/Hybrid/64bit/vsfilters/SourceFilter/LSmashSource/vslsmashsource.dll")
# source: 'C:\Users\Selur\Desktop\howToSharpenThis.mp4'
# current color space: YUV420P10, bit depth: 10, resolution: 3840x2160, fps: 50, color matrix: 709, yuv luminance scale: limited, scanorder: progressive
# Loading C:\Users\Selur\Desktop\howToSharpenThis.mp4 using LibavSMASHSource
clip = core.lsmas.LibavSMASHSource(source="C:/Users/Selur/Desktop/howToSharpenThis.mp4")
# Setting color matrix to 709.
clip = core.std.SetFrameProps(clip, _Matrix=1)
clip = clip if not core.text.FrameProps(clip,'_Transfer') else core.std.SetFrameProps(clip, _Transfer=1)
clip = clip if not core.text.FrameProps(clip,'_Primaries') else core.std.SetFrameProps(clip, _Primaries=9)
# Setting color range to TV (limited) range.
clip = core.std.SetFrameProp(clip=clip, prop="_ColorRange", intval=1)
# making sure frame rate is set to 50
clip = core.std.AssumeFPS(clip=clip, fpsnum=50, fpsden=1)
clip = core.std.SetFrameProp(clip=clip, prop="_FieldBased", intval=0)
# cropping the video to 820x820
clip = core.std.CropRel(clip=clip, left=1020, right=2000, top=540, bottom=800)

clip = core.resize.Bicubic(clip=clip, format=vs.RGBS, matrix_in_s="470bg", range_s="limited")
org = core.resize.Bicubic(clip=clip, width=1640, height=1640)

from vsfemasr import femasr
clip = femasr(clip)

# adjusting output color from: RGBS to YUV420P10 for x265Model
clip = core.resize.Bicubic(clip=clip, format=vs.YUV420P10, matrix_s="470bg", range_s="limited", dither_type="error_diffusion")
org = core.resize.Bicubic(clip=org, format=vs.YUV420P10, matrix_s="470bg", range_s="limited", dither_type="error_diffusion")
clip = core.std.StackHorizontal([org.text.Text("Original"), clip.text.Text("Filtered")])

# Output
clip.set_output()

The result, looks impressive, but I see some strange highlights. (which I also see with other sources)
I also checked: enabling and disabling nvfuser, cuda_graphs does not change these highlights.
Are these to be expected, or is this a bug?

Using tensor:

clip = core.resize.Bicubic(clip=clip, format=vs.RGBH, matrix_in_s="470bg", range_s="limited")  
  from vsfemasr import femasr 
 clip = femasr(clip, trt=True, trt_cache_path=r"G:\Temp")

I also tried with

 clip = femasr(clip, trt=True, trt_cache_path="G:/Temp")

both failed with:

Python exception: 

Traceback (most recent call last):
File "src\cython\vapoursynth.pyx", line 2866, in vapoursynth._vpy_evaluate
File "src\cython\vapoursynth.pyx", line 2867, in vapoursynth._vpy_evaluate
File "C:\Users\Selur\Desktop\test_2.vpy", line 38, in 
clip = femasr(clip, trt=True, trt_cache_path="G:/Temp")
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\vsfemasr\__init__.py", line 196, in femasr
module = lowerer(
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\lower.py", line 323, in __call__
return do_lower(module, inputs)
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\passes\pass_utils.py", line 117, in pass_with_validation
processed_module = pass_(module, input, *args, **kwargs)
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\lower.py", line 320, in do_lower
lower_result = pm(module)
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\fx\passes\pass_manager.py", line 240, in __call__
out = _pass(out)
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\fx\passes\pass_manager.py", line 240, in __call__
out = _pass(out)
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\passes\lower_pass_manager_builder.py", line 167, in lower_func
lowered_module = self._lower_func(
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\lower.py", line 180, in lower_pass
interp_res: TRTInterpreterResult = interpreter(mod, input, module_name)
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\lower.py", line 132, in __call__
interp_result: TRTInterpreterResult = interpreter.run(
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\fx2trt.py", line 252, in run
assert engine
AssertionError

using Tensor without specifying trt_cache_path throws:

Python exception: 

Traceback (most recent call last):
File "src\cython\vapoursynth.pyx", line 2866, in vapoursynth._vpy_evaluate
File "src\cython\vapoursynth.pyx", line 2867, in vapoursynth._vpy_evaluate
File "C:\Users\Selur\Desktop\test_2.vpy", line 38, in 
clip = femasr(clip, trt=True)
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\vsfemasr\__init__.py", line 196, in femasr
module = lowerer(
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\lower.py", line 323, in __call__
return do_lower(module, inputs)
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\passes\pass_utils.py", line 117, in pass_with_validation
processed_module = pass_(module, input, *args, **kwargs)
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\lower.py", line 320, in do_lower
lower_result = pm(module)
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\fx\passes\pass_manager.py", line 240, in __call__
out = _pass(out)
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\fx\passes\pass_manager.py", line 240, in __call__
out = _pass(out)
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\passes\lower_pass_manager_builder.py", line 167, in lower_func
lowered_module = self._lower_func(
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\lower.py", line 180, in lower_pass
interp_res: TRTInterpreterResult = interpreter(mod, input, module_name)
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\lower.py", line 132, in __call__
interp_result: TRTInterpreterResult = interpreter.run(
File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\fx2trt.py", line 252, in run
assert engine
AssertionError

I'm using CUDA-11.7_cuDNN-8.6.0_TensorRT-8.5.2.2_win64.7z from vs-animesr and NVIDIA Studio drivers 527.56 on a Geforce RTX 4080 on Windows 11.

Tensor issue,...

Using:

# Imports
import vapoursynth as vs
import os
import ctypes
# Loading Support Files
Dllref = ctypes.windll.LoadLibrary("i:/Hybrid/64bit/vsfilters/Support/libfftw3f-3.dll")
import sys
# getting Vapoursynth core
core = vs.core
# Import scripts folder
scriptPath = 'i:/Hybrid/64bit/vsscripts'
sys.path.insert(0, os.path.abspath(scriptPath))
import site
# Adding torch dependencies to PATH
path = site.getsitepackages()[0]+'/torch_dependencies/bin/'
ctypes.windll.kernel32.SetDllDirectoryW(path)
path = path.replace('\\', '/')
os.environ["PATH"] = path + os.pathsep + os.environ["PATH"]
os.environ["CUDA_MODULE_LOADING"] = "LAZY"
# Loading Plugins
core.std.LoadPlugin(path="i:/Hybrid/64bit/vsfilters/GrainFilter/RemoveGrain/RemoveGrainVS.dll")
core.std.LoadPlugin(path="i:/Hybrid/64bit/vsfilters/GrainFilter/AddGrain/AddGrain.dll")
core.std.LoadPlugin(path="i:/Hybrid/64bit/vsfilters/DenoiseFilter/DFTTest/DFTTest.dll")
core.std.LoadPlugin(path="i:/Hybrid/64bit/vsfilters/DenoiseFilter/FFT3DFilter/fft3dfilter.dll")
core.std.LoadPlugin(path="i:/Hybrid/64bit/vsfilters/Support/EEDI3m_opencl.dll")
core.std.LoadPlugin(path="i:/Hybrid/64bit/vsfilters/ResizeFilter/nnedi3/NNEDI3CL.dll")
core.std.LoadPlugin(path="i:/Hybrid/64bit/vsfilters/Support/libmvtools.dll")
core.std.LoadPlugin(path="i:/Hybrid/64bit/vsfilters/Support/scenechange.dll")
core.std.LoadPlugin(path="i:/Hybrid/64bit/vsfilters/Support/fmtconv.dll")
core.std.LoadPlugin(path="i:/Hybrid/64bit/vsfilters/MiscFilter/MiscFilters/MiscFilters.dll")
core.std.LoadPlugin(path="i:/Hybrid/64bit/vsfilters/DeinterlaceFilter/Bwdif/Bwdif.dll")
core.std.LoadPlugin(path="i:/Hybrid/64bit/vsfilters/SourceFilter/DGDecNV/DGDecodeNV.dll")
# Import scripts
import havsfunc
# source: 'G:\TestClips&Co\1983-i.mkv'
# current color space: YUV420P8, bit depth: 8, resolution: 720x576, fps: 25, color matrix: 470bg, yuv luminance scale: limited, scanorder: top field first
# Loading G:\TestClips&Co\1983-i.mkv using DGSource
clip = core.dgdecodenv.DGSource("J:/tmp/mkv_85eb1ac8734d03d3e24c3850e2e9287a_853323747.dgi",fieldop=0)# 25 fps, scanorder: top field first
# Setting detected color matrix (470bg).
clip = core.std.SetFrameProps(clip, _Matrix=5)
# Setting color transfer info (470bg), when it is not set
clip = clip if not core.text.FrameProps(clip,'_Transfer') else core.std.SetFrameProps(clip, _Transfer=5)
# Setting color primaries info (), when it is not set
clip = clip if not core.text.FrameProps(clip,'_Primaries') else core.std.SetFrameProps(clip, _Primaries=5)
# Setting color range to TV (limited) range.
clip = core.std.SetFrameProp(clip=clip, prop="_ColorRange", intval=1)
# making sure frame rate is set to 25
clip = core.std.AssumeFPS(clip=clip, fpsnum=25, fpsden=1)
clip = core.std.SetFrameProp(clip=clip, prop="_FieldBased", intval=2) # tff
# Deinterlacing using QTGMC
clip = havsfunc.QTGMC(Input=clip, Preset="Fast", TFF=True, opencl=True) # new fps: 50
# Making sure content is preceived as frame based
clip = core.std.SetFrameProp(clip=clip, prop="_FieldBased", intval=0) # progressive
from vsfemasr import femasr as FeMaSR
# adjusting color space from YUV420P8 to RGBH for VsFeMaSR
clip = core.resize.Bicubic(clip=clip, format=vs.RGBH, matrix_in_s="470bg", range_s="limited")
# resizing using FeMaSR
clip = FeMaSR(clip=clip, device_index=0, trt=True, trt_cache_path=r"J:\tmp") # 1440x1152
# resizing 1440x1152 to 1280x960
# adjusting resizing
clip = core.resize.Bicubic(clip=clip, format=vs.RGBS, range_s="limited")
clip = core.fmtc.resample(clip=clip, w=1280, h=960, kernel="lanczos", interlaced=False, interlacedd=False)
# adjusting output color from: RGBS to YUV420P10 for x265Model
clip = core.resize.Bicubic(clip=clip, format=vs.YUV420P10, matrix_s="470bg", range_s="limited", dither_type="error_diffusion")
# set output frame rate to 50fps (progressive)
clip = core.std.AssumeFPS(clip=clip, fpsnum=50, fpsden=1)
# Output
clip.set_output()

I get:

<html><head><meta charset="utf-8" /><style type="text/css">
p, li { white-space: pre-wrap; }
hr { height: 1px; border-width: 0; }
li.unchecked::marker { content: "\2610"; }
li.checked::marker { content: "\2612"; }
</style></head><body>


167 entries not shown. Save the log to read.
--
2023-03-26 17:12:10.862 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_568 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_569 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_569 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_576 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_576 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_579 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_579 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_581 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_581 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_582 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_582 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_589 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_589 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_592 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_592 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_594 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_594 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_595 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_595 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_602 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_602 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_605 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_605 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_607 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_607 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_608 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_608 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_615 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_615 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_618 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_618 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_620 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_620 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_621 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_621 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_628 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_628 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_631 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_631 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_633 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_633 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_634 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_634 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_635 : Found bad pattern: y.reshape((x, ...)). Reshape node: reshape_635 Now lowering submodule _run_on_acc_0 Now lowering submodule _run_on_acc_0 split_name=_run_on_acc_0, input_specs=[InputTensorSpec(shape=torch.Size([1, 3, 576, 736]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] split_name=_run_on_acc_0, input_specs=[InputTensorSpec(shape=torch.Size([1, 3, 576, 736]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] Timing cache is used! Timing cache is used! TRT INetwork construction elapsed time: 0:00:00.001000 TRT INetwork construction elapsed time: 0:00:00.001000
2023-03-26 17:12:16.295 Build TRT engine elapsed time: 0:00:02.024681 Build TRT engine elapsed time: 0:00:02.024681 Lowering submodule _run_on_acc_0 elapsed time 0:00:03.686569 Lowering submodule _run_on_acc_0 elapsed time 0:00:03.686569 Now lowering submodule _run_on_acc_2 Now lowering submodule _run_on_acc_2 split_name=_run_on_acc_2, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 288, 368]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] split_name=_run_on_acc_2, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 288, 368]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] Timing cache is used! Timing cache is used! TRT INetwork construction elapsed time: 0:00:00.001505 TRT INetwork construction elapsed time: 0:00:00.001505
2023-03-26 17:12:37.284 Build TRT engine elapsed time: 0:00:20.702133 Build TRT engine elapsed time: 0:00:20.702133 Lowering submodule _run_on_acc_2 elapsed time 0:00:20.896665 Lowering submodule _run_on_acc_2 elapsed time 0:00:20.896665 Now lowering submodule _run_on_acc_4 Now lowering submodule _run_on_acc_4 split_name=_run_on_acc_4, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 288, 368]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True), InputTensorSpec(shape=torch.Size([1, 256, 288, 368]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] split_name=_run_on_acc_4, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 288, 368]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True), InputTensorSpec(shape=torch.Size([1, 256, 288, 368]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] Timing cache is used! Timing cache is used! TRT INetwork construction elapsed time: 0:00:00.001001 TRT INetwork construction elapsed time: 0:00:00.001001
2023-03-26 17:12:40.478 Build TRT engine elapsed time: 0:00:02.913266 Build TRT engine elapsed time: 0:00:02.913266 Lowering submodule _run_on_acc_4 elapsed time 0:00:03.107299 Lowering submodule _run_on_acc_4 elapsed time 0:00:03.107299 Now lowering submodule _run_on_acc_6 Now lowering submodule _run_on_acc_6 split_name=_run_on_acc_6, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 288, 368]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] split_name=_run_on_acc_6, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 288, 368]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] Timing cache is used! Timing cache is used! TRT INetwork construction elapsed time: 0:00:00.001001 TRT INetwork construction elapsed time: 0:00:00.001001
2023-03-26 17:12:43.662 Build TRT engine elapsed time: 0:00:02.897112 Build TRT engine elapsed time: 0:00:02.897112 Lowering submodule _run_on_acc_6 elapsed time 0:00:03.091014 Lowering submodule _run_on_acc_6 elapsed time 0:00:03.091014 Now lowering submodule _run_on_acc_8 Now lowering submodule _run_on_acc_8 split_name=_run_on_acc_8, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 288, 368]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True), InputTensorSpec(shape=torch.Size([1, 256, 288, 368]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] split_name=_run_on_acc_8, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 288, 368]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True), InputTensorSpec(shape=torch.Size([1, 256, 288, 368]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] Timing cache is used! Timing cache is used! TRT INetwork construction elapsed time: 0:00:00.002001 TRT INetwork construction elapsed time: 0:00:00.002001
2023-03-26 17:12:47.461 Build TRT engine elapsed time: 0:00:03.509264 Build TRT engine elapsed time: 0:00:03.509264 Lowering submodule _run_on_acc_8 elapsed time 0:00:03.711010 Lowering submodule _run_on_acc_8 elapsed time 0:00:03.711010 Now lowering submodule _run_on_acc_10 Now lowering submodule _run_on_acc_10 split_name=_run_on_acc_10, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 144, 184]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] split_name=_run_on_acc_10, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 144, 184]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] Timing cache is used! Timing cache is used! TRT INetwork construction elapsed time: 0:00:00.002001 TRT INetwork construction elapsed time: 0:00:00.002001 Build TRT engine elapsed time: 0:00:01.452047 Build TRT engine elapsed time: 0:00:01.452047 Lowering submodule _run_on_acc_10 elapsed time 0:00:01.644582 Lowering submodule _run_on_acc_10 elapsed time 0:00:01.644582 Now lowering submodule _run_on_acc_12 Now lowering submodule _run_on_acc_12 split_name=_run_on_acc_12, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 144, 184]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True), InputTensorSpec(shape=torch.Size([1, 256, 144, 184]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] split_name=_run_on_acc_12, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 144, 184]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True), InputTensorSpec(shape=torch.Size([1, 256, 144, 184]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] Timing cache is used! Timing cache is used! TRT INetwork construction elapsed time: 0:00:00.000999 TRT INetwork construction elapsed time: 0:00:00.000999 Build TRT engine elapsed time: 0:00:01.452022 Build TRT engine elapsed time: 0:00:01.452022 Lowering submodule _run_on_acc_12 elapsed time 0:00:01.650969 Lowering submodule _run_on_acc_12 elapsed time 0:00:01.650969 Now lowering submodule _run_on_acc_14 Now lowering submodule _run_on_acc_14 split_name=_run_on_acc_14, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 144, 184]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] split_name=_run_on_acc_14, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 144, 184]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] Timing cache is used! Timing cache is used! TRT INetwork construction elapsed time: 0:00:00.001000 TRT INetwork construction elapsed time: 0:00:00.001000 Build TRT engine elapsed time: 0:00:01.454924 Build TRT engine elapsed time: 0:00:01.454924 Lowering submodule _run_on_acc_14 elapsed time 0:00:01.652692 Lowering submodule _run_on_acc_14 elapsed time 0:00:01.652692 Now lowering submodule _run_on_acc_16 Now lowering submodule _run_on_acc_16 split_name=_run_on_acc_16, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 144, 184]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True), InputTensorSpec(shape=torch.Size([1, 256, 144, 184]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] split_name=_run_on_acc_16, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 144, 184]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True), InputTensorSpec(shape=torch.Size([1, 256, 144, 184]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] Timing cache is used! Timing cache is used! TRT INetwork construction elapsed time: 0:00:00.001000 TRT INetwork construction elapsed time: 0:00:00.001000 Build TRT engine elapsed time: 0:00:01.471268 Build TRT engine elapsed time: 0:00:01.471268 Lowering submodule _run_on_acc_16 elapsed time 0:00:01.675096 Lowering submodule _run_on_acc_16 elapsed time 0:00:01.675096 Now lowering submodule _run_on_acc_18 Now lowering submodule _run_on_acc_18 split_name=_run_on_acc_18, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 26496]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] split_name=_run_on_acc_18, input_specs=[InputTensorSpec(shape=torch.Size([1, 256, 26496]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] Timing cache is used! Timing cache is used!
2023-03-26 17:12:54.710 Unable to find layer norm plugin, fall back to TensorRT implementation. Unable to find layer norm plugin, fall back to TensorRT implementation.
2023-03-26 17:12:54.761 TRT INetwork construction elapsed time: 0:00:00.050510 TRT INetwork construction elapsed time: 0:00:00.050510
2023-03-26 17:13:16.078 Build TRT engine elapsed time: 0:00:21.268365 Build TRT engine elapsed time: 0:00:21.268365 Lowering submodule _run_on_acc_18 elapsed time 0:00:21.520086 Lowering submodule _run_on_acc_18 elapsed time 0:00:21.520086 Now lowering submodule _run_on_acc_20 Now lowering submodule _run_on_acc_20 split_name=_run_on_acc_20, input_specs=[InputTensorSpec(shape=torch.Size([1, 144, 184, 256]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True), InputTensorSpec(shape=torch.Size([1, 144, 184, 1]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] split_name=_run_on_acc_20, input_specs=[InputTensorSpec(shape=torch.Size([1, 144, 184, 256]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True), InputTensorSpec(shape=torch.Size([1, 144, 184, 1]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] Timing cache is used! Timing cache is used! I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\converters\converter_utils.py:457: UserWarning: Both operands of the binary elementwise op floordiv_144 are constant. In this case, please consider constant fold the model first.warnings.warn( I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\converters\converter_utils.py:457: UserWarning: Both operands of the binary elementwise op floordiv_144 are constant. In this case, please consider constant fold the model first.warnings.warn( I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\converters\converter_utils.py:457: UserWarning: Both operands of the binary elementwise op floordiv_145 are constant. In this case, please consider constant fold the model first.warnings.warn( I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\converters\converter_utils.py:457: UserWarning: Both operands of the binary elementwise op floordiv_145 are constant. In this case, please consider constant fold the model first.warnings.warn( I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\converters\converter_utils.py:457: UserWarning: Both operands of the binary elementwise op floordiv_146 are constant. In this case, please consider constant fold the model first.warnings.warn( I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\converters\converter_utils.py:457: UserWarning: Both operands of the binary elementwise op floordiv_146 are constant. In this case, please consider constant fold the model first.warnings.warn( I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\converters\converter_utils.py:457: UserWarning: Both operands of the binary elementwise op floordiv_147 are constant. In this case, please consider constant fold the model first.warnings.warn( I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\converters\converter_utils.py:457: UserWarning: Both operands of the binary elementwise op floordiv_147 are constant. In this case, please consider constant fold the model first.warnings.warn( TRT INetwork construction elapsed time: 0:00:00.226765 TRT INetwork construction elapsed time: 0:00:00.226765 Build TRT engine elapsed time: 0:00:00.463744 Build TRT engine elapsed time: 0:00:00.463744 Lowering submodule _run_on_acc_20 elapsed time 0:00:00.918417 Lowering submodule _run_on_acc_20 elapsed time 0:00:00.918417 Now lowering submodule _run_on_acc_22 Now lowering submodule _run_on_acc_22 split_name=_run_on_acc_22, input_specs=[InputTensorSpec(shape=torch.Size([414, 64, 256]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True), InputTensorSpec(shape=torch.Size([414, 64, 64]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True), InputTensorSpec(shape=torch.Size([1, 414, 1, 64, 64]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] split_name=_run_on_acc_22, input_specs=[InputTensorSpec(shape=torch.Size([414, 64, 256]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True), InputTensorSpec(shape=torch.Size([414, 64, 64]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True), InputTensorSpec(shape=torch.Size([1, 414, 1, 64, 64]), dtype=torch.float16, device=device(type='cuda', index=0), shape_ranges=[], has_batch_dim=True)] Timing cache is used! Timing cache is used! I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\converters\converter_utils.py:457: UserWarning: Both operands of the binary elementwise op floordiv_148 are constant. In this case, please consider constant fold the model first.warnings.warn( I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\converters\converter_utils.py:457: UserWarning: Both operands of the binary elementwise op floordiv_148 are constant. In this case, please consider constant fold the model first.warnings.warn( I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\converters\converter_utils.py:457: UserWarning: Both operands of the binary elementwise op floordiv_149 are constant. In this case, please consider constant fold the model first.warnings.warn( I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\converters\converter_utils.py:457: UserWarning: Both operands of the binary elementwise op floordiv_149 are constant. In this case, please consider constant fold the model first.warnings.warn( TRT INetwork construction elapsed time: 0:00:00.126577 TRT INetwork construction elapsed time: 0:00:00.126577
2023-03-26 17:13:22.295 Failed to evaluate the script:Python exception: Traceback (most recent call last):File "src\cython\vapoursynth.pyx", line 2866, in vapoursynth._vpy_evaluateFile "src\cython\vapoursynth.pyx", line 2867, in vapoursynth._vpy_evaluateFile "J:\tmp\tempPreviewVapoursynthFile17_12_01_348.vpy", line 63, in clip = FeMaSR(clip=clip, device_index=0, trt=True, trt_cache_path=r"J:\tmp") # 1440x1152File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_contextreturn func(*args, **kwargs)File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\vsfemasr\__init__.py", line 171, in femasrmodule = lowerer(File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\lower.py", line 323, in __call__return do_lower(module, inputs)File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\passes\pass_utils.py", line 117, in pass_with_validationprocessed_module = pass_(module, input, *args, **kwargs)File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\lower.py", line 320, in do_lowerlower_result = pm(module)File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\fx\passes\pass_manager.py", line 240, in __call__out = _pass(out)File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch\fx\passes\pass_manager.py", line 240, in __call__out = _pass(out)File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\passes\lower_pass_manager_builder.py", line 167, in lower_funclowered_module = self._lower_func(File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\lower.py", line 180, in lower_passinterp_res: TRTInterpreterResult = interpreter(mod, input, module_name)File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\lower.py", line 132, in __call__interp_result: TRTInterpreterResult = interpreter.run(File "I:\Hybrid\64bit\Vapoursynth\Lib\site-packages\torch_tensorrt\fx\fx2trt.py", line 252, in runassert engineAssertionError

</body></html><!--EndFragment-->

with trt=False it works.
Using trt_min_subgraph_size=5 doesn't help.

Any idea what could be causing this?
(updated to NIVIDA Studio Drivers 531.41 a few days ago, could this be the cause?)

FeMaSR and TensorRT

I noted that in the master has been removed the support to TensorRT because unusable.
I'm using FeMaSR with TensorRT and it is working satisfactory well. The images are sharp with more details.
Like every AI Super Resolution filter sometimes it add some artifacts, but they can be mitigated, see for example this post: https://forum.selur.net/thread-3012-post-17861.html#pid17861
So please keep the support for TensorRT in FeMaSR, because without it the filter is too slow and is practically unusable.
Thanks,
Dan

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.