I'm gonna quote your topic.
Basically the "I hate pytorch" part.
I mean, I took a look at the GitHub you linked and they're not using anything new, they're just leveraging on the different ESRGAN models, therefore why would you wanna make your life miserable with Python when we have a C++ interface for them that can be used in Avisynth and is already supported by FFAStrans out of the box?
https://forum.doom9.org/showthread.php?t=184768
All you need is to either pick an ONNX model OR convert other models to .onnx and you're gonna be ready to go.
Just download the mlrt plugin from here:
https://github.com/Asd-g/avs-mlrt/releases
Download the models from here:
https://www.dropbox.com/sh/f74ao9t1qarf ... r6M4a?dl=0
Create a FFAStrans workflow with the A/V Decoder and a Custom Avisynth Script like:
Code: Select all
#Path to the dll
LoadPlugin("\\FFAStrans\Processors\avs_plugins\mlrt_ncnn.dll")
#Bring everything to 32bit float
ConvertBits(m_clip, 32)
ConvertToPlanarRGB()
#Use the GAN model with 32bit float precision
mlrt_ncnn(network_path="\\avs000\Ingest\MEDIA\temp\onnx-models\realesr-general-wdn-x4v3_opset16.onnx")
#Dithering down to the output bitdepth
m_clip=ConvertBits(bits=8, dither=1)
return m_clip
and then add an encoding node to encode to whatever output you want.
A couple of notes:
1) Please note that GAN models only work with progressive clips, so you may wanna add a deinterlacer after the A/V Decoder just in case
2) The dithering to output bit depth has to be adjusted to your needs according to what the output is gonna be. Also please note that you MUST put ConvertBits() there 'cause even though Avisynth can work in 32bit float with as many values as you'll ever need, FFMpeg can't, which is why you can either deliver 16bit planar (which FFMpeg understands) or go directly down to whatever bit depth you want in output.
3) GAN models will be processed through mlrt_ncnn() but require a GPU and a LOT of VRAM. Keep in mind that my poor people GTX 980Ti couldn't render anything when I tested them at home and I had to go to the office to "borrow" some Quadro cards. That aside, nowadays consumer cards are just fine, so if you have a 4090 etc you're gonna be fine too, but I'm poor, so...
emcodem wrote: ↑Fri Aug 04, 2023 4:39 pm
Also i am sure @FranceBB will tell you why he doesnt like the GAN models
Rather than going through several lines on why non deterministic upscales cannot be reliably used for archival purposes, I'm gonna let the screenshots talk:
Code: Select all
#Indexing
LWLibavVideoSource("Test.mxf")
#ImageSource("\\mibctvan000.avid.mi.bc.sky.it\Ingest\MEDIA\temp\Lenna_(test_image).png")
Bob().Spline64Resize(848, 480)
original=ConvertBits(8).Converttoyv12().Text("Original", y=66)
#Downscale
SinPowResizeMT(width/4, height/4)
#Various Upscales
point=PointResize(width*4, height*4).ConvertBits(8).Converttoyv12()
#bilinear=BilinearResize(width*4, height*4).Converttoyv12()
nnedi3=nnedi3_rpow2(cshift="Spline64ResizeMT", rfactor=2, fwidth=width*4, fheight=height*4, nsize=4, nns=4, qual=1, etype=0, pscrn=2, threads=0, csresize=true, mpeg2=true, threads_rs=0, logicalCores_rs=true, MaxPhysCore_rs=true, SetAffinity_rs=false).ConvertBits(8).Converttoyv12()
esrgan=last.ConverttoPlanarRGB().ConvertBits(32).mlrt_ncnn(network_path="\\myshare\Ingest\MEDIA\temp\realesr-general-wdn-x4v3_opset16.onnx", builtin=false, list_gpu=false, fp16=true).ConvertBits(8).Converttoyv12()
#SSIM
pnt=SSIM(original, point, "\\myshare\Ingest\MEDIA\temp\point3SSIM.csv", "\\myshare\Ingest\MEDIA\temp\point3SSIM.txt", lumimask=1, scaled=0).Text("PointResize", y=66)
nne=SSIM(original, nnedi3, "\\myshare\Ingest\MEDIA\temp\nnedi3SSIM.csv", "\\myshare\Ingest\MEDIA\temp\nnedi3SSIM.txt", lumimask=1, scaled=0).Text("NNEDI3", y=66)
esr=SSIM(original, esrgan, "\\myshare\Ingest\MEDIA\temp\esrganSSIM.csv", "\\myshare\Ingest\MEDIA\temp\esrganSSIM.txt", lumimask=1, scaled=0).Text("ESRGAN", y=66)
#Preview
a=StackHorizontal(original, pnt)
b=StackHorizontal(nne, esr)
StackVertical(a,b)