FFAStrans hardware configuration

Questions and answers on how to get the most out of FFAStrans
llittleton
Posts: 23
Joined: Wed Jul 18, 2018 4:57 pm

Re: FFAStrans hardware configuration

Post by llittleton »

I have been using FFASTrans 1.0.0 with custom ffmpeg specifying h264_nvenc as codec and has been working well. With a nvidia P2000 and localized files on ssd array, I can encode 4 HD 59.94 Prores files to mp4 in realtime! Upgrading soon to a PC with 20 xeon cores and 2 nvidia P4000 cards which should increase transcodes to 8 - 12 realtime.
veks
Posts: 79
Joined: Fri Oct 25, 2019 6:51 am

Re: FFAStrans hardware configuration

Post by veks »

llittleton wrote: Tue Mar 10, 2020 2:38 pm I have been using FFASTrans 1.0.0 with custom ffmpeg specifying h264_nvenc as codec and has been working well. With a nvidia P2000 and localized files on ssd array, I can encode 4 HD 59.94 Prores files to mp4 in realtime! Upgrading soon to a PC with 20 xeon cores and 2 nvidia P4000 cards which should increase transcodes to 8 - 12 realtime.
This is awesome to know!

May I know why Xeon?
When you can get far superior AMD Epyc or Threadripper with more cores for far lower price?
We have AMD Ryzen 9 3900x and it works amazingly fast with FFMpeg + P2000.

Check AMD ;)
emcodem
Posts: 1631
Joined: Wed Sep 19, 2018 8:11 am

Re: FFAStrans hardware configuration

Post by emcodem »

veks wrote: Fri Mar 13, 2020 8:17 am May I know why Xeon?
Thats easy, it is about rack space, power consumption, maintenance and scaling. A VM with 20 XEON cores can be translated to maybe 1 cm rackspace while a custom made desktop pc uses about 25cm. In reality the racks that i built with AMD and Core I7 machines only take 4 machines per rack.
Servers that have AMD CPU's is a niche product. Driving niche products in a datacenter is always problematic, it requires special support contracts, special engineers, special treatment - which makes it very expenisve on a medium/large scale. The reason is not the price for the machine but the additional effort for humans - which are the most precious resource in a Datacenter, not matter if it is the engineer or the guy who cares about the maintenance contracts.

But on a small scale, it is for sure recommended to use a gaming processor like Ryzen or Core I... for Encoding. The higher the clock speed and Bus frequency the better. This is valid for decoding/encoding professional and delivery Formats.
emcodem, wrapping since 2009 you got the rhyme?
Ghtais
Posts: 157
Joined: Thu Jan 19, 2017 11:06 am

Re: FFAStrans hardware configuration

Post by Ghtais »

Hi all

fisrt of all, I hope you are all fine, you and your family during this very complex covid19 situation.
I would like to share with you my first test with GPU encoding.
I have installed a Quadro K4200 in my computer with the last driver available from Nvidia
I tested to transcode DNxHD120 interlaced video file to H264 720p 2500 kbits
I created a custom ffmpeg node with same code except that I replaced -c:v libx264 by -c:v h264_nevc. Is that the correct way ??

results :
with -c:v libx264
CPU : 20%
GPU : 0%
Speed : x1.9

with -c:v h264_nevc
CPU : 12 %
GPU : 3%
Speed : x1.9

So it seems that for my purpose there is no gain to use GPU.
Or perhpas my ffmpeg script is not optimal for GPU
Juts let me know your feeling about that.

bye
emcodem
Posts: 1631
Joined: Wed Sep 19, 2018 8:11 am

Re: FFAStrans hardware configuration

Post by emcodem »

Hey @Ghatis
same to you, happy homeoffice ;-)

As i already said, the reason why your stuff encodes so slow is some filtering. What filters do you have in your workflow?
emcodem, wrapping since 2009 you got the rhyme?
Ghtais
Posts: 157
Joined: Thu Jan 19, 2017 11:06 am

Re: FFAStrans hardware configuration

Post by Ghtais »

Hi Emcodem

I am looking for advice or something to speed up the H264 encoding as the demand is increasing very quickly because many editors and chef are at home.
In my workflow, I'am not using any filter before the H264 processor.
Here is a screenshot of my H264 processor settings.
Image

As you can see, I resize and deinterlaced. not a big deal.
I have tested many setting but it doesn't change many thing.
input file is always DNx120 HD 1080i. I have tested with Xdcam HD 1080i. It is slower.
I use Quicktime ref file with avisynth script to input my file, perhaps is the issue.

note that if I add a timecode node before H264, the speed drop down to 0.6x
admin
Site Admin
Posts: 1658
Joined: Sat Feb 08, 2014 10:39 pm

Re: FFAStrans hardware configuration

Post by admin »

Hi Ghtais,

You must understand that whenever you use the inbuilt encoders, it's not shit in shit out. Depending on your input source and output template the processing might involve several filters affecting encoding speed. The reason for this is to make sure the sure the source quality and its properties are honored. This is normally very different from using a custom ffmpeg encoder. So the only way to know why the speed difference between CPU vs. GPU is not greater, we must look at the exact command line being used. Also, one must not forget to compare quality. If GPU is faster but yields worse quality then the comparison speed wise is not fair to the CPU.

Also, using AviSynth and certainly the QT-plugin will cause some slowdown.

-steinar
Ghtais
Posts: 157
Joined: Thu Jan 19, 2017 11:06 am

Re: FFAStrans hardware configuration

Post by Ghtais »

Hi steinar

thanks you for your help. I still did a lots of test and it appears now clearly that this speed issue is with QT-plugin.
If I input directly the same file, I get x12 speed with nvenc :-)
That is very impressive.
Pogle
Posts: 12
Joined: Fri Mar 27, 2020 10:03 am

Re: FFAStrans hardware configuration

Post by Pogle »

Hello!
May I append a question regarding basic hardware configuration here? As I read in this forum, you recommend to use as much cores as possible at the highest clock speeds available, but with tendency to Desktop CPUs like Ryzen, Threadripper or i7, i9 (if just encoding times matter). So I won't benefit from using a dual CPU (XEON, EPYC) workstation like one of those Precisions or Z8s and should go for a single (Threadripper) CPU config with clock speeds beyond 4 Ghz?
Do I need a Quadro P - GPU to make use of then nvenc - algorithms or can I also use GTX/RTX cards?

Thanks
emcodem
Posts: 1631
Joined: Wed Sep 19, 2018 8:11 am

Re: FFAStrans hardware configuration

Post by emcodem »

Hey Pogle, sure this question fits the thread topic very much i guess :-)
Pogle wrote: Thu Apr 16, 2020 8:39 am May I append a question regarding basic hardware configuration here? As I read in this forum, you recommend to use as much cores as possible at the highest clock speeds available, but with tendency to Desktop CPUs like Ryzen, Threadripper or i7, i9 (if just encoding times matter). So I won't benefit from using a dual CPU (XEON, EPYC) workstation like one of those Precisions or Z8s and should go for a single (Threadripper) CPU config with clock speeds beyond 4 Ghz?
The trade-off is rack space vs "single job encoding speed". A fast Consumer CPU will do one job a lot faster than a XEON, except you do lots of stuff that is actually able to utilize "all Cores" (which is typically limited to the use of 32 Cores per running task in windows). Especially when you do very high quality at low bitrate H264 Encoding.
At the same time typically those configurations with Consumer Hardware require A LOT more rackspace, so they are only cheap if you dont need to put your hardware into racks.
Pogle wrote: Thu Apr 16, 2020 8:39 am Do I need a Quadro P - GPU to make use of then nvenc - algorithms or can I also use GTX/RTX cards?
NVENC, Quicksync and co. is not 100% connected to the GPU, it is a special chip on the Board (ASIC - Application specific integrated Circuit).

If you can afford the additional bitrate at the same quality that the Graphics boards encoding delivers compared to x264 (CPU) encoding (it was made for Live, not for VOD) then you can go for using Graphics cards. Quadros allow more than 2 encoding sessions in paralell and generally stay cooler which allows you to put them into servers of 2 Rack Height units without hassle. Which does not mean that a single encoding session can be faster on a quadro.

Also, do not forget, that there is not only NVENC out there but also a solution from AMD (which i would not recommend) and Intel Quicksync which is contained on your CPU as well, So if you go for a Core I7 or 9 you have both Options in one.

Don't forget, all the consumer hardware is pretty much only helpful when you go for consumer codecs. You cannot utilize any Hardware Acceleration from Graphics boards for any Broadcast codec (Mezzanine).
If you mention what codecs you are decoding and encoding, i could come up with a few usecases for comparison. If you mention your use case, i can recommend the best hardware config.
emcodem, wrapping since 2009 you got the rhyme?
Post Reply