FFAStrans hardware configuration

Questions and answers on how to get the most out of FFAStrans
emcodem
Posts: 1149
Joined: Wed Sep 19, 2018 8:11 am

Re: FFAStrans hardware configuration

Post by emcodem » Mon Nov 22, 2021 12:57 pm

Aye @mrazik
to be able to recommend something for you, we'd need to first determine if your bottleneck is currently hardware or software, e.g. would the transcoding run massively faster when NOT using any filters?

Also, you basically have the choice to go for "one transcoding very fast" or "maximum throughput when running multiple paralell" transcodings.
Also, IF you need a Graphics board with Hardware encoder onboard depends on if you want to use this feature. Maybe you already know that those hardware encoders generally produce worse quality at the same bitrate than X264 (CPU) encoding does.
emcodem, wrapping since 2009 you got the rhyme?

mrazik
Posts: 32
Joined: Sat May 30, 2020 9:48 pm

Re: FFAStrans hardware configuration

Post by mrazik » Tue Nov 23, 2021 10:24 am

Hi @emcodem :)

exactly, I'm trying to avoid any bottleneck in the setup.
I already run some tests with GPU, and decided to use only CPU, not only because of the quality, but the transcoding speed to X264 was basically very similar.

I can imagine running jobs in parallel (let's say 2 or 3 if possible). So in general, I'm trying to transcode around 25 hours of hires material in one hour :)
MediaInfo from hires and my workflows attached (filers + no-filters). If I run the workflow with the hires the speed is:
- filters - (indexing time is fast) - 1.84x (92fps)
- no filters - 1.2x (60fps)
(which is slightly weird as with filters it looks that it's faster)

So my point is, I can put there fast drive, 10gbs NIC if the files will be stored in network storage, more RAM, but I'm trying to find out what CPU would be the best for new machine or multiple machines ;)

Many thanks,
mrazek
Attachments
No-Filters.json
(6.93 KiB) Downloaded 50 times
Filters.json
(10.31 KiB) Downloaded 55 times
hires.txt
(14.96 KiB) Downloaded 54 times
FFAStrans 1.2.2
Webinterface 1.2.0.4
- - - - - - - - - - - - - - - -
Intel Xeon CPU E5-1620 v3 @ 3.5GHz
8.00 GB RAM
NVIDIA Quadro K620
256GB SSD
1GB NIC
- - - - - - - - - - - - - - - -
Windows 10 Pro 64b

emcodem
Posts: 1149
Joined: Wed Sep 19, 2018 8:11 am

Re: FFAStrans hardware configuration

Post by emcodem » Mon Nov 29, 2021 10:19 am

Hi,

sorry for the delay, i first tried on my laptop but was not able to reproduce your results so i needed to wait until i get hold of a fast server to retry.

After all, i cannot reproduce your result that no filters is anyhow slower than filters. In fact, using a 30s source file, the "no filters" would finish encoding before the "filters" even started encoding at all.
Besides the starting time, "filters" for me runs significally slower than the other. In task manager i see exactly what i was talking about: some filter causes the stuff to run on a single core more or less:
mrazik.png
mrazik.png (78.47 KiB) Viewed 608 times
I used a 720p50 source but i don't think the resolution of my source matters.
@mrazik, can you confirm your results and maybe upload a source file that you used for testing?
emcodem, wrapping since 2009 you got the rhyme?

mrazik
Posts: 32
Joined: Sat May 30, 2020 9:48 pm

Re: FFAStrans hardware configuration

Post by mrazik » Wed Dec 08, 2021 12:12 pm

Hi @emcodem,

yeah, I think I made a mistake in the tests :roll:. Just for sure I ran the transcoding several more times and here is the result:
filters:
- Indexing (20s) - SSD looks to be a bottleneck
- A/V Media (183s)
- Deinterlace (25s)
- Timecode (2s)
- FPS (3s)
- H.264 (791s/0.17x) - not sure why it's running only to 30% CPU

no filters:
- H.264 (238s/1.2x) - up to CPU 95%

You can find the test file in https://www.dropbox.com/s/3v2w94797o4pb ... 1.mxf?dl=0 (5mins and it has 9GB).

So at this point, I'm still pretty not sure about:
- what is the best CPU to purchase. Ah, another catch is to use a 1U rack mount machine...
- if there is something I can make the transcoding with filters runs faster
FFAStrans 1.2.2
Webinterface 1.2.0.4
- - - - - - - - - - - - - - - -
Intel Xeon CPU E5-1620 v3 @ 3.5GHz
8.00 GB RAM
NVIDIA Quadro K620
256GB SSD
1GB NIC
- - - - - - - - - - - - - - - -
Windows 10 Pro 64b

emcodem
Posts: 1149
Joined: Wed Sep 19, 2018 8:11 am

Re: FFAStrans hardware configuration

Post by emcodem » Wed Dec 08, 2021 8:40 pm

About CPU: buy what you can afford that has highest clock frequency and the most cores at the same time. Number of cores helps only for scaling multiple paralell encodings but not a single one. E.g. 8 Cores 5 Ghz would be (likely MUCH) faster than 64cores 2.2 Ghz for a single transcode. But the 8 Cores would be used 100% for one h264 transcode.
Up to about 16 cores can be more or less utilized ~100% with a single Fullhd h264 encode, depending a lot on it's settings of course.
If you have the choice 16 cores á 3.8Ghz vs 32 cores á 3.5Ghz i'd probably go for the 32 cores because you loose only 10% speed on a single encode but you can do 2 paralell.
If you can go highest level, it is better to buy 24 cores à 4.20 GHz than 32 cores 3.5 Ghz, but it's a pretty tight calculation. Anyway, GHz rules!
Also AMD threadrippers really perform well but are not common in servers. Personally i am happy with the fastest 16core Xeon Gold series.

About Software bottleneck:
Only when you use A/V decoder, the "disk speed" can matter at all realy. The "indexing" step is usually superfluous, it is basically only for repairing special defects in files but the avisynth ffms2 plugin requires to index always. Also, depending on the source file, it can happen that A/V decoder decides to generate an uncompressed version of your source file in the cache directory which would of course slow down the overall process massively and uses your disk speed basically 100%, maybe 50% for a very fast M2.

You'd need to find a way to get rid of the A/V decoder. I believe in your case, you should just use the builtin encoder procs instead of a custom h264, they will automatically insert any needed deinterlace or fps conversion stuff (as well as color stuff, audio stuff etc) into the finally calculated ffmpeg command internally. Thats actually one thing where ffastrans is pretty good in, to smartly calculate the needed ffmpeg cmd.
Using ffmpeg only without avisynth should usually get you rid of the single core filtering...
emcodem, wrapping since 2009 you got the rhyme?

mrazik
Posts: 32
Joined: Sat May 30, 2020 9:48 pm

Re: FFAStrans hardware configuration

Post by mrazik » Wed Dec 08, 2021 9:13 pm

Hi @emcodem,

thank you much for exhaustive answer. I like the part GHz rules 👍

I’ll also try to play with built in ffmpeg cmd to replace a/v media.

Thank you!
mrazik
FFAStrans 1.2.2
Webinterface 1.2.0.4
- - - - - - - - - - - - - - - -
Intel Xeon CPU E5-1620 v3 @ 3.5GHz
8.00 GB RAM
NVIDIA Quadro K620
256GB SSD
1GB NIC
- - - - - - - - - - - - - - - -
Windows 10 Pro 64b

emcodem
Posts: 1149
Joined: Wed Sep 19, 2018 8:11 am

Re: FFAStrans hardware configuration

Post by emcodem » Thu Dec 09, 2021 9:04 am

Uhm sorry, i forgot to consider one thing, regarding

Code: Select all

24 cores à 4.20 GHz than 32 cores 3.5 Ghz
This is only true if your encode process is not able to benefit from 32 cores vs 24 cores overall in general (which is at the best of my knowledge true for x264). I am not sure if for example x265/x266 at 8k resolution, there is a chance that a single encode process is able to utilize all 24/32 cores 100%. IF this is the case, then we have an exception to the rule quoted above: the 32cores with 3.5 Ghz might be slightly faster than the 24 cores a 4.2 Ghz, even if you have 2 processors in the system (which makes it 48 cores on 24C processors).

The reason is that a single windows process can only have a maximum of 32 cores at a time.
BUT as we usually deal not only with encoding but filtering etc... also we usually encode different formats on the same host, i stick to what i said before: GHz rules ;-)

What you must always do is to configure your mainboard not to force the building of so called processor groups, e.g. on HP servers you turn off "Numa node Clustering". That is because a windows process can only run on a single processor group, no matter how many cores you have in the system.
Also, if you have more than 32 Cores (e.g. 2x18 Core CPU), you should turn off hyperthreading. That is again because windows MUST build processor groups when the overall core count is >64. So for our 2x18 core example, when Hyperthreading is turned on, you would have 2 processor groups in the system, each 36 Virtual cores, but in reality only 18 physical cores. This would limit a single encoding process to run on 18 phyiscal cores which is usually a bad thing compared to giving it 36 physical cores.
Again, Hyperthreading is usually not helpful for encoding processes, i did lots of testing with this.
emcodem, wrapping since 2009 you got the rhyme?

Post Reply