FFAStrans workflows not balancing between hosts in transcoding farm

Here you can submit bugreports
emcodem
Posts: 1631
Joined: Wed Sep 19, 2018 8:11 am

Re: FFAStrans workflows not balancing between hosts in transcoding farm

Post by emcodem »

So here is a first try for load balancing.
@Silicon or anyone, wanna try?

How it works is that you just rename/backup the /processors/exe_manager.exe file and replace it with this one (and restart ffastrans):

Attention, this only works for ffastrans 1.2.1:
exe_manager.exe.txt
(1.09 MiB) Downloaded 180 times
What should happen is an equal job distribution depending on the number of running jobs per host. Includ/exclude configuration of workflows should be working just as usual.
It would be cool if anyone found the time to test this.
emcodem, wrapping since 2009 you got the rhyme?
User avatar
Silicon
Posts: 98
Joined: Fri Sep 04, 2020 6:34 am

Re: FFAStrans workflows not balancing between hosts in transcoding farm

Post by Silicon »

Hi emcodem
I’ll do my best to test it tomorrow since I’m off today.
And thanks a lot for quick reaction!
BR,
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
User avatar
Silicon
Posts: 98
Joined: Fri Sep 04, 2020 6:34 am

Re: FFAStrans workflows not balancing between hosts in transcoding farm

Post by Silicon »

@emcodem
I have installed new version and it seems I have found a bug:
- I have started a job manually (Sumbmit files to "processor node")
- the workflow involved is limited to just one transcode node named "pr-carb-srv-2"
- job has appeared in Webinterface but it is stucked in Queued state (see screenshot)
- job is not visible in Status monitor (see screenshot)

What logs should I collect and send to you?
Attachments
Manually started job - not visible in Status monitor.PNG
Manually started job - not visible in Status monitor.PNG (13.26 KiB) Viewed 4496 times
Manually started job - stucked in Queued state in Webinterface.PNG
Manually started job - stucked in Queued state in Webinterface.PNG (38.27 KiB) Viewed 4501 times
Last edited by Silicon on Thu Jul 29, 2021 10:36 am, edited 1 time in total.
BR,
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
User avatar
Silicon
Posts: 98
Joined: Fri Sep 04, 2020 6:34 am

Re: FFAStrans workflows not balancing between hosts in transcoding farm

Post by Silicon »

@emcodem
I had to rollback to official version of exe_manager, because job distribution was not working as expected :( . What happened:
- there were two jobs running on node "GRFCODER3" (node capacity limited to two simult. jobs)
- Webinterface has shown a bunch of new files in Incoming status (see screenshot)
- they have disappeared from the list shortly, but have not been assigned to other (free) transcode nodes
Attachments
New jobs  in 'incoming' - but not assigned to transcode nodes.PNG
New jobs in 'incoming' - but not assigned to transcode nodes.PNG (99.25 KiB) Viewed 4499 times
BR,
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
emcodem
Posts: 1631
Joined: Wed Sep 19, 2018 8:11 am

Re: FFAStrans workflows not balancing between hosts in transcoding farm

Post by emcodem »

@Silicon thanks a lot for checking out!
Yeah sorry for that, i did not consider the "max tasks of the nodes" at all, so the current patch would only work if all nodes do the same amount of jobs. But on the other hand you verified for me that the basic concept seems to work because it was preferring the node with least amount of jobs.
Lemme see how we can add the max slots into consideration :D
emcodem, wrapping since 2009 you got the rhyme?
User avatar
Silicon
Posts: 98
Joined: Fri Sep 04, 2020 6:34 am

Re: FFAStrans workflows not balancing between hosts in transcoding farm

Post by Silicon »

@emcodem
Thanks for your effort again. Looking forward to get improved version 8-)
BR,
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
User avatar
Silicon
Posts: 98
Joined: Fri Sep 04, 2020 6:34 am

Re: FFAStrans workflows not balancing between hosts in transcoding farm

Post by Silicon »

@emcodem
If you don't mind I would like to propose one more improvement:
- I think it could be beneficial to have the possibility to define "Host priority" (in FFAStrans configuration - Host dialog) for each host / transcode node
- more powerful nodes in the farm should have higher priority assigned
- this attribute will be taken in account when assigning jobs to transcode nodes - higher priority nodes will get jobs (in case they are free) at the expense of the lower priority nodes
What do you think?
BR,
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
emcodem
Posts: 1631
Joined: Wed Sep 19, 2018 8:11 am

Re: FFAStrans workflows not balancing between hosts in transcoding farm

Post by emcodem »

Yeahhh well, from my perspective we need a much more open system for designating system resources anyway, it's kind of an old topic :D
To be honest in my mind we need not only host priority but also a "per-node" host dedication and many more... Especially when taking into account that some machines in the farm might have access to special resources like ASIC (GPU) encoding and other's don't.
Anyway, your suggestion makes sense to me and it is something that @admin is interested to read. However, from my perspective we need to work on a much smarter and much more open system for priority and distribution management...
Thanks for all the hard thoughts you are putting into ffastrans Silicon, really appreciated and useful!
emcodem, wrapping since 2009 you got the rhyme?
User avatar
Silicon
Posts: 98
Joined: Fri Sep 04, 2020 6:34 am

Re: FFAStrans workflows not balancing between hosts in transcoding farm

Post by Silicon »

Hi @emcodem and @admin
I’m wondering if there is any progress in “workflows not balancing between nodes” topic.
Thank you.
BR,
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
admin
Site Admin
Posts: 1658
Joined: Sat Feb 08, 2014 10:39 pm

Re: FFAStrans workflows not balancing between hosts in transcoding farm

Post by admin »

Hi Silicon,

Some. It's not so straight forward cause the master less design kind of works against the notion of perfectly balancing hosts in a farm. However, we have done something that might improve the situation but nothing that will perfectly balance your hosts. 1.2.2 will probably be released later today and will include the "fix". When it's released, try it and report back.

-steinar
Post Reply