FFAStrans workflows not balancing between hosts in transcoding farm
Re: FFAStrans workflows not balancing between hosts in transcoding farm
So here is a first try for load balancing.
@Silicon or anyone, wanna try?
How it works is that you just rename/backup the /processors/exe_manager.exe file and replace it with this one (and restart ffastrans):
Attention, this only works for ffastrans 1.2.1: What should happen is an equal job distribution depending on the number of running jobs per host. Includ/exclude configuration of workflows should be working just as usual.
It would be cool if anyone found the time to test this.
@Silicon or anyone, wanna try?
How it works is that you just rename/backup the /processors/exe_manager.exe file and replace it with this one (and restart ffastrans):
Attention, this only works for ffastrans 1.2.1: What should happen is an equal job distribution depending on the number of running jobs per host. Includ/exclude configuration of workflows should be working just as usual.
It would be cool if anyone found the time to test this.
emcodem, wrapping since 2009 you got the rhyme?
Re: FFAStrans workflows not balancing between hosts in transcoding farm
Hi emcodem
I’ll do my best to test it tomorrow since I’m off today.
And thanks a lot for quick reaction!
I’ll do my best to test it tomorrow since I’m off today.
And thanks a lot for quick reaction!
BR,
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
Re: FFAStrans workflows not balancing between hosts in transcoding farm
@emcodem
I have installed new version and it seems I have found a bug:
- I have started a job manually (Sumbmit files to "processor node")
- the workflow involved is limited to just one transcode node named "pr-carb-srv-2"
- job has appeared in Webinterface but it is stucked in Queued state (see screenshot)
- job is not visible in Status monitor (see screenshot)
What logs should I collect and send to you?
I have installed new version and it seems I have found a bug:
- I have started a job manually (Sumbmit files to "processor node")
- the workflow involved is limited to just one transcode node named "pr-carb-srv-2"
- job has appeared in Webinterface but it is stucked in Queued state (see screenshot)
- job is not visible in Status monitor (see screenshot)
What logs should I collect and send to you?
- Attachments
-
- Manually started job - not visible in Status monitor.PNG (13.26 KiB) Viewed 6006 times
-
- Manually started job - stucked in Queued state in Webinterface.PNG (38.27 KiB) Viewed 6011 times
Last edited by Silicon on Thu Jul 29, 2021 10:36 am, edited 1 time in total.
BR,
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
Re: FFAStrans workflows not balancing between hosts in transcoding farm
@emcodem
I had to rollback to official version of exe_manager, because job distribution was not working as expected . What happened:
- there were two jobs running on node "GRFCODER3" (node capacity limited to two simult. jobs)
- Webinterface has shown a bunch of new files in Incoming status (see screenshot)
- they have disappeared from the list shortly, but have not been assigned to other (free) transcode nodes
I had to rollback to official version of exe_manager, because job distribution was not working as expected . What happened:
- there were two jobs running on node "GRFCODER3" (node capacity limited to two simult. jobs)
- Webinterface has shown a bunch of new files in Incoming status (see screenshot)
- they have disappeared from the list shortly, but have not been assigned to other (free) transcode nodes
- Attachments
-
- New jobs in 'incoming' - but not assigned to transcode nodes.PNG (99.25 KiB) Viewed 6009 times
BR,
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
Re: FFAStrans workflows not balancing between hosts in transcoding farm
@Silicon thanks a lot for checking out!
Yeah sorry for that, i did not consider the "max tasks of the nodes" at all, so the current patch would only work if all nodes do the same amount of jobs. But on the other hand you verified for me that the basic concept seems to work because it was preferring the node with least amount of jobs.
Lemme see how we can add the max slots into consideration
Yeah sorry for that, i did not consider the "max tasks of the nodes" at all, so the current patch would only work if all nodes do the same amount of jobs. But on the other hand you verified for me that the basic concept seems to work because it was preferring the node with least amount of jobs.
Lemme see how we can add the max slots into consideration
emcodem, wrapping since 2009 you got the rhyme?
Re: FFAStrans workflows not balancing between hosts in transcoding farm
@emcodem
Thanks for your effort again. Looking forward to get improved version
Thanks for your effort again. Looking forward to get improved version
BR,
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
Re: FFAStrans workflows not balancing between hosts in transcoding farm
@emcodem
If you don't mind I would like to propose one more improvement:
- I think it could be beneficial to have the possibility to define "Host priority" (in FFAStrans configuration - Host dialog) for each host / transcode node
- more powerful nodes in the farm should have higher priority assigned
- this attribute will be taken in account when assigning jobs to transcode nodes - higher priority nodes will get jobs (in case they are free) at the expense of the lower priority nodes
What do you think?
If you don't mind I would like to propose one more improvement:
- I think it could be beneficial to have the possibility to define "Host priority" (in FFAStrans configuration - Host dialog) for each host / transcode node
- more powerful nodes in the farm should have higher priority assigned
- this attribute will be taken in account when assigning jobs to transcode nodes - higher priority nodes will get jobs (in case they are free) at the expense of the lower priority nodes
What do you think?
BR,
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
Re: FFAStrans workflows not balancing between hosts in transcoding farm
Yeahhh well, from my perspective we need a much more open system for designating system resources anyway, it's kind of an old topic
To be honest in my mind we need not only host priority but also a "per-node" host dedication and many more... Especially when taking into account that some machines in the farm might have access to special resources like ASIC (GPU) encoding and other's don't.
Anyway, your suggestion makes sense to me and it is something that @admin is interested to read. However, from my perspective we need to work on a much smarter and much more open system for priority and distribution management...
Thanks for all the hard thoughts you are putting into ffastrans Silicon, really appreciated and useful!
To be honest in my mind we need not only host priority but also a "per-node" host dedication and many more... Especially when taking into account that some machines in the farm might have access to special resources like ASIC (GPU) encoding and other's don't.
Anyway, your suggestion makes sense to me and it is something that @admin is interested to read. However, from my perspective we need to work on a much smarter and much more open system for priority and distribution management...
Thanks for all the hard thoughts you are putting into ffastrans Silicon, really appreciated and useful!
emcodem, wrapping since 2009 you got the rhyme?
Re: FFAStrans workflows not balancing between hosts in transcoding farm
Hi @emcodem and @admin
I’m wondering if there is any progress in “workflows not balancing between nodes” topic.
Thank you.
I’m wondering if there is any progress in “workflows not balancing between nodes” topic.
Thank you.
BR,
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
Silicon
--------
FFAStrans 1.3.0.2; WebInterface 1.3.0.0
Manager: VM: 2x Xeon E5-2630v3@2.4GHz, 8GB RAM
Workers: 3x HP DL360 G9 (2x Xeon E5-2643v3@3.4GHz,16GB RAM, nVidia M2000)+ 2x Lenovo SR665 (2x AMD EPYC730216C@3.0GHz,128GB RAM, nVidia P2200)
Re: FFAStrans workflows not balancing between hosts in transcoding farm
Hi Silicon,
Some. It's not so straight forward cause the master less design kind of works against the notion of perfectly balancing hosts in a farm. However, we have done something that might improve the situation but nothing that will perfectly balance your hosts. 1.2.2 will probably be released later today and will include the "fix". When it's released, try it and report back.
-steinar
Some. It's not so straight forward cause the master less design kind of works against the notion of perfectly balancing hosts in a farm. However, we have done something that might improve the situation but nothing that will perfectly balance your hosts. 1.2.2 will probably be released later today and will include the "fix". When it's released, try it and report back.
-steinar