Farm vs Workflow specifics

Questions and answers on how to get the most out of FFAStrans
Post Reply
knk
Posts: 38
Joined: Wed Jun 21, 2017 3:50 pm

Farm vs Workflow specifics

Post by knk »

Hi guys, this is a very generic question that I can't seem to find a proper solution.

Suppose there is a 2 machine farm, each with 24 available slots, handling all jobs x48 (hail to the RAM & CPU! lol )
Now, if I have a very specific Workflow that can only run one job at a time due to several constrains, how can I make this happen?

I've been wrapping my head around this subject and I can't seem to limit the number of executions in such a scenario withput cutting the available slots...

Any ideas?

All the best
emcodem
Posts: 1720
Joined: Wed Sep 19, 2018 8:11 am

Re: Farm vs Workflow specifics

Post by emcodem »

Hi again,

it means that you need more machines in your farm or expand to cloud like @FranceBB does ;-)

Sorry the prio stuff is currently hard to explain.

In case you set to 24 slots ("per prio class), you can in theory have a max of 5x25 (if not 6x25, not sure about that) jobs running but only in case you have 5 different worfklows, each set to a different priority AND the jobs are started in the lower prio first. Also read here: viewtopic.php?p=8961#p8961

Manually submitted (via FFAStrans GUI) jobs have their own prio class "5", so we have 6 prio classes: 0 to 4 for watchfolders and 5 for manual submits.
Again, it only works when the lower prio class jobs first fill up the max slots and after that, a higher prio class job starts.

In webint job submitter, i just noticed that i don't set the prio according to workflow settings currently, will fix that.

Also, one important note about priorities: usually you don't want to use anything higher than normal, better stick to using very low and low and normal (0,1,2). This is because Prios are also set as CPU prio for the running process. If you encode using high prio and use all CPUs, the Windows GUI and lots of other stuff in windows will freeze. So anything higher than normal should only be used for very short running processing.

There is also a field in ffastrans.json config file called "auto_pause", it would send lower prio jobs to pause in case a very high prio job comes in.
emcodem, wrapping since 2009 you got the rhyme?
admin
Site Admin
Posts: 1676
Joined: Sat Feb 08, 2014 10:39 pm

Re: Farm vs Workflow specifics

Post by admin »

Hi knk,

You cannot limit a workflow like this. The closest (without adding another host for this one workflow) might be to use the feature where you tell a node (or even all nodes in your workflow) to occupy more slots in order to limit the total executed nodes per host. But it's limited occupy 16 slots, so if you have configured your host to 24 you will still have 8 free slots to use. Please also note that an incoming node with higher priority set to occupying more slots (above just 1) than what is available, will hold until the number of free slots is reached.
Just double click the node to configure the "Job processing slots" setting:
job_slots.png
job_slots.png (17.42 KiB) Viewed 939 times
Hope this helps to somewhat accomplish what you need.

-steinar
emcodem
Posts: 1720
Joined: Wed Sep 19, 2018 8:11 am

Re: Farm vs Workflow specifics

Post by emcodem »

But isnt our 2 answers combined the ultimate answer?
E.g. your special workflow is set to higher prio than everything else in the system and you set the processor that can only execute once (or all processors) to occupy only one slot as steinar said?
emcodem, wrapping since 2009 you got the rhyme?
User avatar
FranceBB
Posts: 255
Joined: Sat Jun 25, 2016 3:43 pm
Contact:

Re: Farm vs Workflow specifics

Post by FranceBB »

emcodem wrote: Mon Sep 30, 2024 8:26 pm it means that you need more machines in your farm or expand to cloud like @FranceBB does ;-)
I think that what I do isn't applicable to like 99.9% of users, but... sure!
I mean, you don't really have resource issues when you can spin up 640 EC2 in the blink of an eye on AWS ehehehehe
Behold to the 640 servers FFAStrans 1.4.0.7 Stable cloud farm:
Screenshot from 2024-10-01 10-13-18.png
Screenshot from 2024-10-01 10-13-18.png (4.77 KiB) Viewed 912 times
The obvious downside is that Amazon isn't a charity fund, it will charge you... a lot...
I'm told that other acceptable forms of payment for the EC2 cost are a kidney, your soul or the life of your first born child. xD
Jokes aside, the cloud is expensive... very expensive... If it was for me I would have left everything on prem, but as Ben Ben would say "c'est la vie".

Anyway, for a practical on prem approach, I'd say that Grandmaster suggested is the way to go.
Raising the resources used by a node in your special workflow will hopefully saturate all the slots and prevent the server from picking up more jobs.
They'll still queue up, though, and they're gonna be picked up once it completes.
knk
Posts: 38
Joined: Wed Jun 21, 2017 3:50 pm

Re: Farm vs Workflow specifics

Post by knk »

Hey guys, thanks for the inputs!

This came to be when I've created a very demanding GPU process inside a specific workflow. From my testing, the machine can keep working other workflows simultaneoulsy, since everything else can be CPU and RAM demanding but does not seem to interfere much, so I wouldn't "lose" processing capability even though said workflow is running. On the other hand, If the GPU demanding workflow starts in more than 1 instance, the machine can become irresponsive.

My workaround at the moment is very file input based, but it's not "pretty", since it takes 2 workflows and scripts to ensure that only one file ends up on the input folder at a time.

If I understood @Steinar correctly, if I lower one of the machines to 16 slots and on the process step occupy the 16 slots, this will prevent the machine from starting other jobs, right?
It's not ideal, since it takes the machine "down" for a few hours while it does the demanding GPU process but for I'll test it...

Thank you guys!

PS @FranceBB now that is a propper Farm! 640 nodes I hope you have a good pickup truck to run that farm end to end and not the old grandpa's tractor :D
emcodem
Posts: 1720
Joined: Wed Sep 19, 2018 8:11 am

Re: Farm vs Workflow specifics

Post by emcodem »

@knk
ah yeah, as you said you work with nvidia, i recalled that i came up with a solution for a very similar problem here:
viewtopic.php?p=6188&hilit=nvidia+smi#p6188

Not perfect but maybe one of the better alternatives.
It is also the reason why i asked @admin for a "hook" that allows to execute some script before a watchfolder actually starts a job.
emcodem, wrapping since 2009 you got the rhyme?
admin
Site Admin
Posts: 1676
Joined: Sat Feb 08, 2014 10:39 pm

Re: Farm vs Workflow specifics

Post by admin »

So knk, it's basically the same use case as we have where we use FFAStrans to drive Whisper transcriptions. If you leave your hosts at 24 slots and set the GPU intensive node to utilize 13 slots, then you will still have 11 free slots for other operations but not enough room for two 13 slot nodes on one host. This is how we utilize this feature and it works very well.

-steinar
emcodem
Posts: 1720
Joined: Wed Sep 19, 2018 8:11 am

Re: Farm vs Workflow specifics

Post by emcodem »

I just needed a wf that processes one file by one while the other workflows are not blocked.
This workflow i quoted above works perfectly, i should use it more often :D But don't forget to install the plugin processor "wait until file ist oldest"
emcodem, wrapping since 2009 you got the rhyme?
Post Reply