No way to abort a job that failed on non-existing farm node/workflow

veks · Post by **veks** » Mon Jun 13, 2022 7:55 am

Hi,
there's no way to abort a job that's being started in a workflow that's being run on a server that's DOWN.
I've rebooted the main FFAStrans server, services, aborted several times and jobs try to start each time again and again...

How to solve this?

Tnx!

Post by **admin** » Tue Jun 14, 2022 11:30 am

Hi veks,

Depending on how your system is configured and where a job was in the process when the server went down, the only host that will respond to abort is the host being down. If the host don't come back we are in a stuck situation where we basically have an uncompleted orphan job not going anywhere.
But you must also be aware that if you have a workflow configured to run on a host that is down it basically wont work until you reconfigure it to run on other available hosts, or bring the host back.

Currently there is no mechanisms for "cleaning" up that mess left by this scenario other than manually delete db files assosiated with the job. This is not complicated but it's not preffered. Will your host never come back up and running?

-steinar

FFAStrans forum

No way to abort a job that failed on non-existing farm node/workflow

No way to abort a job that failed on non-existing farm node/workflow

Re: No way to abort a job that failed on non-existing farm node/workflow