How to setup farm correctly? Errors when farming
How to setup farm correctly? Errors when farming
Hi all,
the topic says it all.
The Problem: I have a Workflow which runs perfectly on a local machine OR when set as only Host in an farming-environment. So, when I run this workflow on farming-Hosts "renderer 1" and "renderer 2" then I got various errors.
I attach the workflow for reference.
As an example I add a log where the AVI-synth node FPS-converter failed.
The network is a 192.168.100.xxx netmask 255.255.255.0.
DHCP-server is a IPfire-installation on a dedicated machine
All network-devices are mapped as a network-drive (i.e. \\192.168.100.192 = Z: for the NAS)
The FFAStrans-config is
GENERAL:
- shared global media cache: Z:\ffastrans_CACHE
HOST
- CHECKED Use Global Shared Media Cahce
- CHECKED Run As Application
- URL: http://192.168.100.226:65445/api/json/v1 <- I never changed that
- Resources: CHECKED Enable Local Workflow Processing As Application
- CPU-roof 90
- Max Active Jobs 8 (on a 8-core AMD Ryzen machine)
maybe... anyone got an idea to let the renderers run in parallel?
cheers,
tom
the topic says it all.
The Problem: I have a Workflow which runs perfectly on a local machine OR when set as only Host in an farming-environment. So, when I run this workflow on farming-Hosts "renderer 1" and "renderer 2" then I got various errors.
I attach the workflow for reference.
As an example I add a log where the AVI-synth node FPS-converter failed.
The network is a 192.168.100.xxx netmask 255.255.255.0.
DHCP-server is a IPfire-installation on a dedicated machine
All network-devices are mapped as a network-drive (i.e. \\192.168.100.192 = Z: for the NAS)
The FFAStrans-config is
GENERAL:
- shared global media cache: Z:\ffastrans_CACHE
HOST
- CHECKED Use Global Shared Media Cahce
- CHECKED Run As Application
- URL: http://192.168.100.226:65445/api/json/v1 <- I never changed that
- Resources: CHECKED Enable Local Workflow Processing As Application
- CPU-roof 90
- Max Active Jobs 8 (on a 8-core AMD Ryzen machine)
maybe... anyone got an idea to let the renderers run in parallel?
cheers,
tom
- Attachments
-
- 20190626-094638-992-E2E115C046D3.txt
- (162.05 KiB) Downloaded 406 times
-
- Full_Scanner_Workflow_Stumm_26JUN2019_copy.xml
- (95.61 KiB) Downloaded 419 times
Re: How to setup farm correctly? Errors when farming
Hey Thomas!
is it possible that you get us the avs file of interest? ...it is mentioned in the log:
If needed, you can enable "keep cache files" in your workflow settings...
cheers!
is it possible that you get us the avs file of interest? ...it is mentioned in the log:
Code: Select all
(Z:\FFasTrans_CACHE\20190626093535\20190626-094638-992-E2E115C046D3\avs_v_fpsconv_20190626-094931-863-3B2E12BAC34A.avs, line 57, column 28)
cheers!
emcodem, wrapping since 2009 you got the rhyme?
Re: How to setup farm correctly? Errors when farming
Hi emcodem,
thank you for your support and sorry for responding late - we were on tour the last days.
I´m afraid it is not an spoiled AVIsynth-script, but is a problem by picking up files and / or distribution of tasks between our two render-stations. I attach screenshots from three encode-sessions with the workflow from the last posting. Here it is clearly to see that not only AVIsynth-scripts but also other nodes on both renderer give an error. The three encodings had all the same files. As you can see, the errors occur on different stages in each run. The Pickup-nodes and deliver-directorys were resetted, history-cleared and all encoded files deleted before each run.
I thought, maybe it is because of intenive branching. So I took some other files an a One-Branch-Workflow (attached).
But I was not succesful. See the next screenshot for some strange behaviour, that the Workflow discards some files as if they had no video (they definetily have...) and try to pick up successful transcodings again.
I attach all the files as a .ZIP because my browser or intranet seems to kill the screenshots after uploading to this forum... Sorry for this inconvenience.
thank you for your support and sorry for responding late - we were on tour the last days.
I´m afraid it is not an spoiled AVIsynth-script, but is a problem by picking up files and / or distribution of tasks between our two render-stations. I attach screenshots from three encode-sessions with the workflow from the last posting. Here it is clearly to see that not only AVIsynth-scripts but also other nodes on both renderer give an error. The three encodings had all the same files. As you can see, the errors occur on different stages in each run. The Pickup-nodes and deliver-directorys were resetted, history-cleared and all encoded files deleted before each run.
I thought, maybe it is because of intenive branching. So I took some other files an a One-Branch-Workflow (attached).
But I was not succesful. See the next screenshot for some strange behaviour, that the Workflow discards some files as if they had no video (they definetily have...) and try to pick up successful transcodings again.
I attach all the files as a .ZIP because my browser or intranet seems to kill the screenshots after uploading to this forum... Sorry for this inconvenience.
- Attachments
-
- Screenshots_and_Workflow.zip
- (892.13 KiB) Downloaded 400 times
-
- Workflow_Run_No_1.png (133.48 KiB) Viewed 11623 times
Re: How to setup farm correctly? Errors when farming
Hey Thomas,
yeah, i also guess your problem is about path access only. My reason to ask for the avisynth script was that i wanted to see the path of the files that are referenced and the variables contents. But your screenshot seems to make it pretty clear that it seems to be about access to Z:
So, are you 100% certain that both machines do see the EXACT path Z:\WATCHFOLDER\FA_Scanner\SCANNER_STUMM\ ?
You can also check e.g. on commandline the command "net use", on both machines, all entries have to be exactly the same.
cheers!
yeah, i also guess your problem is about path access only. My reason to ask for the avisynth script was that i wanted to see the path of the files that are referenced and the variables contents. But your screenshot seems to make it pretty clear that it seems to be about access to Z:
So, are you 100% certain that both machines do see the EXACT path Z:\WATCHFOLDER\FA_Scanner\SCANNER_STUMM\ ?
You can also check e.g. on commandline the command "net use", on both machines, all entries have to be exactly the same.
cheers!
emcodem, wrapping since 2009 you got the rhyme?
Re: How to setup farm correctly? Errors when farming
emcodem...
OMG... In fact I had the same paths doubled to the same resource ( T:\ and Z:\ ; S:\ and Y:\). I let the same files run through the farm (render 1 and render 2 activated). Better, but still imperfect. I attach the two logs from the files which get stuck on the FPS (!) node again.
I tried to get the .AVS-file as you described it earlier, but I could not find Z:\FFasTrans_CACHE\20190626104342\20190701-155136-531-DFFFB924F892\avs_v_fpsconv_20190701-155420-600-12B64EC0A85C.avs as the content in that directory changes from second to second. No jobs running at that time...
Sorry - but it seems I am a little helpless out here ..
regards,
tom
OMG... In fact I had the same paths doubled to the same resource ( T:\ and Z:\ ; S:\ and Y:\). I let the same files run through the farm (render 1 and render 2 activated). Better, but still imperfect. I attach the two logs from the files which get stuck on the FPS (!) node again.
I tried to get the .AVS-file as you described it earlier, but I could not find Z:\FFasTrans_CACHE\20190626104342\20190701-155136-531-DFFFB924F892\avs_v_fpsconv_20190701-155420-600-12B64EC0A85C.avs as the content in that directory changes from second to second. No jobs running at that time...
Sorry - but it seems I am a little helpless out here ..
regards,
tom
- Attachments
-
- 20190701-155143-164-3EB861429702.txt
- (163.45 KiB) Downloaded 399 times
-
- 20190701-155136-531-DFFFB924F892.txt
- (162.98 KiB) Downloaded 380 times
Re: How to setup farm correctly? Errors when farming
Sorry for the delay!
from feeling this error is about the variable %s_assumed_fps_from_log% not being set correctly; the reason might also be about file access as you read the value for it from a file.
However and anyway as you use avisynth a lot and this is the second time yuo have troubles with it, it is time for you to train a little how to debug it.
Sorry for the misleading tipp with the cache files, sure they dissapear immediately.
To get hold of the avisynth script of interest, just insert a deliver node to your workflow and connect the output of an avisynth node (e.g. "Set Assume FPS" and "FPS Converter". This will copy the avs file of interest to the location you configure in the deliver node.
Then, investigate and check what it writes in line 57 Let me know the outcome please!
cheers!
from feeling this error is about the variable %s_assumed_fps_from_log% not being set correctly; the reason might also be about file access as you read the value for it from a file.
However and anyway as you use avisynth a lot and this is the second time yuo have troubles with it, it is time for you to train a little how to debug it.
Sorry for the misleading tipp with the cache files, sure they dissapear immediately.
To get hold of the avisynth script of interest, just insert a deliver node to your workflow and connect the output of an avisynth node (e.g. "Set Assume FPS" and "FPS Converter". This will copy the avs file of interest to the location you configure in the deliver node.
Then, investigate and check what it writes in line 57 Let me know the outcome please!
cheers!
emcodem, wrapping since 2009 you got the rhyme?
Re: How to setup farm correctly? Errors when farming
Hi emcodem!
You do not have to excuse for any delay as I am the noob who has to be thankful for your patience. Big Up!
So, what I did now was a little Re-Order of the nodes. Now I do not get the AVS-script-error anymore. I set the read-out of the variable as one of the first nodes of the workflow.
Moreover I changed a bit to get to a Two-Branch-Workflow rather than three branches. And I put in some more "wait for Files"-nodes.
nevertheless, there are still issues regarding file-access I assume. I attach the latest monitor-log: and the latest version of the workflow: so - the files I submitted were six .AVI from 100 GB ... 200 GB filesize. Only one succeded.
Is it maybe possible that the distribution to the renderers is too fast so that a file is not being written completely?
Or...
Any issues with Synology NAS?
So many Bytes... so little time...
regrads, tom
You do not have to excuse for any delay as I am the noob who has to be thankful for your patience. Big Up!
So, what I did now was a little Re-Order of the nodes. Now I do not get the AVS-script-error anymore. I set the read-out of the variable as one of the first nodes of the workflow.
Moreover I changed a bit to get to a Two-Branch-Workflow rather than three branches. And I put in some more "wait for Files"-nodes.
nevertheless, there are still issues regarding file-access I assume. I attach the latest monitor-log: and the latest version of the workflow: so - the files I submitted were six .AVI from 100 GB ... 200 GB filesize. Only one succeded.
Is it maybe possible that the distribution to the renderers is too fast so that a file is not being written completely?
Or...
Any issues with Synology NAS?
So many Bytes... so little time...
regrads, tom
Re: How to setup farm correctly? Errors when farming
sorry,
the workflow is always rejected...
the workflow is always rejected...
Re: How to setup farm correctly? Errors when farming
emcodem,
also thasnk you for the tip with the .AVS-command-readout. Great feature.
But I have another little issue, maybe it has to do with the file-access...
When looking at the Workflow-properties, I can not see the workstations in Farming-Tab. I see only the machine on which I am actually working on. Also I can choose thgis workstation a multiple time - normal behaviour?
regards,
tom
also thasnk you for the tip with the .AVS-command-readout. Great feature.
But I have another little issue, maybe it has to do with the file-access...
When looking at the Workflow-properties, I can not see the workstations in Farming-Tab. I see only the machine on which I am actually working on. Also I can choose thgis workstation a multiple time - normal behaviour?
regards,
tom
- Attachments
-
- Farming_multiple_Renderer01.png (370.59 KiB) Viewed 11593 times
Re: How to setup farm correctly? Errors when farming
Hey Thomas!
i don't think there is a Problem with "file is not being written completely" or synology or so, your Settings of the monitor node "check once for growing" is good. Maybe you also want to check "Forget missing" in order to be able to process the same filename after it left the watchfolder once.
I checked all nodes in your workflow (and workflow Settings) for paths and found out that you currently use These paths:
Z:\WATCHFOLDER\FA_Scanner\SCANNER-STUMM
Y:\WaitingDirectory\%s_archnumber%
S:\NICHT LOESCHEN FFAStrans\
Z:\FFAStrans_Logs\
All These paths must be exactly the same on all Systems and ffastrans must be started on all nodes from S:\NICHT LOESCHEN FFAStrans\
Also, make sure to check the "use global shared media Cache" stuff in Configuration->Host and in Configuraiton->General set it to a Shared Directory like S:\.ffastrans_work_root
In General, make sure that no file at all ever goes to a local drive like C:\, especially not temporary files like encoded files or avs files in Cache Directory.
Can you confirm that?
Also, Your screenshot about the Farming tab Looks strange, it should only look that way in case you added renderer01 multiple times to the list.
Looks like there is something strange to your farm, make sure on both machines you start ffastrans from exactly the same path (S:)...
For your log.txt, it would be good if we also had the processing logs for it because one would see the path/filename that was not found on RENDERER-02
All your Errors seem to come from RENDERER-02, i assume one of the mapped Network drives is not the same than on the other machine.
i don't think there is a Problem with "file is not being written completely" or synology or so, your Settings of the monitor node "check once for growing" is good. Maybe you also want to check "Forget missing" in order to be able to process the same filename after it left the watchfolder once.
I checked all nodes in your workflow (and workflow Settings) for paths and found out that you currently use These paths:
Z:\WATCHFOLDER\FA_Scanner\SCANNER-STUMM
Y:\WaitingDirectory\%s_archnumber%
S:\NICHT LOESCHEN FFAStrans\
Z:\FFAStrans_Logs\
All These paths must be exactly the same on all Systems and ffastrans must be started on all nodes from S:\NICHT LOESCHEN FFAStrans\
Also, make sure to check the "use global shared media Cache" stuff in Configuration->Host and in Configuraiton->General set it to a Shared Directory like S:\.ffastrans_work_root
In General, make sure that no file at all ever goes to a local drive like C:\, especially not temporary files like encoded files or avs files in Cache Directory.
Can you confirm that?
Also, Your screenshot about the Farming tab Looks strange, it should only look that way in case you added renderer01 multiple times to the list.
Looks like there is something strange to your farm, make sure on both machines you start ffastrans from exactly the same path (S:)...
For your log.txt, it would be good if we also had the processing logs for it because one would see the path/filename that was not found on RENDERER-02
All your Errors seem to come from RENDERER-02, i assume one of the mapped Network drives is not the same than on the other machine.
emcodem, wrapping since 2009 you got the rhyme?