Webinterface

Questions and answers on how to get the most out of FFAStrans
DCCentR
Posts: 10
Joined: Thu May 04, 2023 7:15 am

Re: Webinterface

Post by DCCentR »

emcodem wrote: Fri Nov 08, 2024 9:37 pm Hi @DCCentR

sorry for letting you run into this problem in first place and thanks for the report, perfect that you deliver the webint log along.
So my guess would be that you only see this behaviour when there are jobs running, taking away CPU and especially Bandwith to your NAS09 share.
It is of course important for me to support such problems but it is kind of hard to test, can you please confirm try raising the "API timeout" in webui->admin->Network settings from 7000 to e.g 120000? Meanwhile i'll reactivate some old Raspberry Pi and configure it as WLAN NAS so i have something slow for testing this kind of Situation.

What exactly happens:
So webint reads lots of files from the job history folder (history jobs can currently block active jobs update). When there is network activity, the Latency to the share raises, e.g. if you have 2000 jobs in the ffastrans history and usually each one takes 0.1ms, the response usually takes 200ms, so far so godd but as soon as you have network activity on the same netwok cable, the time for each file raises and we soon have 10ms per file or more, leading to a very long update time.
Of course i have already multiple strategies to try to cache stuff and avoid this but looks like some of it is not working as expected.
Hi @emcodem

Thank you for your reply and explanation. Changing API timeout to 120000 did not help. Is it not necessary to restart webinetrfase service to apply it, right?
logs afte120000.7z
(294.97 KiB) Downloaded 5 times

I tried reorganizing the tasks a bit by unloading NAS09 (the old scheme that caused the problem and the new one):
Диаграмма без названия.png
Диаграмма без названия.png (92.47 KiB) Viewed 202 times

So far, the problem has not occurred again.
logs_after_reorganize.zip
(783.44 KiB) Downloaded 3 times
DCCentR
Posts: 10
Joined: Thu May 04, 2023 7:15 am

Re: Webinterface

Post by DCCentR »

DCCentR wrote: Sat Nov 09, 2024 3:49 pm Hi @emcodem

Thank you for your reply and explanation. Changing API timeout to 120000 did not help. Is it not necessary to restart webinetrfase service to apply it, right?
logs afte120000.7z


I tried reorganizing the tasks a bit by unloading NAS09 (the old scheme that caused the problem and the new one):
Диаграмма без названия.png


So far, the problem has not occurred again.
logs_after_reorganize.zip
Unfortunately reorganizing modules/tasks didn't help. Running and finished workflows are not displayed again when some of the workflows are executed:
Снимок экрана 2024-11-11 171101.png
Снимок экрана 2024-11-11 171101.png (136.14 KiB) Viewed 166 times
Снимок экрана 2024-11-11 171038.png
Снимок экрана 2024-11-11 171038.png (179.75 KiB) Viewed 166 times
log:
webint.zip
(346.99 KiB) Downloaded 3 times
emcodem
Posts: 1743
Joined: Wed Sep 19, 2018 8:11 am

Re: Webinterface

Post by emcodem »

@DCCentR

Oh man what an unfortune, sorry for causing you troubles :D
Meanwhile i raised internally the topic about missing documentation/recommendation regarding the perfect system setup especially for NAS based installations. In fact, steinar serves the ffastrans db folder from a separate VM. In a perfect scenario, the connection between this "Database VM" and the ffastrans hosts does run over a separate control network where no media data is transferred to keep the latency low. Of course not everyone can build a separate control network and we'll always have to keep an eye on this topic.

On the code front, i did experiment with a raspberry NAS over WIFI and i think i was able to solve the most disturbing troubles. You know, Wifi by default has a bad latency and when i put some traffic on it... you know...
Raising the timeouts was also not accepted directly as it should be (just as you say, it must be restarted), this should also be fixed too. But in my testing just raising the timeouts was not really a good solution anyway, only optimizing the file caching really helped.
emcodem, wrapping since 2009 you got the rhyme?
DCCentR
Posts: 10
Joined: Thu May 04, 2023 7:15 am

Re: Webinterface

Post by DCCentR »

emcodem wrote: Mon Nov 11, 2024 8:56 pm @DCCentR

Oh man what an unfortune, sorry for causing you troubles :D
Meanwhile i raised internally the topic about missing documentation/recommendation regarding the perfect system setup especially for NAS based installations. In fact, steinar serves the ffastrans db folder from a separate VM. In a perfect scenario, the connection between this "Database VM" and the ffastrans hosts does run over a separate control network where no media data is transferred to keep the latency low. Of course not everyone can build a separate control network and we'll always have to keep an eye on this topic.

On the code front, i did experiment with a raspberry NAS over WIFI and i think i was able to solve the most disturbing troubles. You know, Wifi by default has a bad latency and when i put some traffic on it... you know...
Raising the timeouts was also not accepted directly as it should be (just as you say, it must be restarted), this should also be fixed too. But in my testing just raising the timeouts was not really a good solution anyway, only optimizing the file caching really helped.
@emcodem
Hey, no problem ;)

Thank you for your efforts in solving this issue.

Regarding the separate management network, I can connect the servers with “FFAStrans & webinterface dir” and “FFAStrans Web server” with 1Gbit (or even 10Gbit if needed :) ):
Снимок экрана 2024-11-12 121735.png
Снимок экрана 2024-11-12 121735.png (24.38 KiB) Viewed 137 times
Do you think a separate management network for these servers would be enough? (for solving web inerface issue)
emcodem
Posts: 1743
Joined: Wed Sep 19, 2018 8:11 am

Re: Webinterface

Post by emcodem »

@DCCentR
Well i in first place you definitely face software shortcomings, e.g. the caching can be optimized a lot, and it must be optimized. Not everyone can build a separate control network ;)

But yes i also think that 1Gbit control network is far enough, as long as you can keep the data transfers from the Control network NIC the latency to the files should be good. But i didnt test such a setup yet.

However i would be glad if you support me with testing on the speed optimisations, here what i have right now:
https://github.com/emcodem/ffastrans_we ... 4.0.88.zip

Not everything is yet optimized, e.g. if you have many "incoming" jobs from watchfolders, it would probably still be laggy but the problems you face now should be solved because i only read the json files of "new" jobs instead of jsons from 100 jobs every 3 seconds.
Also, i will separate reading history jobs from active jobs because updating history is not as important as active.
emcodem, wrapping since 2009 you got the rhyme?
DCCentR
Posts: 10
Joined: Thu May 04, 2023 7:15 am

Re: Webinterface

Post by DCCentR »

emcodem wrote: Tue Nov 12, 2024 9:50 am @DCCentR
Well i in first place you definitely face software shortcomings, e.g. the caching can be optimized a lot, and it must be optimized. Not everyone can build a separate control network ;)

But yes i also think that 1Gbit control network is far enough, as long as you can keep the data transfers from the Control network NIC the latency to the files should be good. But i didnt test such a setup yet.

However i would be glad if you support me with testing on the speed optimisations, here what i have right now:
https://github.com/emcodem/ffastrans_we ... 4.0.88.zip

Not everything is yet optimized, e.g. if you have many "incoming" jobs from watchfolders, it would probably still be laggy but the problems you face now should be solved because i only read the json files of "new" jobs instead of jsons from 100 jobs every 3 seconds.
Also, i will separate reading history jobs from active jobs because updating history is not as important as active.
Visually so far so good :) thanks!
1g.png
1g.png (200.6 KiB) Viewed 109 times
2g.png
2g.png (286.96 KiB) Viewed 109 times
Logs.zip
(966.27 KiB) Downloaded 2 times
DCCentR
Posts: 10
Joined: Thu May 04, 2023 7:15 am

Re: Webinterface

Post by DCCentR »

Another little report on v1.4.0.88.
Somehow this two jobs stuck in running, but they were done already:
Снимок экрана 2024-11-13 120354.png
Снимок экрана 2024-11-13 120354.png (295.61 KiB) Viewed 68 times
Снимок экрана 2024-11-13 120439.png
Снимок экрана 2024-11-13 120439.png (96.33 KiB) Viewed 68 times
logs.zip
(1.27 MiB) Downloaded 1 time
cache jobs 20241113-0752-1378-833b-febd8a09d442.zip
(528.02 KiB) Not downloaded yet
Didnt't try restart webinetrface service yet.
emcodem
Posts: 1743
Joined: Wed Sep 19, 2018 8:11 am

Re: Webinterface

Post by emcodem »

@DCCentR Oha, i guess that is not performance related at all :D But most likely "cache" related
So i did not yet see this behaviour but it seems to be somehow related to jobs with "multiple tasks" and usually my jobs are just single task.
This time unfortunately the logs and cache folders are not really helpful but i'd like to know:
1) just refreshing the page does not "solve" the issue?
2) can you try the localhost:3003 api method GET /jobs and tell me the output?

The output of GET /jobs method will return 100 last history jobs and when you scroll to bottom you will also see "active" jobs, do you see the same 2 jobs there? If yes, can you also check processors/db/cache/monitor folder and see if there are json files for the jobs you see on webint?
emcodem, wrapping since 2009 you got the rhyme?
DCCentR
Posts: 10
Joined: Thu May 04, 2023 7:15 am

Re: Webinterface

Post by DCCentR »

emcodem wrote: Wed Nov 13, 2024 4:59 pm @DCCentR Oha, i guess that is not performance related at all :D But most likely "cache" related
So i did not yet see this behaviour but it seems to be somehow related to jobs with "multiple tasks" and usually my jobs are just single task.
This time unfortunately the logs and cache folders are not really helpful but i'd like to know:
1) just refreshing the page does not "solve" the issue?
2) can you try the localhost:3003 api method GET /jobs and tell me the output?

The output of GET /jobs method will return 100 last history jobs and when you scroll to bottom you will also see "active" jobs, do you see the same 2 jobs there? If yes, can you also check processors/db/cache/monitor folder and see if there are json files for the jobs you see on webint?
1) Nope, refreshing didn't help
2) These two "active" jobs are in the output:
jobs.json
(25.31 KiB) Downloaded 1 time
Yea, there is a two json files with matching id's:
Снимок экрана 2024-11-13 204403.png
Снимок экрана 2024-11-13 204403.png (39.22 KiB) Viewed 33 times
They have something weird with security permissions, I can't see the owner etc. and they don't open for reading either (even under a domain administrator account) :?
A have running "FFAStrans REST-Service" & "FFAStrans Webinterface" services on every machine in farm under a domain administrator account.

Here's the log from there just in case:
monitor log.zip
(159.48 KiB) Downloaded 1 time
UPD: rebooting the server with FFAStrans dir solved the problem with security permissions on json's :D
artjuice
Posts: 36
Joined: Mon Mar 20, 2023 11:33 pm

Re: Webinterface

Post by artjuice »

Hi emcodem
We use for a few days new webinterface_1.4.0.85 and today we see a problem - FFAStrans WebInterface process use almost 30Gb RAM :D
What logs i can send you for a review?
Post Reply