Webinterface

Questions and answers on how to get the most out of FFAStrans
emcodem
Posts: 1743
Joined: Wed Sep 19, 2018 8:11 am

Re: Webinterface

Post by emcodem »

DCCentR wrote: Wed Nov 13, 2024 5:57 pm UPD: rebooting the server with FFAStrans dir solved the problem with security permissions on json's :D
Fantastic report, thanks a lot for the debugging you did.
So there is a recent bug in ffastrans open regarding leaving these files but i guess loosing all the security info on it means basically a bug with the storage or even Microsoft SMB Client (unlikely). We have the same issue at work on netapp sometimes with mxf files but there are so much servers and clients involved that its nearly impossible to debug. So we just inform the storage admins from time to time, they are able to delete the stuff from commandline directly on the storage.

However, this case should be much simpler because the reboot solved it. I guess one of our processes is just still holding the file handle and won't let go of it. But we don't know currently if it is ffastrans itself or webint.
Maybe next time you can apply the procedure to find out which process is holding a file handle to the affected file described here: https://serverfault.com/questions/1966/ ... in-windows

Also, if i remember correctly on Netapp and Isilon we had to opt-in a related feature called "oplocks" (opportunistic file locking), maybe you can check back with the storage documentation if there is some related setting and report back about it?

One thing i want to add for future emcodem here is that webinterface opens the files with "share mode delete" enabled. The documentation of these file open flags is not really good but i believe it means that webint can open the file for reading "while" another one (ffastrans) is allowed to delete the same file - in this case when webint closes the filehandle, the OS should send a delete command to the storage (which it kind of did, why did the permissions get lost otherwise).
Don't get me wrong, i am not sure if the problem is actually anyhow related to webinterface's bare existence, it is absolutely possible that the same error would occur when ffastrans runs alone. We just don't know currently.
file_open_shared.jpg
file_open_shared.jpg (38.43 KiB) Viewed 51 times
emcodem, wrapping since 2009 you got the rhyme?
emcodem
Posts: 1743
Joined: Wed Sep 19, 2018 8:11 am

Re: Webinterface

Post by emcodem »

artjuice wrote: Wed Nov 13, 2024 6:47 pm Hi emcodem
We use for a few days new webinterface_1.4.0.85 and today we see a problem - FFAStrans WebInterface process use almost 30Gb RAM :D
What logs i can send you for a review?
Dear @artjuice,
it is so nice from you to support us on getting this webint release done, thanks also for this report. The issues are of course related to the latest changes which optimize caching. I already found and mitigated most if not all of them. I'll upload another prerelease as soon as i am happy with the changes and notify you here.
But man, really, you got 30 gig ram usage from job json file caching in a few days? How many jobs you run through every day? :D
emcodem, wrapping since 2009 you got the rhyme?
DCCentR
Posts: 11
Joined: Thu May 04, 2023 7:15 am

Re: Webinterface

Post by DCCentR »

emcodem wrote: Wed Nov 13, 2024 11:37 pm However, this case should be much simpler because the reboot solved it. I guess one of our processes is just still holding the file handle and won't let go of it. But we don't know currently if it is ffastrans itself or webint.
Maybe next time you can apply the procedure to find out which process is holding a file handle to the affected file described here: https://serverfault.com/questions/1966/ ... in-windows
The next one didn't take long :) This time it was the jobs of the same workflow as before, but they was stuck in web and in status monitor too:
Снимок экрана 2024-11-14 110110.png
Снимок экрана 2024-11-14 110110.png (60.43 KiB) Viewed 35 times
I found these files are locked only by host machine (SRV-CINEGY1) serving the FFAStrans dir and also doing local processing:
FFAStrans dir server3.png
FFAStrans dir server3.png (37.77 KiB) Viewed 35 times
Opened by Administrator for reading:
FFAStrans dir server.png
FFAStrans dir server.png (448.44 KiB) Viewed 35 times
Also I managed to start Process Monitor on SRV-CINEGY1 something around 13 minutes before jobs get stuck. it seems that when cmd.exe appeared, the problem started:
FFAStrans dir server2.png
FFAStrans dir server2.png (1.72 MiB) Viewed 35 times
Here is the events list from Process Monitor filtered by one of the stuck json (20241114-0613-4367-35a0-e076c917518b~1-0-0.json): https://dropmefiles.com/LHJuA
And full events list without filter, containts all three jsons (20241114-0615-1434-9971-43cca26ef65f~1-0-0.json, 20241114-0613-4367-35a0-e076c917518b~1-0-0.json, 20241114-0616-4487-7178-ca416d100302~1-0-0.json): https://dropmefiles.com/wuW7z
emcodem wrote: Wed Nov 13, 2024 11:37 pm Also, if i remember correctly on Netapp and Isilon we had to opt-in a related feature called "oplocks" (opportunistic file locking), maybe you can check back with the storage documentation if there is some related setting and report back about it?
For the farm we use self-built NAS'es, which are Windows Server servers with hardware raid controllers. In this case, the host machine for FFAStrans dir is a regular Windows 10 Pro.
It looks like Windows has settings regarding Oplocks in the registry
For the farm we use self-built NAS'es and servers, which are Windows Server 2019+ servers with hardware raid controllers. In this case, the host machine for FFAStrans dir is a regular Windows 10 Pro.
It looks like Windows has settings for disabling Oplocks in the registry for SMB1 https://support.storeporter.com/hc/en-u ... 20networks. But it seems that this is no longer relevant as SMB1 is disabled by default in current versions of WIndows.
emcodem wrote: Wed Nov 13, 2024 11:37 pm Don't get me wrong, i am not sure if the problem is actually anyhow related to webinterface's bare existence, it is absolutely possible that the same error would occur when ffastrans runs alone. We just don't know currently.
This time I didn't restart the server to solve the problem. I started stopping the “FFAStrans REST-Service” and “FFAStrans Webinterface” services one by one on all servers involved in the farm.
After stopping all services, the files were not unlocked (I could read them, but could not move them) and were still occupied by System.
I started to restart all services and after starting them on the last server, I noticed that the json files were missing from the monitor directory.
I don't know after which server and service this happened (next time I'll try to wait more after restarting the service), but the last server was the FFAStrans web interface server and it failed to start the “FFAStrans REST-Service” service immediately - it succeeded only from the fourth time.
When starting the service there was an error (I don't remember the number), after which there were warnings in the event log:
Child process [11576 -\\\192.168.100.231\FF_Install\processors\rest_service.exe /ErrorStdOut] finished with 1
and
The “FFAStrans REST-API Service” service was unexpectedly terminated. This occurred (once): 3.
Post Reply