Hey,
it depends a lot on the requirements but i can try to give you some tipps from experience to avoid unneccessary troubles:
1) to be robust, your script should be simple and do only what it must do, not more
2) avoid the need for checking if files have stopped growing and also do not copy by your script (ffastrans has retry, reporting and failure handling builtin)
3) avoid the need for a "database" (where you remember which files have been processed)
4) avoid a long running script, only do the list and compare work and then exit the process. If you need multiple runs, schedule the script to be executed every minute (either from windows task scheduler or webint scheduler)
5) let your script write a log file (just append log msgs to a text file and delete/recreate the file when it gets larger than 10 MB or so). Log the startup time and also when you found something new to process and when the script gracefully ends at least.
6) last but not least, avoid to use unneccessary extra installed python modules, the script is easy portable if you can only use core modules
I assume you can avoid the need for check growing because you just run the stuff at night, if no you can probably just ignore all files younger than X hours (where X is the maximum recording duration you expect).
So your approach number 2 sounds good to me.
I would maybe only have 1 monitor processor and 1 directory where all the txt files go.
The .txt can have the same name as the file to be transferred and content of the txt is the full UNC path (which contains already the server name).
In this case, you use a populate processor at workflow start and set s_source = $read("%s_source%").
If you leave the txt file in the monitored folder after processing, you have a "database" containing what was processed this way too. Your script would just need to do it's work and only write a new txt file if the file does not exist already.
I hope what i wrote is easy to understand
Here is about what i have in mind (untested), ffastrans watchfolder would be C:\output\airspeed_txt_files in this case.
For scheduling the whole thing, you could have a small ffastrans workflow with 2 monitors, one for each airspeed. After the monitors just execute your script and finish. Have another workflow for the txt monitoring and real processing
Code: Select all
import glob
import os
# Specify the UNC paths
unc_path_1 = r'\\server1\share\directory1'
unc_path_2 = r'\\server2\share\directory2'
output_dir = r'C:\output\airspeed_txt_files'
# Ensure the output directory exists
os.makedirs(output_dir, exist_ok=True)
# List all .mxf files in both directories
files_1 = set(os.path.basename(file) for file in glob.glob(os.path.join(unc_path_1, '*.mxf')))
files_2 = set(os.path.basename(file) for file in glob.glob(os.path.join(unc_path_2, '*.mxf')))
#make intersection and diff lists
intersection_files = files_1.intersection(files_2)
only_in_files_1 = files_1.difference(files_2)
only_in_files_2 = files_2.difference(files_1)
def write_files_with_path(file_set, unc_path, output_dir):
for file_name in file_set:
full_path = os.path.join(unc_path, file_name)
output_file = os.path.join(output_dir, f"{file_name}.txt")
if os.path.exists(output_file):
continue # skip output txt file, it already exists
with open(output_file, 'w') as f:
f.write(full_path)
print(f"Written {output_file} with path: {full_path}")
# Write txt files
write_files_with_path(only_in_files_1, unc_path_1, output_dir)
write_files_with_path(only_in_files_2, unc_path_2, output_dir)
write_files_with_path(intersection_files, unc_path_2, output_dir) # if material is on both, copy from server2