Hi, I would like to know if in the Python Swig wrapper for fesapi, is it possible (or is there methods like “setitems”, “getitems”, besides the “setitem”, “getitem”) to vectorize the array operation for efficient array conversion in the case of very large models. I couldn’t find this in example.py and not sure where exactly to look for it. Appreciate any ideas on this topic.
I think there are actually two questions!
A) If you have very large models, I think you don’t really want “setitems” and “getitems” method on your SWIG arrays. I think you would want to write/get into HDF5 by chunk in order not to deal with giant array in your memory at anytime.
Some methods allow chunking, some not in FESAPI. For example setValuesOfInt64Hdf5Array3dOfValues() : Fesapi: resqml2::AbstractValuesProperty Class Reference
B) You can store your very large model in memory, then you just want to be more efficient by setting the array in memory by chunk instead of 1 item at a time.
This is a quite pure SWIG question. Looking at the SWIG documentation, there are several ways to achieve that. The main one looks to be to let FESAPI C (SWIG) array to accept a NumPy array instead. This looks possible using numpy.i but it will require me time to work on it…
Thank you. I am not sure I understood the context of part A of your response. Wondering if you could elaborate more on that? Did you mean the context describes a use case where the large model already exists in memory and you are pointing out ways to optimize the H5 writing of the bulk data? Does it apply also to reading bulk data? But for latter then it means the model already exists on disk.
I think my question falls under part B of your response. I tried using multi threaded approach using chunk size, with setitem and getitem to try to optimize, but I found this didn’t make any difference likely due to the Python GIL. I heard that Python 3.13 and later have released the GIL to enable multi threads. Would you recommend switching to this more current Python version and rebuilding the Fesapi python bindings with 3.13 or later to enabled multi threading?
Yes. Or, you receive live a property but you cannot wait to have received all this property before to write it on disk because it does not fit in your available memory. Then you are forced to write it by chunk on disk.
In this scenario you still use “setitem” and “getitem” with FESAPI (in memory) but you write the property by chunk on disk instead of writing the property at once.
Yes
Ok I think so. Notice that answer A and answer B are not exclusive. You could write to FESAPI (in memory) with “setitems” and “getitems” and decide to write on disk by chunk.
Indeed “setitem” and “getitem” only writes in memory, not on disk.
I saw a lot of discussions about Python GIL and SWIG on some forums but really had not enough time to work deeply on it. I am sorry but I cannot give advise to you about that for now. Maybe someone else in this forum could…
The python wrapper of FESAPI is quite new compared to JAVA, C# ones. And, maybe as a consequence, maybe because some use resqpy instead, I have not a lot of feedbacks about it and I don’t personally use it a lot. So, I am sorry but my expertise is quite low on this topic.
I am really interested in improving it but, for now, I cannot find time to deeply work on it.