How to represent a fracture network

GaelleDufour · January 17, 2019, 11:51pm

Hello Philippe,

I am looking at representing a fracture network in RESQML.

We originally started with a TriangulatedSetRepresentation, but the performance proved to be a problem for our use case, probably because it is not intended to represent such large number of patches. For example: we have a case with 300 000 patches, each patch comprised of 62 nodes, and it took around 20 minutes to create the RESQML file (using fesapi C#). We are targeting around 1 minute.
Afterwards, I saw this comment in the documentation: “Similarly, a triangulated mesh could be represented as an unstructured cell GP grid with a cell count of 0, but with nodes and faces.” (source: http://docs.energistics.org/#RESQML/RESQML_TOPICS/RESQML-000-243-0-C-sv2010.html). Would you recommend using a “degenerate” cell representation in an unstructured grid, with each cell having one face and four nodes? That would probably have better performance than the TriangulatedSetRepresentation, but I feel that it would break the spirit of the Unstructured Grid. We cannot use a GP Grid itself because our other tools don’t support it.
I appreciate any thoughts you have on this (performance and representation) and f you see another way, please let me know.

Thank you!

Gaelle

philippeVerney · January 18, 2019, 9:38am

Hi Gaelle,

I have no experience on such use case which involves a lot of patches with fesapi. I think that the first step is to make a profiling to know what does take so much time.
I am thinking about the following things which can affect performance (probably not an exhaustive list)

XML/EPC : how much time do you save if you don’t call serialize method on the EPC document? The HDF5 file would be created but not the EPC file (even if all EPC objects would have been created in memory).
HDF5 : how much time do you save if you don’t write any geometry at all? The EPC file would be created but not the HDF5 file.
C# : how much time do you save if you run your program in C++ only (quite long to port your program, I guess).
exponential complexity : how much time do you save if you write “only” 200 000 patches? about one third?

Potential solutions for the above cases:

XML/EPC :
- No magical solution if you are forced to exchange with EPC (not ETP for example) which I guess you are. See general solution below.
HDF5:
- One problem could be that each patch is written in a separate HDF5 dataset. So, fesapi opens and closes 300 000 datasets which has probably a significant impact. RESQML 2.0.1 has no solution for that unfortunately, but I think RESQML 2.2 will solve this issue by allowing all patches to be written at once in a single HDF5 dataset.
- Another problem could be it is too much data. Using HDF5 parallel (IX does that as far as I know) probably with a distributed file system could solve this problem. Even if you would have 300 000 datasets, it could be ok because you would divide the total open, close, write amount by the number of ranks you use. If you do that, you would have to code your own HDF proxy class in fesapi because fesapi does not provide such a functionality yet but provides a way to plug your own HDF proxy class.
C#
- I don’t think to have a solution to propose except to rewrite your code in pure C or C++. Even rewriting in pure C# (without fesapi) would have, imho, some performance impact because HDF5 library is in C. HDF5 optimizes probably better their own C# library than fesapi does but I think you would still have a significant impact.
exponential complexity
- That would clearly be a fesapi problem which falls in the first below general solution.
General solution:
- You might try to optimize fesapi code by your own since it is open source, You might want to wait for the fesapi “community” to optimize it for you. You might want to participate to the fesapi consortium to vote for a particular task to be a high priority one.
- You might think about another kind of storage than EPC and/or HDF5 and use ETP for transfer. You could optimize your storage much more than EPC allows to and, depending on your chosen storage, your access time to the data could be lower.

Hope it helps

PS : I would not use GP or unstructured or pebi or ijk grid at all for surface objects, at least as a first resolution step. That would lead to too much transfer issues with other independent softwares in my opinion. And I am not sure at all that it would be significantly faster.

GaelleDufour · January 18, 2019, 3:43pm

Thank you for your quick and detailed answer Philippe!
My profiling showed that most of the time is spent in pushBackTrianglePatch of TriangulatedSetRepresentation. You are right about it being too many datasets: we also wrote the 300 000 surfaces as 1 patch (for comparison only) and it took 1 minute. So RESQML 2.2 offers a solution to this problem of too many datasets, which is great! When will it be ready?

In the meantime, let me try some of your suggestions (C++ only, and check for exponential complexity) - but in light of the above, I guess the outcome may not be very different.

Thanks!

philippeVerney · January 18, 2019, 4:19pm

Actually it is even 600 000 datasets because one dataset is used for the geometry (XY points) and one for the topology (node indices of each triangle).
And your test to write a single patch is the best you could have done to confirm the problem.

RESQML2.2 should solve this problem allowing to tell that the numerical values are not stored in a whole dataset but at a startIndex+count in a dataset. You could consequently write all your numerical values in two datasets at once (XYZ and node indices) which looks to be the main solution.
RESQML 2.2 is already available for testing for Energistics members. A public developer release is more or less planned for end of this year.
Fesapi is quite far to support RESQML2.2 because the main focus now is more on ETP.

Other potential solutions:

HDF5 Parallelization and distributed file system should reduce the problem by dividing the problem. But it might be hard to implement and you would need a specific infra structure.
HDF group might know some possible optimizations with their liibrary which fesapi uses. If you have support with them, you can ask them. For example, I have given a very quick eye to https://support.hdfgroup.org/HDF5/docNewFeatures/NewFeaturesVirtualDatasetDocs.html and if we can virtually simulate 300 000 datasets from a single dataset with HDF5 library, I would say that RESQML2.2 is not even necessary for this problem.
Endly, Fesapi is a bit defensive (and non optimized) and consequently could be optimized significantly (still more than 1 minute but less than 20 minutes) for such a use case.
If you look at the HDF5 write method line 513 of https://github.com/F2I-Consulting/fesapi/blob/master/src/common/HdfProxy.cpp you would notice that line 520 to 527 (maybe to 534) are redundant for your use case. We could imagine to ignore them because we would not check if the hdf file is open (we know it is opened), if the group is created or opened (we would keep it opened somehow), if we need to create a dataspace (I don’t know if we can share a dataspace if all your patches have the same dimensions). I am just guessing that it would potentially save something like 5 minutes.

philippeVerney · July 10, 2019, 4:18pm

Hi Gael,

The Fesapi initiative (DGI, Total and XOM with Energistics and F2I) decided to spend up to 2 days of work on this topic.
Results look promising: I tested to write 300 000 patches each of 60 XYZ triplets + 40 triangle node indice triplets (non initialized/dummy values) and it took a bit more than 1 minute (low c++ optimization and on my laptop).

Here was the problem:

For each dataset writing, fesapi looked for its group first, then created or opened it then wrote the dataset to it
Searching the existence of the group was the main issue. The complexity of this search was even worst than linear…
Another more little problem was to close and reopen the group each time needed even if all datasets to be written belonged to the same group.

The solution is:

To buffer the opened HDF groups in a map
Now the search of a group is done with a constant to linear complexity. I don’t think that RESQML2.2 would do significantly better.

The overall complexity of your use case is now close to linear regarding the number of data to write. It was closer to exponential before.

The latest commit contain the modification : Commits · F2I-Consulting/fesapi · GitHub It is named Performance : Bufferize opened HDF groups
There are other modifications as well due to other “Fesapi initiative” voted priorities.

Here is my tested code if you are interested in looking at it:

/*-----------------------------------------------------------------------
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"; you may not use this file except in compliance
with the License.  You may obtain a copy of the License at

  http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied.  See the License for the
specific language governing permissions and limitations
under the License.
-----------------------------------------------------------------------*/

#include <chrono>

#include "common/EpcDocument.h"
#include "common/DataObjectRepository.h"
#include "common/HdfProxy.h"

#include "resqml2_0_1/LocalDepth3dCrs.h"
#include "resqml2_0_1/TectonicBoundaryFeature.h"
#include "resqml2_0_1/FaultInterpretation.h"
#include "resqml2_0_1/TriangulatedSetRepresentation.h"

void serialize(const std::string & filePath)
{
	COMMON_NS::EpcDocument pck(filePath);
	COMMON_NS::DataObjectRepository repo;

	// HDF
	COMMON_NS::AbstractHdfProxy* hdfProxy = repo.createHdfProxy("", "Hdf Proxy", pck.getStorageDirectory(), pck.getName() + ".h5", COMMON_NS::DataObjectRepository::OVERWRITE);
	repo.setDefaultHdfProxy(hdfProxy);

	//CRS
	RESQML2_NS::AbstractLocal3dCrs* local3dCrs = repo.createLocalDepth3dCrs("", "Default local CRS", .0, .0, .0, .0, gsoap_resqml2_0_1::eml20__LengthUom__m, 23031, gsoap_resqml2_0_1::eml20__LengthUom__m, "Unknown", false);
	repo.setDefaultCrs(local3dCrs);

	RESQML2_0_1_NS::TectonicBoundaryFeature* fault1 = repo.createFault("1424bcc2-3d9d-4f30-b1f9-69dcb897e33b", "Fault1");
	RESQML2_0_1_NS::FaultInterpretation* fault1Interp1 = repo.createFaultInterpretation(fault1, "ba224651-7dd3-4952-85b0-cff6fe37508d", "Fault1 Interp1");

	RESQML2_0_1_NS::TriangulatedSetRepresentation* f1i1triRep = repo.createTriangulatedSetRepresentation(fault1Interp1,
		"1a4112fa-c4ef-4c8d-aed0-47d9273bebc5",
		"Fault1 Interp1 TriRep");

	double* patchXyzNodes = new double[180];
	unsigned int* triangleNodeIndices = new unsigned int [120];

	auto start = std::chrono::system_clock::now();
	for (size_t patchIndex = 0; patchIndex < 300000; ++patchIndex) {
		//auto substart = std::chrono::system_clock::now();
		f1i1triRep->pushBackTrianglePatch(60, patchXyzNodes, 40, triangleNodeIndices, hdfProxy);
		//auto subend = std::chrono::system_clock::now();
		//std::chrono::duration<double> subelapsed_seconds = subend - substart;
		//std::cout << "patch index : " << patchIndex << " : elapsed time: " << subelapsed_seconds.count() << "s\n";
	}
	auto end = std::chrono::system_clock::now();

	std::chrono::duration<double> elapsed_seconds = end - start;

	std::cout << "elapsed time: " << elapsed_seconds.count() << "s\n";

	std::string done;

	std::cout << "Press enter to finish";
	std::getline(std::cin, done);
}


// filepath is defined in a macro to better check memory leak
#define filePath "../../testingPackageCpp.epc"
int main()
{
	serialize(filePath);

#ifdef _WIN32
	_CrtDumpMemoryLeaks();
#endif

	return 0;
}

GaelleDufour · August 2, 2019, 3:55pm

Thank you for the fix Philippe! Great to see the results of your testing.
We will try it out (not in the short term though) and later use RESQML 2.2.

Gaelle