Serialization¶
This page describes the serialization subsystem used for HDF5 checkpoint I/O and restart redistribution across different MPI partition counts.
Overview¶
DNDSR serializes ArrayPair data (father + optional son/ghost arrays)
through a layered design:
Layer |
Header |
Responsibility |
|---|---|---|
|
|
Abstract read/write interface for scalars, vectors, byte arrays. |
|
|
MPI-parallel HDF5 implementation (collective I/O). |
|
|
Per-rank JSON implementation (no MPI coordination). |
|
|
Reads/writes a single array (metadata, structure, data). |
|
|
MPI-aware wrapper: resolves EvenSplit, CSR global offsets. |
|
|
Father-son pair: WriteSerialize, ReadSerialize, ReadSerializeRedistributed. |
|
|
Rendezvous redistribution via ArrayTransformer. |
Serializer interface¶
Offset modes¶
ArrayGlobalOffset describes a rank’s portion of a global dataset:
Sentinel |
Meaning |
|---|---|
|
Auto-detect from the |
|
Compute offset via |
|
Rank 0 writes/reads the entire dataset; others write/read nothing. |
|
Read only: each rank reads |
|
Explicit |
EvenSplit is resolved by ParArray::ReadSerializer into an isDist() offset
before calling Array::ReadSerializer. Array::ReadSerializer asserts that it
never receives EvenSplit directly.
Collective semantics (HDF5)¶
All Read*Vector, ReadShared*Vector, ReadUint8Array, and their Write
counterparts are MPI-collective in SerializerH5. Every rank must call
them in the same order, even when its local element count is 0. Failing to
participate causes a hang.
Two-pass read pattern¶
SerializerH5 reads vectors in two passes internally:
Pass 1 (
buf == nullptr): queries dataset size and resolves the offset.Pass 2 (
buf != nullptr): performs the collectiveH5Dread.
The Read*Vector and ReadShared*Vector methods handle both passes internally
and are single-call for the user.
ReadUint8Array exposes the two-pass pattern to the caller: the first call
with data == nullptr returns the size; the second call reads data.
Zero-size partitions¶
When nGlobal < nRanks (e.g., 5 elements across 8 ranks), EvenSplitRange
assigns 0 rows to some ranks. This is valid and handled throughout the stack:
ReadDataVector: acceptssize == 0in theisDist()second-pass branch. TheH5_ReadDatasetcall proceeds with a 0-count hyperslab selection (selects nothing, but the rank participates in the collective).Callers (
ReadIndexVector,ReadRealVector, etc.): whensize == 0,std::vector<>::data()may return nullptr on an empty vector. A nullptrbufwould skip theH5Dreadblock (guarded byif (buf != nullptr)) and hang other ranks. Each caller passes a dummy stack pointer whensize == 0:index dummy; ReadDataVector<index>(name, size == 0 ? &dummy : v.data(), ...);ReadUint8Arrayexposes the two-pass pattern to the caller directly. When the queried size is 0, the caller must pass a non-null pointer on the second call.Array::__ReadSerializerData’streatAsByteslambda does this:uint8_t dummy; serializerP->ReadUint8Array("data", bufferSize == 0 ? &dummy : (uint8_t*)_data.data(), ...);
Array serialization¶
Write path¶
ParArray::WriteSerializer (called by ArrayPair::WriteSerialize):
Delegates to
Array::WriteSerializerfor metadata, structural data, and the flat data buffer.Writes
sizeGlobal(sum of all ranks’ sizes) as a scalar.For CSR arrays: computes global data offsets via
MPI_Scanand writespRowStartin global data coordinates as a contiguous(nRowsGlobal + 1)dataset. Non-last ranks writenRowsentries; last rank writesnRows + 1.
Read path (same partition)¶
ParArray::ReadSerializer with Unknown offset:
Reads per-rank size from the
sizedataset (auto-detected via therank_offsetscompanion).For CSR: reads per-rank size, computes row offset via
MPI_Scan, resolves toisDist().Delegates to
Array::ReadSerializer.
Read path (different partition / EvenSplit)¶
ParArray::ReadSerializer with EvenSplit offset:
Reads
sizeGlobalfrom the file.Computes
EvenSplitRange(rank, nRanks, sizeGlobal)to get{localRows, globalRowStart}.Resolves to
isDist()and delegates toArray::ReadSerializer.
Some ranks may get localRows == 0. The read proceeds with 0-count hyperslab
selections in all collective HDF5 calls.
ArrayPair serialization¶
WriteSerialize¶
Writes under a sub-path name:
Dataset |
Content |
|---|---|
|
Per-rank serializer only. |
|
Number of MPI ranks at write time. |
|
Father array via |
|
Son (ghost) array, if |
|
Ghost pull indices, if |
The origIndex overload additionally writes:
Dataset |
Content |
|---|---|
|
Partition-independent key per row (e.g., CGNS cell index). |
|
Integer flag (1). |
ReadSerialize¶
Reads data written by WriteSerialize with the same MPI size. Resizes
father (and optionally son) arrays internally. If includePIG is true, the
caller must call trans.createMPITypes() afterward.
ReadSerializeRedistributed¶
Handles three cases:
No origIndex, same np: falls back to
ReadSerialize.Has origIndex, same np: reads father normally, reads origIndex, then redistributes via
RedistributeArrayWithTransformer.Has origIndex, different np: reads father and origIndex via EvenSplit, then redistributes via
RedistributeArrayWithTransformer.
In case 3, the redistribution uses a rendezvous pattern
(BuildRedistributionPullingIndex) with three rounds of MPI_Alltoallv to
build a directory mapping origIdx -> globalReadIdx, then an
ArrayTransformer pull to move data to the correct ranks.
Ranks with 0 rows from EvenSplit participate in all collective calls with empty arrays and empty Alltoallv buffers.
Restart redistribution (Euler solver)¶
EulerSolver::PrintRestart writes checkpoint data with origIndex (from
cell2cellOrig) for H5 serializers. EulerSolver::ReadRestart calls
ReadSerializeRedistributed to load the data, handling both same-np and
cross-np restarts transparently.
ReadRestartOtherSolver peeks the DOF count (nVars) from the file,
constructs a temporary ArrayPair with matching layout, reads via
ReadSerializeRedistributed, and copies the data.