|
DNDSR 0.2.1
Distributed Numeric Data Structure for CFV
|
Status: Largely implemented (see notes below) Date: 2026-04-21 (updated 2026-04-26) Depends on: Mesh DAG Design
Implementation status: Phases 0-3 (MeshConnectivity DSL, Inverse, Compose, InterpolateGlobal, evaluateGhostTree) are implemented. Phase A (AdjIndexInfo + AdjPairTracked) is implemented. The actual implementation uses inheritance for
AdjPairTracked(not composition as originally designed in Section 10.2.2). Phase B (group state as derived queries) remains future work. Phases D-E (simplify conversion methods, Python bindings) are partially complete.Note on file paths: Source files were reorganized into
src/Geom/Mesh/subdirectory. Paths referencingsrc/Geom/MeshConnectivity.hppshould readsrc/Geom/Mesh/MeshConnectivity.hpp.
Implement MeshConnectivity as a standalone class in src/Geom/Mesh/ with its own unit tests, independent of UnstructuredMesh. The class provides a small DSL of composable operations on layered adjacency graphs.
tAdjPair wraps ssp<ParArray> (father) and ssp<ParArray> (son). A C++ move would null the source pair's shared_ptrs, breaking any existing code that holds a reference to the legacy member. Moving back after DAG operations is fragile and error-prone.
Instead, MeshConnectivity stores adjacency data as ssp<tAdjPair>. The legacy members in UnstructuredMesh share the same ssp<>. Both sides see the same data. No move, no invalidation.
The simplest approach: MeshConnectivity::InterLayerAdj stores tAdjPair by value. When UnstructuredMesh adopts arrays, it shallow-copies the ssp<> pointers (father, son) from its legacy members into the DAG's slots. After adoption, both the legacy member and the DAG slot point to the same ParArray objects. Mutations (resize, ghost build) are visible to both.
The DSL is a set of free functions (or methods on MeshConnectivity) that compose adjacency relations. All operations work on tAdjPair or equivalent CSR structures.
Given a cone adjacency A → B (CSR: for each A-entity, list of B-entities), compute the support adjacency B → A (for each B-entity, list of A-entities that reference it).
This is the distributed node-inversion currently done by RecoverNode2CellAndNode2Bnd (with MPI push-back for cross-rank completeness).
Signature:
Key property: The result is globally complete — each "to" entity's row contains ALL "from" entities across all ranks (via MPI push-back).
Given two adjacencies, compose them (flatten one hop through B):
Signature:
Behavior: For each row a in AB, iterate over all b in AB[a], collect all c in BC[b], deduplicate, optionally remove a itself. This is a sparse matrix-matrix product with boolean semiring.
Filtering should happen INSIDE the compose operation, not as a separate pass over the result. When composing A→B + B→C → A→C, the filter predicate evaluates each candidate C-entity as it is discovered through B, before adding it to the result row. This avoids materializing a large unfiltered intermediate.
The filter predicate is general: it receives the A-entity, the candidate C-entity, and the set of intermediate B-entities through which the connection was established. The predicate decides whether to keep the (A, C) pair.
For the common case of "shared sub-entity filtering," the predicate checks whether A and C share a sub-entity of a specific type (face, edge, or vertex). The sharing test uses element topology, not just vertex counting — though vertex counting is a valid fast-path for the common case:
cell2cell node-neighbor): A and C share ≥ 1 vertexcell2cellFace):** A and C share ≥ dim vertices forming a recognized face sub-elementThe share judgment currently relies on counting shared vertices (≥ dim for face-share, ≥ 2 for edge-share, ≥ 1 for vertex-share), which is correct for standard simplex/hex elements. Future extension: use element topology tables to verify the shared vertices actually form a sub-element of the requested type (handles degenerate cases like pyramids where 4 shared vertices might form a face or two triangular faces depending on topology).
Signature:
Usage:
Given a direct adjacency A → C (e.g., cell → node), create an intermediate entity type B (e.g., faces) by extracting sub-entities from A's element topology, deduplicating, and building both A → B and B → C adjacencies.
This is the generalization of InterpolateFace:
interpolate(cell2node, depth=dim-1) → creates cell2face and face2nodeinterpolate(face2node, depth=1) → creates face2edge and edge2node (3D)Signature:
The current mesh build pipeline expressed as DSL operations:
For future node-based FV with edges:
Goal: Implement MeshConnectivity struct, Inverse(), and unit tests.
Status: Implemented and migrated. RecoverNode2CellAndNode2Bnd() now uses DSL Inverse internally.
Files:
src/Geom/MeshConnectivity.hpp — class definitionsrc/Geom/MeshConnectivity.cpp — Inverse() implementationtest/cpp/Geom/test_MeshConnectivity.cpp — unit testsTests:
inverse(inverse(cell2node)) ⊇ cell2node (every cell appears in the re-inverted result, possibly with extra entries from other cells sharing the same nodes)Implementation notes:
GeneralCell2NodeToNode2Cell pattern from RecoverNode2CellAndNode2Bnd but as a standalone function.tAdjPair in Adj_PointToGlobal state (global indices).Goal: Implement Compose() and FilterBySharedNodes() with tests.
Status: Implemented and migrated. RecoverCell2CellAndBnd2Cell() now uses DSL ComposeFiltered for cell2cell internally. The bnd2cell part retains its periodic pbi filter logic.
Tests:
Goal: Implement Interpolate() — the generalized face/edge generator.
Status: Fully implemented. Three tiers:
InterpolateLocal (rank-only, no MPI): extracts and deduplicates sub-entities from parent→node. Used directly for local analysis/testing and as Step 1 of InterpolateGlobal.InterpolateDistributed (legacy, 2-parent only): extends InterpolateLocal with push-based ghost exchange. Superseded by InterpolateGlobal. Retained for backward compatibility.InterpolateGlobal (production): distributed sub-entity creation with global dedup, N-parent edges, pbi extraction, and parent2entityPbi output. Used by InterpolateFace() in Mesh.cpp.Key design decisions:
entity2parent is variable-width tAdjPair (not fixed-2), supporting N-parent edges.entity2nodePbi stores pbi from the first-discovered parent's perspective.parent2entityPbi stores the uniform XOR between each parent's view and the entity's stored pbi. Computed locally (no push needed). Faces: at most 1 bit. Edges: multi-bit.evaluateGhostTree.(globalA, globalB, subIdx) triplets — subIdx disambiguates which slot of a parent with multiple non-owned sub-entities receives which global B ID.Tests:
InterpolateFace on UniformSquare_10After periodic node deduplication (Deduplicate1to1Periodic), cells on opposite sides of a periodic boundary share the same node indices but carry different NodePeriodicBits (pbi) in cell2nodePbi. The pbi bits record which periodic translations must be applied to recover each node's physical coordinates as seen from a given cell.
When extracting faces (sub-entities) from cells, face deduplication compares sorted vertex indices. On a doubly- or triply-periodic mesh, two distinct physical faces can share identical vertex index sets — because periodic node deduplication merged nodes at intersections of periodic boundaries (corners in 2D, edges/corners in 3D).
Example: 2×2 doubly-periodic quad mesh.
After dedup, 9 original nodes collapse to 4. All 4 cells reference the same 4 nodes. Every cell has the same 4 edge vertex sets: {0,1}, {1,3}, {2,3}, {0,2}. Without disambiguation, cell 0's edges would be falsely merged with cell 3's edges (the diagonally opposite cell), producing a sub-entity with 3+ parents — which is topologically impossible for a manifold mesh.
Two faces with the same vertex indices are the same physical face if and only if the periodic-bit difference between the two cells' views of the face nodes is uniform across all node pairs.
Concretely, for candidate face match between cell A (sub-entity iSub) and cell B (sub-entity jSub):
(nodeIndex, pbi) pairs from both cells for their respective face nodes: {(n_k, pbi_A_k)} and {(n_k, pbi_B_k)}.(nodeIndex, pbi).v0 = pbi_A_0 XOR pbi_B_0.k > 0, check pbi_A_k XOR pbi_B_k == v0.If the XOR is uniform, the faces are collaborating — they represent the same physical face, possibly viewed through a single periodic translation (v0 != 0) or from the same side (v0 == 0). If non-uniform, the faces are on different periodic images and must remain distinct entities.
MeshConnectivity::Interpolate is topology-agnostic — it does not know about periodic bits. Instead, the SubEntityQuery struct has an optional matchExtra callback:
After a vertex-set match, if matchExtra is set, Interpolate calls it to validate the match. The callback receives both the current sub-entity's origin (iParent, iSub) and the candidate entity's creating origin (candidateParent, candidateSub), allowing the caller to extract pbi from both and perform the uniform XOR check.
In UnstructuredMesh::InterpolateFace, the callback is wired up for periodic meshes (isPeriodic == true):
For non-periodic meshes, matchExtra is left unset (null), and Interpolate performs vertex-only dedup — which is correct since non-periodic meshes never have two distinct faces sharing the same vertex set.
With variable-width entity2parent (tAdjPair), there is no overflow. If the collaborating check is omitted on a periodic mesh, false merges produce entities with >2 parents (faces) or >4 parents (edges). Tests detect this by checking entity2parent.RowSize(i) bounds.
In 2D doubly-periodic: without matchExtra, at least one face gets 3+ parents. In 3D triply-periodic: without matchExtra, at least one face/edge gets extra parents. With matchExtra: faces have exactly 1-2 parents, edges have exactly 1-4 parents.
entity2nodePbi is stored from the first-discovered parent's perspective (the entity's own frame). When parent cell A accesses entity B, A needs to know the relative periodic transform between its own frame and B's stored frame.
parent2entityPbi[iParent][iSub] is a single NodePeriodicBits value — the uniform XOR between parent A's sub-entity node-pbi and entity B's stored entity2nodePbi, matched by node identity (not position).
Why match by node identity: Different parents may enumerate the same face/edge nodes in different local orderings. The per-position XOR can be non-uniform even when the per-node XOR is uniform. Sorting (node, pbi) pairs by node index before XORing handles this correctly.
Properties:
{0}.cell2nodePbi and entity2nodePbi, both available after InterpolateLocal.Usage: To get entity B's node coordinates in parent A's frame:
| Test | What it proves |
|---|---|
2×2 periodic quad, no matchExtra | >2-parent entity detected (false merge) |
2×2 periodic quad, with matchExtra | 8 faces, all internal, correct cell pairs, no self-connections |
| 2×2×2 triply-periodic hex, faces | 24 Quad4, all 2-parent, 3 distinct face-neighbors per cell |
| 2×2×2 triply-periodic hex, edges | 24 Line2, all 4-parent, 12 edges per cell |
| Distributed 4×4×4 hex, non-periodic | Exact face/edge counts, boundary/internal split |
| Distributed 4×4×4 hex, X-periodic | No X-boundary faces, adjusted counts |
| Distributed 4×4×4 hex, triply-periodic | 3*np*N^3 faces/edges, all internal |
| entity2nodePbi value verification | Stored pbi matches first-parent extraction (faces + edges) |
| parent2entityPbi value verification | Uniform XOR matches cell-to-entity pbi difference |
| parent2entityPbi 1-bit for faces | At most 1 bit set per face slot |
| Coordinate center verification | Face/edge center via entity2nodePbi + parent2entityPbi matches cell's direct computation |
| Regression: InterpolateFace | DSL matches legacy on UniformSquare_10 |
Goal: Implement generic ghost building using explicit adjacency chains compiled into a BFS tree with pull barriers between levels.
Status: Core types implemented and tested. BFS evaluation skeleton done. Scratch pull orchestration deferred to Phase 4 integration.
Files:
src/Geom/MeshConnectivity.hpp — EntityKind, AdjKind, GhostChain, GhostSpec, CompiledGhostTree, GhostResultsrc/Geom/MeshConnectivity_Ghost.cpp — compilation + evaluationEach hop names a specific adjacency relation in the DAG. AdjKind is a struct with (from, to, via) fields rather than a flat enum — this gives a compact, extensible representation without combinatorial explosion.
from != to): via is ignored. E.g., AdjKind(Cell, Node) = cell2node cone. All direct cones and supports are allowed.from == to): via specifies the intermediary. E.g., AdjKind(Cell, Cell, Node) = cell2cell via node, AdjKind(Cell, Cell, Face) = cell2cellFace.Predefined constants in namespace Adj:
Registry policy: The adjacency registry (MeshConnectivity::adjRegistry) stores only:
Cell2Node, Node2Cell, Cell2Face, Face2Cell, Bnd2Node, Node2Bnd, Bnd2Cell, etc.Cell2Cell, Cell2CellFace, Bnd2Bnd.More complex composed adjacencies (multi-hop, filtered) are NOT stored in the registry and cannot be used as ghost chain hops.
Current pipeline expressed as GhostSpec:
Chains are compiled into a forest of BFS-level-ordered trees. Chains sharing common prefixes merge into shared trie paths to avoid redundant traversals.
Example compiled tree for current pipeline:
Tip merging: Node (level 2, COLLECT) from the Cell subtree and Node (level 3, COLLECT) from the Bnd subtree both contribute to ghostResult[Node] via union.
Evaluation has two phases: BFS traversal with conservative scratch pulls to enable adjacency lookups on non-local entities, then a definitive pull using the exact computed ghost set.
The tree is evaluated level-by-level (BFS). At each level, all tree nodes at that level traverse their hop adjacency to produce an entity set. After each level, the evaluator checks which entity kinds have sets containing non-local entries AND have children in the tree. For those kinds, a scratch pull is performed: a temporary ghost mapping is built and adjacency data is pulled so that the next level can traverse from those ghost entities.
Scratch pulls are conservative — they may pull data for entities that no COLLECT node ultimately needs. This is acceptable because they are temporary; the definitive pull replaces them.
After BFS completes, the final ghost set per EntityKind is the union of all COLLECT nodes' non-owned entities for that kind:
Then, for each entity kind with ghost indices, build the definitive ghost mapping and pull all associated arrays:
createFatherGlobalMapping() (if not already done)createGhostMapping(ghostIndices[kind]) — the exact, minimal setcreateMPITypes() + pullOnce() on the primary arrayBorrowAndPull() on all secondary arrays sharing the same ghost layoutThis replaces any scratch pull state. The definitive pull is what persists in the ArrayTransformer for later use (persistent pull/push in solvers).
Scratch pull optimization: During BFS, scratch pulls only need the adjacency arrays required by subsequent hops — NOT all arrays for that entity kind. The compiled tree knows which AdjKind hops follow from each kind, so it can determine the minimal scratch set. E.g., if the only hop from Cell at level 2 is Cell2Node, the scratch pull for Cell at level 1 only needs cell2node ghost data — not cellElemInfo, not cell2nodePbi. This keeps scratch pulls lightweight.
Pull groups: All arrays sharing the same entity kind are pulled together. The caller provides a registry mapping EntityKind to the list of ArrayPairs:
E.g., for entity kind Cell:
cell2cell (full 4-step setup)cell2node, cellElemInfo, cell2nodePbi, cell2cellOrigFor Node:
coordsnode2nodeOrigFor Bnd:
bnd2cellbnd2node, bndElemInfo, bnd2nodePbi, bnd2bndOrigMeshConnectivity does NOT need full ArrayPair move semantics. Ownership transfer between UnstructuredMesh legacy members and the DAG uses ssp<> (shared_ptr) swap on father/son pointers, which is O(1):
ArrayTransformer stays on the ArrayPair in UnstructuredMesh and is (re)initialized during ghost building. Full move semantics on Array, ParArray, ArrayTransformer, ArrayPair is deferred to a separate effort requiring DNDS-level unit test coverage.
MeshConnectivity holds a restricted registry mapping AdjKind to tAdjPair*:
The registry stores raw pointers to adjacency pairs owned elsewhere (typically by UnstructuredMesh). The caller is responsible for ensuring pointer validity.
During evaluateGhostTree, each hop in the BFS tree is resolved via resolveAdj. If any hop is unresolved, the evaluator aborts with a diagnostic message listing the missing adjacencies.
The forest of ghost chains can be unified into a single flat state-vector propagation that is mathematically equivalent (for non-cross-contaminating specs) and simpler to implement efficiently.
State vector. $\mathbf{S}^L$ is a vector indexed by EntityKind, where $\mathbf{S}^L(K)$ is the set of global indices of kind $K$ at BFS level $L$.
Initialization (level 0): $$\mathbf{S}^0(K) = \begin{cases} \text{Owned}(K) & \text{if } K \text{ is an anchor of any chain} \ \emptyset & \text{otherwise} \end{cases}$$
Transfers. Each chain $C_k$ with hops $h_k^1, \ldots, h_k^{n_k}$ generates transfers $(h_k^j, \text{level} = j)$ for $j = 1, \ldots, n_k$. Let $T_L$ be all transfers at level $L$.
Update rule: $$\mathbf{S}^L(K) = \mathbf{S}^{L-1}(K) \;\cup\; \bigcup_{\substack{(h, L) \in T_L \ \text{to}(h) = K}} h\bigl(\mathbf{S}^{L-1}(\text{from}(h))\bigr)$$
where $h(S) = \bigcup_{e \in S} h[e]$ (row lookup and union).
Ghost result: $$G_{\text{flat}}(K) = \mathbf{S}^{L_{\max}}(K) \setminus \text{Owned}(K)$$
Lemma 1. $\mathbf{S}^L(K) \subseteq \mathbf{S}^{L+1}(K)$ for all $K$, $L$.
Proof. The update rule takes a union with $\mathbf{S}^{L-1}(K)$, so the state can only grow. $\square$
Lemma 2. If $A \subseteq B$ then $h(A) \subseteq h(B)$.
Proof. $h(A) = \bigcup_{e \in A} h[e] \subseteq \bigcup_{e \in B} h[e] = h(B)$. $\square$
Theorem. $G_{\text{flat}}(K) \supseteq G_{\text{forest}}(K)$ for all collected kinds $K$. Equality holds when the spec has no cross-chain contamination (defined below).
*Proof of $\supseteq$.*
For chain $C_k$ with hops $h_k^1, \ldots, h_k^{n_k}$, anchor $a_k$, target $t_k = K$, define the chain's intermediate sets:
$$P_k^0 = \text{Owned}(a_k), \qquad P_k^j = h_k^j(P_k^{j-1})$$
The chain result is $R_k = P_k^{n_k}$.
Claim: $P_k^j \subseteq \mathbf{S}^j(\text{to}(h_k^j))$ for all $j$.
Induction on $j$:
Therefore $R_k \subseteq \mathbf{S}^{n_k}(K) \subseteq \mathbf{S}^{L_{\max}}(K)$ (by Lemma 1). Taking the union over all chains targeting $K$ and subtracting owned: $G_{\text{forest}}(K) \subseteq G_{\text{flat}}(K)$. $\square$
*Proof of $\subseteq$ (when no cross-chain contamination).*
Cross-chain contamination occurs when a transfer $(h, L)$ from chain $B$ reads $\mathbf{S}^{L-1}(\text{from}(h))$ which contains entities injected by a different chain $A$ at an earlier level, and the resulting entities are not reachable by any single chain.
Formally, an entity $e \in \mathbf{S}^{L_{\max}}(K) \setminus \text{Owned}(K)$ is cross-contaminated if every reachability path from owned entities to $e$ passes through transfers from two or more distinct chains in an order that does not correspond to any single chain's hop sequence.
When no cross-contamination exists, every entity in $\mathbf{S}^{L_{\max}}(K)$ is either owned or reachable through a single chain's hop sequence, hence belongs to some $R_k$. Therefore $G_{\text{flat}}(K) \subseteq G_{\text{forest}}(K)$.
Sufficient condition for no cross-contamination: For every pair of transfers $(h_A, L)$ from chain $A$ and $(h_B, L')$ from chain $B$ with $L' > L$ and $\text{from}(h_B) = \text{to}(h_A)$, there exists a chain $C$ whose hop sequence includes both $h_A$ at position $L$ and $h_B$ at position $L'$ (i.e., the cross-chain path is also a valid single-chain path).
The default spec chains:
Transfers by level:
Check cross-contamination pairs:
No cross-contamination. $G_{\text{flat}} = G_{\text{forest}}$ exactly. $\square$
Even when cross-contamination exists, $G_{\text{flat}} \supseteq G_{\text{forest}}$. Extra ghost entities are safe — they waste some memory but do not cause incorrect results. The definitive pull uses the exact computed set. For ghost building, this superset property is acceptable and often desirable (simpler implementation, no need to track per-chain provenance).
The flat formulation (Section 3.10) is efficient for MPI pulls (one pull per EntityKind per level) but may produce a strict superset of the forest result due to cross-chain contamination. The hybrid approach combines the best of both: exact per-tree-node traversal with efficient union-based pulls.
Data structures. All indices are global throughout.
S[n] — sorted vector of global indices for tree node n.Pull[K] — sorted vector of global indices for EntityKind K, computed as the union of all tree nodes' sets of that kind at the current level. Used solely for ghost communication (scratch pull).Evaluation.
Claim: $G_{\text{hybrid}}(K) = G_{\text{forest}}(K)$ exactly.
Proof. Each tree node $n$ at level $L$ with parent $p$ computes $S[n] = n.hop(S[p])$, using only the parent's per-node set $S[p]$, not the full union $\text{Pull}[\text{from}(n.hop)]$. This is identical to the forest formulation. The pull union is only used for MPI communication — it ensures that the adjacency data is available for $S[p]$, since $S[p] \subseteq \text{Pull}[p.kind]$ (by construction of Pull as a union containing $S[p]$). Therefore every index in $S[p]$ is resolvable after the scratch pull.
No cross-chain contamination occurs because each tree node's traversal is driven by its own per-node set, which was produced by its own chain's hop sequence. The pull union is a superset used only for communication, not for traversal.
The COLLECT step reads from per-node sets, producing exactly the forest's ghost sets. Union across COLLECT nodes of the same kind gives the same result as the forest's union of chain results. $\square$
The pull union $\text{Pull}[K]$ may contain more entities than any single tree node needs. This means the scratch pull fetches some adjacency data that no tree node will actually traverse. The overhead is:
In practice, for the default spec, the pull unions match the per-node sets (no overlap between tree nodes of the same kind at the same level), so there is zero overhead.
After a scratch pull at level $L$, every global index in every per-node set $S[n]$ at level $L$ (where $n$ has children) must be locally resolvable in the adjacency used by the child's hop. If an index is not found in the father range and not in the ghost mapping, this indicates a bug in the pull union computation or the MPI communication. The traversal asserts this with DNDS_assert rather than silently skipping.
The only exception is level 0 (before any pull), where the per-node sets are owned entities — always locally available by definition.
All sets, adjacency lookups, and ghost results operate on global indices throughout the evaluation. The adjacency arrays are in Adj_PointToGlobal state. Local-to-global and global-to-local conversions happen only at the boundary between the evaluator and the ArrayTransformer (for pull setup and for resolving row indices in the adjacency arrays).
|---—|-------------—| | defaultPrimary compiles | Tree structure: 2 roots, correct levels, COLLECT flags, prefix merging | | prefix merging | Two chains {Cell2Cell} and {Cell2Cell,Cell2Cell} share prefix | | invalid chain detection | Empty hops, anchor mismatch, target mismatch, consecutive mismatch | | checkAvailable | Missing adjacencies correctly identified | | dump | Diagnostic output contains expected entity names | | face ghost chain | Cell→Cell2Face→Face compiles correctly | | AdjKind equality and hash | Direct ignores via; intra-level uses via | | entityDepth | Correct depths for 2D and 3D, Edge==Face in 2D | | adjKindName | Formatted names: "Cell2Node", "Cell2Cell(Node)" | | Evaluate on partitioned mesh | TODO (Phase 4 integration) |
Goal: Wire MeshConnectivity DSL operations into UnstructuredMesh.
Status: Implemented. All ghost operations migrated to evaluateGhostTree. InterpolateFace migrated to InterpolateGlobal + pull-based ghost faces.
What was done:
BuildGhostPrimary: node ghosts via evaluateGhostTree(Cell→Cell2Cell→Cell2Node→Node), bnd ghosts via evaluateGhostTree(Node→Node2Bnd→Bnd).RecoverCell2CellAndBnd2Cell: N2CB ghost pull via evaluateGhostTree(Cell→Cell2Node→Node ∪ Bnd→Bnd2Node→Node).ReadSerializeAndDistribute: ghost cell collection via evaluateGhostTree(Cell→Cell2Cell→Cell).InterpolateFace: uses InterpolateGlobal + pull-based ghost faces via createGhostMapping, not push protocol inside the DSL.Not done (kept as manual code):
BuildCell2CellFace: 9 lines of local code, uses tAdj2Pair not supported by ComposeFiltered.BuildCell2Cell, BuildBnd2CellSerial): rank-0 only.ReadSerializeAndDistribute facial filtering: needs vertex-only counting.InterpolateGlobal for edges in production: API and tests ready, not yet called by the mesh pipeline (no edge adjacency in current solver).Goal: Rewrite RecoverNode2CellAndNode2Bnd, RecoverCell2CellAndBnd2Cell, InterpolateFace as DSL operation sequences.
Approach: The legacy methods become thin wrappers that call DSL operations:
Tests: Existing test_MeshPipeline.cpp must pass unchanged.
Boundaries are NOT a separate entity stratum. A boundary is a codim-1 face with a non-internal zone label. In the current code, bnd2node is a separate array; in the DAG, it's simply a subset of face2node selected by zone ID.
However, for backward compatibility, the DAG can store bnd2node as a separate InterLayerAdj with fromDepth = dim - 1 (boundary faces) and toDepth = 0 (nodes). The "boundary" stratum overlaps with the face stratum in entity depth but represents a subset.
bnd2cell is derived:
Or more directly: after face interpolation, each boundary face has a face ID, and face2cell (the support of the cell→face cone) directly gives the 1-2 cells. So bnd2cell[iBnd] = face2cell[bnd2face[iBnd]].
All tests are MPI-aware (np=1, 2, 4, 8) using doctest + mpirun.
| Phase | Test File | Key Assertions |
|---|---|---|
| 0 | test_MeshConnectivity.cpp | Inverse correctness, MPI completeness |
| 1 | same | Compose correctness, filter correctness, periodic pbi filter |
| 2 | test_MeshConnectivity_Interpolate.cpp | Face/edge counts, dedup, periodic collab check, distributed InterpolateGlobal, pbi value verification, coordinate center verification |
| 3 | test_MeshConnectivity_Ghost.cpp | Ghost set matches known results, chain merging, performance benchmarks |
| 4 | test_MeshPipeline.cpp (existing) | Full pipeline regression |
Small test meshes (hand-crafted 4-cell quads, periodic 2×2, 2×2×2) for local tests. Distributed NxNxN hex meshes (N=4) for InterpolateGlobal tests. Existing CGNS meshes (UniformSquare_10) for regression.
Shared builders in SyntheticMeshBuilders.hpp: HandCraftedMesh, Periodic2x2Mesh, Periodic2x2x2Mesh, DistributedHex3D, makeFaceQuery, makeEdgeQuery, makePeriodicMatchExtra, makeHex8FaceQueryPbi, makeHex8EdgeQueryPbi.
Date: 2026-04-23 Goal: Make every adjacency in MeshConnectivity capable of using any ArrayAdjacencyPair<rs> directly, instead of forcing tAdjPair (NonUniformSize) everywhere. This enables fixed-width adjacencies (face2cell=2, bnd2face=1, etc.) to flow through the DSL without conversion, and supports custom mesh topologies.
| Decision | Choice | Rationale |
|---|---|---|
| DSL method dispatch | Function templates (template<int rs_...>) | Compile-time specialization; if constexpr for ResizeRow differences |
| Result structs | Per-field template params | InterpolateResult<p2e_rs, e2n_rs, e2p_rs> lets callers choose output width |
| ConeAdj/SupportAdj storage | ssp<AdjVariant> shared ownership | Both DAG and legacy mesh members share the same allocation |
| adjRegistry | Owns ssp<AdjVariant>, no raw pointers | Safe shared ownership; std::visit dispatches at ghost traversal time |
| Backward compat | Default template params = NonUniformSize | Existing callers compile unchanged |
Before:
After:
All DSL methods become function templates parameterized on input/output row sizes. Default template args = NonUniformSize for backward compatibility.
For fixed-width arrays (rs > 0), ResizeRow is not supported (the width is compile-time fixed). The DSL methods must use if constexpr to skip ResizeRow calls when the output is fixed-width (the rows are already the correct size after Resize(n)).
Key if constexpr sites:
ResizeRow: skip for fixed-width outputs, assert width matches insteadCompress: skip for fixed-width (no CSR to compress)operator[] (AdjacencyRow handles both)RowSize: compile-time constant for fixed-width, runtime for NonUniformSizeGhost traversal (evaluateGhostTree) dispatches via std::visit:
Phase A: Shared ownership for ConeAdj/SupportAdj/Registry
ConeAdj::adj from AdjVariant to ssp<AdjVariant>.SupportAdj::adj from AdjVariant to ssp<AdjVariant>.adjRegistry from map<AdjKind, tAdjPair*> to map<AdjKind, ssp<AdjVariant>>.registerAdj to accept ssp<AdjVariant>.addCone/addSupport to allocate make_ssp<AdjVariant>(...).asAdj(), asAdj2(), etc. to dereference through ssp.evaluateGhostTree to use std::visit on *resolveAdj(kind).Phase B: Templatize DSL methods
Inverse to template<int cone_rs>. Keep return as tAdjPair.ComposeFiltered to template<int rs_AB, int rs_BC, class Pred>.Interpolate (local) to template<int p2n_rs, int p2e_rs, int e2n_rs, int e2p_rs>.InterpolateGlobal similarly.if constexpr for ResizeRow/Compress differences.Phase C: Templatize result structs
InterpolateResult<p2e_rs, e2n_rs, e2p_rs>.InterpolateGlobalResult<p2e_rs, e2n_rs, e2p_rs>.InterpolateDistributedResult<p2e_rs, e2n_rs, e2p_rs>.Phase D: Production usage with fixed-width
e2p_rs=2 for entity2parent (faces always have ≤2 parents).tAdj2Pair already — wire through the DAG.tAdj1Pair through the DAG.| Risk | Mitigation |
|---|---|
| Template bloat (many instantiations) | Only instantiate used combinations; explicit instantiation in .cpp |
| Compile time increase | Templates in headers are unavoidable for if-constexpr; mitigate with explicit instantiation |
| AdjVariant size grows | Currently 6 alternatives; no change needed |
| ssp overhead vs raw ptr | Negligible — these are large arrays; one extra indirection per access |
| evaluateGhostTree perf with std::visit | Visit dispatch is one virtual call per hop, not per element; negligible |
The current global/local index state management has five group-level state variables (adjPrimaryState, adjFacialState, adjC2FState, adjN2CBState, adjC2CFaceState) that each govern 1-4 adjacency arrays. This creates three problems:
adjPrimaryState covers cell2node (points to nodes), cell2cell (points to cells), bnd2node (points to nodes), and bnd2cell (points to cells). Converting any one forces converting all four, even when only one needs to change. The "ForBnd" variants (AdjGlobal2LocalPrimaryForBnd) are ad-hoc workarounds.pLGhostMapping to use: cell2cell uses cellElemInfo.trans.pLGhostMapping, cell2node uses coords.trans.pLGhostMapping. This coupling is scattered across 10+ methods and is easy to break during refactoring.MatchFaceBoundary, BuildCell2CellFace, and ReorderLocalCells do expensive local->global->local round-trips on entire groups because the group state forces all-or-nothing conversion.A lightweight struct recording what an adjacency array points to and its current state:
A thin wrapper that pairs an ArrayPair with its AdjIndexInfo:
Each adjacency points to a specific entity kind. This table shows which pLGhostMapping and pLGlobalMapping each adjacency needs:
| Adjacency | Points to | targetMapping source | targetGlobalMapping source |
|---|---|---|---|
cell2node | Node | coords.trans.pLGhostMapping | coords.father->pLGlobalMapping |
bnd2node | Node | coords.trans.pLGhostMapping | coords.father->pLGlobalMapping |
face2node | Node | coords.trans.pLGhostMapping | coords.father->pLGlobalMapping |
cell2cell | Cell | cellElemInfo.trans.pLGhostMapping | cell2node.father->pLGlobalMapping |
bnd2cell | Cell | cellElemInfo.trans.pLGhostMapping | cell2node.father->pLGlobalMapping |
face2cell | Cell | cellElemInfo.trans.pLGhostMapping | cell2node.father->pLGlobalMapping |
node2cell | Cell | cellElemInfo.trans.pLGhostMapping | cell2node.father->pLGlobalMapping |
cell2cellFace | Cell | cellElemInfo.trans.pLGhostMapping | cell2node.father->pLGlobalMapping |
node2bnd | Bnd | bndElemInfo.trans.pLGhostMapping | bnd2node.father->pLGlobalMapping |
face2bnd | Bnd | bndElemInfo.trans.pLGhostMapping | bnd2node.father->pLGlobalMapping |
cell2face | Face | face2node.trans.pLGhostMapping | face2node.father->pLGlobalMapping |
bnd2face | Face | face2node.trans.pLGhostMapping | face2node.father->pLGlobalMapping |
Mappings are wired at well-defined points in the mesh pipeline:
PartitionReorderToMeshCell2Cell / ReadDistributed_Redistribute:pLGlobalMapping exist (from createFatherGlobalMapping).state = Adj_PointToGlobal on primary adjacencies.targetGlobalMapping where available.BuildGhostPrimary:cellElemInfo.trans.pLGhostMapping and coords.trans.pLGhostMapping exist.targetMapping for all adjacencies that point to cells or nodes: InterpolateFace builds face ghosts:face2node.trans.pLGhostMapping exists.cell2face.idx.targetMapping and bnd2face.idx.targetMapping.BuildGhostN2CB builds node2cell/node2bnd ghosts:face2bnd.idx.targetMapping and node2bnd.idx.targetMapping (if bnd ghost mapping is available).Each wiring step is a single shared_ptr copy per adjacency — O(1) cost.
The 5 legacy group state variables become read-only computed properties:
This maintains backward compatibility: callers that read m->adjPrimaryState continue to compile (now as a function call) and get the same semantics.
For the DeviceView, the state values are still plain members (POD for GPU), populated from the derived queries at construction time:
The grouped conversion methods remain as thin wrappers:
The ForBnd variants become trivially:
Phase A: Introduce AdjIndexInfo + AdjPairTracked<TPair> (non-breaking)
AdjIndexInfo struct (new header src/Geom/AdjIndexInfo.hpp or within Mesh_DeviceView.hpp).AdjPairTracked<TPair> wrapper template.AdjPairTracked has implicit conversion operator TPair&() so existing code that passes the pair to functions compiles unchanged.AdjIndexInfo::toLocal / toGlobal standalone.Phase B: Replace member types on UnstructuredMesh
tAdjPair cell2node; → AdjPairTracked<tAdjPair> cell2node; and similarly for all adjacency pairs.AdjPairTracked has implicit conversion and forwarding operators, most callers should compile unchanged. Fix any that break.MeshAdjState adjPrimaryState() const methods. The old data members are removed.device_array_list_primary() etc. to return references to pair member of each AdjPairTracked.Phase C: Wire target mappings
BuildGhostPrimary, wire targetMapping for all primary adjacencies.InterpolateFace, wire for facial/C2F adjacencies.BuildGhostN2CB, wire for N2CB adjacencies.BuildCell2CellFace, wire for C2CFace.Phase D: Simplify conversion methods
AdjGlobal2Local* / AdjLocal2Global* with calls to adj.toLocal() / adj.toGlobal().Phase E: Enable fine-grained conversions (optional, future)
BuildCell2CellFace only needs cell2cell in local state — no need to convert bnd2node.ReorderLocalCells can convert only the adjacencies it touches.MatchFaceBoundary can convert only cell2face without touching bnd2face.Potential breakage sites for implicit operator TPair&():
template<class T> void f(T&), passing AdjPairTracked<tAdjPair> deduces T = AdjPairTracked<tAdjPair>, not T = tAdjPair. If the function body accesses .father, .son, .trans, it breaks. Mitigation: Add forwarding members: auto &father(), auto &son(), auto &trans() on AdjPairTracked, or use .pair.father at those sites.ConvertAdjEntries and ConvertAdjEntriesOMP.** These templates use TAdj::RowSize() and TAdj::operator()() — both forwarded by AdjPairTracked, so they should work.PermuteRows.** Uses pair.father directly. Will need .pair.father or forwarding.device_array_list_*().** The DNDS_MAKE_1_MEMBER_REF macro takes a member name. Will need to reference cell2node.pair or adapt the macro.ReadSerialize/WriteSerialize).** Accesses .father, .son, .trans directly. Will need .pair. prefix.| Risk | Mitigation |
|---|---|
| Implicit conversion not deduced in templates | Add forwarding members or use .pair at breakage sites |
device_array_list_* macros break | Adapt macros to access .pair member |
| DeviceView constructor copies state | Populate from derived query methods — same semantics |
| Serialization/deserialization breaks | Access .pair explicitly at serialization sites |
| Python bindings reference member types | pybind11 bindings need .pair access; update Mesh_bind.hpp |
evaluateGhostTree still needs global state | Unchanged; the per-adj state just makes the precondition checkable per-adjacency |
| OMP parallel conversion safety | toLocalOMP/toGlobalOMP use same #pragma omp parallel for pattern as existing code |
toLocal() / toGlobal(). The improvement is that the mapping dependency is recorded (not hard-coded) and conversion is per-adjacency (not per-group).AdjPairTracked<TPair> wrapper over modifying ArrayPair directly, keeping DNDS core types unchanged.