Status: Implemented. evaluateGhostTree supports multi-hop chains with scratch pulls between BFS levels. BuildGhostPrimary(nGhostLayers) passes through to GhostSpec::defaultPrimary(nLayers).
Current State
BuildGhostPrimary supports N layers of ghost cells by traversing cell2cell N times via evaluateGhostTree:
GhostSpec::defaultPrimary(nLayers)
Cell chain: nLayers hops of Cell2Cell
Node chain: nLayers hops of Cell2Cell + Cell2Node
Bnd chain: Bnd2Node -> Node2Bnd (unchanged)
cell2cell uses node-sharing (minShared=1). Nodes, boundaries, and N2CB are derived from the ghost cells.
Goal
Upgrade evaluateGhostTree to handle ANY multi-hop chain with internal scratch pulls between levels, without mutating any input arrays. Use this to support N-layer ghost cells.
Resolved decisions
- Node-sharing for cell2cell: kept as-is.
- Face ghost range: unchanged; separate concern.
- No input mutation: evaluator must not modify the caller's arrays.
Design
Core principle: evaluator-owned scratch transformers
The evaluator accepts adjacency arrays as read-only father-only inputs (global-indexed). When a multi-hop chain requires ghost data at an intermediate level, the evaluator:
- Creates a temporary son array (same type as the father).
- Creates a temporary
ArrayTransformer with setFatherSon(father, tempSon).
- Calls
createFatherGlobalMapping (reuses father's existing mapping), createGhostMapping(ghostIndices), createMPITypes, pullOnce.
- Now the temporary pair (father + tempSon) has ghost data available.
traverseHop at the next level reads from this temporary pair.
The input arrays are never touched. The temporary son arrays are discarded after evaluation.
Registration: father-only, global-indexed
registerAdj stores a reference to the father array only. The adjacency entries must be global indices (the evaluator needs global indices to identify non-owned entities and to build ghost mappings).
template <class TPair>
{
auto adjVar = makeAdjVariant<TPair>();
auto &stored = std::get<TPair>(*adjVar);
stored.father = pair.father;
}
reg registerAdj(Adj::Cell2Node, [&](const PermutationTransfer::LookupResult &lookup) { for(DNDS::index i=0;i< cell2node.father->Size();i++) for(rowsize j=0;j< 2;j++) { DNDS::index &v=cell2node(i, j);if(v !=UnInitIndex) v=lookup.resolve(v);} }, [&](const PermutationTransfer &t, const MPIInfo &m) { t.transferRows(cell2node, m);}, "cell2node")
evaluateGhostTree: internal scratch pull loop
GhostResult MeshConnectivity::evaluateGhostTree(
const CompiledGhostTree &tree,
const MPIInfo &mpi) const
{
std::vector<std::vector<index>> nodeSets(tree.totalNodes);
struct ScratchState
{
ssp<AdjVariant> liveAdj;
};
std::unordered_map<AdjKind, ScratchState, AdjKindHash> scratchStates;
auto resolveAdjLive = [&](AdjKind kind) -> ssp<AdjVariant>
{
auto sit = scratchStates.find(kind);
if (sit != scratchStates.end())
return sit->second.liveAdj;
return resolveAdj(kind);
};
for (auto &entry : tree.levels[0])
{
auto gm = getGlobalMapping(entry.kind);
index myOffset = gm->operator()(mpi.rank, 0);
index mySize = gm->RLengths()[mpi.rank];
auto &
set = nodeSets[entry.nodeId];
for (index
i = 0;
i < mySize;
i++)
}
for (int level = 0; level <= tree.maxLevel; level++)
{
for (auto &entry : tree.levels[level])
{
if (!entry.collect)
continue;
auto gm = getGlobalMapping(entry.kind);
index myOffset = gm->operator()(mpi.rank, 0);
index myEnd = myOffset + gm->RLengths()[mpi.rank];
auto ghosts = filterNonOwned(nodeSets[entry.nodeId], myOffset, myEnd);
sortedMergeInto(
result.ghostIndices[entry.kind], ghosts);
}
if (level >= tree.maxLevel)
break;
for (auto &childEntry : tree.levels[level + 1])
{
AdjKind hop = childEntry.hop;
auto git =
result.ghostIndices.find(parentKind);
if (git ==
result.ghostIndices.end() || git->second.empty())
continue;
if (scratchStates.count(hop))
continue;
auto origAdj = resolveAdj(hop);
if (!origAdj)
continue;
std::visit([&](
auto &
adj)
{
using TPair = std::decay_t<
decltype(
adj)>;
using TArray = typename TPair::t_pArray::element_type;
auto tempSon = make_ssp<TArray>(
ObjName{
"scratch_son"},
adj.father->getMPI());
auto livePair = makeAdjVariant<TPair>();
auto &live = std::get<TPair>(*livePair);
live.father =
adj.father;
live.son = tempSon;
live.trans.setFatherSon(
adj.father, tempSon);
if (
adj.father->pLGlobalMapping)
live.trans.pLGlobalMapping =
adj.father->pLGlobalMapping;
else
live.trans.createFatherGlobalMapping();
auto ghostCopy = git->second;
live.trans.createGhostMapping(ghostCopy);
live.trans.createMPITypes();
live.trans.pullOnce();
scratchStates[hop] = ScratchState{std::move(livePair)};
}, *origAdj);
}
for (auto &childEntry : tree.levels[level + 1])
{
auto &parentSet = nodeSets[childEntry.parentId];
auto adjVar = resolveAdjLive(childEntry.hop);
nodeSets[childEntry.nodeId] = traverseHop(parentSet, *adjVar, false);
}
scratchStates.clear();
}
}
set(LIBNAME cfv) set(LINKS) set(LINKS_SHARED geom_shared dnds_shared $
Key properties
- No input mutation. The original
AdjPairTracked members (father, son, transformer, ghost mapping, idx state) are never modified. All pulls happen on evaluator-owned temporary transformers with temporary son arrays. registerAdj already unwraps AdjPairTracked<TPair> to the base TPair – the evaluator never sees idx.
- Shared father, independent son. Scratch transformers share the same
father shared_ptr as the input pair, but create their own temporary son. setFatherSon on the scratch transformer doesn't affect the input pair's transformer.
- No
createFatherGlobalMapping. The scratch transformer must reuse father->pLGlobalMapping directly (via trans.pLGlobalMapping = father->pLGlobalMapping), never call createFatherGlobalMapping() which would replace the shared father's mapping pointer (the same BuildSerialOut side-effect we already fixed).
- Global-indexed inputs. The evaluator expects adjacency entries to be global indices.
BuildGhostPrimary is called when adjPrimaryState == Adj_PointToGlobal, so this is satisfied. The tracked pairs' idx.state() is Adj_PointToGlobal at that point.
- Any chain depth. The mechanism works for any number of hops. Each level that needs ghost data gets a scratch pull. The
scratchStates.clear() at the end of each level ensures re-pull with the expanded cumulative ghost set.
- Correct ghost accumulation. COLLECT at intermediate levels adds to the cumulative ghost set. The scratch pull at the next level uses the full cumulative set, so all ghost entries are resolvable.
- Backward compatible. For 1-hop chains (current usage), level 0 is roots, level 1 is the single hop. No scratch pull is needed at level 0->1 because the parent set is owned entities (always in father). The evaluator produces the same result as today.
Intermediate COLLECT marking
For multi-hop same-kind chains (e.g., Cell2Cell -> Cell2Cell), the intermediate Cell node must also COLLECT so the ghost set accumulates between layers.
In insertChainIntoTrie, when creating/finding a child node, set collect = true if the child's entity kind matches the chain's target AND the child is not the last hop:
if (childKind == chain.target)
child->collect = true;
This is safe because COLLECT only adds to the ghost set – it never removes or changes traversal behavior.
GhostSpec for N-layer cells
GhostSpec GhostSpec::defaultPrimary(int nLayers)
{
using namespace Adj;
std::vector<AdjKind> cellHops(nLayers, Cell2Cell);
std::vector<AdjKind> nodeHops(nLayers, Cell2Cell);
nodeHops.push_back(Cell2Node);
return GhostSpec{{
{EntityKind::Cell, cellHops, EntityKind::Cell},
{EntityKind::Cell, nodeHops, EntityKind::Node},
{EntityKind::Bnd, {Bnd2Node, Node2Bnd}, EntityKind::Bnd},
}};
}
#define DNDS_assert(expr)
Debug-only assertion (compiled out when DNDS_NDEBUG is defined). Prints the expression + file/line + ...
constexpr AdjKind Bnd2Node
constexpr AdjKind Node2Bnd
Compiled tree for N=2
Cell (root, level 0)
\-[Cell2Cell]-> Cell (level 1, COLLECT) <-- layer 1 ghost cells
\-[Cell2Cell]-> Cell (level 2, COLLECT) <-- layer 2 ghost cells
\-[Cell2Node]-> Node (level 2, COLLECT) <-- nodes of outermost layer
Bnd (root, level 0)
\-[Bnd2Node]-> Node (level 1)
\-[Node2Bnd]-> Bnd (level 2, COLLECT) <-- ghost bnds
\-[Bnd2Node]-> Node (level 3, COLLECT) <-- nodes of ghost bnds
Evaluation for N=2:
- Level 0: roots = owned Cells, owned Bnds.
- Level 1: traverse Cell2Cell for owned cells -> cell set with ghosts. COLLECT ghost cells (layer 1). Traverse Bnd2Node for owned bnds -> node set. No scratch pull needed (parents are all owned).
- Scratch pull: create temp transformer on cell2cell.father, ghost mapping = layer-1 ghost cells, pullOnce. Now layer-1 ghost cells' cell2cell rows are available in temp son.
- Level 2: traverse Cell2Cell for layer-1 cells (including ghosts, now resolvable via temp son) -> cell set with layer-2 ghosts. COLLECT ghost cells (layers 1+2). Traverse Cell2Node for outermost cells -> node set. COLLECT ghost nodes. Traverse Node2Bnd -> bnd set. COLLECT ghost bnds.
- Level 3 (bnd chain): traverse Bnd2Node for ghost bnds -> node set. COLLECT ghost nodes (merged with cell-derived nodes).
Caller code in BuildGhostPrimary
void UnstructuredMesh::BuildGhostPrimary(int nLayers)
{
cell2cell.TransAttach();
cell2cell.trans.createFatherGlobalMapping();
MeshConnectivity dag;
dag.meshDim = dim;
dag.registerAdj(Adj::Cell2Cell, cell2cell);
dag.registerAdj(Adj::Bnd2Node, bnd2node);
dag.registerAdj(Adj::Node2Bnd, node2bnd);
dag.registerGlobalMapping(EntityKind::Cell, cell2cell.trans.pLGlobalMapping);
dag.registerGlobalMapping(EntityKind::Node, coords.trans.pLGlobalMapping);
dag.registerGlobalMapping(EntityKind::Bnd, bnd2node.trans.pLGlobalMapping);
auto spec = GhostSpec::defaultPrimary(nLayers);
auto tree = CompiledGhostTree::compile(spec);
auto result = dag.evaluateGhostTree(tree, mpi);
cell2cell.trans.createMPITypes();
cell2cell.trans.pullOnce();
}
std::vector< DNDS::index > ghostNodes(ghostNodeSet.begin(), ghostNodeSet.end())
std::vector< GhostCell > ghostCells
The caller code is almost identical to today. The only change is passing nLayers to defaultPrimary. The evaluator handles the complexity internally.
Implementation order
- Mark intermediate COLLECT in
insertChainIntoTrie.
- Add scratch pull mechanism to
evaluateGhostTree (temp transformers).
- Add
GhostSpec::defaultPrimary(int nLayers).
- Update
BuildGhostPrimary(int nLayers = 1).
- Propagate through
Mesh_Helpers.hpp, Python read_mesh.
- Tests.
Files to modify