CFV Module Refactoring Plan¶
Branch: dev/harry_refac1
Baseline commit: c774b89 (on dev/harry_refac1)
Date: 2026-04-15
1. Current State¶
The CFV module (~8,000 lines in src/CFV/) implements physics-agnostic Compact
Finite Volume discretizations. Its core is a two-level class hierarchy:
FiniteVolume (1,262 lines) -- geometric metrics, array factories, device mgmt
^ (inheritance)
VariationalReconstruction<dim> (3,538 lines) -- reconstruction, limiters, base functions
Supporting files: Limiters.hpp (789 lines), VRDefines.hpp (types),
VRSettings.hpp / FiniteVolumeSettings.hpp (configuration),
ModelEvaluator (test physics), test_FiniteVolume (benchmark kernels).
Physics is injected via callbacks (TFBoundary, tFGetBoundaryWeight,
TFTrans), which is a clean Strategy pattern. The implementation has
accumulated structural debt that this plan addresses.
Module Dependency Graph (current)¶
DNDS (Core)
^
Geom (Mesh, Elements, Quadrature, Base Functions)
^
CFV (FiniteVolume, VariationalReconstruction, Limiters, ModelEvaluator)
^ ^
| Euler -- uses full VR + VRDefines types (deep structural dep)
EulerP -- uses only FiniteVolume (clean composition)
2. Refactoring Phases¶
Phase 0: Unit Tests for Correctness (DO FIRST) – DONE¶
Goal: Establish baseline correctness tests before touching any CFV code.
Golden reference values captured from commit c774b89.
Implementation: C++ doctest in test/cpp/CFV/test_Reconstruction.cpp.
CMake target: cfv_test_reconstruction. CTest names:
cfv_reconstruction_np1, cfv_reconstruction_np2, cfv_reconstruction_np4.
All iterative VR uses Jacobi iteration (SORInstead=false) to ensure
deterministic results across MPI partitions. Golden values are constant
for all np.
Error metric: L1 pointwise error at cell quadrature points (intOrder >= 5), normalized by domain volume, so its dimension matches the field function.
Meshes:
Uniform_3x3_wall.cgns(9 quads, wall BC, [-1,2]^2) – polynomial testsIV10_10.cgns(100 quads, periodic, [0,10]^2) + bisections 0,1,2 -> 100, 400, 1600 cells – convergence series on structured meshIV10U_10.cgns(322 tris, periodic, [0,10]^2) + bisections 0,1,2 -> 322, ~1288, ~5152 cells – convergence series on unstructured mesh
Reconstruction methods tested:
Gauss-Green (explicit 2nd-order gradient,
DoReconstruction2ndGrad)VFV P1, P2, P3 with HQM weights (dirWeightScheme=HQM_OPT, geomWeightScheme=HQM_SD)
VFV P1, P3 with default Factorial weights
Test cases (17 total, 50 assertions):
Category |
Tests |
Assertion |
|---|---|---|
Wall/constant (5 methods) |
u=1 on wall mesh |
L1 err < 1e-12 (exact) |
Wall/linear (5 methods) |
u=x+2y on wall mesh |
Sanity check < 10 |
Wall/quadratic (3 methods) |
u=x^2+y^2 |
Sanity check < 10 |
Wall/cubic (2 methods) |
u=x^3+xy^2 |
Sanity check < 10 |
Periodic convergence (11 configs x 3 bisections) |
sin*cos and cos+cos on IV10/IV10U |
Golden value match (1e-6 rel) |
Convergence check |
P2-HQM on IV10 base |
Converges in < 200 Jacobi iters |
Convergence series (golden, representative):
Mesh |
Method |
bis=0 |
bis=1 |
bis=2 |
Rate |
|---|---|---|---|---|---|
IV10 |
GG |
1.56e-02 |
3.42e-03 |
7.89e-04 |
~2 |
IV10 |
P1-HQM |
4.66e-02 |
9.30e-03 |
1.53e-03 |
~2.5 |
IV10 |
P2-HQM |
3.05e-03 |
2.31e-04 |
2.44e-05 |
~3.5 |
IV10 |
P3-HQM |
1.91e-03 |
4.67e-05 |
1.49e-06 |
~5 |
IV10 |
P3-Def |
1.88e-03 |
2.50e-05 |
8.37e-07 |
~5 |
IV10U |
P3-HQM |
1.59e-04 |
9.88e-06 |
4.34e-07 |
~4.5 |
Verified passing: np=1, np=2, np=4. Total CTest time ~10s.
Build and run:
cmake --build build -t cfv_test_reconstruction -j8
ctest --test-dir build -R cfv_reconstruction --output-on-failure
Phase 1: Safety Fixes – DONE¶
Fix latent bugs that could cause silent incorrect results.
# |
Issue |
Location |
Fix |
|---|---|---|---|
1.1 |
Thread-unsafe |
|
Replace with local variable |
1.2 |
Raw |
|
Use |
1.3 |
Settings slicing (two divergent copies) |
|
Single settings instance; base reads via accessor |
1.4 |
Missing |
|
Add annotations |
Phase 2: Code Deduplication – DONE (4 of 6 items; 2 skipped)¶
~480 lines removed. Items 2.2 and 2.3 were skipped after analysis showed the structural differences between the paired functions are too deep for a clean merge; forcing unification would reduce readability without meaningful maintenance benefit.
# |
Duplication |
Status |
Fix |
Lines Saved |
|---|---|---|---|---|
2.1 |
Periodic transform block (4 copies) |
Done |
|
~40 |
2.2 |
|
Skipped |
Sweep strategies differ fundamentally (successive SR vs. three-sweep) |
– |
2.3 |
|
Skipped |
Diff version has different callback signature, no GG1 block, no u subtraction |
– |
2.4 |
Polynomial norm in |
Done |
|
~300 |
2.5 |
|
Done |
|
~60 |
2.6 |
Biway limiter dispatch switch (2 copies) |
Done |
|
~30 |
New helpers introduced:
VRDefines.hpp:GetRecDOFRange<dim>(pOrder)– returns{LimStart, LimEnd}pairLimiters.hpp:PolynomialSquaredNorm<dim>(theta),PolynomialDotProduct<dim>(t1, t2)VariationalReconstruction.hpp:ApplyPeriodicTransform(if2c, faceID, data...)– variadic CRTPLimiterProcedure.hxx:DispatchBiwayLimiter<dim, nVarsFixed>(alter, u1, u2, out, n)
Verified: Phase 0 tests pass at np=1,2,4. Euler target compiles clean.
Phase 3: Module Boundary Corrections – DONE (1 of 2 items; 1 deferred)¶
# |
Item |
Status |
Action |
|---|---|---|---|
3.1 |
|
Deferred |
Added placement comment; actual move to |
3.2 |
|
Done |
Renamed to |
Phase 4: Extract FFaceFunctional – DONE¶
File:
VariationalReconstruction.hppWhat was done:
Extracted
AccumulateDiffOrderContributions<dim, powV>()free function template that replaces 4 copies of theswitch(cnDiffs)fallthrough pattern (~160 lines removed)Replaced
#define __POWV/#undef __POWVmacro withconstexpr int powVUses compile-time
Eigen::segment<N>()instead of dynamic{6,7,8,9}initializer-list slicingReplaced
std::pow(faceL, order*2)with integer multiplicationsFFaceFunctional body remains inline in
VariationalReconstruction.hpp(moving to separate.hxxprovides no functional benefit given the above improvements)
Phase 5: Decompose VariationalReconstruction into Composed Sub-Objects – DONE (2 of 4 items; 2 skipped)¶
Split the monolithic class’s data members into two nested sub-objects:
Component |
Status |
Owns |
|---|---|---|
|
Done |
|
|
Done |
|
|
Skipped |
Methods already in separate |
|
Skipped |
Same reason; already in separate |
VariationalReconstruction<dim> holds baseWeight_ and coefficients_ members
and delegates through them. The public API is preserved unchanged. All methods
continue to live on the VR class but access data through the sub-objects.
Phase 6: Separate FiniteVolume Concerns – DONE¶
Concern |
Current State |
Refactored |
|---|---|---|
Geometric metrics |
18 protected array pairs + |
Keep in |
Array factory / DOF builder |
|
DOF logic extracted to free functions in |
Device management |
|
Extracted to |
Additional improvements:
MakePairDefaultOnCell/MakePairDefaultOnFacenow require anameparameter (33 call sites updated with descriptive names like"FV::volumeLocal","VR::matrixA")Array::to_device()/to_host()/deviceView()error messages now include the array’s object identityArrayPair::deviceView()error messages include the array’s object identity
Phase 6a: Unified Device Management Mixin – DONE¶
Implemented: DNDS/DeviceTransferable.hpp — CRTP mixin providing to_device(),
to_host(), device(), getDeviceArrayBytes(). Derived class provides
for_each_device_member(F&&).
Classes migrated:
CFV::FiniteVolume— inheritsDeviceTransferable<FiniteVolume>, delegates viafor_each_device_member→device_array_list()Geom::UnstructuredMesh— inheritsDeviceTransferable<UnstructuredMesh>, delegates viafor_each_device_member→op_on_device_arrays()(conditional 4-group iteration preserved)
Classes skipped (documented rationale):
EulerP::Evaluator,BCHandler,Physics,BC— use manual delegation pattern (callingto_deviceon heterogeneous sub-objects of different types:shared_ptr<FV>,shared_ptr<BCHandler>,host_device_vector, individualArrayPairs). The mixin’s tuple-basedMemberRefiteration doesn’t fit without significant restructuring. The manual delegation is appropriate here since these classes compose otherDeviceTransferableobjects.
CUDA unit tests: test/cpp/CFV/test_DeviceTransferable.cu — 5 tests:
test_device_view_trivially_copyable— compile-timestatic_assert+ host view correctnesstest_device_state_tracking—device()returns correct backend through transfer lifecycletest_host_device_transfer— Host backend preserves data accesstest_cuda_round_trip— host→CUDA kernel read→host verification of cell volumes, barycenters, face areastest_cuda_multiple_round_trips— 3 round-trips with bitwise value stability
Phase 7: API Hygiene and Encapsulation – SKIPPED¶
All three items were evaluated and skipped. The cost/benefit ratio does not justify the changes given the current codebase state after Phases 0-6+8.
# |
Item |
Status |
Rationale |
|---|---|---|---|
7.1 |
|
Skipped |
Stable interface (8/9 params), 6 invocation sites + 8 lambda definitions to update atomically; Euler lambdas lack Phase 0 test coverage; low value for a cosmetic change |
7.2 |
Const accessors for VR→FV protected members |
Skipped |
11 protected members accessed directly by VR (23 reads, all read-only except 1 settings write); protected access is the intended C++ pattern for inheritance; only worthwhile if switching to composition, which was already evaluated and skipped in Phase 5 |
7.3 |
Make |
Skipped |
|
Detailed findings (preserved for future reference):
TFBoundary: 8 params(UBL, UMEAN, iCell, iFace, ig, Norm, pPhy, fType).TFBoundaryDiff: 9 params (addsdUMEANas 2nd arg). Defined inVariationalReconstruction.hpp:885-904. Invoked in 6 sites (all in_Reconstruction.hxx). Lambda definitions: 4 insrc/Euler/, 1 inModelEvaluator.hpp, 1 inEulerSolver.hxx(diff), 2 in test code.VR directly accesses 11 FV protected members (23 total reads). Heaviest:
axisSymmetric(5),faceMeanNorm(4),cellAlignedHBox(2),cellMajorHBox(2),cellMajorCoord(2). 5 members lack any public accessor:cellAlignedHBox,cellMajorCoord,periodicity,cellCent,faceCent. External code (Euler/EulerP) never touches protected members.meshis publicssp<UnstructuredMesh>;mpiis publicMPIInfovalue. Externalmeshaccess: ~25 sites in EulerP (fv->mesh->), ~35 in tests/app. Euler keeps its ownmeshshared_ptr copy and never goes throughfv->mesh. Externalmpiaccess through FV: essentially zero.
Phase 8: Code Cleanup – DONE (3 of 4 items; 1 preserved)¶
Removed ~170 lines of pure debug output across 7 files. Deleted the incomplete
FWBAP_LE_Multiway limiter stub. Fixed bcWeight condition for periodic faces.
# |
Item |
Status |
Notes |
|---|---|---|---|
8.1 |
Delete debug prints ( |
Done |
~170 lines removed from 7 files; all commented-out experimental/alternative code preserved per policy |
8.2 |
Replace |
Done |
Completed in Phase 4 ( |
8.3 |
Remove |
Preserved |
Useful for IDE support; deliberately kept |
8.4 |
Delete incomplete |
Done |
Empty TODO body with no callers; 11 lines removed |
Behavior change: ConstructBaseAndWeight() face weight condition changed
from FaceIDIsExternalBC(...) || FaceIDIsPeriodic(...) to
FaceIDIsExternalBC(...) only — periodic faces are now treated as internal
faces for weight settings, not assigned bcWeight.
3. Implementation Order and Risk Assessment¶
Phase |
Risk |
Effort |
Prerequisite |
Status |
|---|---|---|---|---|
Phase 0: Unit Tests |
None |
1 day |
– |
DONE |
Phase 1: Safety Fixes |
Low |
0.5 day |
Phase 0 |
DONE |
Phase 2: Deduplication |
Low |
2-3 days |
Phase 0 |
DONE (4/6) |
Phase 3: Module Boundaries |
Low |
1 day |
Phase 0 |
DONE (1/2) |
Phase 4: FFaceFunctional |
Medium |
1-2 days |
Phase 2 |
DONE |
Phase 5: VR Decomposition |
Medium-High |
3-5 days |
Phase 4 |
DONE (2/4) |
Phase 6: FV Separation + Device Mixin |
High |
4-6 days |
Phase 0 |
DONE |
Phase 7: API Hygiene |
Medium |
2-3 days |
Phase 5, 6 |
SKIPPED |
Phase 8: Cleanup |
Trivial |
0.5 day |
Any time |
DONE (3/4) |
Each phase must pass the full test suite (pytest test/ + C++ ctest) after
completion. Phase 0 tests serve as the regression guard for all subsequent
phases.
4. Dependency Graph After Refactoring¶
DNDS (Core)
DeviceTransferable<T> -- CRTP mixin for device management
^
Geom (Mesh, Elements, Quadrature, Base Functions)
^
CFV/
FiniteVolume -- geometric metrics (core)
DOFFactory (free fns) -- array allocation helpers
VRBaseWeight -- base functions, caches, face weights
VRCoefficients -- reconstruction matrices
Limiters (free fns) -- individual limiter functions
VariationalReconstruction<dim> -- facade composing above
ModelEvaluator -- advection-diffusion test physics (stays in CFV)
BenchmarkFiniteVolume -- benchmark kernels (renamed from test_FiniteVolume)
^ ^
| Euler (uses full VR facade + DOFFactory)
|
EulerP (uses only FiniteVolume)
5. Typical VR Configurations (for Testing Reference)¶
Based on analysis of ~90 case files, two configurations dominate:
HQM Production (~85% of real cases):
{
"maxOrder": 3, "intOrder": 5, "intOrderVR": 5,
"cacheDiffBase": true, "SORInstead": false, "jacobiRelax": 1.0,
"smoothThreshold": 0.001, "WBAP_nStd": 10.0,
"subs2ndOrder": 1, "subs2ndOrderGGScheme": 0,
"baseSettings": {"localOrientation": false, "anisotropicLengths": false},
"functionalSettings": {
"scaleType": "BaryDiff",
"dirWeightScheme": "HQM_OPT",
"geomWeightScheme": "HQM_SD",
"geomWeightPower": 0.5, "geomWeightBias": 1
}
}
Default Factorial (~15% of cases, code default):
{
"maxOrder": 3, "intOrder": 5,
"SORInstead": true, "jacobiRelax": 1.0,
"smoothThreshold": 0.01, "WBAP_nStd": 10.0,
"subs2ndOrder": 0,
"baseSettings": {"localOrientation": false, "anisotropicLengths": false},
"functionalSettings": {
"scaleType": "BaryDiff",
"dirWeightScheme": "Factorial",
"geomWeightScheme": "GWNone"
}
}
6. Final Summary¶
Refactoring completed on branch dev/harry_refac1 (9 commits from
baseline c774b89). All Phase 0 regression tests pass at np=1,2,4.
Commits¶
# |
Hash |
Phase |
Summary |
|---|---|---|---|
1 |
|
0 |
17 test cases, 50 assertions in |
2 |
|
1 |
Thread-unsafe static, raw new/delete, |
3 |
|
2 |
|
4 |
|
3 |
|
5 |
|
4 |
|
6 |
|
5 |
|
7 |
|
6 |
|
8 |
|
8 |
~170 lines debug prints removed, |
9 |
(this) |
doc |
Final documentation with Phase 7 skip rationale |
What was achieved¶
Safety: Thread-unsafe statics, raw memory management, missing fallthrough annotations, settings slicing – all fixed.
Deduplication: ~480 lines removed via 6 extracted helpers.
Structure: VR data split into
VRBaseWeight+VRCoefficients; DOF allocation extracted toDOFFactory.hpp; device management unified inDeviceTransferable<T>CRTP mixin.Naming: Benchmark files renamed; 33 array pairs given descriptive names; error messages include object identity.
Cleanup: ~170 lines of debug prints removed; dead stub deleted; periodic face weight bug fixed.
Testing: 17 regression test cases (50 assertions) at np=1,2,4; 5 CUDA device transfer tests.
What was deliberately skipped (with rationale)¶
Phase 2.2/2.3:
DoLimiterWBAP_Cvs_3andGetBoundaryRHSvsGetBoundaryRHSDiff– structural differences too deep for clean merge.Phase 3.1: ModelEvaluator move to
src/Model/– blocked by Python binding coupling; stays in CFV.Phase 5c/5d: Reconstruction/limiter methods as free functions – already in separate
.hxxfiles; rewriting explicit instantiation infrastructure for no functional benefit.Phase 6a EulerP: DeviceTransferable mixin for EulerP classes – manual delegation pattern is appropriate for heterogeneous sub-object composition.
Phase 7.1:
BoundaryContextparam struct – stable 8-param interface, Euler lambdas lack test coverage, cosmetic-only.Phase 7.2: Const accessors for VR->FV – protected access is the intended C++ pattern for inheritance; only worthwhile if switching to composition.
Phase 7.3: Private
mesh/mpi– ~897mesh->sites;getMesh()wrapper provides no real encapsulation gain.Phase 8.3:
DNDS_SWITCH_INTELLISENSEhack – useful for IDE support, deliberately preserved.