DNDSR 0.1.0.dev1+gcd065ad
Distributed Numeric Data Structure for CFV
Loading...
Searching...
No Matches
CFV Module Refactoring Plan

Branch: dev/harry_refac1 Baseline commit: c774b89 (on dev/harry_refac1) Date: 2026-04-15


1. Current State

The CFV module (~8,000 lines in src/CFV/) implements physics-agnostic Compact Finite Volume discretizations. Its core is a two-level class hierarchy:

FiniteVolume (1,262 lines) -- geometric metrics, array factories, device mgmt
^ (inheritance)
VariationalReconstruction<dim> (3,538 lines) -- reconstruction, limiters, base functions

Supporting files: Limiters.hpp (789 lines), VRDefines.hpp (types), VRSettings.hpp / FiniteVolumeSettings.hpp (configuration), ModelEvaluator (test physics), test_FiniteVolume (benchmark kernels).

Physics is injected via callbacks (TFBoundary, tFGetBoundaryWeight, TFTrans), which is a clean Strategy pattern. The implementation has accumulated structural debt that this plan addresses.

Module Dependency Graph (current)

DNDS (Core)
^
Geom (Mesh, Elements, Quadrature, Base Functions)
^
CFV (FiniteVolume, VariationalReconstruction, Limiters, ModelEvaluator)
^ ^
| Euler -- uses full VR + VRDefines types (deep structural dep)
EulerP -- uses only FiniteVolume (clean composition)

2. Refactoring Phases

Phase 0: Unit Tests for Correctness (DO FIRST) – DONE

Goal: Establish baseline correctness tests before touching any CFV code. Golden reference values captured from commit c774b89.

Implementation: C++ doctest in test/cpp/CFV/test_Reconstruction.cpp. CMake target: cfv_test_reconstruction. CTest names: cfv_reconstruction_np1, cfv_reconstruction_np2, cfv_reconstruction_np4.

All iterative VR uses Jacobi iteration (SORInstead=false) to ensure deterministic results across MPI partitions. Golden values are constant for all np.

Error metric: L1 pointwise error at cell quadrature points (intOrder >= 5), normalized by domain volume, so its dimension matches the field function.

Meshes:

  • Uniform_3x3_wall.cgns (9 quads, wall BC, [-1,2]^2) – polynomial tests
  • IV10_10.cgns (100 quads, periodic, [0,10]^2) + bisections 0,1,2 -> 100, 400, 1600 cells – convergence series on structured mesh
  • IV10U_10.cgns (322 tris, periodic, [0,10]^2) + bisections 0,1,2 -> 322, ~1288, ~5152 cells – convergence series on unstructured mesh

Reconstruction methods tested:

  • Gauss-Green (explicit 2nd-order gradient, DoReconstruction2ndGrad)
  • VFV P1, P2, P3 with HQM weights (dirWeightScheme=HQM_OPT, geomWeightScheme=HQM_SD)
  • VFV P1, P3 with default Factorial weights

Test cases (17 total, 50 assertions):

Category Tests Assertion
Wall/constant (5 methods) u=1 on wall mesh L1 err < 1e-12 (exact)
Wall/linear (5 methods) u=x+2y on wall mesh Sanity check < 10
Wall/quadratic (3 methods) u=x^2+y^2 Sanity check < 10
Wall/cubic (2 methods) u=x^3+xy^2 Sanity check < 10
Periodic convergence (11 configs x 3 bisections) sin*cos and cos+cos on IV10/IV10U Golden value match (1e-6 rel)
Convergence check P2-HQM on IV10 base Converges in < 200 Jacobi iters

Convergence series (golden, representative):

Mesh Method bis=0 bis=1 bis=2 Rate
IV10 GG 1.56e-02 3.42e-03 7.89e-04 ~2
IV10 P1-HQM 4.66e-02 9.30e-03 1.53e-03 ~2.5
IV10 P2-HQM 3.05e-03 2.31e-04 2.44e-05 ~3.5
IV10 P3-HQM 1.91e-03 4.67e-05 1.49e-06 ~5
IV10 P3-Def 1.88e-03 2.50e-05 8.37e-07 ~5
IV10U P3-HQM 1.59e-04 9.88e-06 4.34e-07 ~4.5

Verified passing: np=1, np=2, np=4. Total CTest time ~10s.

Build and run:

cmake --build build -t cfv_test_reconstruction -j8
ctest --test-dir build -R cfv_reconstruction --output-on-failure

Phase 1: Safety Fixes – DONE

Fix latent bugs that could cause silent incorrect results.

# Issue Location Fix
1.1 Thread-unsafe static local in template Reconstruction.hxx:418 Replace with local variable
1.2 Raw new[]/delete[] in limiter helper Limiters.hpp:315-331 Use Eigen::VectorXd or stack array
1.3 Settings slicing (two divergent copies) VR.hpp:44-48 Single settings instance; base reads via accessor
1.4 Missing [[fallthrough]] annotations VR.cpp:386-397, 405-416 Add annotations

Phase 2: Code Deduplication – DONE (4 of 6 items; 2 skipped)

~480 lines removed. Items 2.2 and 2.3 were skipped after analysis showed the structural differences between the paired functions are too deep for a clean merge; forcing unification would reduce readability without meaningful maintenance benefit.

# Duplication Status Fix Lines Saved
2.1 Periodic transform block (4 copies) Done ApplyPeriodicTransform() variadic template method ~40
2.2 DoLimiterWBAP_C vs DoLimiterWBAP_3 Skipped Sweep strategies differ fundamentally (successive SR vs. three-sweep)
2.3 GetBoundaryRHS vs GetBoundaryRHSDiff Skipped Diff version has different callback signature, no GG1 block, no u subtraction
2.4 Polynomial norm in Limiters.hpp (8 copies) Done PolynomialSquaredNorm<dim>() + PolynomialDotProduct<dim>() ~300
2.5 LimStart/LimEnd DOF index lookup (2 copies) Done GetRecDOFRange<dim>() in VRDefines.hpp ~60
2.6 Biway limiter dispatch switch (2 copies) Done DispatchBiwayLimiter<dim, nVarsFixed>() ~30

New helpers introduced:

  • VRDefines.hpp: GetRecDOFRange<dim>(pOrder) – returns {LimStart, LimEnd} pair
  • Limiters.hpp: PolynomialSquaredNorm<dim>(theta), PolynomialDotProduct<dim>(t1, t2)
  • VariationalReconstruction.hpp: ApplyPeriodicTransform(if2c, faceID, data...) – variadic CRTP
  • LimiterProcedure.hxx: DispatchBiwayLimiter<dim, nVarsFixed>(alter, u1, u2, out, n)

Verified: Phase 0 tests pass at np=1,2,4. Euler target compiles clean.

Phase 3: Module Boundary Corrections – DONE (1 of 2 items; 1 deferred)

# Item Status Action
3.1 ModelEvaluator (concrete physics model inside CFV) Deferred Added placement comment; actual move to src/Model/ blocked by Python binding coupling (CFV.ModelEvaluator)
3.2 test_FiniteVolume* (benchmark kernels with test_ prefix) Done Renamed to BenchmarkFiniteVolume*; updated CMake, includes, pybind11 C++ function names; Python-visible names unchanged

Phase 4: Extract <tt>FFaceFunctional</tt> – DONE

  • File: VariationalReconstruction.hpp
  • What was done:
    • Extracted AccumulateDiffOrderContributions<dim, powV>() free function template that replaces 4 copies of the switch(cnDiffs) fallthrough pattern (~160 lines removed)
    • Replaced #define __POWV / #undef __POWV macro with constexpr int powV
    • Uses compile-time Eigen::segment<N>() instead of dynamic {6,7,8,9} initializer-list slicing
    • Replaced std::pow(faceL, order*2) with integer multiplications
    • FFaceFunctional body remains inline in VariationalReconstruction.hpp (moving to separate .hxx provides no functional benefit given the above improvements)

Phase 5: Decompose <tt>VariationalReconstruction</tt> into Composed Sub-Objects – DONE (2 of 4 items; 2 skipped)

Split the monolithic class's data members into two nested sub-objects:

Component Status Owns
VRBaseWeight Done cellBaseMoment, faceWeight, diff-base caches (4), faceAlignedScales, faceMajorCoordScale, bndVRCaches
VRCoefficients Done matrixAB, vectorB, matrixAAInvB, vectorAInvB, matrixSecondary, matrixAHalf_GG, matrixA, Cholesky caches (4)
VRReconstructor (free functions) Skipped Methods already in separate _Reconstruction.hxx; making free functions would require rewriting all _explicit_instantiation/ files for no practical benefit
VRLimiter (free functions) Skipped Same reason; already in separate _LimiterProcedure.hxx

VariationalReconstruction<dim> holds baseWeight_ and coefficients_ members and delegates through them. The public API is preserved unchanged. All methods continue to live on the VR class but access data through the sub-objects.

Phase 6: Separate <tt>FiniteVolume</tt> Concerns – DONE

Concern Current State Refactored
Geometric metrics 18 protected array pairs + Construct*() Keep in FiniteVolume (core responsibility)
Array factory / DOF builder MakePairDefaultOnCell, BuildUDof, etc. DOF logic extracted to free functions in DOFFactory.hpp; FV methods are thin wrappers
Device management to_host(), to_device(), device(), deviceView() Extracted to DeviceTransferable<Derived> CRTP mixin (see Phase 6a)

Additional improvements:

  • MakePairDefaultOnCell/MakePairDefaultOnFace now require a name parameter (33 call sites updated with descriptive names like "FV::volumeLocal", "VR::matrixA")
  • Array::to_device()/to_host()/deviceView() error messages now include the array's object identity
  • ArrayPair::deviceView() error messages include the array's object identity

Phase 6a: Unified Device Management Mixin – DONE

Implemented: DNDS/DeviceTransferable.hpp — CRTP mixin providing to_device(), to_host(), device(), getDeviceArrayBytes(). Derived class provides for_each_device_member(F&&).

Classes migrated:

  • CFV::FiniteVolume — inherits DeviceTransferable<FiniteVolume>, delegates via for_each_device_memberdevice_array_list()
  • Geom::UnstructuredMesh — inherits DeviceTransferable<UnstructuredMesh>, delegates via for_each_device_memberop_on_device_arrays() (conditional 4-group iteration preserved)

Classes skipped (documented rationale):

  • EulerP::Evaluator, BCHandler, Physics, BC — use manual delegation pattern (calling to_device on heterogeneous sub-objects of different types: shared_ptr<FV>, shared_ptr<BCHandler>, host_device_vector, individual ArrayPairs). The mixin's tuple-based MemberRef iteration doesn't fit without significant restructuring. The manual delegation is appropriate here since these classes compose other DeviceTransferable objects.

CUDA unit tests: test/cpp/CFV/test_DeviceTransferable.cu — 5 tests:

  1. test_device_view_trivially_copyable — compile-time static_assert + host view correctness
  2. test_device_state_trackingdevice() returns correct backend through transfer lifecycle
  3. test_host_device_transfer — Host backend preserves data access
  4. test_cuda_round_trip — host→CUDA kernel read→host verification of cell volumes, barycenters, face areas
  5. test_cuda_multiple_round_trips — 3 round-trips with bitwise value stability

Phase 7: API Hygiene and Encapsulation – SKIPPED

All three items were evaluated and skipped. The cost/benefit ratio does not justify the changes given the current codebase state after Phases 0-6+8.

# Item Status Rationale
7.1 BoundaryContext<nVarsFixed> param struct Skipped Stable interface (8/9 params), 6 invocation sites + 8 lambda definitions to update atomically; Euler lambdas lack Phase 0 test coverage; low value for a cosmetic change
7.2 Const accessors for VR→FV protected members Skipped 11 protected members accessed directly by VR (23 reads, all read-only except 1 settings write); protected access is the intended C++ pattern for inheritance; only worthwhile if switching to composition, which was already evaluated and skipped in Phase 5
7.3 Make mesh/mpi private Skipped mesh-> appears ~897 times across the codebase; making private requires ~60 external call-site changes for a getMesh() wrapper that provides no real encapsulation gain (the shared_ptr grants full mutability anyway)

Detailed findings (preserved for future reference):

  • TFBoundary: 8 params (UBL, UMEAN, iCell, iFace, ig, Norm, pPhy, fType). TFBoundaryDiff: 9 params (adds dUMEAN as 2nd arg). Defined in VariationalReconstruction.hpp:885-904. Invoked in 6 sites (all in _Reconstruction.hxx). Lambda definitions: 4 in src/Euler/, 1 in ModelEvaluator.hpp, 1 in EulerSolver.hxx (diff), 2 in test code.
  • VR directly accesses 11 FV protected members (23 total reads). Heaviest: axisSymmetric (5), faceMeanNorm (4), cellAlignedHBox (2), cellMajorHBox (2), cellMajorCoord (2). 5 members lack any public accessor: cellAlignedHBox, cellMajorCoord, periodicity, cellCent, faceCent. External code (Euler/EulerP) never touches protected members.
  • mesh is public ssp<UnstructuredMesh>; mpi is public MPIInfo value. External mesh access: ~25 sites in EulerP (fv->mesh->), ~35 in tests/app. Euler keeps its own mesh shared_ptr copy and never goes through fv->mesh. External mpi access through FV: essentially zero.

Phase 8: Code Cleanup – DONE (3 of 4 items; 1 preserved)

Removed ~170 lines of pure debug output across 7 files. Deleted the incomplete FWBAP_LE_Multiway limiter stub. Fixed bcWeight condition for periodic faces.

# Item Status Notes
8.1 Delete debug prints (std::cout/printf/std::abort) Done ~170 lines removed from 7 files; all commented-out experimental/alternative code preserved per policy
8.2 Replace #define __POWV/#undef Done Completed in Phase 4 (constexpr int powV)
8.3 Remove DNDS_SWITCH_INTELLISENSE hack Preserved Useful for IDE support; deliberately kept
8.4 Delete incomplete FWBAP_LE_Multiway Done Empty TODO body with no callers; 11 lines removed

Behavior change: ConstructBaseAndWeight() face weight condition changed from FaceIDIsExternalBC(...) || FaceIDIsPeriodic(...) to FaceIDIsExternalBC(...) only — periodic faces are now treated as internal faces for weight settings, not assigned bcWeight.


3. Implementation Order and Risk Assessment

Phase Risk Effort Prerequisite Status
Phase 0: Unit Tests None 1 day DONE
Phase 1: Safety Fixes Low 0.5 day Phase 0 DONE
Phase 2: Deduplication Low 2-3 days Phase 0 DONE (4/6)
Phase 3: Module Boundaries Low 1 day Phase 0 DONE (1/2)
Phase 4: FFaceFunctional Medium 1-2 days Phase 2 DONE
Phase 5: VR Decomposition Medium-High 3-5 days Phase 4 DONE (2/4)
Phase 6: FV Separation + Device Mixin High 4-6 days Phase 0 DONE
Phase 7: API Hygiene Medium 2-3 days Phase 5, 6 SKIPPED
Phase 8: Cleanup Trivial 0.5 day Any time DONE (3/4)

Each phase must pass the full test suite (pytest test/ + C++ ctest) after completion. Phase 0 tests serve as the regression guard for all subsequent phases.


4. Dependency Graph After Refactoring

DNDS (Core)
DeviceTransferable<T> -- CRTP mixin for device management
^
Geom (Mesh, Elements, Quadrature, Base Functions)
^
CFV/
FiniteVolume -- geometric metrics (core)
DOFFactory (free fns) -- array allocation helpers
VRBaseWeight -- base functions, caches, face weights
VRCoefficients -- reconstruction matrices
Limiters (free fns) -- individual limiter functions
VariationalReconstruction<dim> -- facade composing above
ModelEvaluator -- advection-diffusion test physics (stays in CFV)
BenchmarkFiniteVolume -- benchmark kernels (renamed from test_FiniteVolume)
^ ^
| Euler (uses full VR facade + DOFFactory)
|
EulerP (uses only FiniteVolume)

5. Typical VR Configurations (for Testing Reference)

Based on analysis of ~90 case files, two configurations dominate:

HQM Production (~85% of real cases):

{
"maxOrder": 3, "intOrder": 5, "intOrderVR": 5,
"cacheDiffBase": true, "SORInstead": false, "jacobiRelax": 1.0,
"smoothThreshold": 0.001, "WBAP_nStd": 10.0,
"subs2ndOrder": 1, "subs2ndOrderGGScheme": 0,
"baseSettings": {"localOrientation": false, "anisotropicLengths": false},
"functionalSettings": {
"scaleType": "BaryDiff",
"dirWeightScheme": "HQM_OPT",
"geomWeightScheme": "HQM_SD",
"geomWeightPower": 0.5, "geomWeightBias": 1
}
}

Default Factorial (~15% of cases, code default):

{
"maxOrder": 3, "intOrder": 5,
"SORInstead": true, "jacobiRelax": 1.0,
"smoothThreshold": 0.01, "WBAP_nStd": 10.0,
"subs2ndOrder": 0,
"baseSettings": {"localOrientation": false, "anisotropicLengths": false},
"functionalSettings": {
"scaleType": "BaryDiff",
"dirWeightScheme": "Factorial",
"geomWeightScheme": "GWNone"
}
}

6. Final Summary

Refactoring completed on branch dev/harry_refac1 (9 commits from baseline c774b89). All Phase 0 regression tests pass at np=1,2,4.

Commits

# Hash Phase Summary
1 6537b94 0 17 test cases, 50 assertions in test_Reconstruction.cpp
2 5523abe 1 Thread-unsafe static, raw new/delete, [[fallthrough]], settings slicing
3 700ed4f 2 GetRecDOFRange, PolynomialSquaredNorm, PolynomialDotProduct, ApplyPeriodicTransform, DispatchBiwayLimiter
4 54d9de9 3 test_FiniteVolume* -> BenchmarkFiniteVolume*; ModelEvaluator placement comment
5 048fe4d 4 AccumulateDiffOrderContributions with compile-time segment<N>(), integer powers, constexpr powV
6 959f751 5 VRBaseWeight (9 members) + VRCoefficients (12 members) nested sub-objects
7 217fc09 6 DeviceTransferable<Derived> CRTP mixin, DOF factory extraction, named arrays, 5 CUDA tests
8 f119cbc 8 ~170 lines debug prints removed, FWBAP_LE_Multiway deleted, periodic face weight fix
9 (this) doc Final documentation with Phase 7 skip rationale

What was achieved

  • Safety: Thread-unsafe statics, raw memory management, missing fallthrough annotations, settings slicing – all fixed.
  • Deduplication: ~480 lines removed via 6 extracted helpers.
  • Structure: VR data split into VRBaseWeight + VRCoefficients; DOF allocation extracted to DOFFactory.hpp; device management unified in DeviceTransferable<T> CRTP mixin.
  • Naming: Benchmark files renamed; 33 array pairs given descriptive names; error messages include object identity.
  • Cleanup: ~170 lines of debug prints removed; dead stub deleted; periodic face weight bug fixed.
  • Testing: 17 regression test cases (50 assertions) at np=1,2,4; 5 CUDA device transfer tests.

What was deliberately skipped (with rationale)

  • Phase 2.2/2.3: DoLimiterWBAP_C vs _3 and GetBoundaryRHS vs GetBoundaryRHSDiff – structural differences too deep for clean merge.
  • Phase 3.1: ModelEvaluator move to src/Model/ – blocked by Python binding coupling; stays in CFV.
  • Phase 5c/5d: Reconstruction/limiter methods as free functions – already in separate .hxx files; rewriting explicit instantiation infrastructure for no functional benefit.
  • Phase 6a EulerP: DeviceTransferable mixin for EulerP classes – manual delegation pattern is appropriate for heterogeneous sub-object composition.
  • Phase 7.1: BoundaryContext param struct – stable 8-param interface, Euler lambdas lack test coverage, cosmetic-only.
  • Phase 7.2: Const accessors for VR->FV – protected access is the intended C++ pattern for inheritance; only worthwhile if switching to composition.
  • Phase 7.3: Private mesh/mpi – ~897 mesh-> sites; getMesh() wrapper provides no real encapsulation gain.
  • Phase 8.3: DNDS_SWITCH_INTELLISENSE hack – useful for IDE support, deliberately preserved.