|
DNDSR 0.1.0.dev1+gcd065ad
Distributed Numeric Data Structure for CFV
|
Branch: dev/harry_refac1 Baseline commit: c774b89 (on dev/harry_refac1) Date: 2026-04-15
The CFV module (~8,000 lines in src/CFV/) implements physics-agnostic Compact Finite Volume discretizations. Its core is a two-level class hierarchy:
Supporting files: Limiters.hpp (789 lines), VRDefines.hpp (types), VRSettings.hpp / FiniteVolumeSettings.hpp (configuration), ModelEvaluator (test physics), test_FiniteVolume (benchmark kernels).
Physics is injected via callbacks (TFBoundary, tFGetBoundaryWeight, TFTrans), which is a clean Strategy pattern. The implementation has accumulated structural debt that this plan addresses.
Goal: Establish baseline correctness tests before touching any CFV code. Golden reference values captured from commit c774b89.
Implementation: C++ doctest in test/cpp/CFV/test_Reconstruction.cpp. CMake target: cfv_test_reconstruction. CTest names: cfv_reconstruction_np1, cfv_reconstruction_np2, cfv_reconstruction_np4.
All iterative VR uses Jacobi iteration (SORInstead=false) to ensure deterministic results across MPI partitions. Golden values are constant for all np.
Error metric: L1 pointwise error at cell quadrature points (intOrder >= 5), normalized by domain volume, so its dimension matches the field function.
Meshes:
Uniform_3x3_wall.cgns (9 quads, wall BC, [-1,2]^2) – polynomial testsIV10_10.cgns (100 quads, periodic, [0,10]^2) + bisections 0,1,2 -> 100, 400, 1600 cells – convergence series on structured meshIV10U_10.cgns (322 tris, periodic, [0,10]^2) + bisections 0,1,2 -> 322, ~1288, ~5152 cells – convergence series on unstructured meshReconstruction methods tested:
DoReconstruction2ndGrad)Test cases (17 total, 50 assertions):
| Category | Tests | Assertion |
|---|---|---|
| Wall/constant (5 methods) | u=1 on wall mesh | L1 err < 1e-12 (exact) |
| Wall/linear (5 methods) | u=x+2y on wall mesh | Sanity check < 10 |
| Wall/quadratic (3 methods) | u=x^2+y^2 | Sanity check < 10 |
| Wall/cubic (2 methods) | u=x^3+xy^2 | Sanity check < 10 |
| Periodic convergence (11 configs x 3 bisections) | sin*cos and cos+cos on IV10/IV10U | Golden value match (1e-6 rel) |
| Convergence check | P2-HQM on IV10 base | Converges in < 200 Jacobi iters |
Convergence series (golden, representative):
| Mesh | Method | bis=0 | bis=1 | bis=2 | Rate |
|---|---|---|---|---|---|
| IV10 | GG | 1.56e-02 | 3.42e-03 | 7.89e-04 | ~2 |
| IV10 | P1-HQM | 4.66e-02 | 9.30e-03 | 1.53e-03 | ~2.5 |
| IV10 | P2-HQM | 3.05e-03 | 2.31e-04 | 2.44e-05 | ~3.5 |
| IV10 | P3-HQM | 1.91e-03 | 4.67e-05 | 1.49e-06 | ~5 |
| IV10 | P3-Def | 1.88e-03 | 2.50e-05 | 8.37e-07 | ~5 |
| IV10U | P3-HQM | 1.59e-04 | 9.88e-06 | 4.34e-07 | ~4.5 |
Verified passing: np=1, np=2, np=4. Total CTest time ~10s.
Build and run:
Fix latent bugs that could cause silent incorrect results.
| # | Issue | Location | Fix |
|---|---|---|---|
| 1.1 | Thread-unsafe static local in template | Reconstruction.hxx:418 | Replace with local variable |
| 1.2 | Raw new[]/delete[] in limiter helper | Limiters.hpp:315-331 | Use Eigen::VectorXd or stack array |
| 1.3 | Settings slicing (two divergent copies) | VR.hpp:44-48 | Single settings instance; base reads via accessor |
| 1.4 | Missing [[fallthrough]] annotations | VR.cpp:386-397, 405-416 | Add annotations |
~480 lines removed. Items 2.2 and 2.3 were skipped after analysis showed the structural differences between the paired functions are too deep for a clean merge; forcing unification would reduce readability without meaningful maintenance benefit.
| # | Duplication | Status | Fix | Lines Saved |
|---|---|---|---|---|
| 2.1 | Periodic transform block (4 copies) | Done | ApplyPeriodicTransform() variadic template method | ~40 |
| 2.2 | DoLimiterWBAP_C vs DoLimiterWBAP_3 | Skipped | Sweep strategies differ fundamentally (successive SR vs. three-sweep) | – |
| 2.3 | GetBoundaryRHS vs GetBoundaryRHSDiff | Skipped | Diff version has different callback signature, no GG1 block, no u subtraction | – |
| 2.4 | Polynomial norm in Limiters.hpp (8 copies) | Done | PolynomialSquaredNorm<dim>() + PolynomialDotProduct<dim>() | ~300 |
| 2.5 | LimStart/LimEnd DOF index lookup (2 copies) | Done | GetRecDOFRange<dim>() in VRDefines.hpp | ~60 |
| 2.6 | Biway limiter dispatch switch (2 copies) | Done | DispatchBiwayLimiter<dim, nVarsFixed>() | ~30 |
New helpers introduced:
VRDefines.hpp: GetRecDOFRange<dim>(pOrder) – returns {LimStart, LimEnd} pairLimiters.hpp: PolynomialSquaredNorm<dim>(theta), PolynomialDotProduct<dim>(t1, t2)VariationalReconstruction.hpp: ApplyPeriodicTransform(if2c, faceID, data...) – variadic CRTPLimiterProcedure.hxx: DispatchBiwayLimiter<dim, nVarsFixed>(alter, u1, u2, out, n)Verified: Phase 0 tests pass at np=1,2,4. Euler target compiles clean.
| # | Item | Status | Action |
|---|---|---|---|
| 3.1 | ModelEvaluator (concrete physics model inside CFV) | Deferred | Added placement comment; actual move to src/Model/ blocked by Python binding coupling (CFV.ModelEvaluator) |
| 3.2 | test_FiniteVolume* (benchmark kernels with test_ prefix) | Done | Renamed to BenchmarkFiniteVolume*; updated CMake, includes, pybind11 C++ function names; Python-visible names unchanged |
VariationalReconstruction.hppAccumulateDiffOrderContributions<dim, powV>() free function template that replaces 4 copies of the switch(cnDiffs) fallthrough pattern (~160 lines removed)#define __POWV / #undef __POWV macro with constexpr int powVEigen::segment<N>() instead of dynamic {6,7,8,9} initializer-list slicingstd::pow(faceL, order*2) with integer multiplicationsVariationalReconstruction.hpp (moving to separate .hxx provides no functional benefit given the above improvements)Split the monolithic class's data members into two nested sub-objects:
| Component | Status | Owns |
|---|---|---|
VRBaseWeight | Done | cellBaseMoment, faceWeight, diff-base caches (4), faceAlignedScales, faceMajorCoordScale, bndVRCaches |
VRCoefficients | Done | matrixAB, vectorB, matrixAAInvB, vectorAInvB, matrixSecondary, matrixAHalf_GG, matrixA, Cholesky caches (4) |
VRReconstructor (free functions) | Skipped | Methods already in separate _Reconstruction.hxx; making free functions would require rewriting all _explicit_instantiation/ files for no practical benefit |
VRLimiter (free functions) | Skipped | Same reason; already in separate _LimiterProcedure.hxx |
VariationalReconstruction<dim> holds baseWeight_ and coefficients_ members and delegates through them. The public API is preserved unchanged. All methods continue to live on the VR class but access data through the sub-objects.
| Concern | Current State | Refactored |
|---|---|---|
| Geometric metrics | 18 protected array pairs + Construct*() | Keep in FiniteVolume (core responsibility) |
| Array factory / DOF builder | MakePairDefaultOnCell, BuildUDof, etc. | DOF logic extracted to free functions in DOFFactory.hpp; FV methods are thin wrappers |
| Device management | to_host(), to_device(), device(), deviceView() | Extracted to DeviceTransferable<Derived> CRTP mixin (see Phase 6a) |
Additional improvements:
MakePairDefaultOnCell/MakePairDefaultOnFace now require a name parameter (33 call sites updated with descriptive names like "FV::volumeLocal", "VR::matrixA")Array::to_device()/to_host()/deviceView() error messages now include the array's object identityArrayPair::deviceView() error messages include the array's object identityImplemented: DNDS/DeviceTransferable.hpp — CRTP mixin providing to_device(), to_host(), device(), getDeviceArrayBytes(). Derived class provides for_each_device_member(F&&).
Classes migrated:
CFV::FiniteVolume — inherits DeviceTransferable<FiniteVolume>, delegates via for_each_device_member → device_array_list()Geom::UnstructuredMesh — inherits DeviceTransferable<UnstructuredMesh>, delegates via for_each_device_member → op_on_device_arrays() (conditional 4-group iteration preserved)Classes skipped (documented rationale):
EulerP::Evaluator, BCHandler, Physics, BC — use manual delegation pattern (calling to_device on heterogeneous sub-objects of different types: shared_ptr<FV>, shared_ptr<BCHandler>, host_device_vector, individual ArrayPairs). The mixin's tuple-based MemberRef iteration doesn't fit without significant restructuring. The manual delegation is appropriate here since these classes compose other DeviceTransferable objects.CUDA unit tests: test/cpp/CFV/test_DeviceTransferable.cu — 5 tests:
test_device_view_trivially_copyable — compile-time static_assert + host view correctnesstest_device_state_tracking — device() returns correct backend through transfer lifecycletest_host_device_transfer — Host backend preserves data accesstest_cuda_round_trip — host→CUDA kernel read→host verification of cell volumes, barycenters, face areastest_cuda_multiple_round_trips — 3 round-trips with bitwise value stabilityAll three items were evaluated and skipped. The cost/benefit ratio does not justify the changes given the current codebase state after Phases 0-6+8.
| # | Item | Status | Rationale |
|---|---|---|---|
| 7.1 | BoundaryContext<nVarsFixed> param struct | Skipped | Stable interface (8/9 params), 6 invocation sites + 8 lambda definitions to update atomically; Euler lambdas lack Phase 0 test coverage; low value for a cosmetic change |
| 7.2 | Const accessors for VR→FV protected members | Skipped | 11 protected members accessed directly by VR (23 reads, all read-only except 1 settings write); protected access is the intended C++ pattern for inheritance; only worthwhile if switching to composition, which was already evaluated and skipped in Phase 5 |
| 7.3 | Make mesh/mpi private | Skipped | mesh-> appears ~897 times across the codebase; making private requires ~60 external call-site changes for a getMesh() wrapper that provides no real encapsulation gain (the shared_ptr grants full mutability anyway) |
Detailed findings (preserved for future reference):
TFBoundary: 8 params (UBL, UMEAN, iCell, iFace, ig, Norm, pPhy, fType). TFBoundaryDiff: 9 params (adds dUMEAN as 2nd arg). Defined in VariationalReconstruction.hpp:885-904. Invoked in 6 sites (all in _Reconstruction.hxx). Lambda definitions: 4 in src/Euler/, 1 in ModelEvaluator.hpp, 1 in EulerSolver.hxx (diff), 2 in test code.axisSymmetric (5), faceMeanNorm (4), cellAlignedHBox (2), cellMajorHBox (2), cellMajorCoord (2). 5 members lack any public accessor: cellAlignedHBox, cellMajorCoord, periodicity, cellCent, faceCent. External code (Euler/EulerP) never touches protected members.mesh is public ssp<UnstructuredMesh>; mpi is public MPIInfo value. External mesh access: ~25 sites in EulerP (fv->mesh->), ~35 in tests/app. Euler keeps its own mesh shared_ptr copy and never goes through fv->mesh. External mpi access through FV: essentially zero.Removed ~170 lines of pure debug output across 7 files. Deleted the incomplete FWBAP_LE_Multiway limiter stub. Fixed bcWeight condition for periodic faces.
| # | Item | Status | Notes |
|---|---|---|---|
| 8.1 | Delete debug prints (std::cout/printf/std::abort) | Done | ~170 lines removed from 7 files; all commented-out experimental/alternative code preserved per policy |
| 8.2 | Replace #define __POWV/#undef | Done | Completed in Phase 4 (constexpr int powV) |
| 8.3 | Remove DNDS_SWITCH_INTELLISENSE hack | Preserved | Useful for IDE support; deliberately kept |
| 8.4 | Delete incomplete FWBAP_LE_Multiway | Done | Empty TODO body with no callers; 11 lines removed |
Behavior change: ConstructBaseAndWeight() face weight condition changed from FaceIDIsExternalBC(...) || FaceIDIsPeriodic(...) to FaceIDIsExternalBC(...) only — periodic faces are now treated as internal faces for weight settings, not assigned bcWeight.
| Phase | Risk | Effort | Prerequisite | Status |
|---|---|---|---|---|
| Phase 0: Unit Tests | None | 1 day | – | DONE |
| Phase 1: Safety Fixes | Low | 0.5 day | Phase 0 | DONE |
| Phase 2: Deduplication | Low | 2-3 days | Phase 0 | DONE (4/6) |
| Phase 3: Module Boundaries | Low | 1 day | Phase 0 | DONE (1/2) |
| Phase 4: FFaceFunctional | Medium | 1-2 days | Phase 2 | DONE |
| Phase 5: VR Decomposition | Medium-High | 3-5 days | Phase 4 | DONE (2/4) |
| Phase 6: FV Separation + Device Mixin | High | 4-6 days | Phase 0 | DONE |
| Phase 7: API Hygiene | Medium | 2-3 days | Phase 5, 6 | SKIPPED |
| Phase 8: Cleanup | Trivial | 0.5 day | Any time | DONE (3/4) |
Each phase must pass the full test suite (pytest test/ + C++ ctest) after completion. Phase 0 tests serve as the regression guard for all subsequent phases.
Based on analysis of ~90 case files, two configurations dominate:
HQM Production (~85% of real cases):
Default Factorial (~15% of cases, code default):
Refactoring completed on branch dev/harry_refac1 (9 commits from baseline c774b89). All Phase 0 regression tests pass at np=1,2,4.
| # | Hash | Phase | Summary |
|---|---|---|---|
| 1 | 6537b94 | 0 | 17 test cases, 50 assertions in test_Reconstruction.cpp |
| 2 | 5523abe | 1 | Thread-unsafe static, raw new/delete, [[fallthrough]], settings slicing |
| 3 | 700ed4f | 2 | GetRecDOFRange, PolynomialSquaredNorm, PolynomialDotProduct, ApplyPeriodicTransform, DispatchBiwayLimiter |
| 4 | 54d9de9 | 3 | test_FiniteVolume* -> BenchmarkFiniteVolume*; ModelEvaluator placement comment |
| 5 | 048fe4d | 4 | AccumulateDiffOrderContributions with compile-time segment<N>(), integer powers, constexpr powV |
| 6 | 959f751 | 5 | VRBaseWeight (9 members) + VRCoefficients (12 members) nested sub-objects |
| 7 | 217fc09 | 6 | DeviceTransferable<Derived> CRTP mixin, DOF factory extraction, named arrays, 5 CUDA tests |
| 8 | f119cbc | 8 | ~170 lines debug prints removed, FWBAP_LE_Multiway deleted, periodic face weight fix |
| 9 | (this) | doc | Final documentation with Phase 7 skip rationale |
VRBaseWeight + VRCoefficients; DOF allocation extracted to DOFFactory.hpp; device management unified in DeviceTransferable<T> CRTP mixin.DoLimiterWBAP_C vs _3 and GetBoundaryRHS vs GetBoundaryRHSDiff – structural differences too deep for clean merge.src/Model/ – blocked by Python binding coupling; stays in CFV..hxx files; rewriting explicit instantiation infrastructure for no functional benefit.BoundaryContext param struct – stable 8-param interface, Euler lambdas lack test coverage, cosmetic-only.mesh/mpi – ~897 mesh-> sites; getMesh() wrapper provides no real encapsulation gain.DNDS_SWITCH_INTELLISENSE hack – useful for IDE support, deliberately preserved.