DNDSR 0.1.0.dev1+gcd065ad
Distributed Numeric Data Structure for CFV
Loading...
Searching...
No Matches
VectorUtils.hpp
Go to the documentation of this file.
1#pragma once
2/// @file VectorUtils.hpp
3/// @brief Small utilities for MPI-indexed type layouts (hindexed optimisation).
4
5#include "Defines.hpp"
6#include "MPI.hpp"
7
8namespace DNDS
9{
10 /**
11 * @brief Coalesce contiguous blocks in an `MPI_Type_create_hindexed` layout.
12 *
13 * @details MPI derived-type performance depends heavily on the number of
14 * blocks. If two consecutive blocks happen to be adjacent in memory
15 * (`disps[i+1] == disps[i] + blk_sizes[i] * sizeofElem`), they can be
16 * merged into a single larger block. This helper walks the input arrays
17 * and returns a compacted `(n_size, blk_sizes, disps)` tuple.
18 *
19 * Used by @ref DNDS::ArrayTransformer "ArrayTransformer"::createMPITypes before calling the actual
20 * `MPI_Type_create_hindexed`.
21 *
22 * @tparam TBlkSiz Element type of the block-size array (e.g., `MPI_int`).
23 * @tparam TDisp Element type of the displacement array (e.g., `MPI_Aint`).
24 * @tparam TSizeof Element-size type (usually `MPI_Aint`).
25 *
26 * @param o_size Number of input blocks.
27 * @param blk_sizes Block sizes (in element count).
28 * @param disps Displacements (in bytes).
29 * @param sizeofElem Element size in bytes (matches @ref MPI_Type_extent).
30 * @return `(new_size, new_blk_sizes, new_disps)` -- a merged layout.
31 */
32 template <class TBlkSiz, class TDisp, class TSizeof>
33 auto optimize_hindexed_layout(index o_size, TBlkSiz *blk_sizes, TDisp *disps, TSizeof sizeofElem)
34 {
35 index n_size = 0;
36 for (index i = 0; i < o_size; i++)
37 {
38 while (i + 1 < o_size && (blk_sizes[i] * sizeofElem + disps[i] == disps[i + 1]))
39 i++;
40 n_size++;
41 }
42 std::vector<TBlkSiz> new_blk_sizes(n_size, 0);
43 std::vector<TDisp> new_disps(n_size);
44 n_size = 0;
45 for (index i = 0; i < o_size; i++)
46 {
47 new_blk_sizes[n_size] += blk_sizes[i];
48 new_disps[n_size] = disps[i];
49 while (i + 1 < o_size && (blk_sizes[i] * sizeofElem + disps[i] == disps[i + 1]))
50 {
51 i++;
52 new_blk_sizes[n_size] += blk_sizes[i];
53 }
54 n_size++;
55 }
56 return std::make_tuple(n_size, new_blk_sizes, new_disps);
57 }
58}
Core type aliases, constants, and metaprogramming utilities for the DNDS framework.
MPI wrappers: MPIInfo, collective operations, type mapping, CommStrategy.
the host side operators are provided as implemented
int64_t index
Global row / DOF index type (signed 64-bit; handles multi-billion-cell meshes).
Definition Defines.hpp:107
auto optimize_hindexed_layout(index o_size, TBlkSiz *blk_sizes, TDisp *disps, TSizeof sizeofElem)
Coalesce contiguous blocks in an MPI_Type_create_hindexed layout.