Table of Contents

Class NDIterCoalescing

Namespace
NumSharp.Backends.Iteration
Assembly
NumSharp.dll

Axis coalescing logic for NDIter. Merges adjacent compatible axes to reduce iteration overhead.

NUMSHARP DIVERGENCE: This implementation supports unlimited dimensions. Uses StridesNDim for stride array indexing (allocated based on actual ndim).

public static class NDIterCoalescing
Inheritance
NDIterCoalescing
Inherited Members

Methods

CoalesceAxes(ref NDIterState)

Coalesce adjacent axes that have compatible strides for all operands. Reduces ndim, improving iteration efficiency.

public static void CoalesceAxes(ref NDIterState state)

Parameters

state NDIterState

FlipNegativeStrides(ref NDIterState)

Flip axes with all-negative strides for memory-order iteration.

NumPy's npyiter_flip_negative_strides():

  • For each axis, check if ALL operands have negative or zero strides
  • If so, negate the strides, adjust base pointers to start at the end, and mark the axis as flipped in the Perm array (perm[d] = -1 - perm[d])
  • Sets NEGPERM flag and clears IDENTPERM

This allows the iterator to traverse memory in ascending order even for reversed arrays, improving cache efficiency.

public static bool FlipNegativeStrides(ref NDIterState state)

Parameters

state NDIterState

Iterator state to modify

Returns

bool

True if any axes were flipped

IsContiguousForCoalescing(ref NDIterState)

Check if all operands are contiguous in the current internal axis order. This determines whether coalescing would preserve the iteration semantics for C/F order iteration.

For coalescing to preserve iteration order, all operands must be contiguous such that stride[i] * shape[i] == stride[i+1] for adjacent axes.

public static bool IsContiguousForCoalescing(ref NDIterState state)

Parameters

state NDIterState

Returns

bool

RemoveUnitAxes(ref NDIterState)

Remove size-1 axes from the internal representation. Each contributes exactly one coordinate (always 0) and stride 0 (fill invariant), so removal never changes the element-visit sequence — it restores a meaningful innermost axis for EXLOOP/kernels and the buffer-manager linearity test. NumPy reaches the same state through its UNCONDITIONAL npyiter_coalesce_axes (the strict trivial branch absorbs every stride-0 size-1 axis); NumSharp's full coalesce is gated on all-operands-contiguous, so the non-coalesced branch calls this instead — without it, a trailing size-1 axis sits innermost and collapses EXLOOP to one-element inner loops ((N,1) strided views ran N kernel invocations of count 1).

Must NOT be used when a multi-index or flat index is tracked — index reconstruction needs the original axis structure (NumPy likewise skips coalescing there).

public static void RemoveUnitAxes(ref NDIterState state)

Parameters

state NDIterState

ReorderAxes(ref NDIterState)

Reorder axes for optimal memory access pattern. Prioritizes axes with stride=1 as innermost.

[Obsolete("Use ReorderAxesForCoalescing with order parameter instead")]
public static void ReorderAxes(ref NDIterState state)

Parameters

state NDIterState

ReorderAxesForCoalescing(ref NDIterState, NPY_ORDER, bool)

Reorder axes for iteration based on the specified order. This is called BEFORE CoalesceAxes to enable full coalescing of contiguous arrays.

Order semantics (matching NumPy):

  • C-order (NPY_CORDER): Last axis innermost (row-major logical order) Forces axes to [n-1, n-2, ..., 0] order regardless of memory layout
  • F-order (NPY_FORTRANORDER): First axis innermost (column-major logical order) Forces axes to [0, 1, ..., n-1] order regardless of memory layout
  • K-order (NPY_KEEPORDER): Follow memory layout (smallest stride innermost) Sorts by stride to maximize cache efficiency
  • A-order (NPY_ANYORDER): Same as K-order

The Perm array tracks the mapping: Perm[internal_axis] = original_axis This allows GetMultiIndex to return coordinates in the original axis order.

public static void ReorderAxesForCoalescing(ref NDIterState state, NPY_ORDER order, bool forCoalescing = true)

Parameters

state NDIterState

Iterator state to modify

order NPY_ORDER

Iteration order

forCoalescing bool

If true, sort for coalescing (ascending). If false, sort for memory-order iteration with MULTI_INDEX (descending). Only affects K-order; C and F orders are deterministic.

TryCoalesceInner(ref NDIterState)

Try to coalesce the inner dimension for better vectorization. Returns true if inner loop size increased.

public static bool TryCoalesceInner(ref NDIterState state)

Parameters

state NDIterState

Returns

bool