Table of Contents

Class ILKernelGenerator

Namespace
NumSharp.Backends.Kernels
Assembly
NumSharp.dll

Generates per-chunk IL kernels for NDIter-driven execution.

Kernels emitted here are called as the inner loop of an NDIter iteration — once per chunk, with dataptrs/strides/count provided by the iterator. The kernel does no axis or stride walking of its own.

Add new kernel families in ILKernelGenerator.<Op>.cs partial files. See DirectILKernelGenerator for the legacy whole-array kernels currently being migrated to this model.

public static class ILKernelGenerator
Inheritance
ILKernelGenerator
Inherited Members

Methods

GetCumSumInnerLoop(NPTypeCode, NPTypeCode)

Returns the NDIter-driven cumulative-sum inner loop for inTypeaccType. The returned delegate matches NDInnerLoopFunc; drive it with an iterator whose scan axis has been removed (see DefaultEngine.AccumulateAxis), passing a pointer to a ILKernelGenerator.ScanAxisAux as the kernel's auxdata.

public static NDInnerLoopFunc GetCumSumInnerLoop(NPTypeCode inType, NPTypeCode accType)

Parameters

inType NPTypeCode
accType NPTypeCode

Returns

NDInnerLoopFunc

GetReduceInnerLoop(ReduceKernelKey)

Returns the cached per-chunk reduction kernel for the given (op, input, accumulator) triple, or null when no NDIter-driven kernel exists yet (caller falls back to the DirectILKernelGenerator path). The returned delegate matches NDInnerLoopFunc; hand it to an iterator built by NewReduce(NDArray, NDArray, int, NDIterGlobalFlags).

public static NDInnerLoopFunc GetReduceInnerLoop(ILKernelGenerator.ReduceKernelKey key)

Parameters

key ILKernelGenerator.ReduceKernelKey

Returns

NDInnerLoopFunc

MeanDivideByCount(NDArray, long)

Divide every element of output by count in place — the post-pass that turns an accumulated axis Sum into a Mean. For Complex this divides both components by the real count (NumPy: mean = sum / n), which is exactly what the legacy MeanAxisComplex did per element but without its per-output-row NDArray allocation. Writes through SetAtIndex(object, long) so any output layout is honored.

public static void MeanDivideByCount(NDArray output, long count)

Parameters

output NDArray
count long

SeedReduceIdentity(NDArray, ReductionOp)

Pre-fill output with the reduction identity for op before driving a REDUCE iterator. Required because the per-chunk kernels fold into the existing output slot(s). Writes through SetAtIndex(object, long) so any output layout (contiguous fresh alloc or user-supplied view) is honored.

public static void SeedReduceIdentity(NDArray output, ReductionOp op)

Parameters

output NDArray
op ReductionOp