Table of Contents

Class DirectILKernelGenerator

Namespace
NumSharp.Backends.Kernels
Assembly
NumSharp.dll

Binary operations (same-type) - contiguous kernels and generic helpers.

public static class DirectILKernelGenerator
Inheritance
DirectILKernelGenerator
Inherited Members

Fields

VectorBits

Detected vector width at startup: 512, 256, 128, or 0 (no SIMD).

public static readonly int VectorBits

Field Value

int

VectorBytes

Number of bytes per vector register.

public static readonly int VectorBytes

Field Value

int

Properties

Enabled

Whether IL generation is enabled. Can be disabled for debugging.

public static bool Enabled { get; set; }

Property Value

bool

Name

Provider name for diagnostics.

public static string Name { get; }

Property Value

string

Methods

Clip(NPTypeCode, ClipMode, ClipBoundsKind, void*, void*, long, void*, void*)

Run a clip operation. Picks (and on first call, IL-generates) the appropriate DynamicMethod for the (dtype, mode, kind) tuple and invokes it with the supplied pointers.

public static void Clip(NPTypeCode dtype, DirectILKernelGenerator.ClipMode mode, DirectILKernelGenerator.ClipBoundsKind kind, void* src, void* dst, long size, void* lo, void* hi)

Parameters

dtype NPTypeCode
mode DirectILKernelGenerator.ClipMode
kind DirectILKernelGenerator.ClipBoundsKind
src void*
dst void*
size long
lo void*
hi void*

GetArgwhereCountKernel(Type)

IL-emitted count of non-zero elements. Returns null only when Enabled is false — every supported dtype has a kernel (SIMD where Vector{T} exists, scalar IL via op_Inequality otherwise).

public static ArgwhereCountKernel GetArgwhereCountKernel(Type elementType)

Parameters

elementType Type

Returns

ArgwhereCountKernel

GetArgwhereExpandKernel()

IL-emitted coord expand (singleton — same kernel handles any ndim).

public static ArgwhereExpandKernel GetArgwhereExpandKernel()

Returns

ArgwhereExpandKernel

GetArgwhereFlatKernel(Type)

IL-emitted bit-scan that writes flat indices of non-zero elements into a pre-sized buffer.

public static ArgwhereFlatKernel GetArgwhereFlatKernel(Type elementType)

Parameters

elementType Type

Returns

ArgwhereFlatKernel

GetBinaryScalarDelegate(BinaryScalarKernelKey)

Get or generate an IL-based binary scalar delegate. Returns a Func<TLhs, TRhs, TResult> delegate.

public static Delegate GetBinaryScalarDelegate(BinaryScalarKernelKey key)

Parameters

key BinaryScalarKernelKey

Returns

Delegate

GetComparisonKernel(ComparisonKernelKey)

Get or generate a comparison kernel for the specified key.

public static ComparisonKernel GetComparisonKernel(ComparisonKernelKey key)

Parameters

key ComparisonKernelKey

Returns

ComparisonKernel

GetComparisonScalarDelegate(ComparisonScalarKernelKey)

Get or generate a comparison scalar delegate. Returns a Func<TLhs, TRhs, bool> delegate.

public static Delegate GetComparisonScalarDelegate(ComparisonScalarKernelKey key)

Parameters

key ComparisonScalarKernelKey

Returns

Delegate

GetCopyKernel(CopyKernelKey)

public static CopyKernel GetCopyKernel(CopyKernelKey key)

Parameters

key CopyKernelKey

Returns

CopyKernel

GetCumulativeKernel(CumulativeKernelKey)

Get or generate a cumulative (scan) kernel. Returns a delegate that computes running accumulation over all elements.

public static CumulativeKernel GetCumulativeKernel(CumulativeKernelKey key)

Parameters

key CumulativeKernelKey

Returns

CumulativeKernel

GetFilterAxisKernel(long)

IL-emitted kernel cached by innerSize. Pass the actual innerSize you'll use at call time; the function buckets {1,2,4,8,16} into typed-copy variants and anything else into the bulk-cpblk variant. Returns null only when Enabled is false.

public static FilterAxisKernel GetFilterAxisKernel(long innerSize)

Parameters

innerSize long

Returns

FilterAxisKernel

GetIndicesKernel()

IL-emitted indices fill kernel (singleton — same kernel handles any ndim). Returns null only when Enabled is false.

public static IndicesKernel GetIndicesKernel()

Returns

IndicesKernel

GetMixedTypeKernel(MixedTypeKernelKey)

Get or generate a mixed-type kernel for the specified key.

public static MixedTypeKernel GetMixedTypeKernel(MixedTypeKernelKey key)

Parameters

key MixedTypeKernelKey

Returns

MixedTypeKernel

GetNonZeroPerDimKernel()

IL-emitted per-dim coord expander (singleton — same kernel handles any ndim). Returns null only when Enabled is false.

public static NonZeroPerDimKernel GetNonZeroPerDimKernel()

Returns

NonZeroPerDimKernel

GetPlaceKernel()

IL-emitted place kernel (singleton — same kernel handles any dtype via the elemBytes runtime argument). Returns null only when Enabled is false.

public static PlaceKernel GetPlaceKernel()

Returns

PlaceKernel

GetPutKernel()

IL-emitted put kernel (singleton — same kernel handles any dtype via the elemBytes runtime argument and any mode). Returns null only when Enabled is false.

public static PutKernel GetPutKernel()

Returns

PutKernel

GetRavelMultiIndexKernel()

IL-emitted multi→flat folder (singleton — same kernel handles any ndim, both orders, and arbitrary per-axis mode tuples). Returns null only when Enabled is false.

public static RavelMultiIndexKernel GetRavelMultiIndexKernel()

Returns

RavelMultiIndexKernel

GetRepeatBroadcastKernel(int)

Returns the cached IL-emitted broadcast-repeat kernel for the given slab size. First call for a size triggers IL generation, later calls hit a dictionary lookup.

public static DirectILKernelGenerator.RepeatBroadcastKernel GetRepeatBroadcastKernel(int chunkBytes)

Parameters

chunkBytes int

Returns

DirectILKernelGenerator.RepeatBroadcastKernel

GetRepeatPerJKernel(int)

Returns the cached IL-emitted per-j repeat kernel for the given slab size.

public static DirectILKernelGenerator.RepeatPerJKernel GetRepeatPerJKernel(int chunkBytes)

Parameters

chunkBytes int

Returns

DirectILKernelGenerator.RepeatPerJKernel

GetSearchSortedKernel(NPTypeCode, bool, bool, bool)

Get or generate a searchsorted kernel.

public static DirectILKernelGenerator.SearchSortedKernel GetSearchSortedKernel(NPTypeCode type, bool leftSide, bool hasSorter, bool contiguousA)

Parameters

type NPTypeCode

Element dtype of a (and contiguous v, after caller normalizes).

leftSide bool

true = side='left', false = side='right'.

hasSorter bool

true = sorter param non-null (kernel emits sort_idx indirection).

contiguousA bool

true = a is contiguous (arrStrideBytes == elemSize). Lets JIT use scaled-index addressing instead of imul, which closes the gap to NumPy on the random-key hot path. When false, arrStrideBytes is honored as a runtime parameter.

Returns

DirectILKernelGenerator.SearchSortedKernel

GetShiftArrayKernel<T>(bool)

Get or generate a shift kernel for element-wise shift amounts.

public static DirectILKernelGenerator.ShiftArrayKernel<T>? GetShiftArrayKernel<T>(bool isLeftShift) where T : unmanaged

Parameters

isLeftShift bool

True for left shift, false for right shift

Returns

DirectILKernelGenerator.ShiftArrayKernel<T>

Kernel delegate or null if not supported

Type Parameters

T

Integer element type

GetShiftScalarKernel<T>(bool)

Get or generate a SIMD-optimized shift kernel for uniform shift amount.

public static DirectILKernelGenerator.ShiftScalarKernel<T>? GetShiftScalarKernel<T>(bool isLeftShift) where T : unmanaged

Parameters

isLeftShift bool

True for left shift, false for right shift

Returns

DirectILKernelGenerator.ShiftScalarKernel<T>

Kernel delegate or null if not supported

Type Parameters

T

Integer element type

GetStridedUnaryKernel(UnaryKernelKey)

Get or generate a fused strided-SIMD unary kernel for the given key. Gating (same-width SIMD-capable op, supported dtype) is the caller's responsibility.

public static StridedUnaryKernel GetStridedUnaryKernel(UnaryKernelKey key)

Parameters

key UnaryKernelKey

Returns

StridedUnaryKernel

GetTakeKernel()

IL-emitted take kernel (singleton — same kernel handles any ndim, any elemBytes, any innerSize, both axis=None and axis=k). Returns null only when Enabled is false.

public static TakeKernel GetTakeKernel()

Returns

TakeKernel

GetTraceAccumTypeCode(NPTypeCode)

Maps src NPTypeCode → (result NPTypeCode, supported). Convenience for callers that already have the type-code.

public static (NPTypeCode, bool) GetTraceAccumTypeCode(NPTypeCode src)

Parameters

src NPTypeCode

Returns

(NPTypeCode, bool)

GetTraceKernel(Type)

IL-emitted singleton per srcType. Returns null when the dtype has no kernel (no triples are unsupported in the current implementation; the field exists for graceful future expansion).

public static TraceKernel GetTraceKernel(Type srcType)

Parameters

srcType Type

Returns

TraceKernel

GetTypedElementReductionKernel<TResult>(ElementReductionKernelKey)

Get or generate a typed element-wise reduction kernel. Returns a delegate that reduces all elements to a single value of type TResult.

public static TypedElementReductionKernel<TResult> GetTypedElementReductionKernel<TResult>(ElementReductionKernelKey key) where TResult : unmanaged

Parameters

key ElementReductionKernelKey

Returns

TypedElementReductionKernel<TResult>

Type Parameters

TResult

GetUnaryKernel(UnaryKernelKey)

Get or generate a unary kernel for the specified key.

public static UnaryKernel GetUnaryKernel(UnaryKernelKey key)

Parameters

key UnaryKernelKey

Returns

UnaryKernel

GetUnaryScalarDelegate(UnaryScalarKernelKey)

Get or generate an IL-based unary scalar delegate. Returns a Func<TInput, TOutput> delegate.

public static Delegate GetUnaryScalarDelegate(UnaryScalarKernelKey key)

Parameters

key UnaryScalarKernelKey

Returns

Delegate

GetUnravelIndexKernel()

IL-emitted unravel kernel (singleton — same kernel handles any ndim and both C / F order via the runtime idxStart / idxStep args). Returns null only when Enabled is false.

public static UnravelIndexKernel GetUnravelIndexKernel()

Returns

UnravelIndexKernel

GetWeightedSumIterKernel(WeightedSumKernelKey)

Returns the cached IL-emitted weighted-sum kernel for the given dtype, or null if the dtype isn't supported (Bool/Char/Half/Complex/Decimal). The kernel signature matches NDInnerLoopFunc; pass it to NDIter.ForEach over the 4-operand [a, w, num_out, scl_out] iterator.

public static NDInnerLoopFunc? GetWeightedSumIterKernel(DirectILKernelGenerator.WeightedSumKernelKey key)

Parameters

key DirectILKernelGenerator.WeightedSumKernelKey

Returns

NDInnerLoopFunc

GetWhereKernel<T>()

Get or generate an IL-based where kernel for the specified type. Returns null if IL generation is disabled or fails.

public static WhereKernel<T>? GetWhereKernel<T>() where T : unmanaged

Returns

WhereKernel<T>

Type Parameters

T

GetWhereScalarXKernel<T>()

public static WhereScalarXKernel<T> GetWhereScalarXKernel<T>() where T : unmanaged

Returns

WhereScalarXKernel<T>

Type Parameters

T

GetWhereScalarXYKernel<T>()

public static WhereScalarXYKernel<T> GetWhereScalarXYKernel<T>() where T : unmanaged

Returns

WhereScalarXYKernel<T>

Type Parameters

T

GetWhereScalarYKernel<T>()

public static WhereScalarYKernel<T> GetWhereScalarYKernel<T>() where T : unmanaged

Returns

WhereScalarYKernel<T>

Type Parameters

T

ModfHelper(double*, double*, long)

SIMD-optimized Modf operation for contiguous double arrays. Computes fractional and integral parts in-place. Handles special values (NaN, Inf) according to C standard modf.

public static void ModfHelper(double* data, double* integral, long size)

Parameters

data double*

Input array (will contain fractional parts after)

integral double*

Output array for integral parts

size long

Number of elements

ModfHelper(float*, float*, long)

SIMD-optimized Modf operation for contiguous float arrays. Computes fractional and integral parts in-place. Handles special values (NaN, Inf) according to C standard modf.

public static void ModfHelper(float* data, float* integral, long size)

Parameters

data float*

Input array (will contain fractional parts after)

integral float*

Output array for integral parts

size long

Number of elements

Quantile(NPTypeCode, NPTypeCode, QuantileMethod, void*, void*, long, int, int*, int, double*, int, void*, long, bool, int*)

Run the cached quantile kernel for the given dtype triple. First call for a tuple emits and caches the DynamicMethod; later calls jump straight into the specialized native code.

public static void Quantile(NPTypeCode srcType, NPTypeCode outType, QuantileMethod method, void* srcBase, void* scratchBase, long outer, int n, int* kSorted, int nKs, double* q, int nQs, void* dstBase, long dstOuterStride, bool ignoreNaN = false, int* rowKScratch = null)

Parameters

srcType NPTypeCode
outType NPTypeCode
method QuantileMethod
srcBase void*
scratchBase void*
outer long
n int
kSorted int*
nKs int
q double*
nQs int
dstBase void*
dstOuterStride long
ignoreNaN bool
rowKScratch int*

TryGetAxisReductionKernel(AxisReductionKernelKey)

Try to get an axis reduction kernel. Supports all reduction operations and all types including type promotion. Uses SIMD for capable types, scalar loop for others.

public static AxisReductionKernel? TryGetAxisReductionKernel(AxisReductionKernelKey key)

Parameters

key AxisReductionKernelKey

Returns

AxisReductionKernel

TryGetBooleanAxisReductionKernel(AxisReductionKernelKey)

Try to get a boolean axis reduction kernel (All / Any). Returns null for non-SIMD-capable dtypes (Half, Complex, Decimal, Char) so the caller can fall back to the NDAxisIter scalar path.

public static AxisReductionKernel? TryGetBooleanAxisReductionKernel(AxisReductionKernelKey key)

Parameters

key AxisReductionKernelKey

Returns

AxisReductionKernel

TryGetCastKernel(NPTypeCode, NPTypeCode)

Get or generate a contig cast kernel for the given pair. Returns null for unsupported pairs (Boolean/Char/Half/Complex/Decimal involved).

public static DirectILKernelGenerator.CastKernel TryGetCastKernel(NPTypeCode srcType, NPTypeCode dstType)

Parameters

srcType NPTypeCode
dstType NPTypeCode

Returns

DirectILKernelGenerator.CastKernel

TryGetCopyKernel(CopyKernelKey)

public static CopyKernel? TryGetCopyKernel(CopyKernelKey key)

Parameters

key CopyKernelKey

Returns

CopyKernel

TryGetCumulativeAxisKernel(CumulativeAxisKernelKey)

Try to get or generate a cumulative axis kernel.

public static CumulativeAxisKernel? TryGetCumulativeAxisKernel(CumulativeAxisKernelKey key)

Parameters

key CumulativeAxisKernelKey

Returns

CumulativeAxisKernel

TryGetCumulativeKernel(CumulativeKernelKey)

Try to get or generate a cumulative kernel.

public static CumulativeKernel? TryGetCumulativeKernel(CumulativeKernelKey key)

Parameters

key CumulativeKernelKey

Returns

CumulativeKernel

TryGetInnerCastKernel(NPTypeCode, NPTypeCode)

Get or emit the scalar inner-loop cast kernel for the pair. Non-null for every dtype pair (all 225 Converts.To{Dst}({Src}) methods exist); returns null only if IL generation is disabled or a method unexpectedly fails to resolve.

public static DirectILKernelGenerator.InnerCastLoop TryGetInnerCastKernel(NPTypeCode srcType, NPTypeCode dstType)

Parameters

srcType NPTypeCode
dstType NPTypeCode

Returns

DirectILKernelGenerator.InnerCastLoop

TryGetMaskedCastKernel(NPTypeCode, NPTypeCode)

Get or generate a masked-cast kernel for the given (src, dst) pair. Returns null for unsupported pairs (Boolean/Char/Half/Complex/Decimal involved).

public static DirectILKernelGenerator.MaskedCastKernel TryGetMaskedCastKernel(NPTypeCode srcType, NPTypeCode dstType)

Parameters

srcType NPTypeCode
dstType NPTypeCode

Returns

DirectILKernelGenerator.MaskedCastKernel

TryGetNanAxisReductionKernel(AxisReductionKernelKey)

Try to get a NaN-aware axis reduction kernel. SIMD kernels exist only for float/double; Half and Complex route to scalar fallback paths (Default.Reduction.Nan.cs ExecuteNanAxisReductionScalar / np.nanmean.cs / np.nanvar.cs / np.nanstd.cs) which handle them directly.

public static AxisReductionKernel? TryGetNanAxisReductionKernel(AxisReductionKernelKey key)

Parameters

key AxisReductionKernelKey

Returns

AxisReductionKernel

TryGetStridedCastKernel(NPTypeCode, NPTypeCode)

Get or generate a strided/broadcast cast kernel for the given pair. Returns null for unsupported pairs.

public static DirectILKernelGenerator.StridedCastKernel TryGetStridedCastKernel(NPTypeCode srcType, NPTypeCode dstType)

Parameters

srcType NPTypeCode
dstType NPTypeCode

Returns

DirectILKernelGenerator.StridedCastKernel

TryGetTypedElementReductionKernel<TResult>(ElementReductionKernelKey)

Try to get or generate an element reduction kernel.

public static TypedElementReductionKernel<TResult>? TryGetTypedElementReductionKernel<TResult>(ElementReductionKernelKey key) where TResult : unmanaged

Parameters

key ElementReductionKernelKey

Returns

TypedElementReductionKernel<TResult>

Type Parameters

TResult

WhereExecute<T>(bool*, T*, T*, T*, long)

Execute where operation using IL-generated kernel or fallback to static helper.

public static void WhereExecute<T>(bool* cond, T* x, T* y, T* result, long count) where T : unmanaged

Parameters

cond bool*
x T*
y T*
result T*
count long

Type Parameters

T