Class ILKernelGenerator
Binary operations (same-type) - contiguous kernels and generic helpers.
public static class ILKernelGenerator
- Inheritance
-
ILKernelGenerator
- Inherited Members
Fields
VectorBits
Detected vector width at startup: 512, 256, 128, or 0 (no SIMD).
public static readonly int VectorBits
Field Value
VectorBytes
Number of bytes per vector register.
public static readonly int VectorBytes
Field Value
Properties
AxisReductionCachedCount
Number of axis reduction kernels in cache.
public static int AxisReductionCachedCount { get; }
Property Value
AxisScanCachedCount
Number of axis scan kernels in cache.
public static int AxisScanCachedCount { get; }
Property Value
BinaryScalarCachedCount
Number of binary scalar kernels in cache.
public static int BinaryScalarCachedCount { get; }
Property Value
CachedCount
Number of IL-generated kernels in cache.
public static int CachedCount { get; }
Property Value
ComparisonCachedCount
Number of comparison kernels in cache.
public static int ComparisonCachedCount { get; }
Property Value
ComparisonScalarCachedCount
Number of comparison scalar kernels in cache.
public static int ComparisonScalarCachedCount { get; }
Property Value
ElementReductionCachedCount
Number of element reduction kernels in cache.
public static int ElementReductionCachedCount { get; }
Property Value
Enabled
Whether IL generation is enabled. Can be disabled for debugging.
public static bool Enabled { get; set; }
Property Value
MixedTypeCachedCount
Number of mixed-type kernels in cache.
public static int MixedTypeCachedCount { get; }
Property Value
Name
Provider name for diagnostics.
public static string Name { get; }
Property Value
NanAxisReductionCachedCount
Number of NaN axis reduction kernels in cache.
public static int NanAxisReductionCachedCount { get; }
Property Value
ScanCachedCount
Number of scan kernels in cache.
public static int ScanCachedCount { get; }
Property Value
UnaryCachedCount
Number of unary kernels in cache.
public static int UnaryCachedCount { get; }
Property Value
UnaryScalarCachedCount
Number of unary scalar kernels in cache.
public static int UnaryScalarCachedCount { get; }
Property Value
Methods
ClipArrayBounds<T>(T*, T*, T*, int)
Clip with element-wise array bounds (both min and max arrays). All three arrays must be broadcast to the same shape by the caller. For contiguous arrays of SIMD-supported types, uses Vector operations.
public static void ClipArrayBounds<T>(T* output, T* minArr, T* maxArr, int size) where T : unmanaged, IComparable<T>
Parameters
outputT*minArrT*maxArrT*sizeint
Type Parameters
T
Remarks
NumPy clip semantics: result[i] = min(max(a[i], min[i]), max[i]) When min[i] > max[i], result is max[i] (per NumPy behavior).
ClipArrayMax<T>(T*, T*, int)
Clip with element-wise max array bounds only (no min).
public static void ClipArrayMax<T>(T* output, T* maxArr, int size) where T : unmanaged, IComparable<T>
Parameters
outputT*maxArrT*sizeint
Type Parameters
T
ClipArrayMin<T>(T*, T*, int)
Clip with element-wise min array bounds only (no max).
public static void ClipArrayMin<T>(T* output, T* minArr, int size) where T : unmanaged, IComparable<T>
Parameters
outputT*minArrT*sizeint
Type Parameters
T
ClipHelper<T>(T*, int, T, T)
SIMD-optimized Clip operation for contiguous arrays (min and max). Modifies the array in-place: data[i] = Min(Max(data[i], minVal), maxVal)
public static void ClipHelper<T>(T* data, int size, T minVal, T maxVal) where T : unmanaged, IComparable<T>
Parameters
dataT*sizeintminValTmaxValT
Type Parameters
T
ClipMaxHelper<T>(T*, int, T)
SIMD-optimized Max-only Clip operation (no lower bound).
public static void ClipMaxHelper<T>(T* data, int size, T maxVal) where T : unmanaged, IComparable<T>
Parameters
dataT*sizeintmaxValT
Type Parameters
T
ClipMaxStrided<T>(T*, int, T, Shape)
Max-only Clip operation for strided arrays.
public static void ClipMaxStrided<T>(T* data, int size, T maxVal, Shape shape) where T : unmanaged, IComparable<T>
Parameters
Type Parameters
T
ClipMaxUnified<T>(T*, int, T, Shape)
Unified Max-only Clip operation.
public static void ClipMaxUnified<T>(T* data, int size, T maxVal, Shape shape) where T : unmanaged, IComparable<T>
Parameters
Type Parameters
T
ClipMinHelper<T>(T*, int, T)
SIMD-optimized Min-only Clip operation (no upper bound).
public static void ClipMinHelper<T>(T* data, int size, T minVal) where T : unmanaged, IComparable<T>
Parameters
dataT*sizeintminValT
Type Parameters
T
ClipMinStrided<T>(T*, int, T, Shape)
Min-only Clip operation for strided arrays.
public static void ClipMinStrided<T>(T* data, int size, T minVal, Shape shape) where T : unmanaged, IComparable<T>
Parameters
Type Parameters
T
ClipMinUnified<T>(T*, int, T, Shape)
Unified Min-only Clip operation.
public static void ClipMinUnified<T>(T* data, int size, T minVal, Shape shape) where T : unmanaged, IComparable<T>
Parameters
Type Parameters
T
ClipStrided<T>(T*, int, T, T, Shape)
Clip operation for strided (non-contiguous) arrays. Uses coordinate-based iteration via Shape.TransformOffset.
public static void ClipStrided<T>(T* data, int size, T minVal, T maxVal, Shape shape) where T : unmanaged, IComparable<T>
Parameters
Type Parameters
T
Remarks
This handles arrays that are:
- Transposed (stride order differs from dimension order)
- Sliced with step (e.g., arr[::2])
- Views with non-standard memory layout
Performance is O(n) with coordinate overhead per element. For contiguous arrays, use ClipHelper instead.
ClipUnified<T>(T*, int, T, T, Shape)
Unified Clip operation that handles both contiguous and strided arrays. Automatically selects the optimal path based on array contiguity.
public static void ClipUnified<T>(T* data, int size, T minVal, T maxVal, Shape shape) where T : unmanaged, IComparable<T>
Parameters
dataT*Pointer to the data buffer (at offset 0, not adjusted for shape.offset)
sizeintNumber of elements to process
minValTMinimum value to clip to
maxValTMaximum value to clip to
shapeShapeShape describing the memory layout
Type Parameters
T
CumSumHelper<TIn, TOut>(void*, void*, int)
SIMD-optimized cumulative sum for contiguous arrays with type conversion. Called directly by DefaultEngine for the fast path.
public static void CumSumHelper<TIn, TOut>(void* input, void* output, int totalSize) where TIn : unmanaged where TOut : unmanaged
Parameters
inputvoid*Pointer to input data
outputvoid*Pointer to output data
totalSizeintNumber of elements
Type Parameters
TInInput element type
TOutOutput element type
GetBinaryScalarDelegate(BinaryScalarKernelKey)
Get or generate an IL-based binary scalar delegate. Returns a Func<TLhs, TRhs, TResult> delegate.
public static Delegate GetBinaryScalarDelegate(BinaryScalarKernelKey key)
Parameters
Returns
GetComparisonKernel(ComparisonKernelKey)
Get or generate a comparison kernel for the specified key.
public static ComparisonKernel GetComparisonKernel(ComparisonKernelKey key)
Parameters
Returns
GetComparisonScalarDelegate(ComparisonScalarKernelKey)
Get or generate a comparison scalar delegate. Returns a Func<TLhs, TRhs, bool> delegate.
public static Delegate GetComparisonScalarDelegate(ComparisonScalarKernelKey key)
Parameters
Returns
GetContiguousKernel<T>(BinaryOp)
Get or generate an IL-based kernel for contiguous (SimdFull) operations. Returns null if IL generation is not supported for this type/operation.
public static ContiguousKernel<T>? GetContiguousKernel<T>(BinaryOp op) where T : unmanaged
Parameters
opBinaryOp
Returns
Type Parameters
T
GetCumulativeAxisKernel(CumulativeAxisKernelKey)
Get or generate a cumulative axis (scan along axis) kernel. Returns a delegate that computes running accumulation along a specific axis.
public static CumulativeAxisKernel GetCumulativeAxisKernel(CumulativeAxisKernelKey key)
Parameters
Returns
GetCumulativeKernel(CumulativeKernelKey)
Get or generate a cumulative (scan) kernel. Returns a delegate that computes running accumulation over all elements.
public static CumulativeKernel GetCumulativeKernel(CumulativeKernelKey key)
Parameters
Returns
GetMatMulKernel<T>()
Get or generate an IL-based high-performance MatMul kernel. Returns null if the type is not supported for SIMD optimization.
public static MatMul2DKernel<T>? GetMatMulKernel<T>() where T : unmanaged
Returns
Type Parameters
T
GetMixedTypeKernel(MixedTypeKernelKey)
Get or generate a mixed-type kernel for the specified key.
public static MixedTypeKernel GetMixedTypeKernel(MixedTypeKernelKey key)
Parameters
Returns
GetShiftArrayKernel<T>(bool)
Get or generate a shift kernel for element-wise shift amounts.
public static ILKernelGenerator.ShiftArrayKernel<T>? GetShiftArrayKernel<T>(bool isLeftShift) where T : unmanaged
Parameters
isLeftShiftboolTrue for left shift, false for right shift
Returns
- ILKernelGenerator.ShiftArrayKernel<T>
Kernel delegate or null if not supported
Type Parameters
TInteger element type
GetShiftScalarKernel<T>(bool)
Get or generate a SIMD-optimized shift kernel for uniform shift amount.
public static ILKernelGenerator.ShiftScalarKernel<T>? GetShiftScalarKernel<T>(bool isLeftShift) where T : unmanaged
Parameters
isLeftShiftboolTrue for left shift, false for right shift
Returns
- ILKernelGenerator.ShiftScalarKernel<T>
Kernel delegate or null if not supported
Type Parameters
TInteger element type
GetTypedElementReductionKernel<TResult>(ElementReductionKernelKey)
Get or generate a typed element-wise reduction kernel. Returns a delegate that reduces all elements to a single value of type TResult.
public static TypedElementReductionKernel<TResult> GetTypedElementReductionKernel<TResult>(ElementReductionKernelKey key) where TResult : unmanaged
Parameters
Returns
- TypedElementReductionKernel<TResult>
Type Parameters
TResult
GetUnaryKernel(UnaryKernelKey)
Get or generate a unary kernel for the specified key.
public static UnaryKernel GetUnaryKernel(UnaryKernelKey key)
Parameters
keyUnaryKernelKey
Returns
GetUnaryScalarDelegate(UnaryScalarKernelKey)
Get or generate an IL-based unary scalar delegate. Returns a Func<TInput, TOutput> delegate.
public static Delegate GetUnaryScalarDelegate(UnaryScalarKernelKey key)
Parameters
Returns
ModfHelper(double*, double*, int)
SIMD-optimized Modf operation for contiguous double arrays. Computes fractional and integral parts in-place. Handles special values (NaN, Inf) according to C standard modf.
public static void ModfHelper(double* data, double* integral, int size)
Parameters
datadouble*Input array (will contain fractional parts after)
integraldouble*Output array for integral parts
sizeintNumber of elements
ModfHelper(float*, float*, int)
SIMD-optimized Modf operation for contiguous float arrays. Computes fractional and integral parts in-place. Handles special values (NaN, Inf) according to C standard modf.
public static void ModfHelper(float* data, float* integral, int size)
Parameters
datafloat*Input array (will contain fractional parts after)
integralfloat*Output array for integral parts
sizeintNumber of elements
TryGetAxisReductionKernel(AxisReductionKernelKey)
Try to get an axis reduction kernel. Supports all reduction operations and all types including type promotion. Uses SIMD for capable types, scalar loop for others.
public static AxisReductionKernel? TryGetAxisReductionKernel(AxisReductionKernelKey key)
Parameters
Returns
TryGetComparisonKernel(ComparisonKernelKey)
Try to get or generate a comparison kernel. Returns null if generation fails.
public static ComparisonKernel? TryGetComparisonKernel(ComparisonKernelKey key)
Parameters
Returns
TryGetCumulativeAxisKernel(CumulativeAxisKernelKey)
Try to get or generate a cumulative axis kernel.
public static CumulativeAxisKernel? TryGetCumulativeAxisKernel(CumulativeAxisKernelKey key)
Parameters
Returns
TryGetCumulativeKernel(CumulativeKernelKey)
Try to get or generate a cumulative kernel.
public static CumulativeKernel? TryGetCumulativeKernel(CumulativeKernelKey key)
Parameters
Returns
TryGetMixedTypeKernel(MixedTypeKernelKey)
Try to get or generate a mixed-type kernel. Returns null if generation fails.
public static MixedTypeKernel? TryGetMixedTypeKernel(MixedTypeKernelKey key)
Parameters
Returns
TryGetNanAxisReductionKernel(AxisReductionKernelKey)
Try to get a NaN-aware axis reduction kernel. Only supports float and double types (NaN is only defined for floating-point).
public static AxisReductionKernel? TryGetNanAxisReductionKernel(AxisReductionKernelKey key)
Parameters
Returns
TryGetTypedElementReductionKernel<TResult>(ElementReductionKernelKey)
Try to get or generate an element reduction kernel.
public static TypedElementReductionKernel<TResult>? TryGetTypedElementReductionKernel<TResult>(ElementReductionKernelKey key) where TResult : unmanaged
Parameters
Returns
- TypedElementReductionKernel<TResult>
Type Parameters
TResult
TryGetUnaryKernel(UnaryKernelKey)
Try to get or generate a unary kernel. Returns null if generation fails.
public static UnaryKernel? TryGetUnaryKernel(UnaryKernelKey key)
Parameters
keyUnaryKernelKey