Class DirectILKernelGenerator
Binary operations (same-type) - contiguous kernels and generic helpers.
public static class DirectILKernelGenerator
- Inheritance
-
DirectILKernelGenerator
- Inherited Members
Fields
VectorBits
Detected vector width at startup: 512, 256, 128, or 0 (no SIMD).
public static readonly int VectorBits
Field Value
VectorBytes
Number of bytes per vector register.
public static readonly int VectorBytes
Field Value
Properties
Enabled
Whether IL generation is enabled. Can be disabled for debugging.
public static bool Enabled { get; set; }
Property Value
Name
Provider name for diagnostics.
public static string Name { get; }
Property Value
Methods
Clip(NPTypeCode, ClipMode, ClipBoundsKind, void*, void*, long, void*, void*)
Run a clip operation. Picks (and on first call, IL-generates) the appropriate DynamicMethod for the (dtype, mode, kind) tuple and invokes it with the supplied pointers.
public static void Clip(NPTypeCode dtype, DirectILKernelGenerator.ClipMode mode, DirectILKernelGenerator.ClipBoundsKind kind, void* src, void* dst, long size, void* lo, void* hi)
Parameters
dtypeNPTypeCodemodeDirectILKernelGenerator.ClipModekindDirectILKernelGenerator.ClipBoundsKindsrcvoid*dstvoid*sizelonglovoid*hivoid*
GetArgwhereCountKernel(Type)
IL-emitted count of non-zero elements. Returns null only when
Enabled is false — every supported dtype has a kernel
(SIMD where Vector{T} exists, scalar IL via op_Inequality otherwise).
public static ArgwhereCountKernel GetArgwhereCountKernel(Type elementType)
Parameters
elementTypeType
Returns
GetArgwhereExpandKernel()
IL-emitted coord expand (singleton — same kernel handles any ndim).
public static ArgwhereExpandKernel GetArgwhereExpandKernel()
Returns
GetArgwhereFlatKernel(Type)
IL-emitted bit-scan that writes flat indices of non-zero elements into a pre-sized buffer.
public static ArgwhereFlatKernel GetArgwhereFlatKernel(Type elementType)
Parameters
elementTypeType
Returns
GetBinaryScalarDelegate(BinaryScalarKernelKey)
Get or generate an IL-based binary scalar delegate. Returns a Func<TLhs, TRhs, TResult> delegate.
public static Delegate GetBinaryScalarDelegate(BinaryScalarKernelKey key)
Parameters
Returns
GetComparisonKernel(ComparisonKernelKey)
Get or generate a comparison kernel for the specified key.
public static ComparisonKernel GetComparisonKernel(ComparisonKernelKey key)
Parameters
Returns
GetComparisonScalarDelegate(ComparisonScalarKernelKey)
Get or generate a comparison scalar delegate. Returns a Func<TLhs, TRhs, bool> delegate.
public static Delegate GetComparisonScalarDelegate(ComparisonScalarKernelKey key)
Parameters
Returns
GetCopyKernel(CopyKernelKey)
public static CopyKernel GetCopyKernel(CopyKernelKey key)
Parameters
keyCopyKernelKey
Returns
GetCumulativeKernel(CumulativeKernelKey)
Get or generate a cumulative (scan) kernel. Returns a delegate that computes running accumulation over all elements.
public static CumulativeKernel GetCumulativeKernel(CumulativeKernelKey key)
Parameters
Returns
GetFilterAxisKernel(long)
IL-emitted kernel cached by innerSize. Pass the
actual innerSize you'll use at call time; the function buckets
{1,2,4,8,16} into typed-copy variants and anything else into the
bulk-cpblk variant. Returns null only when
Enabled is false.
public static FilterAxisKernel GetFilterAxisKernel(long innerSize)
Parameters
innerSizelong
Returns
GetIndicesKernel()
IL-emitted indices fill kernel (singleton — same kernel handles any ndim).
Returns null only when Enabled is false.
public static IndicesKernel GetIndicesKernel()
Returns
GetMixedTypeKernel(MixedTypeKernelKey)
Get or generate a mixed-type kernel for the specified key.
public static MixedTypeKernel GetMixedTypeKernel(MixedTypeKernelKey key)
Parameters
Returns
GetNonZeroPerDimKernel()
IL-emitted per-dim coord expander (singleton — same kernel handles any ndim).
Returns null only when Enabled is false.
public static NonZeroPerDimKernel GetNonZeroPerDimKernel()
Returns
GetPlaceKernel()
IL-emitted place kernel (singleton — same kernel handles any dtype
via the elemBytes runtime argument). Returns null only
when Enabled is false.
public static PlaceKernel GetPlaceKernel()
Returns
GetPutKernel()
IL-emitted put kernel (singleton — same kernel handles any dtype
via the elemBytes runtime argument and any mode).
Returns null only when Enabled is false.
public static PutKernel GetPutKernel()
Returns
GetRavelMultiIndexKernel()
IL-emitted multi→flat folder (singleton — same kernel handles any
ndim, both orders, and arbitrary per-axis mode tuples). Returns
null only when Enabled is false.
public static RavelMultiIndexKernel GetRavelMultiIndexKernel()
Returns
GetRepeatBroadcastKernel(int)
Returns the cached IL-emitted broadcast-repeat kernel for the given slab size. First call for a size triggers IL generation, later calls hit a dictionary lookup.
public static DirectILKernelGenerator.RepeatBroadcastKernel GetRepeatBroadcastKernel(int chunkBytes)
Parameters
chunkBytesint
Returns
GetRepeatPerJKernel(int)
Returns the cached IL-emitted per-j repeat kernel for the given slab size.
public static DirectILKernelGenerator.RepeatPerJKernel GetRepeatPerJKernel(int chunkBytes)
Parameters
chunkBytesint
Returns
GetSearchSortedKernel(NPTypeCode, bool, bool, bool)
Get or generate a searchsorted kernel.
public static DirectILKernelGenerator.SearchSortedKernel GetSearchSortedKernel(NPTypeCode type, bool leftSide, bool hasSorter, bool contiguousA)
Parameters
typeNPTypeCodeElement dtype of a (and contiguous v, after caller normalizes).
leftSidebooltrue = side='left', false = side='right'.
hasSorterbooltrue = sorter param non-null (kernel emits sort_idx indirection).
contiguousAbooltrue = a is contiguous (arrStrideBytes == elemSize). Lets JIT use scaled-index addressing instead of imul, which closes the gap to NumPy on the random-key hot path. When false, arrStrideBytes is honored as a runtime parameter.
Returns
GetShiftArrayKernel<T>(bool)
Get or generate a shift kernel for element-wise shift amounts.
public static DirectILKernelGenerator.ShiftArrayKernel<T>? GetShiftArrayKernel<T>(bool isLeftShift) where T : unmanaged
Parameters
isLeftShiftboolTrue for left shift, false for right shift
Returns
- DirectILKernelGenerator.ShiftArrayKernel<T>
Kernel delegate or null if not supported
Type Parameters
TInteger element type
GetShiftScalarKernel<T>(bool)
Get or generate a SIMD-optimized shift kernel for uniform shift amount.
public static DirectILKernelGenerator.ShiftScalarKernel<T>? GetShiftScalarKernel<T>(bool isLeftShift) where T : unmanaged
Parameters
isLeftShiftboolTrue for left shift, false for right shift
Returns
- DirectILKernelGenerator.ShiftScalarKernel<T>
Kernel delegate or null if not supported
Type Parameters
TInteger element type
GetStridedUnaryKernel(UnaryKernelKey)
Get or generate a fused strided-SIMD unary kernel for the given key. Gating (same-width SIMD-capable op, supported dtype) is the caller's responsibility.
public static StridedUnaryKernel GetStridedUnaryKernel(UnaryKernelKey key)
Parameters
keyUnaryKernelKey
Returns
GetTakeKernel()
IL-emitted take kernel (singleton — same kernel handles any ndim,
any elemBytes, any innerSize, both axis=None and axis=k). Returns
null only when Enabled is false.
public static TakeKernel GetTakeKernel()
Returns
GetTraceAccumTypeCode(NPTypeCode)
Maps src NPTypeCode → (result NPTypeCode, supported). Convenience for callers that already have the type-code.
public static (NPTypeCode, bool) GetTraceAccumTypeCode(NPTypeCode src)
Parameters
srcNPTypeCode
Returns
- (NPTypeCode, bool)
GetTraceKernel(Type)
IL-emitted singleton per srcType. Returns
null when the dtype has no kernel (no triples are
unsupported in the current implementation; the field exists for
graceful future expansion).
public static TraceKernel GetTraceKernel(Type srcType)
Parameters
srcTypeType
Returns
GetTypedElementReductionKernel<TResult>(ElementReductionKernelKey)
Get or generate a typed element-wise reduction kernel. Returns a delegate that reduces all elements to a single value of type TResult.
public static TypedElementReductionKernel<TResult> GetTypedElementReductionKernel<TResult>(ElementReductionKernelKey key) where TResult : unmanaged
Parameters
Returns
- TypedElementReductionKernel<TResult>
Type Parameters
TResult
GetUnaryKernel(UnaryKernelKey)
Get or generate a unary kernel for the specified key.
public static UnaryKernel GetUnaryKernel(UnaryKernelKey key)
Parameters
keyUnaryKernelKey
Returns
GetUnaryScalarDelegate(UnaryScalarKernelKey)
Get or generate an IL-based unary scalar delegate. Returns a Func<TInput, TOutput> delegate.
public static Delegate GetUnaryScalarDelegate(UnaryScalarKernelKey key)
Parameters
Returns
GetUnravelIndexKernel()
IL-emitted unravel kernel (singleton — same kernel handles any ndim and
both C / F order via the runtime idxStart / idxStep args).
Returns null only when Enabled is false.
public static UnravelIndexKernel GetUnravelIndexKernel()
Returns
GetWeightedSumIterKernel(WeightedSumKernelKey)
Returns the cached IL-emitted weighted-sum kernel for the given dtype, or null if the dtype isn't supported (Bool/Char/Half/Complex/Decimal). The kernel signature matches NDInnerLoopFunc; pass it to NDIter.ForEach over the 4-operand [a, w, num_out, scl_out] iterator.
public static NDInnerLoopFunc? GetWeightedSumIterKernel(DirectILKernelGenerator.WeightedSumKernelKey key)
Parameters
Returns
GetWhereKernel<T>()
Get or generate an IL-based where kernel for the specified type. Returns null if IL generation is disabled or fails.
public static WhereKernel<T>? GetWhereKernel<T>() where T : unmanaged
Returns
- WhereKernel<T>
Type Parameters
T
GetWhereScalarXKernel<T>()
public static WhereScalarXKernel<T> GetWhereScalarXKernel<T>() where T : unmanaged
Returns
Type Parameters
T
GetWhereScalarXYKernel<T>()
public static WhereScalarXYKernel<T> GetWhereScalarXYKernel<T>() where T : unmanaged
Returns
Type Parameters
T
GetWhereScalarYKernel<T>()
public static WhereScalarYKernel<T> GetWhereScalarYKernel<T>() where T : unmanaged
Returns
Type Parameters
T
ModfHelper(double*, double*, long)
SIMD-optimized Modf operation for contiguous double arrays. Computes fractional and integral parts in-place. Handles special values (NaN, Inf) according to C standard modf.
public static void ModfHelper(double* data, double* integral, long size)
Parameters
datadouble*Input array (will contain fractional parts after)
integraldouble*Output array for integral parts
sizelongNumber of elements
ModfHelper(float*, float*, long)
SIMD-optimized Modf operation for contiguous float arrays. Computes fractional and integral parts in-place. Handles special values (NaN, Inf) according to C standard modf.
public static void ModfHelper(float* data, float* integral, long size)
Parameters
datafloat*Input array (will contain fractional parts after)
integralfloat*Output array for integral parts
sizelongNumber of elements
Quantile(NPTypeCode, NPTypeCode, QuantileMethod, void*, void*, long, int, int*, int, double*, int, void*, long, bool, int*)
Run the cached quantile kernel for the given dtype triple. First call for a tuple emits and caches the DynamicMethod; later calls jump straight into the specialized native code.
public static void Quantile(NPTypeCode srcType, NPTypeCode outType, QuantileMethod method, void* srcBase, void* scratchBase, long outer, int n, int* kSorted, int nKs, double* q, int nQs, void* dstBase, long dstOuterStride, bool ignoreNaN = false, int* rowKScratch = null)
Parameters
srcTypeNPTypeCodeoutTypeNPTypeCodemethodQuantileMethodsrcBasevoid*scratchBasevoid*outerlongnintkSortedint*nKsintqdouble*nQsintdstBasevoid*dstOuterStridelongignoreNaNboolrowKScratchint*
TryGetAxisReductionKernel(AxisReductionKernelKey)
Try to get an axis reduction kernel. Supports all reduction operations and all types including type promotion. Uses SIMD for capable types, scalar loop for others.
public static AxisReductionKernel? TryGetAxisReductionKernel(AxisReductionKernelKey key)
Parameters
Returns
TryGetBooleanAxisReductionKernel(AxisReductionKernelKey)
Try to get a boolean axis reduction kernel (All / Any). Returns null for non-SIMD-capable dtypes (Half, Complex, Decimal, Char) so the caller can fall back to the NDAxisIter scalar path.
public static AxisReductionKernel? TryGetBooleanAxisReductionKernel(AxisReductionKernelKey key)
Parameters
Returns
TryGetCastKernel(NPTypeCode, NPTypeCode)
Get or generate a contig cast kernel for the given pair.
Returns null for unsupported pairs (Boolean/Char/Half/Complex/Decimal involved).
public static DirectILKernelGenerator.CastKernel TryGetCastKernel(NPTypeCode srcType, NPTypeCode dstType)
Parameters
srcTypeNPTypeCodedstTypeNPTypeCode
Returns
TryGetCopyKernel(CopyKernelKey)
public static CopyKernel? TryGetCopyKernel(CopyKernelKey key)
Parameters
keyCopyKernelKey
Returns
TryGetCumulativeAxisKernel(CumulativeAxisKernelKey)
Try to get or generate a cumulative axis kernel.
public static CumulativeAxisKernel? TryGetCumulativeAxisKernel(CumulativeAxisKernelKey key)
Parameters
Returns
TryGetCumulativeKernel(CumulativeKernelKey)
Try to get or generate a cumulative kernel.
public static CumulativeKernel? TryGetCumulativeKernel(CumulativeKernelKey key)
Parameters
Returns
TryGetInnerCastKernel(NPTypeCode, NPTypeCode)
Get or emit the scalar inner-loop cast kernel for the pair. Non-null for every
dtype pair (all 225 Converts.To{Dst}({Src}) methods exist); returns null
only if IL generation is disabled or a method unexpectedly fails to resolve.
public static DirectILKernelGenerator.InnerCastLoop TryGetInnerCastKernel(NPTypeCode srcType, NPTypeCode dstType)
Parameters
srcTypeNPTypeCodedstTypeNPTypeCode
Returns
TryGetMaskedCastKernel(NPTypeCode, NPTypeCode)
Get or generate a masked-cast kernel for the given (src, dst) pair.
Returns null for unsupported pairs (Boolean/Char/Half/Complex/Decimal involved).
public static DirectILKernelGenerator.MaskedCastKernel TryGetMaskedCastKernel(NPTypeCode srcType, NPTypeCode dstType)
Parameters
srcTypeNPTypeCodedstTypeNPTypeCode
Returns
TryGetNanAxisReductionKernel(AxisReductionKernelKey)
Try to get a NaN-aware axis reduction kernel. SIMD kernels exist only for float/double; Half and Complex route to scalar fallback paths (Default.Reduction.Nan.cs ExecuteNanAxisReductionScalar / np.nanmean.cs / np.nanvar.cs / np.nanstd.cs) which handle them directly.
public static AxisReductionKernel? TryGetNanAxisReductionKernel(AxisReductionKernelKey key)
Parameters
Returns
TryGetStridedCastKernel(NPTypeCode, NPTypeCode)
Get or generate a strided/broadcast cast kernel for the given pair.
Returns null for unsupported pairs.
public static DirectILKernelGenerator.StridedCastKernel TryGetStridedCastKernel(NPTypeCode srcType, NPTypeCode dstType)
Parameters
srcTypeNPTypeCodedstTypeNPTypeCode
Returns
TryGetTypedElementReductionKernel<TResult>(ElementReductionKernelKey)
Try to get or generate an element reduction kernel.
public static TypedElementReductionKernel<TResult>? TryGetTypedElementReductionKernel<TResult>(ElementReductionKernelKey key) where TResult : unmanaged
Parameters
Returns
- TypedElementReductionKernel<TResult>
Type Parameters
TResult
WhereExecute<T>(bool*, T*, T*, T*, long)
Execute where operation using IL-generated kernel or fallback to static helper.
public static void WhereExecute<T>(bool* cond, T* x, T* y, T* result, long count) where T : unmanaged
Parameters
Type Parameters
T