Table of Contents

Class SimdDot

Namespace
NumSharp.Backends.Kernels
Assembly
NumSharp.dll

SIMD fused multiply-accumulate dot product for contiguous float / double vectors. Computes sum(a[i] * b[i]) in a single pass — no temporary product array (contrast with left * right followed by ReduceAdd, which materializes an n-element temp and walks the data twice).

Four independent Vector256 accumulators give the out-of-order core enough instruction-level parallelism to hide FMA latency; a scalar tail handles the remainder. Accumulation type matches the element type (double in double, float in float) so the result dtype mirrors NumPy's np.dot.

Callers route only contiguous (stride == 1) same-type operands here; strided views take a scalar strided loop, and non-float dtypes take the INumber<T> path — both in Default.Dot.Fused.cs.

public static class SimdDot
Inheritance
SimdDot
Inherited Members

Methods

DotDouble(double*, double*, long)

Fused dot of two contiguous double vectors of length n.

public static double DotDouble(double* a, double* b, long n)

Parameters

a double*
b double*
n long

Returns

double

DotFloat(float*, float*, long)

Fused dot of two contiguous float vectors of length n.

public static float DotFloat(float* a, float* b, long n)

Parameters

a float*
b float*
n long

Returns

float