Class SimdDot
SIMD fused multiply-accumulate dot product for contiguous float / double vectors.
Computes sum(a[i] * b[i]) in a single pass — no temporary product array
(contrast with left * right followed by ReduceAdd, which materializes
an n-element temp and walks the data twice).
Four independent Vector256 accumulators give the out-of-order core enough
instruction-level parallelism to hide FMA latency; a scalar tail handles the
remainder. Accumulation type matches the element type (double in double, float in
float) so the result dtype mirrors NumPy's np.dot.
Callers route only contiguous (stride == 1) same-type operands here; strided views
take a scalar strided loop, and non-float dtypes take the INumber<T> path —
both in Default.Dot.Fused.cs.
public static class SimdDot
- Inheritance
-
SimdDot
- Inherited Members
Methods
DotDouble(double*, double*, long)
Fused dot of two contiguous double vectors of length n.
public static double DotDouble(double* a, double* b, long n)
Parameters
Returns
DotFloat(float*, float*, long)
Fused dot of two contiguous float vectors of length n.
public static float DotFloat(float* a, float* b, long n)