Namespace NumSharp.Backends.Unmanaged.Pooling
Classes
- SizeBucketedBufferPool
Thread-safe pool of recently-freed unmanaged buffers, bucketed by exact byte size. Acts as a tcache-like front for Alloc(nuint) / Free(void*): a successful Take is just a pop from a per-size ConcurrentStack<T>; a failed Take falls through to NativeMemory.Alloc.
WHY THIS EXISTS
Profiling NumSharp's binary-op pipeline shows ~500 µs of every 1024×1024 float32
a + bis spent on first-touch overhead of the fresh output buffer — page-faulting each cache line on first write plus the kernel-mode cost of Alloc(nuint) reaching out to the OS for a fresh chunk. NumPy hides the same cost via glibc tcache reuse: a buffer freed by the previous op is handed back warm to the next call. This pool replicates that behaviour at the NumSharp layer.SIZING POLICY
• The window is MinPoolableBytes (1 B) to MaxPoolableBytes (64 MiB) — Wave 2.4 opened both ends: the 1000-element float32 result (4000 B) missed the old 4 KiB floor by 96 bytes, and every 4M-element output (16–32 MiB) missed the old 1 MiB cap, paying ~2× in demand-zero page faults per call (in-place toggle-verified: P1 contig add 4M 3.37→1.74 ms). • Above the cap: no pooling. Huge buffers are rare and the memory cost of keeping them around dwarfs the alloc-cost savings. • Per-bucket cap of MaxBuffersPerBucket entries (MaxBuffersPerLargeBucket at ≥ 1 MiB) to bound peak resident memory. • Bucket key is the EXACT byte count requested (no rounding). Same-size repeated allocs are the dominant pattern in element- wise ops; rounding to power-of-2 would waste memory and break exact-fit reuse for typical workloads (e.g. 4 MiB float32 1K×1K).
CORRECTNESS
• Stored buffers are NOT zero-filled. Callers that need zeroed memory must zero on Take (the same contract NativeMemory.Alloc has). • Buffer ownership transfers fully on Take: the pool no longer references the pointer, so subsequent Return calls aren't at risk of double-pop. • Return is best-effort: when the bucket is full or the size falls outside the pool's window the pointer is freed immediately via Free(void*).