Skip to content

< Back


LLamaKvCacheViewSafeHandle

Namespace: LLama.Native

A safe handle for a LLamaKvCacheView

1
public sealed class LLamaKvCacheViewSafeHandle : SafeLLamaHandleBase, System.IDisposable

Inheritance ObjectCriticalFinalizerObjectSafeHandleSafeLLamaHandleBaseLLamaKvCacheViewSafeHandle
Implements IDisposable
Attributes NullableContextAttribute, NullableAttribute

Fields

handle

1
protected IntPtr handle;

Properties

CellCount

Number of KV cache cells. This will be the same as the context size.

1
public int CellCount { get; }

Property Value

Int32

TokenCount

Get the total number of tokens in the KV cache.

For example, if there are two populated cells, the first with 1 sequence id in it and the second with 2 sequence ids then you'll have 3 tokens.

1
public int TokenCount { get; }

Property Value

Int32

MaxSequenceCount

Maximum number of sequences visible for a cell. There may be more sequences than this in reality, this is simply the maximum number this view can see.

1
public int MaxSequenceCount { get; }

Property Value

Int32

UsedCellCount

Number of populated cache cells

1
public int UsedCellCount { get; }

Property Value

Int32

MaxContiguous

Maximum contiguous empty slots in the cache.

1
public int MaxContiguous { get; }

Property Value

Int32

MaxContiguousIdx

Index to the start of the MaxContiguous slot range. Can be negative when cache is full.

1
public int MaxContiguousIdx { get; }

Property Value

Int32

IsInvalid

1
public bool IsInvalid { get; }

Property Value

Boolean

IsClosed

1
public bool IsClosed { get; }

Property Value

Boolean

Methods

Allocate(SafeLLamaContextHandle, Int32)

Allocate a new KV cache view which can be used to inspect the KV cache

1
public static LLamaKvCacheViewSafeHandle Allocate(SafeLLamaContextHandle ctx, int maxSequences)

Parameters

ctx SafeLLamaContextHandle

maxSequences Int32
The maximum number of sequences visible in this view per cell

Returns

LLamaKvCacheViewSafeHandle

ReleaseHandle()

1
protected bool ReleaseHandle()

Returns

Boolean

Update()

Read the current KV cache state into this view.

1
public void Update()

GetCell(Int32)

Get the cell at the given index

1
public LLamaPos GetCell(int index)

Parameters

index Int32
The index of the cell [0, CellCount)

Returns

LLamaPos
Data about the cell at the given index

Exceptions

ArgumentOutOfRangeException
Thrown if index is out of range (0 <= index < CellCount)

GetCellSequences(Int32)

Get all of the sequences assigned to the cell at the given index. This will contain LLamaKvCacheViewSafeHandle.MaxSequenceCount entries sequences even if the cell actually has more than that many sequences, allocate a new view with a larger maxSequences parameter if necessary. Invalid sequences will be negative values.

1
public Span<LLamaSeqId> GetCellSequences(int index)

Parameters

index Int32
The index of the cell [0, CellCount)

Returns

Span<LLamaSeqId>
A span containing the sequences assigned to this cell

Exceptions

ArgumentOutOfRangeException
Thrown if index is out of range (0 <= index < CellCount)


< Back