BatchedExecutor
Namespace: LLama.Batched
A batched executor that can infer multiple separate "conversations" simultaneously.
| 1 |  | 
Inheritance Object → BatchedExecutor
Implements IDisposable
Attributes NullableContextAttribute, NullableAttribute
Properties
Context
The LLamaContext this executor is using
| 1 |  | 
Property Value
Model
The LLamaWeights this executor is using
| 1 |  | 
Property Value
BatchedTokenCount
Get the number of tokens in the batch, waiting for BatchedExecutor.Infer(CancellationToken) to be called
| 1 |  | 
Property Value
BatchQueueCount
Number of batches in the queue, waiting for BatchedExecutor.Infer(CancellationToken) to be called
| 1 |  | 
Property Value
IsDisposed
Check if this executor has been disposed.
| 1 |  | 
Property Value
Constructors
BatchedExecutor(LLamaWeights, IContextParams)
Create a new batched executor
| 1 |  | 
Parameters
model LLamaWeights
The model to use
contextParams IContextParams
Parameters to create a new context
Methods
Create()
Start a new Conversation
| 1 |  | 
Returns
Load(String)
Load a conversation that was previously saved to a file. Once loaded the conversation will need to be prompted.
| 1 |  | 
Parameters
filepath String
Returns
Exceptions
Load(State)
Load a conversation that was previously saved into memory. Once loaded the conversation will need to be prompted.
| 1 |  | 
Parameters
state State
Returns
Exceptions
Infer(CancellationToken)
Run inference for all conversations in the batch which have pending tokens.
If the result is NoKvSlot then there is not enough memory for inference, try disposing some conversation
 threads and running inference again.
| 1 |  | 
Parameters
cancellation CancellationToken
Returns
Dispose()
| 1 |  |