BatchedExecutor
Namespace: LLama.Batched
A batched executor that can infer multiple separate "conversations" simultaneously.
1 |
|
Inheritance Object → BatchedExecutor
Implements IDisposable
Attributes NullableContextAttribute, NullableAttribute
Properties
Context
The LLamaContext this executor is using
1 |
|
Property Value
Model
The LLamaWeights this executor is using
1 |
|
Property Value
BatchedTokenCount
Get the number of tokens in the batch, waiting for BatchedExecutor.Infer(CancellationToken) to be called
1 |
|
Property Value
BatchQueueCount
Number of batches in the queue, waiting for BatchedExecutor.Infer(CancellationToken) to be called
1 |
|
Property Value
IsDisposed
Check if this executor has been disposed.
1 |
|
Property Value
Constructors
BatchedExecutor(LLamaWeights, IContextParams)
Create a new batched executor
1 |
|
Parameters
model
LLamaWeights
The model to use
contextParams
IContextParams
Parameters to create a new context
Methods
Create()
Start a new Conversation
1 |
|
Returns
Load(String)
Load a conversation that was previously saved to a file. Once loaded the conversation will need to be prompted.
1 |
|
Parameters
filepath
String
Returns
Exceptions
Load(State)
Load a conversation that was previously saved into memory. Once loaded the conversation will need to be prompted.
1 |
|
Parameters
state
State
Returns
Exceptions
Infer(CancellationToken)
Run inference for all conversations in the batch which have pending tokens.
If the result is NoKvSlot
then there is not enough memory for inference, try disposing some conversation
threads and running inference again.
1 |
|
Parameters
cancellation
CancellationToken
Returns
Dispose()
1 |
|