BatchedExecutor
Namespace: LLama.Batched
A batched executor that can infer multiple separate "conversations" simultaneously.
1 |
|
Inheritance Object → BatchedExecutor
Implements IDisposable
Properties
Context
The LLamaContext this executor is using
1 |
|
Property Value
Model
The LLamaWeights this executor is using
1 |
|
Property Value
BatchedTokenCount
Get the number of tokens in the batch, waiting for BatchedExecutor.Infer(CancellationToken) to be called
1 |
|
Property Value
IsDisposed
Check if this executor has been disposed.
1 |
|
Property Value
Constructors
BatchedExecutor(LLamaWeights, IContextParams)
Create a new batched executor
1 |
|
Parameters
model
LLamaWeights
The model to use
contextParams
IContextParams
Parameters to create a new context
Methods
Prompt(String)
Caution
Use BatchedExecutor.Create instead
Start a new Conversation with the given prompt
1 |
|
Parameters
prompt
String
Returns
Create()
Start a new Conversation
1 |
|
Returns
Infer(CancellationToken)
Run inference for all conversations in the batch which have pending tokens.
If the result is NoKvSlot
then there is not enough memory for inference, try disposing some conversation
threads and running inference again.
1 |
|
Parameters
cancellation
CancellationToken
Returns
Dispose()
1 |
|
GetNextSequenceId()
1 |
|