StatefulExecutorBase
Namespace: LLama
The base class for stateful LLama executors.
1 |
|
Inheritance Object → StatefulExecutorBase
Implements ILLamaExecutor
Attributes NullableContextAttribute, NullableAttribute
Fields
_logger
The logger used by this executor.
1 |
|
_pastTokensCount
The tokens that were already processed by the model.
1 |
|
_consumedTokensCount
The tokens that were consumed by the model during the current inference.
1 |
|
_n_session_consumed
1 |
|
_n_matching_session_tokens
1 |
|
_pathSession
The path of the session file.
1 |
|
_embeds
A container of the tokens to be processed and after processed.
1 |
|
_embed_inps
A container for the tokens of input.
1 |
|
_session_tokens
1 |
|
_last_n_tokens
The last tokens generated by the model.
1 |
|
Properties
Context
The context used by the executor.
1 |
|
Property Value
IsMultiModal
1 |
|
Property Value
ClipModel
1 |
|
Property Value
Images
1 |
|
Property Value
Constructors
StatefulExecutorBase(LLamaContext, ILogger)
1 |
|
Parameters
context
LLamaContext
logger
ILogger
StatefulExecutorBase(LLamaContext, LLavaWeights, ILogger)
1 |
|
Parameters
context
LLamaContext
lLavaWeights
LLavaWeights
logger
ILogger
Methods
WithSessionFile(String)
This API is currently not verified.
1 |
|
Parameters
filename
String
Returns
Exceptions
SaveSessionFile(String)
This API has not been verified currently.
1 |
|
Parameters
filename
String
HandleRunOutOfContext(Int32)
After running out of the context, take some tokens from the original prompt and recompute the logits in batches.
1 |
|
Parameters
tokensToKeep
Int32
TryReuseMatchingPrefix()
Try to reuse the matching prefix from the session file.
1 |
|
GetLoopCondition(InferStateArgs)
Decide whether to continue the loop.
1 |
|
Parameters
args
InferStateArgs
Returns
PreprocessInputs(String, InferStateArgs)
Preprocess the inputs before the inference.
1 |
|
Parameters
text
String
args
InferStateArgs
Returns
PostProcess(IInferenceParams, InferStateArgs)
Do some post processing after the inference.
1 |
|
Parameters
inferenceParams
IInferenceParams
args
InferStateArgs
Returns
Task<ValueTuple<Boolean, IReadOnlyList<String>>>
InferInternal(IInferenceParams, InferStateArgs)
The core inference logic.
1 |
|
Parameters
inferenceParams
IInferenceParams
args
InferStateArgs
Returns
SaveState(String)
Save the current state to a file.
1 |
|
Parameters
filename
String
Returns
GetStateData()
Get the current state data.
1 |
|
Returns
LoadState(ExecutorBaseState)
Load the state from data.
1 |
|
Parameters
data
ExecutorBaseState
Returns
LoadState(String)
Load the state from a file.
1 |
|
Parameters
filename
String
Returns
InferAsync(String, IInferenceParams, CancellationToken)
Execute the inference.
1 |
|
Parameters
text
String
The prompt. If null, generation will continue where it left off previously.
inferenceParams
IInferenceParams
cancellationToken
CancellationToken
Returns
PrefillPromptAsync(String)
Asynchronously runs a prompt through the model to compute KV cache without generating any new tokens. It could reduce the latency of the first time response if the first input from the user is not immediate.
1 |
|
Parameters
prompt
String
Prompt to process