Inference Parameters
Different from LLamaModel
, when using an executor, InferenceParams
is passed to the Infer
method instead of constructor. This is because executors only define the ways to run the model, therefore in each run, you can change the settings for this time inference.
InferenceParams
Namespace: LLama.Common
public class InferenceParams
Inheritance Object → InferenceParams
Properties
TokensKeep
number of tokens to keep from initial prompt
public int TokensKeep { get; set; }
Property Value
MaxTokens
how many new tokens to predict (n_predict), set to -1 to infinitely generate response until it complete.
public int MaxTokens { get; set; }
Property Value
LogitBias
logit bias for specific tokens
public Dictionary<int, float> LogitBias { get; set; }
Property Value
AntiPrompts
Sequences where the model will stop generating further tokens.
public IEnumerable<string> AntiPrompts { get; set; }
Property Value
PathSession
path to file for saving/loading model eval state
public string PathSession { get; set; }
Property Value
InputSuffix
string to suffix user inputs with
public string InputSuffix { get; set; }
Property Value
InputPrefix
string to prefix user inputs with
public string InputPrefix { get; set; }
Property Value
TopK
0 or lower to use vocab size
public int TopK { get; set; }
Property Value
TopP
1.0 = disabled
public float TopP { get; set; }
Property Value
TfsZ
1.0 = disabled
public float TfsZ { get; set; }
Property Value
TypicalP
1.0 = disabled
public float TypicalP { get; set; }
Property Value
Temperature
1.0 = disabled
public float Temperature { get; set; }
Property Value
RepeatPenalty
1.0 = disabled
public float RepeatPenalty { get; set; }
Property Value
RepeatLastTokensCount
last n tokens to penalize (0 = disable penalty, -1 = context size) (repeat_last_n)
public int RepeatLastTokensCount { get; set; }
Property Value
FrequencyPenalty
frequency penalty coefficient 0.0 = disabled
public float FrequencyPenalty { get; set; }
Property Value
PresencePenalty
presence penalty coefficient 0.0 = disabled
public float PresencePenalty { get; set; }
Property Value
Mirostat
Mirostat uses tokens instead of words. algorithm described in the paper https://arxiv.org/abs/2007.14966. 0 = disabled, 1 = mirostat, 2 = mirostat 2.0
public MiroStateType Mirostat { get; set; }
Property Value
MirostatTau
target entropy
public float MirostatTau { get; set; }
Property Value
MirostatEta
learning rate
public float MirostatEta { get; set; }
Property Value
PenalizeNL
consider newlines as a repeatable token (penalize_nl)
public bool PenalizeNL { get; set; }