LLamaContextParams
Namespace: LLama.Native
A C# representation of the llama.cpp llama_context_params struct
1 | |
Inheritance Object → ValueType → LLamaContextParams
Remarks:
changing the default values of parameters marked as [EXPERIMENTAL] may cause crashes or incorrect results in certain configurations https://github.com/ggerganov/llama.cpp/pull/7544
Fields
n_ctx
text context, 0 = from model
1 | |
n_batch
logical maximum batch size that can be submitted to llama_decode
1 | |
n_ubatch
physical maximum batch size
1 | |
n_seq_max
max number of sequences (i.e. distinct states for recurrent models)
1 | |
n_threads
number of threads to use for generation
1 | |
n_threads_batch
number of threads to use for batch processing
1 | |
rope_scaling_type
RoPE scaling type, from enum llama_rope_scaling_type
1 | |
llama_pooling_type
whether to pool (sum) embedding results by sequence id
1 | |
attention_type
Attention type to use for embeddings
1 | |
rope_freq_base
RoPE base frequency, 0 = from model
1 | |
rope_freq_scale
RoPE frequency scaling factor, 0 = from model
1 | |
yarn_ext_factor
YaRN extrapolation mix factor, negative = from model
1 | |
yarn_attn_factor
YaRN magnitude scaling factor
1 | |
yarn_beta_fast
YaRN low correction dim
1 | |
yarn_beta_slow
YaRN high correction dim
1 | |
yarn_orig_ctx
YaRN original context size
1 | |
defrag_threshold
defragment the KV cache if holes/size > defrag_threshold, Set to < 0 to disable (default)
1 | |
cb_eval
ggml_backend_sched_eval_callback
1 | |
cb_eval_user_data
User data passed into cb_eval
1 | |
type_k
data type for K cache. EXPERIMENTAL
1 | |
type_v
data type for V cache. EXPERIMENTAL
1 | |
abort_callback
ggml_abort_callback
1 | |
abort_callback_user_data
User data passed into abort_callback
1 | |
Properties
embeddings
if true, extract embeddings (together with logits)
1 | |
Property Value
offload_kqv
whether to offload the KQV ops (including the KV cache) to GPU
1 | |
Property Value
flash_attention
whether to use flash attention. EXPERIMENTAL
1 | |
Property Value
no_perf
whether to measure performance timings
1 | |
Property Value
Methods
Default()
Get the default LLamaContextParams
1 | |