LLamaContextParams
Namespace: LLama.Native
A C# representation of the llama.cpp llama_context_params
struct
1 |
|
Inheritance Object → ValueType → LLamaContextParams
Remarks:
changing the default values of parameters marked as [EXPERIMENTAL] may cause crashes or incorrect results in certain configurations https://github.com/ggerganov/llama.cpp/pull/7544
Fields
n_ctx
text context, 0 = from model
1 |
|
n_batch
logical maximum batch size that can be submitted to llama_decode
1 |
|
n_ubatch
physical maximum batch size
1 |
|
n_seq_max
max number of sequences (i.e. distinct states for recurrent models)
1 |
|
n_threads
number of threads to use for generation
1 |
|
n_threads_batch
number of threads to use for batch processing
1 |
|
rope_scaling_type
RoPE scaling type, from enum llama_rope_scaling_type
1 |
|
llama_pooling_type
whether to pool (sum) embedding results by sequence id
1 |
|
attention_type
Attention type to use for embeddings
1 |
|
rope_freq_base
RoPE base frequency, 0 = from model
1 |
|
rope_freq_scale
RoPE frequency scaling factor, 0 = from model
1 |
|
yarn_ext_factor
YaRN extrapolation mix factor, negative = from model
1 |
|
yarn_attn_factor
YaRN magnitude scaling factor
1 |
|
yarn_beta_fast
YaRN low correction dim
1 |
|
yarn_beta_slow
YaRN high correction dim
1 |
|
yarn_orig_ctx
YaRN original context size
1 |
|
defrag_threshold
defragment the KV cache if holes/size > defrag_threshold, Set to < 0 to disable (default)
1 |
|
cb_eval
ggml_backend_sched_eval_callback
1 |
|
cb_eval_user_data
User data passed into cb_eval
1 |
|
type_k
data type for K cache. EXPERIMENTAL
1 |
|
type_v
data type for V cache. EXPERIMENTAL
1 |
|
abort_callback
ggml_abort_callback
1 |
|
abort_callback_user_data
User data passed into abort_callback
1 |
|
Properties
embeddings
if true, extract embeddings (together with logits)
1 |
|
Property Value
offload_kqv
whether to offload the KQV ops (including the KV cache) to GPU
1 |
|
Property Value
flash_attention
whether to use flash attention. EXPERIMENTAL
1 |
|
Property Value
no_perf
whether to measure performance timings
1 |
|
Property Value
Methods
Default()
Get the default LLamaContextParams
1 |
|