LLamaContextParams
Namespace: LLama.Native
A C# representation of the llama.cpp llama_context_params
struct
1 |
|
Inheritance Object → ValueType → LLamaContextParams
Fields
seed
RNG seed, -1 for random
1 |
|
n_ctx
text context, 0 = from model
1 |
|
n_batch
prompt processing batch size
1 |
|
n_threads
number of threads to use for generation
1 |
|
n_threads_batch
number of threads to use for batch processing
1 |
|
rope_scaling_type
RoPE scaling type, from enum llama_rope_scaling_type
1 |
|
rope_freq_base
RoPE base frequency, 0 = from model
1 |
|
rope_freq_scale
RoPE frequency scaling factor, 0 = from model
1 |
|
yarn_ext_factor
YaRN extrapolation mix factor, negative = from model
1 |
|
yarn_attn_factor
YaRN magnitude scaling factor
1 |
|
yarn_beta_fast
YaRN low correction dim
1 |
|
yarn_beta_slow
YaRN high correction dim
1 |
|
yarn_orig_ctx
YaRN original context size
1 |
|
defrag_threshold
defragment the KV cache if holes/size > defrag_threshold, Set to < 0 to disable (default)
1 |
|
cb_eval
ggml_backend_sched_eval_callback
1 |
|
cb_eval_user_data
User data passed into cb_eval
1 |
|
type_k
data type for K cache
1 |
|
type_v
data type for V cache
1 |
|
Properties
embedding
embedding mode only
1 |
|
Property Value
offload_kqv
whether to offload the KQV ops (including the KV cache) to GPU
1 |
|
Property Value
do_pooling
Whether to pool (sum) embedding results by sequence id (ignored if no pooling layer)
1 |
|