LLamaModelParams
Namespace: LLama.Native
A C# representation of the llama.cpp llama_model_params
struct
public struct LLamaModelParams
Inheritance Object → ValueType → LLamaModelParams
Fields
n_gpu_layers
// number of layers to store in VRAM
public int n_gpu_layers;
split_mode
how to split the model across multiple GPUs
public GPUSplitMode split_mode;
main_gpu
the GPU that is used for scratch and small tensors
public int main_gpu;
tensor_split
how to split layers across multiple GPUs (size: NativeApi.llama_max_devices())
public Single* tensor_split;
progress_callback
called with a progress value between 0 and 1, pass NULL to disable. If the provided progress_callback returns true, model loading continues. If it returns false, model loading is immediately aborted.
public LlamaProgressCallback progress_callback;
progress_callback_user_data
context pointer passed to the progress callback
public Void* progress_callback_user_data;
kv_overrides
override key-value pairs of the model meta data
public LLamaModelMetadataOverride* kv_overrides;
Properties
vocab_only
only load the vocabulary, no weights
public bool vocab_only { get; set; }
Property Value
use_mmap
use mmap if possible
public bool use_mmap { get; set; }
Property Value
use_mlock
force system to keep model in RAM
public bool use_mlock { get; set; }