LLamaModelParams
Namespace: LLama.Native
A C# representation of the llama.cpp llama_model_params
struct
1 |
|
Inheritance Object → ValueType → LLamaModelParams
Fields
n_gpu_layers
// number of layers to store in VRAM
1 |
|
split_mode
how to split the model across multiple GPUs
1 |
|
main_gpu
the GPU that is used for scratch and small tensors
1 |
|
tensor_split
how to split layers across multiple GPUs (size: NativeApi.llama_max_devices())
1 |
|
progress_callback
called with a progress value between 0 and 1, pass NULL to disable. If the provided progress_callback returns true, model loading continues. If it returns false, model loading is immediately aborted.
1 |
|
progress_callback_user_data
context pointer passed to the progress callback
1 |
|
kv_overrides
override key-value pairs of the model meta data
1 |
|
Properties
vocab_only
only load the vocabulary, no weights
1 |
|
Property Value
use_mmap
use mmap if possible
1 |
|
Property Value
use_mlock
force system to keep model in RAM
1 |
|