LLamaModelParams

Namespace: LLama.Native

A C# representation of the llama.cpp llama_model_params struct

public struct LLamaModelParams

Inheritance Object → ValueType → LLamaModelParams

Fields

tensor_buft_overrides

NULL-terminated list of buffer types to use for tensors that match a pattern

public LLamaModelTensorBufferOverride* tensor_buft_overrides;

n_gpu_layers

// number of layers to store in VRAM

public int n_gpu_layers;

split_mode

how to split the model across multiple GPUs

public GPUSplitMode split_mode;

main_gpu

the GPU that is used for the entire model when split_mode is LLAMA_SPLIT_MODE_NONE

public int main_gpu;

tensor_split

how to split layers across multiple GPUs (size: NativeApi.llama_max_devices())

public Single* tensor_split;

progress_callback

called with a progress value between 0 and 1, pass NULL to disable. If the provided progress_callback returns true, model loading continues. If it returns false, model loading is immediately aborted.

public LlamaProgressCallback progress_callback;

progress_callback_user_data

context pointer passed to the progress callback

public Void* progress_callback_user_data;

kv_overrides

override key-value pairs of the model meta data

public LLamaModelMetadataOverride* kv_overrides;

Properties

vocab_only

only load the vocabulary, no weights

public bool vocab_only { get; set; }

Property Value

Boolean

use_mmap

use mmap if possible

public bool use_mmap { get; set; }

Property Value

Boolean

use_mlock

force system to keep model in RAM

public bool use_mlock { get; set; }

Property Value

Boolean

check_tensors

validate model tensor data

public bool check_tensors { get; set; }

Property Value

Boolean

Methods

Default()

Create a LLamaModelParams with default values

LLamaModelParams Default()

Returns

LLamaModelParams

< Back