IModelParams
Namespace: LLama.Abstractions
The parameters for initializing a LLama model.
1 |
|
Attributes NullableContextAttribute
Properties
MainGpu
main_gpu interpretation depends on split_mode:
- None - The GPU that is used for the entire mode.
- Row - The GPU that is used for small tensors and intermediate results.
- Layer - Ignored.
1 |
|
Property Value
SplitMode
How to split the model across multiple GPUs
1 |
|
Property Value
TensorBufferOverrides
Buffer type overrides for specific tensor patterns, allowing you to specify hardware devices to use for individual tensors or sets of tensors. Equivalent to --override-tensor or -ot on the llama.cpp command line or tensor_buft_overrides internally.
1 |
|
Property Value
GpuLayerCount
Number of layers to run in VRAM / GPU memory (n_gpu_layers)
1 |
|
Property Value
UseMemorymap
Use mmap for faster loads (use_mmap)
1 |
|
Property Value
UseMemoryLock
Use mlock to keep model in memory (use_mlock)
1 |
|
Property Value
ModelPath
Model path (model)
1 |
|
Property Value
TensorSplits
how split tensors should be distributed across GPUs
1 |
|
Property Value
VocabOnly
Load vocab only (no weights)
1 |
|
Property Value
CheckTensors
Validate model tensor data before loading
1 |
|
Property Value
MetadataOverrides
Override specific metadata items in the model
1 |
|