SafeLlamaModelHandle
Namespace: LLama.Native
A reference to a set of llama model weights
public sealed class SafeLlamaModelHandle : SafeLLamaHandleBase, System.IDisposable
Inheritance Object → CriticalFinalizerObject → SafeHandle → SafeLLamaHandleBase → SafeLlamaModelHandle
Implements IDisposable
Properties
VocabCount
Total number of tokens in vocabulary of this model
public int VocabCount { get; }
Property Value
ContextSize
Total number of tokens in the context
public int ContextSize { get; }
Property Value
RopeFrequency
Get the rope frequency this model was trained with
public float RopeFrequency { get; }
Property Value
EmbeddingSize
Dimension of embedding vectors
public int EmbeddingSize { get; }
Property Value
SizeInBytes
Get the size of this model in bytes
public ulong SizeInBytes { get; }
Property Value
ParameterCount
Get the number of parameters in this model
public ulong ParameterCount { get; }
Property Value
Description
Get a description of this model
public string Description { get; }
Property Value
MetadataCount
Get the number of metadata key/value pairs
public int MetadataCount { get; }
Property Value
IsInvalid
public bool IsInvalid { get; }
Property Value
IsClosed
public bool IsClosed { get; }
Property Value
Constructors
SafeLlamaModelHandle()
public SafeLlamaModelHandle()
Methods
ReleaseHandle()
protected bool ReleaseHandle()
Returns
LoadFromFile(String, LLamaModelParams)
Load a model from the given file path into memory
public static SafeLlamaModelHandle LoadFromFile(string modelPath, LLamaModelParams lparams)
Parameters
modelPath
String
lparams
LLamaModelParams
Returns
Exceptions
llama_model_apply_lora_from_file(SafeLlamaModelHandle, String, Single, String, Int32)
Apply a LoRA adapter to a loaded model path_base_model is the path to a higher quality model to use as a base for the layers modified by the adapter. Can be NULL to use the current loaded model. The model needs to be reloaded before applying a new adapter, otherwise the adapter will be applied on top of the previous one
public static int llama_model_apply_lora_from_file(SafeLlamaModelHandle model_ptr, string path_lora, float scale, string path_base_model, int n_threads)
Parameters
model_ptr
SafeLlamaModelHandle
path_lora
String
scale
Single
path_base_model
String
n_threads
Int32
Returns
Int32
Returns 0 on success
llama_model_meta_val_str(SafeLlamaModelHandle, Byte, Byte, Int64)
Get metadata value as a string by key name
public static int llama_model_meta_val_str(SafeLlamaModelHandle model, Byte* key, Byte* buf, long buf_size)
Parameters
model
SafeLlamaModelHandle
key
Byte*
buf
Byte*
buf_size
Int64
Returns
Int32
The length of the string on success, or -1 on failure
ApplyLoraFromFile(String, Single, String, Nullable<Int32>)
Apply a LoRA adapter to a loaded model
public void ApplyLoraFromFile(string lora, float scale, string modelBase, Nullable<int> threads)
Parameters
lora
String
scale
Single
modelBase
String
A path to a higher quality model to use as a base for the layers modified by the
adapter. Can be NULL to use the current loaded model.
threads
Nullable<Int32>
Exceptions
TokenToSpan(LLamaToken, Span<Byte>)
Convert a single llama token into bytes
public uint TokenToSpan(LLamaToken token, Span<byte> dest)
Parameters
token
LLamaToken
Token to decode
dest
Span<Byte>
A span to attempt to write into. If this is too small nothing will be written
Returns
UInt32
The size of this token. nothing will be written if this is larger than dest
TokensToSpan(IReadOnlyList<LLamaToken>, Span<Char>, Encoding)
Caution
Use a StreamingTokenDecoder instead
Convert a sequence of tokens into characters.
internal Span<char> TokensToSpan(IReadOnlyList<LLamaToken> tokens, Span<char> dest, Encoding encoding)
Parameters
tokens
IReadOnlyList<LLamaToken>
dest
Span<Char>
encoding
Encoding
Returns
Span<Char>
The section of the span which has valid data in it.
If there was insufficient space in the output span this will be
filled with as many characters as possible, starting from the last token.
Tokenize(String, Boolean, Boolean, Encoding)
Convert a string of text into tokens
public LLamaToken[] Tokenize(string text, bool add_bos, bool special, Encoding encoding)
Parameters
text
String
add_bos
Boolean
special
Boolean
Allow tokenizing special and/or control tokens which otherwise are not exposed and treated as plaintext.
encoding
Encoding
Returns
CreateContext(LLamaContextParams)
Create a new context for this model
public SafeLLamaContextHandle CreateContext(LLamaContextParams params)
Parameters
params
LLamaContextParams
Returns
MetadataKeyByIndex(Int32)
Get the metadata key for the given index
public Nullable<Memory<byte>> MetadataKeyByIndex(int index)
Parameters
index
Int32
The index to get
Returns
Nullable<Memory<Byte>>
The key, null if there is no such key or if the buffer was too small
MetadataValueByIndex(Int32)
Get the metadata value for the given index
public Nullable<Memory<byte>> MetadataValueByIndex(int index)
Parameters
index
Int32
The index to get
Returns
Nullable<Memory<Byte>>
The value, null if there is no such value or if the buffer was too small
ReadMetadata()
internal IReadOnlyDictionary<string, string> ReadMetadata()
Returns
IReadOnlyDictionary<String, String>
<llama_model_meta_key_by_index>g__llama_model_meta_key_by_index_native|23_0(SafeLlamaModelHandle, Int32, Byte*, Int64)
internal static int <llama_model_meta_key_by_index>g__llama_model_meta_key_by_index_native|23_0(SafeLlamaModelHandle model, int index, Byte* buf, long buf_size)
Parameters
model
SafeLlamaModelHandle
index
Int32
buf
Byte*
buf_size
Int64
Returns
<llama_model_meta_val_str_by_index>g__llama_model_meta_val_str_by_index_native|24_0(SafeLlamaModelHandle, Int32, Byte*, Int64)
internal static int <llama_model_meta_val_str_by_index>g__llama_model_meta_val_str_by_index_native|24_0(SafeLlamaModelHandle model, int index, Byte* buf, long buf_size)
Parameters
model
SafeLlamaModelHandle
index
Int32
buf
Byte*
buf_size
Int64