SafeLlamaModelHandle

Namespace: LLama.Native

A reference to a set of llama model weights

public sealed class SafeLlamaModelHandle : SafeLLamaHandleBase, System.IDisposable

Inheritance Object → CriticalFinalizerObject → SafeHandle → SafeLLamaHandleBase → SafeLlamaModelHandle
Implements IDisposable

Properties

VocabCount

Total number of tokens in vocabulary of this model

public int VocabCount { get; }

Property Value

Int32

ContextSize

Total number of tokens in the context

public int ContextSize { get; }

Property Value

Int32

RopeFrequency

Get the rope frequency this model was trained with

public float RopeFrequency { get; }

Property Value

Single

EmbeddingSize

Dimension of embedding vectors

public int EmbeddingSize { get; }

Property Value

Int32

SizeInBytes

Get the size of this model in bytes

public ulong SizeInBytes { get; }

Property Value

UInt64

ParameterCount

Get the number of parameters in this model

public ulong ParameterCount { get; }

Property Value

UInt64

Description

Get a description of this model

public string Description { get; }

Property Value

String

MetadataCount

Get the number of metadata key/value pairs

public int MetadataCount { get; }

Property Value

Int32

IsInvalid

public bool IsInvalid { get; }

Property Value

Boolean

IsClosed

public bool IsClosed { get; }

Property Value

Boolean

Constructors

SafeLlamaModelHandle()

public SafeLlamaModelHandle()

Methods

ReleaseHandle()

protected bool ReleaseHandle()

Returns

Boolean

LoadFromFile(String, LLamaModelParams)

Load a model from the given file path into memory

public static SafeLlamaModelHandle LoadFromFile(string modelPath, LLamaModelParams lparams)

Parameters

modelPath String

lparams LLamaModelParams

Returns

SafeLlamaModelHandle

Exceptions

RuntimeError

llama_model_apply_lora_from_file(SafeLlamaModelHandle, String, Single, String, Int32)

Apply a LoRA adapter to a loaded model path_base_model is the path to a higher quality model to use as a base for the layers modified by the adapter. Can be NULL to use the current loaded model. The model needs to be reloaded before applying a new adapter, otherwise the adapter will be applied on top of the previous one

public static int llama_model_apply_lora_from_file(SafeLlamaModelHandle model_ptr, string path_lora, float scale, string path_base_model, int n_threads)

Parameters

model_ptr SafeLlamaModelHandle

path_lora String

scale Single

path_base_model String

n_threads Int32

Returns

Int32
Returns 0 on success

llama_model_meta_val_str(SafeLlamaModelHandle, Byte, Byte, Int64)

Get metadata value as a string by key name

public static int llama_model_meta_val_str(SafeLlamaModelHandle model, Byte* key, Byte* buf, long buf_size)

Parameters

model SafeLlamaModelHandle

key Byte*

buf Byte*

buf_size Int64

Returns

Int32
The length of the string on success, or -1 on failure

ApplyLoraFromFile(String, Single, String, Nullable<Int32>)

Apply a LoRA adapter to a loaded model

public void ApplyLoraFromFile(string lora, float scale, string modelBase, Nullable<int> threads)

Parameters

lora String

scale Single

modelBase String
A path to a higher quality model to use as a base for the layers modified by the adapter. Can be NULL to use the current loaded model.

threads Nullable<Int32>

Exceptions

RuntimeError

TokenToSpan(LLamaToken, Span<Byte>)

Convert a single llama token into bytes

public uint TokenToSpan(LLamaToken token, Span<byte> dest)

Parameters

token LLamaToken
Token to decode

dest Span<Byte>
A span to attempt to write into. If this is too small nothing will be written

Returns

UInt32
The size of this token. nothing will be written if this is larger than dest

TokensToSpan(IReadOnlyList<LLamaToken>, Span<Char>, Encoding)

Caution

Use a StreamingTokenDecoder instead

Convert a sequence of tokens into characters.

internal Span<char> TokensToSpan(IReadOnlyList<LLamaToken> tokens, Span<char> dest, Encoding encoding)

Parameters

tokens IReadOnlyList<LLamaToken>

dest Span<Char>

encoding Encoding

Returns

Span<Char>
The section of the span which has valid data in it. If there was insufficient space in the output span this will be filled with as many characters as possible, starting from the last token.

Tokenize(String, Boolean, Boolean, Encoding)

Convert a string of text into tokens

public LLamaToken[] Tokenize(string text, bool add_bos, bool special, Encoding encoding)

Parameters

text String

add_bos Boolean

special Boolean
Allow tokenizing special and/or control tokens which otherwise are not exposed and treated as plaintext.

encoding Encoding

Returns

LLamaToken[]

CreateContext(LLamaContextParams)

Create a new context for this model

public SafeLLamaContextHandle CreateContext(LLamaContextParams params)

Parameters

params LLamaContextParams

Returns

SafeLLamaContextHandle

MetadataKeyByIndex(Int32)

Get the metadata key for the given index

public Nullable<Memory<byte>> MetadataKeyByIndex(int index)

Parameters

index Int32
The index to get

Returns

Nullable<Memory<Byte>>
The key, null if there is no such key or if the buffer was too small

MetadataValueByIndex(Int32)

Get the metadata value for the given index

public Nullable<Memory<byte>> MetadataValueByIndex(int index)

Parameters

index Int32
The index to get

Returns

Nullable<Memory<Byte>>
The value, null if there is no such value or if the buffer was too small

ReadMetadata()

internal IReadOnlyDictionary<string, string> ReadMetadata()

Returns

IReadOnlyDictionary<String, String>

**<llama_model_meta_key_by_index>g__llama_model_meta_key_by_index_native|23_0(SafeLlamaModelHandle, Int32, Byte*, Int64)**

internal static int <llama_model_meta_key_by_index>g__llama_model_meta_key_by_index_native|23_0(SafeLlamaModelHandle model, int index, Byte* buf, long buf_size)

Parameters

model SafeLlamaModelHandle

index Int32

buf Byte*

buf_size Int64

Returns

Int32

**<llama_model_meta_val_str_by_index>g__llama_model_meta_val_str_by_index_native|24_0(SafeLlamaModelHandle, Int32, Byte*, Int64)**

internal static int <llama_model_meta_val_str_by_index>g__llama_model_meta_val_str_by_index_native|24_0(SafeLlamaModelHandle model, int index, Byte* buf, long buf_size)

Parameters

model SafeLlamaModelHandle

index Int32

buf Byte*

buf_size Int64

Returns

Int32