LLamaQuantizer
Namespace: LLama
The quantizer to quantize the model.
public static class LLamaQuantizer
Inheritance Object → LLamaQuantizer
Methods
Quantize(String, String, LLamaFtype, Int32, Boolean, Boolean)
Quantize the model.
public static bool Quantize(string srcFileName, string dstFilename, LLamaFtype ftype, int nthread, bool allowRequantize, bool quantizeOutputTensor)
Parameters
srcFileName
String
The model file to be quantized.
dstFilename
String
The path to save the quantized model.
ftype
LLamaFtype
The type of quantization.
nthread
Int32
Thread to be used during the quantization. By default it's the physical core number.
allowRequantize
Boolean
quantizeOutputTensor
Boolean
Returns
Boolean
Whether the quantization is successful.
Exceptions
Quantize(String, String, String, Int32, Boolean, Boolean)
Quantize the model.
public static bool Quantize(string srcFileName, string dstFilename, string ftype, int nthread, bool allowRequantize, bool quantizeOutputTensor)
Parameters
srcFileName
String
The model file to be quantized.
dstFilename
String
The path to save the quantized model.
ftype
String
The type of quantization.
nthread
Int32
Thread to be used during the quantization. By default it's the physical core number.
allowRequantize
Boolean
quantizeOutputTensor
Boolean
Returns
Boolean
Whether the quantization is successful.