How to quantize

#10
by supercharge19 - opened

Is there an example, code, to use quantize this model or is there a quantized version available?

WhereIsAI org

@supercharge19 hi, you can use optimum to load the quantized onnx model, as follows:

from optimum.onnxruntime import ORTModelForFeatureExtraction
from optimum.pipelines import pipeline

model = ORTModelForFeatureExtraction.from_pretrained('WhereIsAI/UAE-Large-V1', file_name="onnx/model_quantized.onnx")
extractor = pipeline('feature-extraction', model=model)
output = extractor('hello world')

@supercharge19 hi, you can use optimum to load the quantized onnx model, as follows:

from optimum.onnxruntime import ORTModelForFeatureExtraction
from optimum.pipelines import pipeline

model = ORTModelForFeatureExtraction.from_pretrained('WhereIsAI/UAE-Large-V1', file_name="onnx/model_quantized.onnx")
extractor = pipeline('feature-extraction', model=model)
output = extractor('hello world')

Thanks man.

Sign up or log in to comment