LiteRT Inference¶
This is a reference implementation of quantized inference, based on the
LiteRT Interpreter.
Run quantized LiteRT inference on test images from the MNIST dataset.
This implementation uses the reference LiteRT Interpreter inference implementation.
It returns each layer’s tensor output, which is useful for verifying the correctness of
NumPy Inference and PyRTL Inference.
The litert_inference demo uses load_tflite_model() and run_tflite_model()
to implement quantized inference with LiteRT.
- pyrtlnet.litert_inference.load_tflite_model(tensor_path)[source]¶
Load the quantized model and return an initialized LiteRT
Interpreter.The quantized model should be produced by
quantize_model().- Parameters:
quantized_model_name – Name of the
.tflitefile created byquantize_model().- Return type:
Interpreter- Returns:
An initialized LiteRT
Interpreter.
- pyrtlnet.litert_inference.run_tflite_model(interpreter, test_batch)[source]¶
Run quantized inference on an image batch with a TFLite
Interpreter.- Parameters:
interpreter (
Interpreter) – An initialized TFLiteInterpreter, produced byload_tflite_model().test_batch (
ndarray) – An image batch of shape(batch_size, 12, 12)to run through theInterpreter.
- Return type:
- Returns:
(layer0_output, layer1_output, actuals), wherelayer0_outputis the first layer’s raw tensor output, with shape(batch_size, 18).layer1_outputis the second layer’s raw tensor output, with shape(batch_size, 10).actualsis the actual list of predicted digits, with shape(batch_size,).