Training Implementation

Training uses TensorFlow to produce a quantized tflite file, which can be loaded by LiteRT Inference, NumPy Inference, or PyRTL Inference.

TensorFlow Training

These functions train a quantized two-layer dense MNIST neural network.

This implementation is based on “Quantization aware training in Keras”.

The tensorflow_training demo uses train_unquantized_model() and quantize_model() to implement quantized training with TensorFlow Keras.

Model Architecture

The model processes 12×12 8-bit images of hand-drawn digits from the MNIST data set. The image sizes are reduced from the data set’s original size of 28×28 by load_mnist_images().

The model consists of two dense layers:

One input image, shape: (12, 12)
   │
   │
   ▼
┌─────────┐
│ flatten │
└─────────┘
   │
   │ Tensor shape: (1, 144)
   ▼
┌──────────────────┐
│ layer0: 18 units │
└──────────────────┘
   │
   │ Tensor shape: (1, 18)
   ▼
┌──────┐
│ ReLU │
└──────┘
   │
   │ Tensor shape: (1, 18)
   ▼
┌──────────────────┐
│ layer1: 10 units │
└──────────────────┘
   │
   │
   ▼
Output tensor, shape: (1, 10)
pyrtlnet.tensorflow_training.evaluate_model(model, test_images, test_labels)[source]

Evaluate a model on its test data set.

Parameters:
Return type:

tuple[float, float]

Returns:

(loss, accuracy), where loss is the loss function’s output (lower is better) and accuracy is the model’s accuracy on the test data set (higher is better).

pyrtlnet.tensorflow_training.quantize_model(model, learning_rate, epochs, train_images, train_labels, quantized_model_prefix)[source]

Quantize and save a model.

The model should be trained with train_unquantized_model().

The quantized model will be saved to a file named {quantized_model_prefix}.tflite, and can be loaded with the LiteRT Interpreter, load_tflite_model(), NumPyInference, or PyRTLInference.

The quantized model’s NumPy weights, biases, and quantization metadata will also be saved with save_tensors(), to a file named {quantized_model_prefix}.npz. This file can be loaded with SavedTensors.

Parameters:
  • model (Model) – A trained Keras Model from train_unquantized_model().

  • learning_rate (float) – Controls how quickly the neural network adjusts its weights.

  • epochs (int) – Number of times the train_images are processed.

  • train_images (Tensor) – Training image data from load_mnist_images().

  • train_labels (Tensor) – Training labels from load_mnist_images().

  • quantized_model_prefix (str) – Prefix for the saved quantized .tflite model file, and the NumPy .npz file containing the model’s weights, biases, and quantization parameters.

Return type:

Model

Returns:

A quantized Keras Model.

pyrtlnet.tensorflow_training.train_unquantized_model(learning_rate, epochs, train_images, train_labels)[source]

Train an unquantized, two-layer, dense MNIST neural network model.

Parameters:
  • learning_rate (float) – Controls how quickly the neural network adjusts its weights.

  • epochs (int) – Number of times the train_images are processed.

  • train_images (Tensor) – Training image data from load_mnist_images().

  • train_labels (Tensor) – Training labels from load_mnist_images().

Return type:

Model

Returns:

A trained Keras Model.

Training Utilities

pyrtlnet.training_util.get_tensor_scale_zero(interpreter, tensor_index)[source]

Retrieve a tensor’s scale and zero point from the LiteRT Interpreter.

These scales and zero points may be per-axis or per-tensor.

For more information, see NumPy Inference and these references:

Parameters:
  • interpreter (Interpreter) – LiteRT Interpreter to retrieve tensor metadata from.

  • tensor_index (int) – Index of the tensor to retrieve. These indices can be extracted from the Model Explorer.

Return type:

tuple[ndarray, ndarray]

Returns:

(scale, zero_point) for the tensor. These are one-dimensional tensors with length 1 for per-tensor quantization, and length > 1 for per-axis quantization.

pyrtlnet.training_util.save_mnist_data(tensor_path, test_images, test_labels)[source]

Save the resized MNIST test data so the inference scripts can use it without importing tensorflow.

Importing tensorflow is slow and the inference scripts should not depend on tensorflow.

pyrtlnet.training_util.save_tensors(interpreter, quantized_model_prefix)[source]

Saves a quantized model’s weights, biases, and quantization metadata.

The tensors are saved to a NumPy .npz file with numpy.savez_compressed(), and can be loaded by SavedTensors or numpy.load().

Parameters:
  • interpreter (Interpreter) – LiteRT Interpreter to retrieve tensor metadata from.

  • quantized_model_prefix (str) – Prefix for the .npz file to save, without the .npz suffix.

MNIST Utilities

pyrtlnet.mnist_util.load_mnist_images()[source]

Load the MNIST data set, normalize pixel data to [0.0, 1.0], and resize images from 28×28 to 12×12.

Return type:

tuple[tuple[Tensor, Tensor], tuple[Tensor, Tensor]]

Returns:

(train_images, train_labels), (test_images, test_labels).