Inference Utilities¶
Neural Network Tensor Utilities¶
- class pyrtlnet.saved_tensors.QuantizedLayer(input_scale, weight_scale, weight_zero, output_scale, output_zero, weight, bias)[source]¶
Stores a layer’s weights, biases, and quantization metadata.
This class performs some additional pre-processing on the raw quantization metadata. For example, the layer’s floating-point scale factor
mis converted to a fixed-point scale factorm0and a bitwise right-shiftnwithnormalization_constants().- __init__(input_scale, weight_scale, weight_zero, output_scale, output_zero, weight, bias)[source]¶
Store a layer’s weights, biases, and quantization metadata.
- Parameters:
input_scale (
ndarray) – Scale factor for the layer’s input. The first layer’s input is special, and must be retrieved from the model separately. The input for subsequent layers comes from the preceding layer, so subsequent layer inputs use the preceding layer’s scale factor. A layer’s scale factor can be retrieved withscale.weight_scale (
ndarray) – Scale factor for the layer’s weight.weight_zero (
ndarray) – Zero point for the layer’s weight.output_scale (
ndarray) – Scale factor for the layer’s output.output_zero (
ndarray) – Zero point for the layer’s output.weight (
ndarray) – The layer’s weight.bias (
ndarray) – The layer’s bias.
- m0: ndarray[source]¶
The layer’s
scalecan be expressed as a fixed-point scale factorm0and a bitwise right-shiftn. Seenormalization_constants().
- class pyrtlnet.saved_tensors.SavedTensors(quantized_model_name)[source]¶
Loads weights, biases, and quantization metadata saved by
save_tensors().- input_scale: ndarray[source]¶
Floating-point scale factor for neural network’s input.
This scale factor can be converted to a fixed-point multiplier by
normalization_constants().
- layer: list[QuantizedLayer][source]¶
List of
QuantizedLayercontaining per-layer weights, biases, and quantization metadata.
- pyrtlnet.saved_tensors.normalization_constants(s1, s2, s3)[source]¶
Normalize multiplier
mto fixed-pointm0and bit-shiftn.See Section 2.2 in Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. The multiplier
m(Equation 5) is computed from three scale factorss1, s2, s3.This multiplier
mcan then be expressed as a pair of(m0, n), wherem0is a fixed-point 32-bit multiplier andnis a bitwise right-shift amount. A floating-point multiplication bymis equivalent to a fixed-point multiplication bym0, followed by a bitwise right-shift byn. This fixed-point multiplication and bitwise shift are done bynormalize().In other words,
m == (2 ** -n) * m0, wherem0must be in the interval[0.5, 1).A layer can have per-axis scale factors, so
s1,s2, ands3are vectors of scale factors. This function returns a vector of fixed-pointm0values and a vector of integernvalues. See per-axis quantization for details.- Parameters:
s1 (
ndarray) – Scale factors for the matrix multiplication’s left input, which is the layer’s weight matrix.s2 (
ndarray) – Scale factors for the matrix multiplication’s right input, which is the layer’s input matrix.s3 (
ndarray) – Scale factors for the matrix multiplication’s output, which is the layer’s output matrix.
- Return type:
- Returns:
(m0, n), wherem0is a fixed-point multiplier in the interval[0.5, 1),nis a bitwise right-shift amount, andm == (2 ** -n) * m0.
Inference Script Utilities¶
- pyrtlnet.inference_util.add_common_arguments(parser)[source]¶
Add common command line arguments supported by all inference scripts to an
ArgumentParser.This defines all the shared command line flags in one place, so it’s easier to keep them in sync.
- Parameters:
parser (
ArgumentParser) – Parser to add arguments to.
- pyrtlnet.inference_util.batched_images(images, start_image, num_images, batch_size)[source]¶
Generator that yields batched image data.
- Parameters:
images (
ndarray) – Image data to group into batches.start_image (
int) – Index of the first image in the first batch, inimages.num_images (
int) – Total number of images to yield. The yielded images may be grouped into batches, seebatch_sizebelow. The generator always stops at the end ofimagesand does not wrap around to the beginning, so fewer thannum_imagesimages will be yielded if there aren’t enoughimages.batch_size (
int) – Maximum size of each batch. Ifnum_imagesis not evenly divisible bybatch_size, the last batch will be padded out tobatch_sizewith null images.batch_sizemust be greater than zero.
- Raises:
ValueError – If
start_imageexceeds the number of availableimages, or ifbatch_sizeis less than or equal to zero.- Return type:
- Returns:
Yields
batch_start_index, test_batch, wherebatch_start_indexis the index of the first image in the yielded batch, andtest_batchis a batch of image data, with shape(batch_size, 12, 12). The last batch may be padded with null images, see the description ofbatch_sizeabove.images[batch_start_index]is equivalent totest_batch[0].
- pyrtlnet.inference_util.load_mnist_data(tensor_path)[source]¶
Load MNIST test data and labels from
mnist_test_data.npz.mnist_test_data.npzcontains a saved copy of the preprocessed MNIST image data and its corresponding labels. This.npzfile is generated bytensorflow_training.py, after evaluating the trained model. Loading this.npzfile is much faster than callingload_mnist_images(), and avoids a dependency ontensorflow.- Parameters:
tensor_path (
str) – Path tomnist_test_data.npz.- Raises:
FileNotFoundError – If
mnist_test_data.npzis not found.- Return type:
- Returns:
(test_images, test_labels), wheretest_imageshas shape(10000, 12, 12)andtest_labelshas shape(10000,).
- pyrtlnet.inference_util.preprocess_image(test_batch, input_scale, input_zero)[source]¶
Preprocess the raw image data in the batch. This is required by the quantized neural network.
This adjusts the batch image data by
input_scaleandinput_zero. Then, it flattens each 2D image into a 1D column vector and stores them in a matrix of shape(144, batch_size).- Parameters:
test_batch (
ndarray) – Batch data to preprocess. This data should have already been normalized to[0.0, 1.0]and resized to(batch_size, 12, 12), usually byload_mnist_images().input_scale (
ndarray) – Scale factor fortest_batch.input_zero (
ndarray) – Zero point fortest_batch.
- Return type:
- Returns:
Flattened batch data of shape
(144, batch_size), adjusted by the quantized neural network’sinput_scaleandinput_zero.
Command-line Interface Utilities¶
- class pyrtlnet.cli_util.Accuracy[source]¶
Update and display accuracy statistics over multiple tests.
- display()[source]¶
Display accuracy statistics over all tests.
The printed summary looks like:
9/10 correct predictions, 90.0% accuracy
- class pyrtlnet.cli_util.PrintElapsedTime(message)[source]¶
Report how long it takes to run the code in a
withstatement.This context manager first prints a
message, then runs the code in thewithstatement, then prints a"done"message followed by the elapsed time. All output is printed on one line.When an interactive script pauses for more than a second, users will start to wonder if something is wrong. So use
PrintElapsedTimeto let the user know what’s going on before starting an operation that’s expected to take more than a second.Example:
with PrintElapsedTime(message="Sleeping"): time.sleep(2)
Example output:
Sleeping... done (2.0 seconds)
- pyrtlnet.cli_util.display_image(script_name, image, image_index, batch_number, batch_index, verbose)[source]¶
Print an image as ASCII art in a terminal and its metadata.
A header line is always printed, which looks like:
LiteRT Inference image_index 2 batch_number 1 batch_index 0
This header line displays the
script_name(LiteRT Inference), theimage_index(2),batch_number(1), and thebatch_index(0).Next, the image is displayed, when
verboseisTrue. The image display requires a terminal that supports 24-bit color.The image is presented as a 2D array of grayscale pixel values. The pixel values are normalized such that the largest value displays as white, and the smallest value displays as black. One line of terminal output contains up to two rows of pixels.
After the image, a footer line is displayed, when
verboseisTrue:shape (12, 12) dtype float32
This footer line displays
image’sshapeanddtype.- Parameters:
script_name (
str) – Name of the script processing the image data.image (
ndarray) – Image to display in the terminal.image_index (
int) – Index of the displayed image in the full test data set.batch_number (
int) – Batch number that the displayed image belongs to. Multiple consecutive images may be grouped into batches for processing. When this grouping occurs, multiple images will share the samebatch_number. The first batch processed isbatch_number0, the second batch processed isbatch_number1, and so on.batch_index (
int) – Index of the displayed image in its batch. When batching is disabled, thebatch_sizeis1, so everybatch_indexwill be0.verbose (
bool) – WhenFalse, only the header line is displayed. WhenTrue, the header line, image, and footer line are all displayed.
- pyrtlnet.cli_util.display_outputs(script_name, layer0_output, layer1_output, expected, actual, verbose)[source]¶
Display the neural network’s outputs.
Prints the raw outputs of each neural network layer, followed by a bar chart that interprets the final layer’s output as each digit’s un-normalized probability.
Bars for higher probability digits are displayed before bars for lower probability digits.
The bar corresponding to the
expecteddigit is always colored green. If theactualdigit is not the same as theexpecteddigit, the bar corresponding to theactualdigit will be colored red.Sample output with colors omitted:
LiteRT Inference layer0 output shape (18,) dtype int8: [-123 -114 -123 -76 -123 -23 -94 -123 -65 -68 -123 -1 -63 -112 -123 ...] LiteRT Inference layer1 output shape (10,) dtype int8: [ 33 -48 29 58 -50 31 -87 93 9 49] LiteRT Inference layer1 output as bar chart: 7▕ ▄▄▄▄▄▄▄▄▄▄ 93 (expected, actual) 3▕ ▄▄▄▄▄▄ 58 9▕ ▄▄▄▄▄ 49 0▕ ▄▄▄▄ 33 5▕ ▄▄▄▄ 31 2▕ ▄▄▄ 29 8▕ ▄ 9 1▕ ▄▄▄▄▄ -48 4▕ ▄▄▄▄▄ -50 6▕ ▄▄▄▄▄▄▄▄▄ -87
In the sample output above, the digit corresponding to each bar is displayed on the left, so the digit
7has the highest probability, followed by the digit3. The model predicted the digit is a7, and the digit actually was a7according to the labeled test data, so the first bar is annotated with(expected, actual).- Parameters:
script_name (
str) – Name of the script processing the image data.layer0_output (
ndarray) – Output of the neural network’s first layer.layer1_output (
ndarray) – Output of the neural network’s second layer.expected (
int) – Expected prediction from labeled training data.actual (
int) – Actual prediction from the neural network.verbose (
bool) – WhenFalse, just print a summary of the expected and actual predictions. WhenTrue, print each layer’s output and an annotated bar chart.