Reference Models

Timings

The tables in this section contain inference timings for a set of representative models. The quantized models have been imported and compiled offline using SyNAP toolkit. The floating point models are benchmarked for comparison purpose with the corresponding quantized models.

The mobilenet_v1, mobilenet_v2, posenet and inception models are open-source models available in tflite format from TensorFlow Hosted Models page: https://www.tensorflow.org/lite/guide/hosted_models

yolov5 models are available from https://github.com/ultralytics/yolov5, while yolov5_face comes from https://github.com/deepcam-cn/yolov5-face.

Other models come from AI-Benchmark APK: https://ai-benchmark.com/ranking_IoT.html.

Some of the models are Synaptics proprietary, including test models, object detection (mobilenet224), super-resolution and format conversion models.

The model test_64_128x128_5_132_132 has been designed to take maximum advantage of the computational capabilities of the NPU. It has 64 5x5 convolutions with a [1, 128, 128, 132] input and output. Its execution requires 913’519’411’200 operations (0.913 TOPs). Inference time shows that in the right conditions VS640 and SL1640 achieve above 1.6 TOP/s while VS680 and SL1680 able to achieve above 7.9 TOP/s. For 16-bits inference the maximum TOP/s can be achieved with test_64_64x64_5_132_132. With this model we achieve 0.45 TOP/s on VS640/SL1640 and above 1.7 TOP/s on VS680/SL1680. For actual models used in practice it’s very difficut to get close to this level of performance and it’s hard to predict the inference time of a model from the number of operation it contains. The only reliable way is to execute the model on the platform and measure.

Remarks:

In the following tables all timing values are expressed in milliseconds
The columns Online CPU and Online NPU represent the inference time obtained by running the original tflite model directly on the board (online conversion)
Online CPU tests have been done with 4 threads (--num_threads=4) on both vs680 and vs640
Online CPU tests of floating point models on vs640 have been done in fp16 mode (--allow_fp16=true)
Online NPU tests on have been done with the timvx delegate (--external_delegate_path=libvx_delegate.so).
The Offline Infer column represents the inference time obtained by using a model converted offline using SyNAP toolkit (median time over 10 consecutive inferences)
The Online timings represent the minimum time measured (for both init and inference). We took minimim instead of average because this is measure less sensitive to outliers due to the test process being temporarily suspended by the CPU scheduler
Online timings, in particular for init and CPU inference, can be influenced by other processes running on the board and the total amount of free memory available. We ran all tests on Android AOSP/64bits with 4GB of memory on VS680 and 2GB on VS640. Running on Android GMS or 32-bits OS or with less memory can result in longer init and inference times
Timings for SL1640 and SL1680 corresponds to those of VS640 and VS680, respectively
Offline tests have been done with non-contiguous memory allocation and no cache flush
Models marked with * come precompiled and preinstalled on the platform

Table 2 Inference timings on VS680, 64-bits OS, 4GB memory
Model	Online CPU Infer	Online GPU Infer	Online NPU Init	Online NPU Infer	Offline NPU Init	Offline NPU Infer
inception_v4_299_quant	610.15		24486	18.93	100.79	19.59
mobilenet_v1_0.25_224_quant	2.68		250	1.30	3.70	0.77	*
mobilenet_v2_1.0_224_quant	16.62		1353	2.46	10.98	1.79	*
convert_nv12@1920x1080_rgb@1920x1080					17.52	32.30	*
convert_nv12@1920x1080_rgb@224x224					14.25	1.49	*
convert_nv12@1920x1080_rgb@640x360					13.55	5.14	*
sr_fast_y_uv_1280x720_3840x2160			299	36.26	18.09	12.04	*
sr_fast_y_uv_1920x1080_3840x2160			776	56.49	20.40	17.50	*
sr_qdeo_y_uv_1280x720_3840x2160			153	36.34	21.84	21.50	*
sr_qdeo_y_uv_1920x1080_3840x2160			246	41.96	24.11	26.97	*
posenet_mobilenet_075_float	43.03	53.71					*
posenet_mobilenet_075_quant	39.01		564	6.89	1.84	2.32
mobilenet224_full80					755.52	26.14	*
yolov5m-640x480			11464	167.74	54.11	118.82
yolov5s-640x480			4506	111.81	22.17	75.83
yolov5s_face_640x480_onnx_mq					21.98	35.31	*
mobilenet224_full1					615.31	16.02	*
deeplab_v3_plus_quant	297.82		4693	62.48	7.68	59.81
dped_quant	335.63		1191	9.58	4.74	8.82
inception_v3_float	432.57	415.76
inception_v3_quant	328.16		13504	10.62	59.55	10.22
mobilenet_v2_b4_quant	67.14		1445	14.08	11.53	13.63
mobilenet_v2_float	28.16	29.84
mobilenet_v2_quant	16.57		1431	2.65	9.27	1.98
mobilenet_v3_quant	59.63		1760	10.62	13.15	10.15
pynet_quant	1067.53		6389	19.76	24.45	19.30
srgan_quant	1816.22		4680	56.43	14.72	56.95
unet_quant	288.01		906	10.38	7.73	14.80
vgg_quant	1641.18		1987	30.50	10.74	30.07
test_64_128x128_5_132_132					50.07	119.34
sublima_cnn_model_relu_400_pruned_mq					1.49	45.69
sublima_cnn_model_relu_400_pruned_uint8					1.49	27.12
sublima_cnn_model_relu_400_pruned_uint8full					1.49	26.25

Table 3 Inference timings on VS640, 64-bits OS, 2GB memory
Model	Online CPU Infer	Online GPU Infer	Online NPU Init	Online NPU Infer	Offline NPU Init	Offline NPU Infer
inception_v4_299_quant	1006.29		38736	54.07	127.13	53.82
mobilenet_v1_0.25_224_quant	5.12		381	1.82	4.99	0.93	*
mobilenet_v2_1.0_224_quant	29.30		1947	3.20	14.21	2.31	*
convert_nv12@1920x1080_rgb@1920x1080					17.48	34.49	*
convert_nv12@1920x1080_rgb@224x224					15.14	1.25	*
convert_nv12@1920x1080_rgb@640x360					14.70	5.29	*
sr_fast_y_uv_1280x720_3840x2160			309	54.75	17.87	17.01	*
sr_fast_y_uv_1920x1080_3840x2160			618	88.67	20.35	25.90	*
sr_qdeo_y_uv_1280x720_3840x2160					20.33	26.16	*
sr_qdeo_y_uv_1920x1080_3840x2160					22.03	33.56	*
posenet_mobilenet_075_float	125.53	90.06					*
posenet_mobilenet_075_quant	49.06		827	10.80	2.48	4.13
mobilenet224_full80					718.96	52.98	*
yolov5m-640x480			17885	234.41	60.64	178.00
yolov5s-640x480			7132	145.38	24.90	103.36
yolov5s_face_640x480_onnx_mq					27.31	63.06	*
mobilenet224_full1					595.52	36.53	*
deeplab_v3_plus_quant	442.51		4877	84.37	8.31	70.85
dped_quant	630.28		1287	26.65	6.71	25.72
inception_v3_float	991.44	706.98
inception_v3_quant	536.96		21292	31.00	80.70	29.82
mobilenet_v2_b4_quant	120.55		2144	19.72	13.60	18.39
mobilenet_v2_float	70.24	49.92
mobilenet_v2_quant	29.37		2097	3.33	13.81	2.44
mobilenet_v3_quant	107.90		2770	13.62	16.38	11.91
pynet_quant	1932.73		10494	59.03	31.04	56.30
srgan_quant	2766.75		5232	121.92	15.97	121.75
unet_quant	543.22		1474	19.93	9.93	24.20
vgg_quant	2969.34		2871	103.66	10.65	102.65
test_64_128x128_5_132_132					63.95	563.81

Super Resolution

Synaptics provides two proprietary families of super resolution models: fast and qdeo, the former provides better inference time, the latter better upscaling quality. They can be tested using synap_cli_ip application, see synap_cli_ip Application.

These models are preinstalled in $MODELS/image_processing/super_resolution .

Table 4 Synaptics SuperResolution Models on Y+UV Channels
Name	Input Image	Ouput Image	Factor
sr_fast_y_uv_960x540_3840x2160	960x540	3840x2160	4
sr_fast_y_uv_1280x720_3840x2160	1280x720	3840x2160	3
sr_fast_y_uv_1920x1080_3840x2160	1920x1080	3840x2160	2
sr_qdeo_y_uv_960x540_3840x2160	960x540	3840x2160	4
sr_qdeo_y_uv_1280x720_3840x2160	1280x720	3840x2160	3
sr_qdeo_y_uv_1920x1080_3840x2160	1920x1080	3840x2160	2
sr_qdeo_y_uv_640x360_1920x1080	640x360	1920x1080	3

Format Conversion

Conversion models can be used to convert an image from NV12 format to RGB. A set of models is provided for the most commonly used resolutions. These models have been generated by taking advantage of the preprocessing feature of the SyNAP toolkit (see Preprocessing) and can be used to convert an image so that is can be fed to a processing model with RGB input.

These models are preinstalled in $MODELS/image_processing/preprocess and can be tested using synap_cli_ic2 application, see synap_cli_ic2 Application.

Table 5 Synaptics Conversion Models NV12 to RGB 224x224
Name	Input Image (NV12)	Ouput Image (RGB)
convert_nv12@426x240_rgb@224x224	426x240	224x224
convert_nv12@640x360_rgb@224x224	640x360	224x224
convert_nv12@854x480_rgb@224x224	854x480	224x224
convert_nv12@1280x720_rgb@224x224	1280x720	224x224
convert_nv12@1920x1080_rgb@224x224	1920x1080	224x224
convert_nv12@2560x1440_rgb@224x224	2560x1440	224x224
convert_nv12@3840x2160_rgb@224x224	3840x2160	224x224
convert_nv12@7680x4320_rgb@224x224	7680x4320	224x224

Table 6 Synaptics Conversion Models NV12 to RGB 640x360
Name	Input Image (NV12)	Ouput Image (RGB)
convert_nv12@426x240_rgb@640x360	426x240	640x360
convert_nv12@640x360_rgb@640x360	640x360	640x360
convert_nv12@854x480_rgb@640x360	854x480	640x360
convert_nv12@1280x720_rgb@640x360	1280x720	640x360
convert_nv12@1920x1080_rgb@640x360	1920x1080	640x360
convert_nv12@2560x1440_rgb@640x360	2560x1440	640x360
convert_nv12@3840x2160_rgb@640x360	3840x2160	640x360
convert_nv12@7680x4320_rgb@640x360	7680x4320	640x360

Table 7 Synaptics Conversion Models NV12 to RGB 1920x1080
Name	Input Image (NV12)	Ouput Image (RGB)
convert_nv12@426x240_rgb@1920x1080	426x240	1920x1080
convert_nv12@640x360_rgb@1920x1080	640x360	1920x1080
convert_nv12@854x480_rgb@1920x1080	854x480	1920x1080
convert_nv12@1280x720_rgb@1920x1080	1280x720	1920x1080
convert_nv12@1920x1080_rgb@1920x1080	1920x1080	1920x1080
convert_nv12@2560x1440_rgb@1920x1080	2560x1440	1920x1080
convert_nv12@3840x2160_rgb@1920x1080	3840x2160	1920x1080
convert_nv12@7680x4320_rgb@1920x1080	7680x4320	1920x1080