Reference Models
Timings
The tables in this section contain inference timings for a set of representative models. The quantized models have been imported and compiled offline using SyNAP toolkit. The floating point models are benchmarked for comparison purpose with the corresponding quantized models.
The mobilenet_v1, mobilenet_v2, posenet and inception models are open-source models
available in tflite format from TensorFlow Hosted Models page:
https://www.tensorflow.org/lite/guide/hosted_models
yolov5 models are available from https://github.com/ultralytics/yolov5
,
while yolov5_face comes from https://github.com/deepcam-cn/yolov5-face
.
Other models come from AI-Benchmark APK: https://ai-benchmark.com/ranking_IoT.html
.
Some of the models are Synaptics proprietary, including test models, object detection (mobilenet224), super-resolution and format conversion models.
The model test_64_128x128_5_132_132 has been designed to take maximum advantage of the computational capabilities of the NPU. It has 64 5x5 convolutions with a [1, 128, 128, 132] input and output. Its execution requires 913’519’411’200 operations (0.913 TOPs). Inference time shows that in the right conditions VS640 and SL1640 achieve above 1.6 TOP/s while VS680 and SL1680 able to achieve above 7.9 TOP/s. For 16-bits inference the maximum TOP/s can be achieved with test_64_64x64_5_132_132. With this model we achieve 0.45 TOP/s on VS640/SL1640 and above 1.7 TOP/s on VS680/SL1680. For actual models used in practice it’s very difficut to get close to this level of performance and it’s hard to predict the inference time of a model from the number of operation it contains. The only reliable way is to execute the model on the platform and measure.
Remarks:
In the following tables all timing values are expressed in milliseconds
The columns Online CPU and Online NPU represent the inference time obtained by running the original tflite model directly on the board (online conversion)
Online CPU tests have been done with 4 threads (
--num_threads=4
) on both vs680 and vs640Online CPU tests of floating point models on vs640 have been done in fp16 mode (
--allow_fp16=true
)Online NPU tests executed with the timvx delegate (
--external_delegate_path=libvx_delegate.so
)The Offline Infer column represents the inference time obtained by using a model converted offline using SyNAP toolkit (median time over 10 consecutive inferences)
The Online timings represent the minimum time measured (for both init and inference). We took minimim instead of average because this is measure less sensitive to outliers due to the test process being temporarily suspended by the CPU scheduler
Online timings, in particular for init and CPU inference, can be influenced by other processes running on the board and the total amount of free memory available. We ran all tests on Android AOSP/64bits with 4GB of memory on VS680 and 2GB on VS640. Running on Android GMS or 32-bits OS or with less memory can result in longer init and inference times
Offline tests have been done with non-contiguous memory allocation and no cache flush
Models marked with * come precompiled and preinstalled on the platform
Inference timings on VS680 and SL1680
These tables show the inference timings for a set of models on VS680 and SL1680. All tests have been done on 64-bits OS with 4GB of memory.
Model |
Online CPU Infer |
Online GPU Infer |
Online NPU Init |
Online NPU Infer |
Offline NPU Init |
Offline NPU Infer |
|
---|---|---|---|---|---|---|---|
convert_nv12@1920x1080_rgb@1920x1080 |
17.52 |
30.81 |
* |
||||
convert_nv12@1920x1080_rgb@224x224 |
14.25 |
1.27 |
* |
||||
convert_nv12@1920x1080_rgb@640x360 |
13.55 |
5.14 |
* |
||||
sr_fast_y_uv_1280x720_3840x2160 |
317 |
32.88 |
18.09 |
11.46 |
* |
||
sr_fast_y_uv_1920x1080_3840x2160 |
776 |
50.56 |
20.40 |
17.50 |
* |
||
sr_qdeo_y_uv_1280x720_3840x2160 |
149 |
32.49 |
21.84 |
20.59 |
* |
||
sr_qdeo_y_uv_1920x1080_3840x2160 |
233 |
38.41 |
24.11 |
25.84 |
* |
||
mobilenet224_full80 |
66.28 |
25.17 |
* |
||||
mobilenet224_full1 |
57.71 |
14.23 |
* |
||||
test_64_128x128_5_132_132 |
50.07 |
119.34 |
Model |
Online CPU Infer |
Online GPU Infer |
Online NPU Init |
Online NPU Infer |
Offline NPU Init |
Offline NPU Infer |
|
---|---|---|---|---|---|---|---|
inception_v4_299_quant |
500.54 |
13502 |
17.80 |
100.79 |
19.59 |
||
mobilenet_v1_0.25_224_quant |
3.37 |
166 |
0.81 |
2.61 |
0.77 |
* |
|
mobilenet_v2_1.0_224_quant |
18.60 |
854 |
1.85 |
6.13 |
1.79 |
* |
|
posenet_mobilenet_075_float |
34.44 |
61.78 |
* |
||||
posenet_mobilenet_075_quant |
28.60 |
382 |
6.01 |
1.84 |
2.32 |
||
yolov8s-pose |
14.61 |
30.79 |
* |
||||
yolov5m-640x480 |
6606 |
113.88 |
54.11 |
118.82 |
|||
yolov5s-640x480 |
2672 |
72.27 |
22.17 |
75.83 |
|||
yolov5s_face_640x480_onnx_mq |
13.00 |
31.88 |
* |
Model |
Online CPU Infer |
Online GPU Infer |
Online NPU Init |
Online NPU Infer |
Offline NPU Init |
Offline NPU Infer |
|
---|---|---|---|---|---|---|---|
deeplab_v3_plus_quant |
231.76 |
4090 |
60.73 |
7.68 |
59.81 |
||
dped_quant |
335.63 |
1019 |
8.93 |
4.74 |
8.82 |
||
inception_v3_float |
370.95 |
436.74 |
|||||
inception_v3_quant |
267.95 |
7210 |
9.47 |
59.55 |
10.22 |
||
mobilenet_v2_b4_quant |
53.54 |
875 |
12.50 |
11.53 |
13.63 |
||
mobilenet_v2_float |
20.84 |
35.40 |
|||||
mobilenet_v2_quant |
18.72 |
886 |
2.02 |
9.27 |
1.98 |
||
mobilenet_v3_quant |
51.89 |
1089 |
9.76 |
13.15 |
10.15 |
||
pynet_quant |
976.61 |
3175 |
18.56 |
24.45 |
19.30 |
||
srgan_quant |
1513.86 |
3517 |
54.58 |
14.72 |
56.95 |
||
unet_quant |
265.51 |
487 |
9.34 |
7.73 |
14.80 |
||
vgg_quant |
1641.18 |
2177 |
29.77 |
10.74 |
30.07 |
Model |
Online CPU Infer |
Online GPU Infer |
Online NPU Init |
Online NPU Infer |
Offline NPU Init |
Offline NPU Infer |
|
---|---|---|---|---|---|---|---|
crnn_float |
240.53 |
217.89 |
|||||
crnn_quant |
113.52 |
40641 |
22.54 |
217.19 |
23.03 |
||
deeplab_v3_plus_float |
1181.26 |
1447.74 |
|||||
deeplab_v3_plus_quant |
674.10 |
2381 |
97.98 |
16.68 |
103.05 |
||
dped_float |
4043.67 |
2353.72 |
|||||
dped_instance_float |
2236.11 |
1386.78 |
|||||
dped_instance_quant |
3422.17 |
288 |
229.41 |
5.43 |
953.44 |
||
dped_quant |
2249.53 |
3266 |
196.30 |
18.70 |
199.15 |
||
efficientnet_b4_float |
432.00 |
579.98 |
|||||
efficientnet_b4_quant |
228.86 |
9649 |
162.06 |
54.48 |
166.33 |
||
esrgan_float |
1445.08 |
1522.63 |
|||||
esrgan_quant |
770.69 |
2119 |
93.78 |
5.08 |
101.56 |
||
imdn_float |
2553.12 |
2382.78 |
|||||
imdn_quant |
1350.97 |
3215 |
165.92 |
8.63 |
155.46 |
||
inception_v3_float |
371.39 |
437.60 |
|||||
inception_v3_quant |
221.61 |
7254 |
10.26 |
76.98 |
11.38 |
||
mobilenet_v2_b8_float |
152.22 |
203.15 |
|||||
mobilenet_v2_b8_quant |
91.44 |
889 |
25.91 |
14.69 |
27.18 |
||
mobilenet_v2_float |
20.98 |
36.08 |
|||||
mobilenet_v2_quant |
12.28 |
968 |
2.11 |
10.04 |
2.07 |
||
mobilenet_v3_b4_float |
351.62 |
461.38 |
|||||
mobilenet_v3_b4_quant |
359.39 |
1497 |
97.86 |
20.36 |
101.09 |
||
mobilenet_v3_float |
91.33 |
114.77 |
|||||
mobilenet_v3_quant |
97.05 |
1706 |
19.82 |
15.14 |
20.92 |
||
mv3_depth_float |
132.65 |
194.37 |
|||||
mv3_depth_quant |
218.06 |
1513 |
71.09 |
15.74 |
90.96 |
||
punet_float |
2612.87 |
1796.33 |
|||||
punet_quant |
1660.59 |
2019 |
155.60 |
14.06 |
149.79 |
||
pynet_float |
2836.85 |
1620.06 |
|||||
pynet_quant |
2100.18 |
3441 |
137.39 |
15.80 |
135.94 |
||
resnet_float |
0.10 |
2.86 |
|||||
resnet_quant |
0.41 |
132 |
0.13 |
3.95 |
0.12 |
||
srgan_float |
6192.96 |
2921.47 |
|||||
srgan_quant |
4220.89 |
12224 |
200.90 |
29.85 |
208.23 |
||
unet_float |
2909.00 |
2132.16 |
|||||
unet_quant |
1710.41 |
775 |
69.08 |
19.11 |
95.29 |
||
vsr_float |
820.35 |
974.12 |
|||||
vsr_quant |
580.30 |
2124 |
155.28 |
20.45 |
133.86 |
||
xlsr_float |
518.61 |
532.46 |
|||||
xlsr_quant |
470.63 |
1700 |
36.20 |
3.93 |
31.38 |
||
yolo_v4_tiny_float |
187.81 |
157.75 |
|||||
yolo_v4_tiny_quant |
311.65 |
1406 |
6.62 |
4.69 |
6.03 |
Inference timings on VS640 and SL1640
These tables show the inference timings for a set of models on VS640 and SL1640. All tests have been done on 64-bits OS with 2GB of memory.
Model |
Online CPU Infer |
Online GPU Infer |
Online NPU Init |
Online NPU Infer |
Offline NPU Init |
Offline NPU Infer |
|
---|---|---|---|---|---|---|---|
convert_nv12@1920x1080_rgb@1920x1080 |
17.48 |
34.49 |
* |
||||
convert_nv12@1920x1080_rgb@224x224 |
15.14 |
1.25 |
* |
||||
convert_nv12@1920x1080_rgb@640x360 |
14.70 |
5.29 |
* |
||||
sr_fast_y_uv_1280x720_3840x2160 |
274 |
53.03 |
17.87 |
17.01 |
* |
||
sr_fast_y_uv_1920x1080_3840x2160 |
524 |
86.39 |
20.35 |
25.90 |
* |
||
sr_qdeo_y_uv_1280x720_3840x2160 |
20.33 |
26.16 |
* |
||||
sr_qdeo_y_uv_1920x1080_3840x2160 |
22.03 |
33.56 |
* |
||||
mobilenet224_full80 |
718.96 |
52.98 |
* |
||||
mobilenet224_full1 |
595.52 |
36.53 |
* |
||||
test_64_128x128_5_132_132 |
63.95 |
563.81 |
Model |
Online CPU Infer |
Online GPU Infer |
Online NPU Init |
Online NPU Infer |
Offline NPU Init |
Offline NPU Infer |
|
---|---|---|---|---|---|---|---|
inception_v4_299_quant |
255.94 |
21481 |
54.07 |
127.13 |
53.82 |
||
mobilenet_v1_0.25_224_quant |
2.80 |
244 |
1.00 |
4.99 |
0.93 |
* |
|
mobilenet_v2_1.0_224_quant |
12.31 |
1203 |
2.40 |
14.21 |
2.31 |
* |
|
posenet_mobilenet_075_float |
27.96 |
90.06 |
* |
||||
posenet_mobilenet_075_quant |
18.70 |
565 |
9.76 |
2.48 |
4.13 |
||
yolov8s-pose |
20.66 |
54.59 |
* |
||||
yolov5m-640x480 |
10657 |
175.90 |
60.64 |
178.00 |
|||
yolov5s-640x480 |
4264 |
101.73 |
24.90 |
103.36 |
|||
yolov5s_face_640x480_onnx_mq |
27.31 |
59.63 |
* |
Model |
Online CPU Infer |
Online GPU Infer |
Online NPU Init |
Online NPU Infer |
Offline NPU Init |
Offline NPU Infer |
|
---|---|---|---|---|---|---|---|
deeplab_v3_plus_quant |
158.81 |
3679 |
82.47 |
8.31 |
70.85 |
||
dped_quant |
134.25 |
1694 |
25.84 |
6.71 |
25.72 |
||
inception_v3_float |
229.07 |
706.98 |
|||||
inception_v3_quant |
146.25 |
11130 |
30.21 |
80.70 |
29.82 |
||
mobilenet_v2_b4_quant |
47.85 |
1373 |
18.95 |
13.60 |
18.39 |
||
mobilenet_v2_float |
18.27 |
52.41 |
|||||
mobilenet_v2_quant |
12.44 |
1282 |
2.57 |
13.81 |
2.44 |
||
mobilenet_v3_quant |
47.99 |
1593 |
12.57 |
16.38 |
11.91 |
||
pynet_quant |
447.61 |
4803 |
57.11 |
31.04 |
56.30 |
||
srgan_quant |
829.17 |
5232 |
121.92 |
15.97 |
121.75 |
||
unet_quant |
159.01 |
745 |
18.58 |
9.93 |
24.20 |
||
vgg_quant |
572.70 |
3258 |
103.66 |
10.65 |
102.65 |
Model |
Online CPU Infer |
Online GPU Infer |
Online NPU Init |
Online NPU Infer |
Offline NPU Init |
Offline NPU Infer |
|
---|---|---|---|---|---|---|---|
crnn_float |
140.70 |
352.45 |
|||||
crnn_quant |
87.40 |
70679 |
33.96 |
284.46 |
33.66 |
||
deeplab_v3_plus_float |
976.50 |
2465.12 |
|||||
deeplab_v3_plus_quant |
580.25 |
3889 |
137.92 |
19.01 |
133.41 |
||
dped_float |
2197.13 |
4492.49 |
|||||
dped_instance_float |
1458.77 |
2556.78 |
|||||
dped_instance_quant |
3565.00 |
438 |
366.10 |
||||
dped_quant |
1166.57 |
4025 |
340.13 |
14.97 |
326.37 |
||
efficientnet_b4_float |
396.99 |
945.97 |
|||||
efficientnet_b4_quant |
266.18 |
15393 |
202.61 |
74.93 |
200.67 |
||
esrgan_float |
971.15 |
2632.86 |
|||||
esrgan_quant |
501.45 |
2394 |
147.30 |
5.28 |
147.75 |
||
imdn_float |
1596.76 |
4181.09 |
|||||
imdn_quant |
1123.80 |
5018 |
281.54 |
5.57 |
269.71 |
||
inception_v3_float |
228.96 |
724.73 |
|||||
inception_v3_quant |
157.53 |
11300 |
30.83 |
96.69 |
30.12 |
||
mobilenet_v2_b8_float |
132.81 |
338.71 |
|||||
mobilenet_v2_b8_quant |
88.94 |
1395 |
39.21 |
15.75 |
38.26 |
||
mobilenet_v2_float |
18.09 |
52.04 |
|||||
mobilenet_v2_quant |
11.68 |
1395 |
2.69 |
14.11 |
2.58 |
||
mobilenet_v3_b4_float |
327.02 |
765.60 |
|||||
mobilenet_v3_b4_quant |
364.18 |
2358 |
136.42 |
22.52 |
135.36 |
||
mobilenet_v3_float |
90.62 |
181.89 |
|||||
mobilenet_v3_quant |
100.55 |
2586 |
24.74 |
20.79 |
23.85 |
||
mv3_depth_float |
189.75 |
331.05 |
|||||
mv3_depth_quant |
301.72 |
2307 |
80.36 |
18.43 |
98.45 |
||
punet_float |
1856.32 |
3173.80 |
|||||
punet_quant |
1572.03 |
2736 |
259.45 |
11.74 |
249.44 |
||
pynet_float |
2747.17 |
2833.27 |
|||||
pynet_quant |
2126.74 |
5113 |
282.27 |
17.12 |
275.28 |
||
resnet_float |
0.08 |
2.16 |
|||||
resnet_quant |
0.04 |
205 |
0.17 |
5.09 |
0.23 |
||
srgan_float |
3295.12 |
5420.38 |
|||||
srgan_quant |
1740.11 |
17125 |
423.24 |
29.30 |
420.17 |
||
unet_float |
1726.05 |
3776.94 |
|||||
unet_quant |
1315.86 |
1089 |
155.08 |
14.68 |
195.52 |
||
vsr_float |
680.60 |
1905.83 |
|||||
vsr_quant |
621.03 |
1732 |
200.22 |
11.42 |
156.74 |
||
xlsr_float |
595.84 |
987.90 |
|||||
xlsr_quant |
570.30 |
2063 |
42.01 |
3.27 |
41.15 |
||
yolo_v4_tiny_float |
125.44 |
254.04 |
|||||
yolo_v4_tiny_quant |
321.50 |
2064 |
13.68 |
8.08 |
12.73 |
Super Resolution
Synaptics provides two proprietary families of super resolution models: fast and qdeo, the former
provides better inference time, the latter better upscaling quality.
They can be tested using synap_cli_ip
application, see synap_cli_ip application.
These models are preinstalled in $MODELS/image_processing/super_resolution
.
Name |
Input Image |
Ouput Image |
Factor |
---|---|---|---|
sr_fast_y_uv_960x540_3840x2160 |
960x540 |
3840x2160 |
4 |
sr_fast_y_uv_1280x720_3840x2160 |
1280x720 |
3840x2160 |
3 |
sr_fast_y_uv_1920x1080_3840x2160 |
1920x1080 |
3840x2160 |
2 |
sr_qdeo_y_uv_960x540_3840x2160 |
960x540 |
3840x2160 |
4 |
sr_qdeo_y_uv_1280x720_3840x2160 |
1280x720 |
3840x2160 |
3 |
sr_qdeo_y_uv_1920x1080_3840x2160 |
1920x1080 |
3840x2160 |
2 |
sr_qdeo_y_uv_640x360_1920x1080 |
640x360 |
1920x1080 |
3 |
Format Conversion
Conversion models can be used to convert an image from NV12
format to RGB
.
A set of models is provided for the most commonly used resolutions.
These models have been generated by taking advantage of the
preprocessing feature of the SyNAP
toolkit (see Preprocessing) and can be used to convert
an image so that it can be fed to a processing model with RGB
input.
These models are preinstalled in $MODELS/image_processing/preprocess
and can be
tested using synap_cli_ic2
application, see synap_cli_ic2 application.
Name |
Input Image (NV12) |
Ouput Image (RGB) |
---|---|---|
convert_nv12@426x240_rgb@224x224 |
426x240 |
224x224 |
convert_nv12@640x360_rgb@224x224 |
640x360 |
224x224 |
convert_nv12@854x480_rgb@224x224 |
854x480 |
224x224 |
convert_nv12@1280x720_rgb@224x224 |
1280x720 |
224x224 |
convert_nv12@1920x1080_rgb@224x224 |
1920x1080 |
224x224 |
convert_nv12@2560x1440_rgb@224x224 |
2560x1440 |
224x224 |
convert_nv12@3840x2160_rgb@224x224 |
3840x2160 |
224x224 |
convert_nv12@7680x4320_rgb@224x224 |
7680x4320 |
224x224 |
Name |
Input Image (NV12) |
Ouput Image (RGB) |
---|---|---|
convert_nv12@426x240_rgb@640x360 |
426x240 |
640x360 |
convert_nv12@640x360_rgb@640x360 |
640x360 |
640x360 |
convert_nv12@854x480_rgb@640x360 |
854x480 |
640x360 |
convert_nv12@1280x720_rgb@640x360 |
1280x720 |
640x360 |
convert_nv12@1920x1080_rgb@640x360 |
1920x1080 |
640x360 |
convert_nv12@2560x1440_rgb@640x360 |
2560x1440 |
640x360 |
convert_nv12@3840x2160_rgb@640x360 |
3840x2160 |
640x360 |
convert_nv12@7680x4320_rgb@640x360 |
7680x4320 |
640x360 |
Name |
Input Image (NV12) |
Ouput Image (RGB) |
---|---|---|
convert_nv12@426x240_rgb@1920x1080 |
426x240 |
1920x1080 |
convert_nv12@640x360_rgb@1920x1080 |
640x360 |
1920x1080 |
convert_nv12@854x480_rgb@1920x1080 |
854x480 |
1920x1080 |
convert_nv12@1280x720_rgb@1920x1080 |
1280x720 |
1920x1080 |
convert_nv12@1920x1080_rgb@1920x1080 |
1920x1080 |
1920x1080 |
convert_nv12@2560x1440_rgb@1920x1080 |
2560x1440 |
1920x1080 |
convert_nv12@3840x2160_rgb@1920x1080 |
3840x2160 |
1920x1080 |
convert_nv12@7680x4320_rgb@1920x1080 |
7680x4320 |
1920x1080 |