Intel NCS2 计算棒
Intel® Neural Compute Stick 2 (Intel® NCS2)
用于人工智能推理的即插即用开发套件
- 在英特尔® Movidius™ Myriad™ X 视觉处理器 (VPU) 之上以优异的每瓦性能和性价比构建并扩展
- 开始在 Windows® 10、Ubuntu* 或 macOS* 上迅速开发
- 在通用框架和现成的示例应用上开发
- 在无云计算依赖性的情况下运行
- 利用低成本边缘设备(如 Raspberry Pi* 3 和其他 ARM* 主机设备)制作原型
OpenVino 安装 NCS2 驱动
$ source /opt/intel/openvino_2022/setupvars.sh
$ /opt/intel/openvino_2022/install_dependencies/install_NCS_udev_rules.sh
$ ./hello_query_device
[ INFO ] OpenVINO Runtime version ......... 2022.1.0
[ INFO ] Build ........... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ]
[ INFO ] Available devices:
[ INFO ] CPU
[ INFO ] SUPPORTED_PROPERTIES:
[ INFO ] Immutable: AVAILABLE_DEVICES : ""
[ INFO ] Immutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 1 1 1
[ INFO ] Immutable: RANGE_FOR_STREAMS : 1 4
[ INFO ] Immutable: FULL_DEVICE_NAME : Intel(R) Celeron(R) J6413 @ 1.80GHz
[ INFO ] Immutable: OPTIMIZATION_CAPABILITIES : FP32 FP16 INT8 BIN EXPORT_IMPORT
[ INFO ] Immutable: CACHE_DIR : ""
[ INFO ] Mutable: NUM_STREAMS : 1
[ INFO ] Mutable: AFFINITY : CORE
[ INFO ] Mutable: INFERENCE_NUM_THREADS : 0
[ INFO ] Mutable: PERF_COUNT : NO
[ INFO ] Mutable: INFERENCE_PRECISION_HINT : f32
[ INFO ] Mutable: PERFORMANCE_HINT : ""
[ INFO ] Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 0
[ INFO ]
[ INFO ] GNA
[ INFO ] SUPPORTED_PROPERTIES:
[ INFO ] Immutable: AVAILABLE_DEVICES : GNA_SW
[ INFO ] Immutable: OPTIMAL_NUMBER_OF_INFER_REQUESTS : 1
[ INFO ] Immutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 1 1 1
[ INFO ] Immutable: OPTIMIZATION_CAPABILITIES : INT16 INT8 EXPORT_IMPORT
[ INFO ] Immutable: FULL_DEVICE_NAME : GNA_SW
[ INFO ] Immutable: GNA_LIBRARY_FULL_VERSION : 3.0.0.1455
[ INFO ] Mutable: GNA_SCALE_FACTOR_PER_INPUT : ""
[ INFO ] Mutable: GNA_FIRMWARE_MODEL_IMAGE : ""
[ INFO ] Mutable: GNA_DEVICE_MODE : GNA_SW_EXACT
[ INFO ] Mutable: GNA_HW_EXECUTION_TARGET : UNDEFINED
[ INFO ] Mutable: GNA_HW_COMPILE_TARGET : UNDEFINED
[ INFO ] Mutable: GNA_PWL_DESIGN_ALGORITHM : UNDEFINED
[ INFO ] Mutable: GNA_PWL_MAX_ERROR_PERCENT : 1.000000
[ INFO ] Mutable: PERFORMANCE_HINT : ""
[ INFO ] Mutable: INFERENCE_PRECISION_HINT : undefined
[ INFO ] Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 1
[ INFO ] Mutable: LOG_LEVEL : LOG_NONE
[ INFO ]
[ INFO ] GPU
[ INFO ] SUPPORTED_PROPERTIES:
[ INFO ] Immutable: AVAILABLE_DEVICES : 0
[ INFO ] Immutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 1 2 1
[ INFO ] Immutable: RANGE_FOR_STREAMS : 1 2
[ INFO ] Immutable: OPTIMAL_BATCH_SIZE : 1
[ INFO ] Immutable: MAX_BATCH_SIZE : 1
[ INFO ] Immutable: FULL_DEVICE_NAME : Intel(R) UHD Graphics [0x4555] (iGPU)
[ INFO ] Immutable: DEVICE_TYPE : integrated
[ INFO ] Immutable: DEVICE_GOPS : f16 409.6 f32 204.8 i8 204.8 u8 204.8
[ INFO ] Immutable: OPTIMIZATION_CAPABILITIES : FP32 BIN FP16
[ INFO ] Immutable: GPU_DEVICE_TOTAL_MEM_SIZE : 3162546176
[ INFO ] Immutable: GPU_UARCH_VERSION : 11.0.0
[ INFO ] Immutable: GPU_EXECUTION_UNITS_COUNT : 16
[ INFO ] Immutable: GPU_MEMORY_STATISTICS : ""
[ INFO ] Mutable: PERF_COUNT : NO
[ INFO ] Mutable: MODEL_PRIORITY : MEDIUM
[ INFO ] Mutable: GPU_HOST_TASK_PRIORITY : MEDIUM
[ INFO ] Mutable: GPU_QUEUE_PRIORITY : MEDIUM
[ INFO ] Mutable: GPU_QUEUE_THROTTLE : MEDIUM
[ INFO ] Mutable: GPU_ENABLE_LOOP_UNROLLING : YES
[ INFO ] Mutable: CACHE_DIR : ""
[ INFO ] Mutable: PERFORMANCE_HINT : ""
[ INFO ] Mutable: COMPILATION_NUM_THREADS : 4
[ INFO ] Mutable: NUM_STREAMS : 1
[ INFO ] Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 0
[ INFO ] Mutable: DEVICE_ID : 0
[ INFO ]
[ INFO ] MYRIAD
[ INFO ] SUPPORTED_PROPERTIES:
[ INFO ] Mutable: AVAILABLE_DEVICES : 1.4-ma2480
[ INFO ] Mutable: FULL_DEVICE_NAME : Intel Movidius Myriad X VPU
[ INFO ] Mutable: OPTIMIZATION_CAPABILITIES : EXPORT_IMPORT FP16
[ INFO ] Mutable: RANGE_FOR_ASYNC_INFER_REQUESTS : 3 6 1
[ INFO ] Mutable: DEVICE_THERMAL : EMPTY VALUE
[ INFO ] Mutable: DEVICE_ARCHITECTURE : MYRIAD
[ INFO ] Mutable: NUM_STREAMS : AUTO
[ INFO ] Mutable: PERFORMANCE_HINT : ""
[ INFO ] Mutable: PERFORMANCE_HINT_NUM_REQUESTS : 0
[ INFO ]
测试
用自己的yolov5s模型测试报错
$ ./benchmark_app -m ~/model/model.xml -i ~/model/1.jpg -d "MULTI:MYRIAD,GPU,CPU"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[ INFO ] For input 1 files were added.
[Step 2/11] Loading Inference Engine
[ INFO ] OpenVINO: OpenVINO Runtime version ......... 2022.1.0
[ INFO ] Build ........... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] openvino_intel_cpu_plugin version ......... 2022.1.0
[ INFO ] Build ........... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ]
[ INFO ] GPU
[ INFO ] Intel GPU plugin version ......... 2022.1.0
[ INFO ] Build ........... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ]
[ INFO ] MULTI
[ INFO ] MultiDevicePlugin version ......... 2022.1.0
[ INFO ] Build ........... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ]
[ INFO ] MYRIAD
[ INFO ] openvino_intel_myriad_plugin version ......... 2022.1.0
[ INFO ] Build ........... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(MYRIAD) performance hint will be set to THROUGHPUT.
[ WARNING ] Performance hint was not explicitly specified in command line. Device(GPU) performance hint will be set to THROUGHPUT.
[ WARNING ] GPU throttling is turned on. Multi-device execution with the CPU + GPU performs best with GPU throttling hint, which releases another CPU thread (that is otherwise used by the GPU driver for active polling).
[ WARNING ] Performance hint was not explicitly specified in command line. Device(CPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading network files
[ INFO ] Loading network files
[ INFO ] Read network took 29.29 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Number of test configurations is calculated basing on number of input images
[ WARNING ] images: layout is not set explicitly, so it is defaulted to NCHW. It is STRONGLY recommended to set layout manually to avoid further issues.
[Step 6/11] Configuring input of the model
[ INFO ] Network batch size: 1
Network inputs:
images (node: images) : u8 / [N,C,H,W]
Network outputs:
output (node: output) : f32 / [...]
712 (node: 712) : f32 / [...]
732 (node: 732) : f32 / [...]
[Step 7/11] Loading the model to the device
W: [global] [ 552256] [WatchdogThread] XLinkWriteDataWithTimeout:185 XLinkWriteDataWithTimeout is not fully supported yet. The XLinkWriteData method is called instead. Desired timeout = 12000
[ INFO ] Load network took 18661.76 ms
[Step 8/11] Setting optimal runtime parameters
[ INFO ] Device: MYRIAD
[ INFO ] { NETWORK_NAME , torch-jit-export }
[ INFO ] { OPTIMAL_NUMBER_OF_INFER_REQUESTS , 12 }
[ INFO ] { MULTI_DEVICE_PRIORITIES , MYRIAD,GPU,CPU }
[ INFO ] Device: GPU
[ INFO ] { NETWORK_NAME , torch-jit-export }
[ INFO ] { OPTIMAL_NUMBER_OF_INFER_REQUESTS , 12 }
[ INFO ] { MULTI_DEVICE_PRIORITIES , MYRIAD,GPU,CPU }
[ INFO ] Device: CPU
[ INFO ] { NETWORK_NAME , torch-jit-export }
[ INFO ] { OPTIMAL_NUMBER_OF_INFER_REQUESTS , 12 }
[ INFO ] { MULTI_DEVICE_PRIORITIES , MYRIAD,GPU,CPU }
[Step 9/11] Creating infer requests and preparing input blobs with data
[ WARNING ] Some image input files will be duplicated: 12 files are required but only 1 are provided
[ WARNING ] Image is resized from (1280, 720) to (1280, 1280)
W: [global] [ 553256] [WatchdogThread] XLinkWriteDataWithTimeout:185 XLinkWriteDataWithTimeout is not fully supported yet. The XLinkWriteData method is called instead. Desired timeout = 12000
W: [global] [ 554256] [WatchdogThread] XLinkWriteDataWithTimeout:185 XLinkWriteDataWithTimeout is not fully supported yet. The XLinkWriteData method is called instead. Desired timeout = 12000
W: [global] [ 555255] [WatchdogThread] XLinkWriteDataWithTimeout:185 XLinkWriteDataWithTimeout is not fully supported yet. The XLinkWriteData method is called instead. Desired timeout = 12000
W: [global] [ 556255] [WatchdogThread] XLinkWriteDataWithTimeout:185 XLinkWriteDataWithTimeout is not fully supported yet. The XLinkWriteData method is called instead. Desired timeout = 12000
W: [global] [ 557255] [WatchdogThread] XLinkWriteDataWithTimeout:185 XLinkWriteDataWithTimeout is not fully supported yet. The XLinkWriteData method is called instead. Desired timeout = 12000
[ INFO ] Test Config 0
[ INFO ] images ([N,C,H,W], u8, {1, 3, 1280, 1280}, static): /home/aibox/model/1.jpg
[Step 10/11] Measuring performance (Start inference asynchronously, 12 inference requests, limits: 60000 ms duration)
[ INFO ] BENCHMARK IS IN INFERENCE ONLY MODE.
[ INFO ] Input blobs will be filled once before performance measurements.
W: [global] [ 558254] [WatchdogThread] XLinkWriteDataWithTimeout:185 XLinkWriteDataWithTimeout is not fully supported yet. The XLinkWriteData method is called instead. Desired timeout = 12000
[ INFO ] First inference took 856.61 ms
W: [global] [ 559254] [WatchdogThread] XLinkWriteDataWithTimeout:185 XLinkWriteDataWithTimeout is not fully supported yet. The XLinkWriteData method is called instead. Desired timeout = 12000
...
报错:经之后验证似乎是benchmark_app的问题,并不是模型的问题。
注意: NCS2只支持FP16
使用NCS2没有跑出正确结果
测试官方文档上的例子
$ git clone --depth 1 https://github.com/openvinotoolkit/open_model_zoo
$ cd open_model_zoo/tools/model_tools
$ python3 -m pip install --upgrade pip
$ python3 -m pip install -r requirements.in
$ python3 downloader.py --name squeezenet1.1
$ python3 converter.py --name squeezenet1.1
$ ~/inference_engine_cpp_samples_build/intel64/Release/hello_classification ~/open_model_zoo/tools/model_tools/public/squeezenet1.1/FP32/squeezenet1.1.xml ~/model/1.jpg MYRIAD
[ INFO ] OpenVINO Runtime version ......... 2022.1.0
[ INFO ] Build ........... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ]
[ INFO ] Loading model files: /home/aibox/open_model_zoo/tools/model_tools/public/squeezenet1.1/FP32/squeezenet1.1.xml
[ INFO ] model name: squeezenet1.1
[ INFO ] inputs
[ INFO ] input name: data
[ INFO ] input type: f32
[ INFO ] input shape: {1, 3, 227, 227}
[ INFO ] outputs
[ INFO ] output name: prob
[ INFO ] output type: f32
[ INFO ] output shape: {1, 1000, 1, 1}
Top 10 results:
Image /home/aibox/model/1.jpg
classid probability
------- -----------
619 0.2119141
641 0.1959229
922 0.1434326
719 0.0561829
415 0.0392151
950 0.0392151
692 0.0234222
721 0.0191193
988 0.0153503
964 0.0146637
$ python3 downloader.py --name mobilenet-v2
$ python3 converter.py --name mobilenet-v2
$ ~/inference_engine_cpp_samples_build/intel64/Release/hello_classification ~/open_model_zoo/tools/model_tools/public/mobilenet-v2/FP32/mobilenet-v2.xml ~/model/1.jpg MYRIAD
[ INFO ] OpenVINO Runtime version ......... 2022.1.0
[ INFO ] Build ........... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ]
[ INFO ] Loading model files: /home/aibox/open_model_zoo/tools/model_tools/public/mobilenet-v2/FP32/mobilenet-v2.xml
[ INFO ] model name: MOBILENET_V2
[ INFO ] inputs
[ INFO ] input name: data
[ INFO ] input type: f32
[ INFO ] input shape: {1, 3, 224, 224}
[ INFO ] outputs
[ INFO ] output name: prob
[ INFO ] output type: f32
[ INFO ] output shape: {1, 1000, 1, 1}
Top 10 results:
Image /home/aibox/model/1.jpg
classid probability
------- -----------
725 0.0995483
953 0.0653076
954 0.0500793
700 0.0444946
729 0.0383606
934 0.0344238
926 0.0308380
411 0.0287476
618 0.0236359
923 0.0223999
参考NCS2官网
Get Started with Intel® Neural Compute Stick 2
Note This article is based on the 2019 R1 release of the Intel® Distribution of OpenVINO™ toolkit.
2019.R1已经下载不到了,资料陈旧!!!没有太大参考意义
树莓派openvinotoolkit下载仓库 需要梯子
2022.?开始已经不提供raspbian版本