OpenVino - benchmark

cpu J6412 16eu

Yolov5s

$ ./benchmark_app -m ~/model/model.xml -d CPU
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading Inference Engine
[ INFO ] OpenVINO: OpenVINO Runtime version ......... 2022.1.0
[ INFO ] Build ........... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ] 
[ INFO ] Device info: 
[ INFO ] CPU
[ INFO ] openvino_intel_cpu_plugin version ......... 2022.1.0
[ INFO ] Build ........... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ] 
[ INFO ] 
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(CPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading network files
[ INFO ] Loading network files
[ INFO ] Read network took 94.79 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ WARNING ] images: layout is not set explicitly, so it is defaulted to NCHW. It is STRONGLY recommended to set layout manually to avoid further issues.
[Step 6/11] Configuring input of the model
[ INFO ] Network batch size: 1
Network inputs:
    images (node: images) : u8 / [N,C,H,W]
Network outputs:
    output (node: output) : f32 / [...]
    712 (node: 712) : f32 / [...]
    732 (node: 732) : f32 / [...]
[Step 7/11] Loading the model to the device
[ INFO ] Load network took 313.38 ms
[Step 8/11] Setting optimal runtime parameters
[ INFO ] Device: CPU
[ INFO ]   { NETWORK_NAME , torch-jit-export }
[ INFO ]   { OPTIMAL_NUMBER_OF_INFER_REQUESTS , 4 }
[ INFO ]   { NUM_STREAMS , 4 }
[ INFO ]   { AFFINITY , CORE }
[ INFO ]   { INFERENCE_NUM_THREADS , 0 }
[ INFO ]   { PERF_COUNT , NO }
[ INFO ]   { INFERENCE_PRECISION_HINT , f32 }
[ INFO ]   { PERFORMANCE_HINT , THROUGHPUT }
[ INFO ]   { PERFORMANCE_HINT_NUM_REQUESTS , 0 }
[Step 9/11] Creating infer requests and preparing input blobs with data
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] images  ([N,C,H,W], u8, {1, 3, 1280, 1280}, static):   random (image is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 60000 ms duration)
[ INFO ] BENCHMARK IS IN INFERENCE ONLY MODE.
[ INFO ] Input blobs will be filled once before performance measurements.
[ INFO ] First inference took 9943.55 ms
[Step 11/11] Dumping statistics report
[ INFO ] Count:      28 iterations
[ INFO ] Duration:   78439.01 ms
[ INFO ] Latency: 
[ INFO ]        Median:     11095.19 ms
[ INFO ]        Average:    11124.67 ms
[ INFO ]        Min:        11079.07 ms
[ INFO ]        Max:        11305.68 ms
[ INFO ] Throughput: 0.36 FPS


$ ./benchmark_app -m ~/model/model.xml -d GPU
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading Inference Engine
[ INFO ] OpenVINO: OpenVINO Runtime version ......... 2022.1.0
[ INFO ] Build ........... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ] 
[ INFO ] Device info: 
[ INFO ] GPU
[ INFO ] Intel GPU plugin version ......... 2022.1.0
[ INFO ] Build ........... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ] 
[ INFO ] 
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(GPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading network files
[ INFO ] Loading network files
[ INFO ] Read network took 29.36 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ WARNING ] images: layout is not set explicitly, so it is defaulted to NCHW. It is STRONGLY recommended to set layout manually to avoid further issues.
[Step 6/11] Configuring input of the model
[ INFO ] Network batch size: 1
Network inputs:
    images (node: images) : u8 / [N,C,H,W]
Network outputs:
    output (node: output) : f32 / [...]
    712 (node: 712) : f32 / [...]
    732 (node: 732) : f32 / [...]
[Step 7/11] Loading the model to the device
[ INFO ] Load network took 6507.84 ms
[Step 8/11] Setting optimal runtime parameters
[ INFO ] Device: GPU
[ INFO ]   { NETWORK_NAME , torch-jit-export }
[ INFO ]   { OPTIMAL_NUMBER_OF_INFER_REQUESTS , 4 }
[ INFO ]   { PERF_COUNT , NO }
[ INFO ]   { MODEL_PRIORITY , MEDIUM }
[ INFO ]   { GPU_HOST_TASK_PRIORITY , MEDIUM }
[ INFO ]   { GPU_QUEUE_PRIORITY , MEDIUM }
[ INFO ]   { GPU_QUEUE_THROTTLE , MEDIUM }
[ INFO ]   { GPU_ENABLE_LOOP_UNROLLING , YES }
[ INFO ]   { CACHE_DIR ,  }
[ INFO ]   { PERFORMANCE_HINT , THROUGHPUT }
[ INFO ]   { COMPILATION_NUM_THREADS , 4 }
[ INFO ]   { NUM_STREAMS , 2 }
[ INFO ]   { PERFORMANCE_HINT_NUM_REQUESTS , 0 }
[ INFO ]   { DEVICE_ID , 0 }
[Step 9/11] Creating infer requests and preparing input blobs with data
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] images  ([N,C,H,W], u8, {1, 3, 1280, 1280}, static):   random (image is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 60000 ms duration)
[ INFO ] BENCHMARK IS IN INFERENCE ONLY MODE.
[ INFO ] Input blobs will be filled once before performance measurements.
[ INFO ] First inference took 691.17 ms
[Step 11/11] Dumping statistics report
[ INFO ] Count:      92 iterations
[ INFO ] Duration:   63092.38 ms
[ INFO ] Latency: 
[ INFO ]        Median:     2741.97 ms
[ INFO ]        Average:    2698.46 ms
[ INFO ]        Min:        1234.79 ms
[ INFO ]        Max:        2751.01 ms
[ INFO ] Throughput: 1.46 FPS

分类

$ ./benchmark_app -m ~/model/resnet50.xml -d GPU
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading Inference Engine
[ INFO ] OpenVINO: OpenVINO Runtime version ......... 2022.1.0
[ INFO ] Build ........... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ] 
[ INFO ] Device info: 
[ INFO ] GPU
[ INFO ] Intel GPU plugin version ......... 2022.1.0
[ INFO ] Build ........... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ] 
[ INFO ] 
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(GPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading network files
[ INFO ] Loading network files
[ INFO ] Read network took 246.00 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ WARNING ] input: layout is not set explicitly, so it is defaulted to NCHW. It is STRONGLY recommended to set layout manually to avoid further issues.
[Step 6/11] Configuring input of the model
[ INFO ] Network batch size: 1
Network inputs:
    input (node: input) : u8 / [N,C,H,W]
Network outputs:
    features (node: features) : f32 / [...]
[Step 7/11] Loading the model to the device
[ INFO ] Load network took 5694.90 ms
[Step 8/11] Setting optimal runtime parameters
[ INFO ] Device: GPU
[ INFO ]   { OPTIMAL_NUMBER_OF_INFER_REQUESTS , 4 }
[ INFO ]   { NETWORK_NAME , torch-jit-export }
[ INFO ]   { AUTO_BATCH_TIMEOUT , 1000 }
[Step 9/11] Creating infer requests and preparing input blobs with data
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] input  ([N,C,H,W], u8, {1, 3, 256, 256}, static):      random (image is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 60000 ms duration)
[ INFO ] BENCHMARK IS IN INFERENCE ONLY MODE.
[ INFO ] Input blobs will be filled once before performance measurements.
[ INFO ] First inference took 101.57 ms

[Step 11/11] Dumping statistics report
[ INFO ] Count:      624 iterations
[ INFO ] Duration:   60743.61 ms
[ INFO ] Latency: 
[ INFO ]        Median:     389.24 ms
[ INFO ]        Average:    388.42 ms
[ INFO ]        Min:        173.69 ms
[ INFO ]        Max:        392.76 ms
[ INFO ] Throughput: 10.27 FPS


$ ./benchmark_app -m ~/model/MobileNetV1.xml -d GPU
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading Inference Engine
[ INFO ] OpenVINO: OpenVINO Runtime version ......... 2022.1.0
[ INFO ] Build ........... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ] 
[ INFO ] Device info: 
[ INFO ] GPU
[ INFO ] Intel GPU plugin version ......... 2022.1.0
[ INFO ] Build ........... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ] 
[ INFO ] 
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(GPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading network files
[ INFO ] Loading network files
[ INFO ] Read network took 52.78 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ WARNING ] input: layout is not set explicitly, so it is defaulted to NCHW. It is STRONGLY recommended to set layout manually to avoid further issues.
[Step 6/11] Configuring input of the model
[ INFO ] Network batch size: 1
Network inputs:
    input (node: input) : u8 / [N,C,H,W]
Network outputs:
    features (node: features) : f32 / [...]
[Step 7/11] Loading the model to the device
[ INFO ] Load network took 4170.93 ms
[Step 8/11] Setting optimal runtime parameters
[ INFO ] Device: GPU
[ INFO ]   { NETWORK_NAME , torch-jit-export }
[ INFO ]   { OPTIMAL_NUMBER_OF_INFER_REQUESTS , 4 }
[ INFO ]   { PERF_COUNT , NO }
[ INFO ]   { MODEL_PRIORITY , MEDIUM }
[ INFO ]   { GPU_HOST_TASK_PRIORITY , MEDIUM }
[ INFO ]   { GPU_QUEUE_PRIORITY , MEDIUM }
[ INFO ]   { GPU_QUEUE_THROTTLE , MEDIUM }
[ INFO ]   { GPU_ENABLE_LOOP_UNROLLING , YES }
[ INFO ]   { CACHE_DIR ,  }
[ INFO ]   { PERFORMANCE_HINT , THROUGHPUT }
[ INFO ]   { COMPILATION_NUM_THREADS , 4 }
[ INFO ]   { NUM_STREAMS , 2 }
[ INFO ]   { PERFORMANCE_HINT_NUM_REQUESTS , 0 }
[ INFO ]   { DEVICE_ID , 0 }
[Step 9/11] Creating infer requests and preparing input blobs with data
[ WARNING ] No input files were given: all inputs will be filled with random values!
[ INFO ] Test Config 0
[ INFO ] input  ([N,C,H,W], u8, {1, 3, 256, 256}, static):      random (image is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 60000 ms duration)
[ INFO ] BENCHMARK IS IN INFERENCE ONLY MODE.
[ INFO ] Input blobs will be filled once before performance measurements.
[ INFO ] First inference took 17.50 ms
[Step 11/11] Dumping statistics report
[ INFO ] Count:      4272 iterations
[ INFO ] Duration:   60056.18 ms
[ INFO ] Latency: 
[ INFO ]        Median:     56.20 ms
[ INFO ]        Average:    56.18 ms
[ INFO ]        Min:        23.77 ms
[ INFO ]        Max:        57.59 ms
[ INFO ] Throughput: 71.13 FPS