Improving the speed

Our implementation is massively multithreaded, SIMD and GPGPU accelerated but on some low-end devices it could be slow.

This section explains how to improve the speed (frame rate).

Restrict the format

By default the detector will search for both E-13B and CMC-7 lines to make sure the application will work for all formats. Depending on your country you’ll only need to detect one format. To speed up the detection process we recommend changing the format (JSON configuration entry: “format”) to “e13b” or “cmc7” instead of “e13b+cmc7”.

Configuration entry: format

GPU / CPU workload balacing

A device contains a CPU and a GPU. Both can be used for math operations. You can use gpgpu_workload_balancing_enabled configuration entry to allows using both units. On some devices the CPU is faster and on other it’s slower. When the application starts, the work (math operations to perform) is equally divided: 50% for the CPU and 50% for the GPU. Our code contains a profiler to determine which unit is faster and how fast (percentage) it is. The profiler will change how the work is divided based on the time each unit takes to complete. This is why this configuration entry is named “workload balancing”.

On x86 there is a known issue and we only recommend enabling this option on ARM devices. On ARM device this could speedup the detection by up to 100%.

Configuration entry: gpgpu_workload_balancing_enabled

Segmenter

As explained in the configuration section the segmenter accuracy accepts 5 values (JSON strings): veryhigh, high, medim, low, and verylow. The default value is high. If you’re using a low-end mobile device then, consider using medium. value.

Configuration entry: segmenter_accuracy

Backpropagation

Disable backpropagation if the MICR lines are clear enought.

Configuration entry: backpropagation_enabled

Interpolation

As explained in the configuration section, the interpolation operation accepts 3 values (JSON strings): bicubic, bilinear, and nearest. The default value is bilinear. The interpolation operations are used when pixels are scaled, deskewed or deslanted. bicubic offers the best quality but is slow as there is no SIMD or GPU acceleration yet. bilinear and nearest interpolations are multithreaded and SIMD accelerated.

For most scenarios bilinear interpolation is good enough to provide high accuracy/precision results while the code still runs very fast. Change the interpolation value to bicubic if you’re having low recognition score.

Configuration entry: interpolation

Region of interest

Unlike the other applications you can find on the market we don’t define a region of interest (ROI), the entire frame is processed to look for MICR lines. The default resolution used in our sample application is HD (720p). If you’re using a low end mobile device then, consider setting a region of interest instead of downscaling the resolution.

Configuration entry: roi

Device orientation

When the device is on portrait mode then, the image is rotated 90 or 270 degree (or any modulo 90 degree). On landscape mode it’s rotated 0 or 180 degree (or any modulo 180 degree). On some devices the image could also be horizontally/vertically mirrored in addition to being rotated.

Our deep leaning model can natively handle rotations up to 45 degree but not 90, 180 or 270. There is a pre-processing operation to rotate the image back to 0 degree and remove the mirroring effect but such operation could be time consuming on some mobile devices. We recommend using the device on landscape mode to avoid the pre-processing operation.

Memory alignment

Make sure to provide memory aligned data to the SDK. On ARM the preferred alignment is 16-byte (NEON) while on x86 it’s 32-byte (AVX). If the input data is an image and the width isn’t aligned to the preferred alignment size, then it should be strided. Please check the memory management section for more information.

Planar formats

Both the detector and recognizer expect a grayscale image as input but most likely your camera doesn’t support such format. Your camera will probably output YUV frames. Converting YUV frames to grayscale is a nop (very fast), we just need to map the Y plane.

Reducing camera frame rate

The CPU is a shared resource and all background tasks are fighting each other for their share of the resources. Requesting the camera to provide high resolution images at high frame rate means it’ll take a big share. It’s useless to have any frame rate above 25fps. What is very important is the frame resolution. Higher the resolution is better the detection and recognition qualities will be. Try to use very high (2K if possible) resolution but low frame rate.