Improving the speed

This section explains how to improve the speed (frame rate).

Memory alignment

Make sure to provide memory aligned data to the SDK. On ARM the preferred alignment is 16bytes while on x86 it’s 32bytes. If the input data is an image and the width isn’t aligned to the preferred alignment size, then it should be strided.

Please check the memory management section for more information.

Landscape mode

When the device is on portrait mode, then the image is rotated 90 or 270 degree (or any modulo 90 degree). On landscape mode it’s rotated 0 or 180 degree (or any modulo 180 degree). On some devices the image could also be horizontally/vertically mirrored in addition to being rotated. Our deep leaning model can natively handle rotations up to 45 degree but not 90, 180 or 270. There is a pre-processing operation to rotate the image back to 0 degree and remove the mirroring effect but such operation is time consuming on mobile devices. We recommend using the device on landscape mode to avoid the pre-processing operation.

Removing rectification layer

On ARM devices you should not add the rectification layer which introduces important delay to the inference pipeline. The current code can already handle moderately distorted credit cards. If your images are highly distorted and require the rectification layer, then we recommend changing the camera position or using multiple cameras if possible. On x86, there is no issue on adding the rectification layer.

Please check the configuration section on how to add/remove the rectification layer.

Planar formats

Both the detector and recognizer expect a RGB_888 image as input but most likely your camera doesn’t support such format. Your camera will probably output YUV frames. If you can choose, then prefer the planar formats (e.g YUV420P) instead of the semi-planar ones (e.g. YUV420SP a.k.a NV21 or NV12). The issue with semi-planar formats is that we’ve to deinterleave the UV plane which takes some extra time.

Reducing camera frame rate and resolution

The CPU is a shared resource and all background tasks are fighting each other for their share of the resources. Requesting the camera to provide high resolution images at high frame rate means it’ll take a big share. It’s useless to have any frame rate above 25fps or any resolution above 720p (1280x720) unless you’re monitoring a very large zone and in such case we recommend using multiple cameras.