Architecture overview

Supported operating systems

We support any OS with a C++11 compiler. The code has been tested on Android, iOS, Windows, Linux, Raspberry Pi 4 and many custom embedded devices (e.g. scanners).

The Github repository contains binaries for Android, Raspberry Pi, Windows, Linux and iOS as reference code to allow developers to test the implementation. These reference implementations come with Java, Obj-C, Python and C++ APIs. The API is common to all operating systems which means you can develop and test your application on Android, Raspberry Pi, Windows, Linux or iOS and when you’re ready to move forward we’ll provide the binaries for your OS.

Supported CPUs

We officially support any ARM32 (AArch32), ARM64 (AArch64), x86 and x86_64 architecture. The SDK have been tested on all these CPUs.

MIPS32/64 may work but haven’t been tested and would be horribly slow as there is no SIMD acceleration written for these architectures.

Almost all computer vision functions are written using assembler and accelerated with SIMD code (NEON, SSE and AVX). Some computer vision functions have been open sourced and shared in CompV project available at https://github.com/DoubangoTelecom/CompV.

Supported GPUs

We support any OpenCL 1.2+ compatible GPU for the computer vision and OCR parts.

In addition to being GPGPU accelerated the implementation is SIMD accelerated. The GPU implementation requires support for 64-bit floating point math (cl_khr_fp64 extension) which is not available on most of the ARM Mali GPUs. On such devices the code is massively multi-treaded and accelerated using assembler code and NEON instructions. When you run the code on GPU devices without support for cl_khr_fp64 extension then, you’ll have the next message:

W org.doubango.compv: **[COMPV WARN]: function: "newObj()"
W org.doubango.compv: file: "..\source\ml\ultimate_base_ml_predict_rbf.cxx"
W org.doubango.compv: line: "153"
W org.doubango.compv: message: [UltBaseMachineLearningPredictRBF] GPGPU instance requested but failed as double precision extension (cl_khr_fp64) is missing

You can safely ignore the warning.

Supported programming languages

The code was developed using C++11 and assembler but the API (Application Programming Interface) has many bindings thanks to SWIG.

Bindings: ANSI-C, C++, C#, Java, ObjC, Swift, Perl, Ruby and Python.

Supported raw formats

We supports the following image/video formats: RGBA32, RGB24, BGRA32, NV12, NV21, Y(Grayscale), YUV420P, YVU420P, YUV422P and YUV444P. NV12 and NV21 are semi-planar formats also known as YUV420SP.

The list of supported formats is wide enough to make sure any camera will work.

Optimizations

  • Hand-written assembler (YASM for x86 and GNU ASM for ARM)

  • SIMD (SSE, AVX, NEON) using intrinsics or assembler

  • GPGPU (OpenCL 1.2+) acceleration

  • Massively multithreaded

  • Smart multithreading (minimal context switch, no forking, no false-sharing, no boundaries crossing…)

  • Smart memory access (data alignment, cache pre-load, cache blocking, non-temporal load/store for minimal cache pollution, smart reference counting…)

  • Fixed-point math

  • 8-bit Quantization

  • … and many more

Many functions have been open sourced and included in CompV project: https://github.com/DoubangoTelecom/CompV. More functions from deep learning parts will be open sourced in the coming months. You can contact us to get some closed-source code we’re planning to open.

Thread safety

All the functions in the SDK are thread safe which means you can invoke them in concurrent from multiple threads. But, you should not do it for many reasons:

  • The SDK is already massively multithreaded d in an efficient way (see the threading model section).

  • You’ll end up saturating the CPU and making everything run slower. The threading model makes sure the SDK will never use more threads than the number of virtual CPU cores. Calling the engine from different threads will break this rule as we cannot control the threads created outside the SDK.

  • Unless you have access to the private API the engine uses a single context which means concurrent calls are locked when they try to write to a shared resource.