Data validation

What is nice with the MRZ / MRTD specifications is that they contain different check digits to make sure that the most important fields (document number, expiry date, date of birth…) are valid. We’ll explain how the validation is done in the next sections.

The SDK comes with C++ sample code to validate MRZ / MRTD data. The code is at https://github.com/DoubangoTelecom/ultimateMRZ-SDK/tree/master/samples/c++/validation.

A check digit consists of a single digit computed from the other digits in a series. Check digits in the MRZ are calculated on specified numerical data elements in the MRZ. The check digits permit readers to verify that data in the MRZ is correctly interpreted.

A special check digit calculation has been adopted for use in MRTDs. The check digits shall be calculated on modulus 10 with a continuously repetitive weighting of 731 731 …, as follows:
  • Step 1. Going from left to right, multiply each digit of the pertinent numerical data element by the weighting figure appearing in the corresponding sequential position.

  • Step 2. Add the products of each multiplication.

  • Step 3. Divide the sum by 10 (the modulus).

  • Step 4. The remainder shall be the check digit.

For data elements in which the number does not occupy all available character positions, the symbol < shall be used to complete vacant positions and shall be given the value of zero for the purpose of calculating the check digit.

When the check digit calculation is applied to data elements containing alphabetic characters, the characters A to Z shall have the values 10 to 35 consecutively, as follows:

A

10

B

11

C

12

D

13

E

14

F

15

G

16

H

17

I

18

J

19

K

20

L

21

M

22

N

23

O

24

P

25

Q

26

R

27

S

28

T

29

U

30

V

31

W

32

X

33

Y

34

Z

35

Next sections explain how to implement the above algorithm using C++. The entire code could be found at https://github.com/DoubangoTelecom/ultimateMRZ-SDK/blob/master/samples/c++/validation/main.cxx and it’s used in the validation sample application.

The SDK itself doesn’t contain the validation code for the simple reason that we want it to be generic to work with any MRZ format even if the data is malformed or non-standard.

Local variables and macros

Weighting:
static const int __Weights[] = { 7, 3, 1 };
Mapped values:
static std::map<char, int> __MappedValues;
const std::string charset = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
for (int i = 0; i < static_cast<int>(charset.size()); ++i) {
        __MappedValues[charset[i]] = i;
}
__MappedValues['<'] = 0;
Weighted sum:
#define MRZ_COMPUTE_WEIGHTED_SUM(_line_, _start_, _end_, w) { \
        for (size_t i = _start_, j = w; i <= _end_; ++i, ++j) { \
                sum += __MappedValues[(_line_)[i]] * __Weights[j % 3]; \
        } \
}
Validity check:
#define MRZ_CHECK_VALIDITY(_line_, _start_, _end_, _check_, _ret_) { \
        const std::string __line__ = (_line_); \
        int sum = 0; \
        MRZ_COMPUTE_WEIGHTED_SUM(__line__, _start_, _end_, 0); \
        _ret_ = ((sum % 10) == __MappedValues[__line__[_check_]]); \
}

Validating TD1 format

bool documentNumber, dateOfBirth, dateOfExpiry, upperAndMiddleLines;

MRZ_CHECK_VALIDITY(lines[0], 5, 13, 14, documentNumber);
MRZ_CHECK_VALIDITY(lines[1], 0, 5, 6, dateOfBirth);
MRZ_CHECK_VALIDITY(lines[1], 8, 13, 14, dateOfExpiry);

int sum = 0;
MRZ_COMPUTE_WEIGHTED_SUM(lines[0], 5, 29, 0);
MRZ_COMPUTE_WEIGHTED_SUM(lines[1], 0, 6, 25);
MRZ_COMPUTE_WEIGHTED_SUM(lines[1], 8, 14, 32);
MRZ_COMPUTE_WEIGHTED_SUM(lines[1], 18, 28, 43);
upperAndMiddleLines = (sum % 10) == __MappedValues[lines[1][29]];

Validating TD2 format

bool documentNumber, dateOfBirth, dateOfExpiry, composite;

MRZ_CHECK_VALIDITY(lines[1], 0, 8, 9, documentNumber);
MRZ_CHECK_VALIDITY(lines[1], 13, 18, 19, dateOfBirth);
MRZ_CHECK_VALIDITY(lines[1], 21, 26, 27, dateOfExpiry);

int sum = 0;
MRZ_COMPUTE_WEIGHTED_SUM(lines[1], 0, 9, 0);
MRZ_COMPUTE_WEIGHTED_SUM(lines[1], 13, 19, 10);
MRZ_COMPUTE_WEIGHTED_SUM(lines[1], 21, 34, 17);
composite = (sum % 10) == __MappedValues[lines[1][35]];

Validating TD3 format

bool passportNumber, dateOfBirth, dateOfExpiry, personalNumber, composite;

MRZ_CHECK_VALIDITY(lines[1], 0, 8, 9, passportNumber);
MRZ_CHECK_VALIDITY(lines[1], 13, 18, 19, dateOfBirth);
MRZ_CHECK_VALIDITY(lines[1], 21, 26, 27, dateOfExpiry);
MRZ_CHECK_VALIDITY(lines[1], 28, 41, 42, personalNumber);

int sum = 0;
MRZ_COMPUTE_WEIGHTED_SUM(lines[1], 0, 9, 0);
MRZ_COMPUTE_WEIGHTED_SUM(lines[1], 13, 19, 10);
MRZ_COMPUTE_WEIGHTED_SUM(lines[1], 21, 42, 17);
composite = (sum % 10) == __MappedValues[lines[1][43]];

Validating MRVA and MRVB formats

bool documentNumber, dateOfBirth, dateOfExpiry;

MRZ_CHECK_VALIDITY(lines[1], 0, 8, 9, documentNumber);
MRZ_CHECK_VALIDITY(lines[1], 13, 18, 19, dateOfBirth);
MRZ_CHECK_VALIDITY(lines[1], 21, 26, 27, dateOfExpiry);