Pipeline¶
- Every image provided to the SDK will pass through the same pipeline:
STN (Spatial Transformer Network) processing to retrieve the document position and attention map.
The document will be rectified (de-skewed) and anything outside the attention map will be ignored.
Perform full MRZ search on the rectified image.
Perform MRZ recognition on the image (if it has some).
Perform document classification on the detected cards (in batches if multiple). Filter the possibilities using the MRZ result if the card has one.
Perform graph-match on the detected cards (against the matched graphs from the dataset) to localize each field (text, portrait, signature…)
Perform image registration on each field and run the OCR to extract the text.
Return the result as JSON string.
The next sections explain how each module (MRZ recognizer, Card Detector, Card Classifier, Graph computation…) work.