Technical Architecture of the Project

In this session, we will discuss the technical architecture of our project. The technical architecture is basically our approach to implementing this project we will discuss using the following,

  1. pdf2image library to convert pdf document to a list of images

  2. OpenCV for basics image processing to make image more clearer

  3. Pytesseract library to extract text from an image

  4. Regular expression (a.k.a. regex) to extract useful information (or fields) from a text block

