Here, There're using two different libraries (PDFBox, iText) for extracting data from some particular bank statements by using java. There has been made an XML file which is one of the easiest ways which is a third party library & also used python library for data extraction from pdf. The main purpose of this repo is to find out which is the more accurate, readable, less boilerplate code & more convenient ways for data extraction from pdf & making XML files.
regain001/Data-Extraction-From-PDF
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|