Hi Praveen,
The most powerful way to extract a paragraph’s position and other data from a PDF document is the iText 7 add-on pdf2Data, which also has an online demo: https://pdf2data.online/
Maybe this Stack Overflow answer by iText’s Alexey Subach can help you: https://stackoverflow.com/questions/55807256/how-can-i-get-the-position-of-the-specified-keyword-in-itext7
While pdf2data is the optimal approach, you can do basic extractions with iText 7 Core using a regular expression:
PdfDocument pdfDocument = new PdfDocument(new PdfReader(inputFile));
ILocationExtractionStrategy strategy = new RegexBasedLocationExtractionStrategy("regular expression");
PdfCanvasProcessor canvasProcessor = new PdfCanvasProcessor(strategy);
canvasProcessor.processPageContent(pdfDocument.getPage(1));
pdfDocument.close();
strategy.getResultantLocations(); // now contains all the locations of the matching text
If you want an answer for your specific case, then it is better to post a more detailed question on Stack Overflow pointing out what you have tried and where you are stuck.
If you have a commercial license, you will also have access to iText customer support over Jira.
Kind regards,
Kenneth Holvoet
iText Software
Previously known as iText, we are now a part of Apryse. With optimized technology and a comprehensive suite of tools, Apryse simplifies even the most complex projects, taking you further, faster. Comm
With over 2.5 million reviews, we can provide the specific details that help you make an informed software buying decision for your business. Finding the right product is important, let us help.
or continue with
LinkedIn
Google
Google (Business)
Gmail.com addresses not permitted. A business domain using Google is allowed.