What is the best way to get good quality conversions from PDFs to Word documents that have complicated pictures and text?

Question

MM

Freelance Translator Spanish to English at Self-Employed

What is the best way to get good quality conversions from PDFs to Word documents that have complicated pictures and text?

Asked almost 5 years ago

I have a translation or two to do that have a good amount of random marks on them and are PDFs. I was wondering if there's a good way to still convert these to Word documents while still preserving the formatting. I am not interested in the random marks, of course, but I am interested in the text.

PDF Editor Software

Document Creation Software

Desktop Publishing Software

Document Generation Software

Comment

2 comments

2

Looks like you’re not logged in.

Users need to be logged in to answer questions

Log In

Sean F. · Answer 1 · 2020-04-15T13:52:52-05:00

SF

Sean F.

CEO at SCS Computer Consultants, Inc.

0

Answered almost 5 years ago

The OCR does a phenomenal job in handling stray marks, creases in the paper, etc. They will be turned in graphics, while the text would be handled as separate blocks. Normally, the document can be edited as-is -- deleting the extraneous graphic blocks; But I've found that if the OCRed document is too cluttered, you will have to resort to copy and pasting the text. There were several scans I had made that had graphics UNDER the text. The last time this had happened was with a picture of a crease on the page. I had to resort to copy and pasting because the graphic refused to be removed without removing most of the text along with it.

delete edit

Looks like you’re not logged in.

Users need to be logged in to write comments

Log In

Reply

Gary F. · Answer 2 · 2020-04-14T13:48:06-05:00

GF

Gary F.

Independent Publishing Professional

0

Answered almost 5 years ago

It's unlikely you'll find anything which can perfectly differentiate between legitimate text and random blemishes / marks. I would suggest using PDFelement's OCR module (under the "Convert" menu in PDFelement Pro) to extract as much of the text as possible, then making corrections as needed to the text (while still in PDFelement) and finally saving as a Word document (also under "Convert.") I tried this on my own system before writing the above; it works fine.

What is the best way to get good quality conversions from PDFs to Word documents that have complicated pictures and text?

About Wondershare PDFelement