Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!
  • Guest, before posting your code please take these rules into consideration:
    • It is required to use our BBCode feature to display your code. While within the editor click < / > or >_ and place your code within the BB Code prompt. This helps others with finding a solution by making it easier to read and easier to copy.
    • You can also use markdown to share your code. When using markdown your code will be automatically converted to BBCode. For help with markdown check out the markdown guide.
    • Don't share a wall of code. All we want is the problem area, the code related to your issue.


    To learn more about how to use our BBCode feature, please click here.

    Thank you, Code Forum.

Node.JS PDf text extraction shift enter issue

Haribabu

New Coder
AI
I'm having trouble with text shifting during PDF extraction. I'm using [Name of tool/library you're using] and when I extract text from certain PDFs, the content seems to shift or become misaligned. For example, text that should be in one column ends up in another, or lines overlap.

Has anyone else encountered this issue? Are there any known solutions or workarounds? I've tried [Mention any troubleshooting steps you've already taken, e.g., different extraction methods, pre-processing the PDF], but haven't had any luck. Any help or suggestions would be greatly appreciated! I'm also interested in hearing about other PDF extraction tools that handle complex layouts well. Thanks in advance!

I am using pdf-parse library , I've attached a couple of screenshots to illustrate the problem.

and also i am checking with python library its extracting like this


Z WEI DRIT TEL
ALLER FÜHRENDEN FAHRZEUGHERSTELLER
ENTSCHEIDEN SICH FÜR
GETRIEBEÖLE VON CASTROL*
WÄHLEN SIE CASTROL TRANSMAX.
VERLÄNGERT DIE GETRIEBELEBENSDAUER.
* Basierend auf LMCA-Daten für die OEMs mit den
meisten Verkäufen (gesamte Neuwagenverkäufe)
im Jahr 2019. Verwendung im Rahmen der
OEM-Werksbefüllung.
Cyan
CR16761_H267503_P605434 Castrol Magenta
Yellow
Germany 594.00 x 420.00 mm Black
600.00 x 426.00 mm
Hogarth Worldwide
01/06/2023 18:46


If anyone has any insights into this issue, your help would be greatly appreciated. Thanks in advance!
 

Attachments

  • Screen Shot 2024-11-05 at 11.11.01 PM.png
    Screen Shot 2024-11-05 at 11.11.01 PM.png
    123.8 KB · Views: 2
  • Screen Shot 2024-11-06 at 11.04.33 PM.png
    Screen Shot 2024-11-06 at 11.04.33 PM.png
    83.6 KB · Views: 2

New Threads

Buy us a coffee!

Back
Top Bottom