Rank: Member
Groups: Registered
Joined: 6/1/2016(UTC) Posts: 25  Location: Hessen
|
Hi,
is there a possibility to get the logical structure of a pdf document with .net code? For example, if pdf document is generated by Microsoft Word the logical structure contains information about headlines, links, header, footer and so on.
Thanks and best regards Julian
|
|
|
|
Rank: Administration
Groups: Administrators
Joined: 1/5/2016(UTC) Posts: 844
Thanks: 2 times Was thanked: 103 time(s) in 101 post(s)
|
Hi, If this information is contained in the PDF file, then it is possible get it with SDK. Although I have not seen such things in the PDF specification, and I don’t know where it can be stored. In any case, you can access all the contents of the PDF. Start your research with Document.Root. In addition, the qpdf utility helps very well, if you run it with following command line, you will see the contents of the PDF document in text form, which will greatly facilitate its investigation. qpdf.exe --stream-data=uncompress --normalize-content=y --object-streams=disable %1 %1_decoded.pdf You can download qpdf utility here http://qpdf.sourceforge.net
|
|
|
|
Forum Jump
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.
Important Information:
The Patagames Software Support Forum uses cookies. By continuing to browse this site, you are agreeing to our use of cookies.
More Details
Close