logo
Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

Options
Go to last post Go to first unread
sau002  
#1 Posted : Wednesday, September 4, 2019 9:29:06 AM(UTC)
sau002

Rank: Newbie

Groups: Registered
Joined: 8/17/2019(UTC)
Posts: 6
United Kingdom

Hi All,
I am using the method PdfTextObject.GetCharInfo to read the individual character from the document. This works fine for most of the cases. However, there are some documents (possibly unicode) which do not make sense.

E.g. The PDF document has the words "Page 10"
The property PdfTextObj.TextUnicode correctly gives me "Page 10"
However, the charcode values from GetCharInfo(index,charcode,kerning) are as follows:

Index 0->44
Index 1->6
Index 2->27
Index 3->11
Index 4->1
Index 5->28
etc.

I am under the assumption, that the right encoding has to be applied to make proper sense of "charcode". Any help?

Many thanks,
Sau

Paul Rayman  
#2 Posted : Thursday, September 5, 2019 9:28:02 PM(UTC)
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 789

Thanks: 1 times
Was thanked: 98 time(s) in 96 post(s)
Try to use PdfTextObject.Font.ToUnicode(...)
Users browsing this topic
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.