logo
Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

New Topic Post Reply
Options
Go to last post Go to first unread
apostol.bakalov  
#1 Posted : Friday, February 21, 2020 10:17:03 AM(UTC)
Quote
apostol.bakalov

Rank: Newbie

Groups: Registered
Joined: 11/28/2019(UTC)
Posts: 4
Man
Bulgaria

Hello,

We are working on a application that uses the PdfViewer supplied by the Pdfium library. We have noticed that in certain cases the text boxes cannot display some unicode characters, while other readers such as Acrobat or Foxit reader are able to read and display the properly.

The way we use the viewer is relatively simple:
1. We first load the document using a path by calling PdfDocument.Load
2. We obtain the result and we assign it to the PdfViewer's property Document

Please find attached what we see in Acrobat and the result in PdfViewer. In this particular example "ij" is a special single symbol and it is frequently found in Dutch. Acrobat read the whole string, while PdfViewer stops after "vr". In addition, "vr" in PdfViewer is only visible if we focus on the text box. Otherwise, the field appears blank.

Test Fonts - Acrobat.pngTest Fonts - Pdfium.png


Does anybody have any idea how to handle that problem?

Here is the document that is displayed in the screenshots: TestFonts.pdf (125kb) downloaded 18 time(s).

Thanks in advance.


Best regards,
Apostol Bakalov
Paul Rayman  
#2 Posted : Wednesday, March 4, 2020 9:11:09 AM(UTC)
Quote
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 918

Thanks: 3 times
Was thanked: 111 time(s) in 108 post(s)
Hello,

In the latest version, the symbol "ij" is displayed, but you should pay attention to some nuances.

This symbol is not available in standard encodings described in the PDF specification, therefore, it cannot be displayed with standard fonts. In order to show it, we need to use some unicode font. And here the question arises, which font to choose.
Since the user can type absolutely any of the characters possible in Unicode, including hieroglyphs, this task is not trivial.
Currently an algorithm is implemented to search for a glyph of a specific character in the following fonts depending on a code page : Helvetica, SimSun, MingLiU, MS Gothic, Batang, Arial and Tahoma. If the glyph for the character is not found in these fonts, then Arial Unicode MS is used. If the glyph is also not found in latest font, then it will not be displayed. If these fonts are not present in the system, then the character will not be displayed.

In addition, even if a glyph was found, the corresponding font will still not be embedded in the PDF, and the "link" only will be indicated. And this means that on another machine, if this font is missing, the symbol will not be displayed (unless a third-party viewer replaces the missing font).

Thus, if you want to provide support for maximum number of languages, then you need to
1. Find a font that contains glyphs for all the languages ​​you want to support.
2. Embed this font into the PDF document
3. Indicate to acroforms that they should use this font.
If you follow these steps, you will solve all the problems described above.

You can use code2000.ttf as a font - it can be downloaded here: https://forum.patagames.com/posts/t682-How-to-generate-multilingual-content-Pdf

This font contains, if I am not mistaken, more than 60 thousand. glyphs and covers almost all languages, well, maybe, except extremely rare. It also has a small size - when embedded about 3Mb, against 20Mb for, for example, Arial Unicode MS, in which there are fewer glyphs, by the way.
To embed and use it is necessary to write the following code:

Code:
pdfViewer1.DocumentLoaded += (s, e) =>
{
	var data = System.IO.File.ReadAllBytes(@"e:\45\code2000.ttf");
	var font = PdfFont.CreateEmbeddedFont(pdfViewer1.Document, data, FontCharSet.UNICODE_CHARSET, false);
	Pdfium.FPDFDOC_SetSubstitutionFont(pdfViewer1.Document.Handle, font.Handle);
};

Edited by user Wednesday, March 4, 2020 9:21:40 AM(UTC)  | Reason: Not specified

apostol.bakalov  
#3 Posted : Wednesday, March 4, 2020 11:37:55 AM(UTC)
Quote
apostol.bakalov

Rank: Newbie

Groups: Registered
Joined: 11/28/2019(UTC)
Posts: 4
Man
Bulgaria

Hi Paul,

Thank you for your answer. I ran a couple of tests and so far the results look good. The SDK version that we used is 4.24.2704. Is that the one you were referring to?

I did, however, run into a problem, but it is related to the licensing. As I updated the SDK version, the current key is no longer valid. It seems there is a limitation to the file size that we can open. As I embed the fonts, the files grow larger and I cannot open them anymore :)

Could you please tell me who I need to contact in order to solve this issue? I need a licence key for the version.

Best regards,
Apostol Bakalov
Paul Rayman  
#4 Posted : Wednesday, March 4, 2020 9:32:02 PM(UTC)
Quote
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 918

Thanks: 3 times
Was thanked: 111 time(s) in 108 post(s)
Originally Posted by: apostol.bakalov Go to Quoted Post

Could you please tell me who I need to contact in order to solve this issue? I need a licence key for the version.


Please provide the initial order information to Tany Olivo at sales@patagames.com.
apostol.bakalov  
#5 Posted : Tuesday, March 17, 2020 4:44:37 AM(UTC)
Quote
apostol.bakalov

Rank: Newbie

Groups: Registered
Joined: 11/28/2019(UTC)
Posts: 4
Man
Bulgaria

Hi Paul,

We have obtained a license for the new version and we have integrated it in our software. I have just one more question on that subject. I have a small PDF with just a text box. It seems that the size increases by about 3 MB every time I change the text box value and save the document. So after a while the document becomes quite large. Do we need to embed the font every time we open the document?

Thank you for your help.

Best regards,
Apostol Bakalov
Chris_1987  
#6 Posted : Tuesday, March 17, 2020 5:14:53 AM(UTC)
Quote
Chris_1987

Rank: Newbie

Groups: Registered
Joined: 7/24/2018(UTC)
Posts: 1
United Kingdom


Is it possible to extend that list of fall back system fonts to include Code2000 rather than embedding the font? Also is it possible to set the font of the viewer to the system Code2000 font? So whilst editing the glyphs are visible OK, and on save we could subset embed what we need?

Paul Rayman  
#7 Posted : Wednesday, March 18, 2020 2:18:03 AM(UTC)
Quote
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 918

Thanks: 3 times
Was thanked: 111 time(s) in 108 post(s)
Originally Posted by: apostol.bakalov Go to Quoted Post
Hi Paul,
I have a small PDF with just a text box. It seems that the size increases by about 3 MB every time I change the text box value and save the document.


Replace the code I provided earlier with the following
Code:

pdfViewer1.DocumentLoaded += (s, e) =>
{
	string marker = "MyMarker";
	PdfFont substFont = FindFontByMarker(pdfViewer1.Document, marker);
	if (substFont == null)
	{
		var data = System.IO.File.ReadAllBytes(@"e:\45\code2000.ttf");
		substFont = PdfFont.CreateEmbeddedFont(pdfViewer1.Document, data, FontCharSet.UNICODE_CHARSET, false);
		substFont.Dictionary[marker] = PdfTypeNumber.Create(1);
	}
	Pdfium.FPDFDOC_SetSubstitutionFont(pdfViewer1.Document.Handle, substFont.Handle);
};

and
Code:

private PdfFont FindFontByMarker(PdfDocument doc, string marker)
{
	var list = PdfIndirectList.FromPdfDocument(doc);
	var crossReff = PdfCrossReferenceTable.FromPdfDocument(doc);
	foreach (var num in crossReff)
	{
		var item = list[num.ObjectNumber];
		if (item == null || !item.Is<PdfTypeDictionary>())
			continue;
		if (item.As<PdfTypeDictionary>().ContainsKey(marker))
			return PdfFont.CreateFont(doc, item.As<PdfTypeDictionary>());
	}
	return null;
}

Edited by user Wednesday, March 18, 2020 2:19:31 AM(UTC)  | Reason: Not specified

Paul Rayman  
#8 Posted : Wednesday, March 18, 2020 2:22:35 AM(UTC)
Quote
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 918

Thanks: 3 times
Was thanked: 111 time(s) in 108 post(s)
Originally Posted by: Chris_1987 Go to Quoted Post

Is it possible to extend that list of fall back system fonts to include Code2000 rather than embedding the font? Also is it possible to set the font of the viewer to the system Code2000 font? So whilst editing the glyphs are visible OK, and on save we could subset embed what we need?


Instead of Code2000, you can use any font.

Quick Reply Show Quick Reply
Users browsing this topic
New Topic Post Reply
Forum Jump  
You can post new topics in this forum.
You can reply to topics in this forum.
You can delete your posts in this forum.
You can edit your posts in this forum.
You cannot create polls in this forum.
You can vote in polls in this forum.