logo
Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

Options
Go to last post Go to first unread
rhnatiuk  
#1 Posted : Friday, July 12, 2019 3:30:23 AM(UTC)
rhnatiuk

Rank: Advanced Member

Groups: Registered
Joined: 4/30/2019(UTC)
Posts: 35
Man
Finland
Location: Raisio

Thanks: 9 times
Hi,

We are processing PDF documents in background worker threads (with background priority), as well as upon user request in its own thread.

When background processing is off, everything is fast, but when it is on, everything slows down to a crawl. One of the culprits was in this code of the background workers:

Code:
using(var d = PdfDocument.Load(bytes)) {
   /// do something with document (reading text runs, etc.)
}


In this case, most of the time is spent in the implicit Dispose(), like this
MicrosoftTeams-image.png (79kb) downloaded 4 time(s).
screenshot from a profiler shows.

The other thread spends a lot of time in getting BoundingBoxes, Locations, Matrices of PdfTextRun objects.

Are we doing something wrong? Or is it a bug in pdfium?

Paul Rayman  
#2 Posted : Friday, July 12, 2019 9:54:39 AM(UTC)
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 814

Thanks: 1 times
Was thanked: 100 time(s) in 98 post(s)
Are you using latest version of SDK?
rhnatiuk  
#3 Posted : Monday, July 15, 2019 1:00:11 AM(UTC)
rhnatiuk

Rank: Advanced Member

Groups: Registered
Joined: 4/30/2019(UTC)
Posts: 35
Man
Finland
Location: Raisio

Thanks: 9 times
Originally Posted by: Paul Rayman Go to Quoted Post
Are you using latest version of SDK?


No, we had 4.7.2704. I updated it now to the latest 4.11.2704, and will ask a tester to check if the problem is still there. Was something related to threading fixed recently?
rhnatiuk  
#4 Posted : Wednesday, July 17, 2019 1:40:18 AM(UTC)
rhnatiuk

Rank: Advanced Member

Groups: Registered
Joined: 4/30/2019(UTC)
Posts: 35
Man
Finland
Location: Raisio

Thanks: 9 times
Originally Posted by: Paul Rayman Go to Quoted Post
Are you using latest version of SDK?


Strange... We saw the problem on one computer (DELL Intel machine), but we could not reproduce it in our testing environment (not the same kind of machine). The problem is, that the guy on the problem-machine is on vacation now, and will be back on August 12th. :( So, we will not be able to confirm if updating SDK fixed the problem until then.

Have a nice summer!
rhnatiuk  
#5 Posted : Wednesday, August 14, 2019 7:37:57 AM(UTC)
rhnatiuk

Rank: Advanced Member

Groups: Registered
Joined: 4/30/2019(UTC)
Posts: 35
Man
Finland
Location: Raisio

Thanks: 9 times
Originally Posted by: Paul Rayman Go to Quoted Post
Are you using latest version of SDK?


Hi Paul,

The problem is still reproducible with the latest SDK. I have made a short repro project, see PdfiumNetParallelism.zip (697kb) downloaded 8 time(s).. Just drop Patagames.Pdf.dll and pdfium.dll from the SDK to the ParallelPdfTest folder - the project is referencing those (had to remove them, as zip was too big for this forum).

The code is simple, you do not need explanations. Just run it, and you will see that in multithreaded scenario SDK is about 10 times slower.
rhnatiuk  
#6 Posted : Saturday, August 17, 2019 1:31:13 PM(UTC)
rhnatiuk

Rank: Advanced Member

Groups: Registered
Joined: 4/30/2019(UTC)
Posts: 35
Man
Finland
Location: Raisio

Thanks: 9 times
Originally Posted by: Paul Rayman Go to Quoted Post
Are you using latest version of SDK?


By the way, the same approx. 10x slowdown occurs when processing pages of one document in parallel.
rhnatiuk  
#7 Posted : Wednesday, August 21, 2019 9:38:07 AM(UTC)
rhnatiuk

Rank: Advanced Member

Groups: Registered
Joined: 4/30/2019(UTC)
Posts: 35
Man
Finland
Location: Raisio

Thanks: 9 times
Hi Paul Rayman ,

Any news on the subject? Did you manage to reproduce those problems?

Edited by user Wednesday, August 21, 2019 9:39:57 AM(UTC)  | Reason: Not specified

Paul Rayman  
#8 Posted : Thursday, September 5, 2019 9:31:38 PM(UTC)
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 814

Thanks: 1 times
Was thanked: 100 time(s) in 98 post(s)
I apologize for the delay in responding. This problem requires additional research on the capabilities of the Pdfium engine.
rhnatiuk  
#9 Posted : Thursday, September 12, 2019 7:14:16 AM(UTC)
rhnatiuk

Rank: Advanced Member

Groups: Registered
Joined: 4/30/2019(UTC)
Posts: 35
Man
Finland
Location: Raisio

Thanks: 9 times
Hi Paul,

Originally Posted by: Paul Rayman Go to Quoted Post
I apologize for the delay in responding. This problem requires additional research on the capabilities of the Pdfium engine.


Any success on the topic? This issue is biting us quite badly at the moment. We already thought about using pdfium to read PDFs, and then old library to write. But, it seems, that even on "normal" read some documents can have 20 seconds long .Dispose() calls, which makes parallelism the only possible solution.
rhnatiuk  
#10 Posted : Thursday, September 19, 2019 1:59:40 AM(UTC)
rhnatiuk

Rank: Advanced Member

Groups: Registered
Joined: 4/30/2019(UTC)
Posts: 35
Man
Finland
Location: Raisio

Thanks: 9 times
Hi Paul,

Originally Posted by: Paul Rayman Go to Quoted Post
I apologize for the delay in responding. This problem requires additional research on the capabilities of the Pdfium engine.


Any hope for a solution? We are heading towards release, and this issue is slowly becoming a blocker.
Paul Rayman  
#11 Posted : Wednesday, September 25, 2019 1:49:58 AM(UTC)
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 814

Thanks: 1 times
Was thanked: 100 time(s) in 98 post(s)
Hello,

I apologize for the delay in responding.
Currently, a stable working solution has not yet been found.
rhnatiuk  
#12 Posted : Wednesday, September 25, 2019 3:15:52 AM(UTC)
rhnatiuk

Rank: Advanced Member

Groups: Registered
Joined: 4/30/2019(UTC)
Posts: 35
Man
Finland
Location: Raisio

Thanks: 9 times
Originally Posted by: Paul Rayman Go to Quoted Post
Hello,

I apologize for the delay in responding.
Currently, a stable working solution has not yet been found.


:(
Users browsing this topic
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.