logo
Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

Options
Go to last post Go to first unread
Paul Rayman  
#1 Posted : Monday, February 6, 2017 8:56:37 AM(UTC)
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 1,011

Thanks: 5 times
Was thanked: 121 time(s) in 118 post(s)
Question:
I am trying to search for a text in a pdf file and return the coordinates if the text exist.
I was researching the net and find out that can be done with the Pdfium.Net SDK.

Could you please provide some examples of how to do that?

Answer:
Please look at code below
Code:

//Open PDF document
using (var doc = PdfDocument.Load(@"d:\0\test_big.pdf"))
{
	//Enumerate pages
	foreach(var page in doc.Pages)
	{
		var found = page.Text.Find("text for search", FindFlags.None, 0);
		if (found != null)
		{
			do
			{
				var textInfo = found.FindedText;
				foreach(var rect in textInfo.Rects)
				{
					float x = rect.left;
					float y = rect.top;
					//...
				}
			} while (found.FindNext());
		}
		page.Dispose();
	}
}


Also you can use PdfSearch class for asynchronous search
Code:

//Open PDF document
var doc = PdfDocument.Load(@"d:\0\test_big.pdf");
PdfSearch search = new PdfSearch(doc);
search.FoundTextAdded += (s, e) =>
	{
		var textInfo = doc.Pages[e.FoundText.PageIndex].Text.GetTextInfo(e.FoundText.CharIndex, e.FoundText.CharsCount);
		foreach (var rect in textInfo.Rects)
		{
			float x = rect.left;
			float y = rect.top;
			Console.WriteLine(string.Format("Found text: {0}, Page = {1}, x= {2}, y={3}", textInfo.Text, e.FoundText.PageIndex, x, y));
			//...
		}
	};
search.SearchCompleted += (s, e) =>
	{
		doc.Dispose();
	};
search.SearchProgressChanged += (s, e) =>
	{
		Console.WriteLine(string.Format("Progress: {0}%", e.ProgressPercentage));
	};
search.Start("document", FindFlags.MatchWholeWord);
Console.ReadLine();

Edited by user Wednesday, November 7, 2018 9:50:21 PM(UTC)  | Reason: Not specified

fmotsch  
#2 Posted : Monday, February 26, 2018 1:48:46 AM(UTC)
fmotsch

Rank: Member

Groups: Registered
Joined: 2/6/2018(UTC)
Posts: 11
France
Location: Paris

Thanks: 6 times
Hello,

I've try your first code but it throws a StackOverflowException that I think i've handle.
Then my problem is that the ScrollToPoint is not precise at all... It zoom in far away from the word i'm looking for !
Here are the two functions I'm working on:




Code:
 public void highligtText(string text)
        {
            
                int cnt = this.pdfViewer.Document.Pages.Count;
                for (int i = 0; i < cnt; i++)
                {
                    var found = this.pdfViewer.Document.Pages[i].Text.Find(text, Patagames.Pdf.Enums.FindFlags.None, 0); //1)
                    if (found == null)
                        continue;

                    do
                    {
                    try
                    {
                        zoomRecherche(text, found);
                    }
                    catch (StackOverflowException e)
                    {
                        if (e.Source != null)
                            Console.WriteLine("IOException source: {0}", e.Source);
                        throw;
                    }

                    this.pdfViewer.HighlightText(i, found.CharIndex, found.CharsCount, System.Windows.Media.Color.FromArgb(90, 219, 0, 25));
                    } while (found.FindNext());
                }
            
        }

        /// <summary>
        /// Zoom sur l'élément recherché 
        /// </summary>
        public void zoomRecherche(string text, dynamic found)
        {
            int idx = pdfViewer.CurrentIndex;

            if(idx >= 0)
            {
                var textInfo = found.FindedText;
                foreach (var rect in textInfo.Rects)
                {
                    Point p = rect.Position;
                    // Aller à la position de point p
                    this.pdfViewer.ScrollToPoint(idx, p);
                }
                pdfViewer.Zoom = 2f;
            }
            
        }



Do you have an idea on how can I zoom in on the word I'm looking for in a accurate way? Maybe there is a function which decide how to open the PDF which make the opening a bit messy ?
Thanks

Edited by user Tuesday, February 27, 2018 8:39:56 AM(UTC)  | Reason: Not specified

Paul Rayman  
#3 Posted : Monday, February 26, 2018 8:39:06 AM(UTC)
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 1,011

Thanks: 5 times
Was thanked: 121 time(s) in 118 post(s)
Hi,

Looks like following thread will be helpful for you
http://forum.patagames.c...or-Position-on-PdfViewer

Update.
Although, seems a simpler solution will be acceptable.
Just change the order of yours actions
1. First zoom the page
2. Then call the ScrollToPoint method.

Edited by user Monday, February 26, 2018 9:02:01 AM(UTC)  | Reason: Not specified

fmotsch  
#4 Posted : Tuesday, February 27, 2018 8:54:22 AM(UTC)
fmotsch

Rank: Member

Groups: Registered
Joined: 2/6/2018(UTC)
Posts: 11
France
Location: Paris

Thanks: 6 times
Thanks, for the answer.
I have done your modifications but still, the word i'm looking for is never in the window (whereas i'm sure that this code should work).
I believe that something else (an other padding or function) is on the way to get it precise.
Do you have an idea of what should i be looking for?

Thanks anyway, you do a great job !!! :)

Edited by user Tuesday, February 27, 2018 9:01:31 AM(UTC)  | Reason: Not specified

Paul Rayman  
#5 Posted : Tuesday, February 27, 2018 9:32:37 AM(UTC)
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 1,011

Thanks: 5 times
Was thanked: 121 time(s) in 118 post(s)
Well...

You also may look in source code of original PdfToolStripSearch.
This code does exactly what you need. Produces text search, selects text and is positioned on the found.
https://github.com/Patag...rs/PdfToolStripSearch.cs

How to zoom in page around point shown at the link in my previous post.



thanks 1 user thanked Paul Rayman for this useful post.
fmotsch on 3/6/2018(UTC)
Users browsing this topic
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.