logo
Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

New Topic Post Reply
Options
Go to last post Go to first unread
jacksonsavitraz  
#1 Posted : Wednesday, November 23, 2016 9:56:56 PM(UTC)
Quote
jacksonsavitraz

Rank: Newbie

Groups: Registered
Joined: 11/23/2016(UTC)
Posts: 5
Man
Brazil

Thanks: 2 times
I need to get a PdfPage in PdfDocument and draw a rectangle for each PdfImageObject detected, but the BoundingBox coordinates are not OK when OriginalRotation different than normal. Why?
Paul Rayman  
#2 Posted : Wednesday, November 23, 2016 11:57:08 PM(UTC)
Quote
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 898

Thanks: 3 times
Was thanked: 110 time(s) in 107 post(s)
What do you mean when talking "the BoundingBox coordinates are not OK"?
I have checked it right now and and all is working like a charm.
jacksonsavitraz  
#3 Posted : Thursday, November 24, 2016 3:17:51 PM(UTC)
Quote
jacksonsavitraz

Rank: Newbie

Groups: Registered
Joined: 11/23/2016(UTC)
Posts: 5
Man
Brazil

Thanks: 2 times
Originally Posted by: Paul Rayman Go to Quoted Post
What do you mean when talking "the BoundingBox coordinates are not OK"?
I have checked it right now and and all is working like a charm.


Paul, I think the BoundingBox for the object is incorrect because the page.OriginalRotation is PageRotate.Rotate90.
See my code:

Code:

			PdfCommon.Initialize();
			const int dpi = 96;
			using (var document = PdfDocument.Load(Path.Combine(Directory.GetCurrentDirectory(), "test.pdf")))
			{
				for (var i = 0; i < document.Pages.Count; i++)
				{
					using (var page = document.Pages[i])
					{
						int width = Convert.ToInt32(page.Width / 72.0 * dpi);
						int height = Convert.ToInt32(page.Height / 72.0 * dpi);
						using (var bmp = new PdfBitmap(width, height, true))
						{
							using (var g = Graphics.FromImage(bmp.Image))
							{
								g.Clear(Color.White);
								page.Render(bmp, 0, 0, bmp.Width, bmp.Height, PageRotate.Normal, RenderFlags.FPDF_LCD_TEXT);
								foreach (var obj in page.PageObjects)
								{
									if (obj is PdfImageObject)
									{
										var rect = new Rectangle(Convert.ToInt32(obj.BoundingBox.X / 72.0 * dpi), Convert.ToInt32(obj.BoundingBox.Y / 72.0 * dpi), Convert.ToInt32(obj.BoundingBox.Width / 72.0 * dpi), Convert.ToInt32(obj.BoundingBox.Height / 72.0 * dpi));
										g.DrawRectangle(new Pen(Color.Red, 3), rect);
									}
								}

							}
							bmp.Image.Save(Path.Combine(Directory.GetCurrentDirectory(), string.Format("page-{0}.png", i + 1)), ImageFormat.Png);
						}
					}
				}
			}
			PdfCommon.Release();


Use this PDF file test.zip (262kb) downloaded 31 time(s).

It's generating this image: page-1.png
Paul Rayman  
#4 Posted : Thursday, November 24, 2016 8:49:22 PM(UTC)
Quote
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 898

Thanks: 3 times
Was thanked: 110 time(s) in 107 post(s)
The engine does not know your plans about rendering the page. You may render page in normal state, or rotate it to 90,180 or 270 degrees and/or with various sizes.
Due to this reason the coordinates of the bounding box are given in the page's coordinate system which is independent on the rendering parameters.

So you need to translate the coordinates of bounding box from one to another - from page's coordinate system to the device coordinate system.
You can do it using page.PageToDevice method (clickable)
In that method you should pass exactly the same parameters as in the render method. In this case the engine can translate coordinates from one system to another taking into account the parameters that you're using to render the page.

After that you need to build the rectangle and normalize it, because it can has the negative size.
Please look at code below

Code:

page.Render(bmp, 0, 0, bmp.Width, bmp.Height, PageRotate.Normal, RenderFlags.FPDF_LCD_TEXT);
foreach (var obj in page.PageObjects)
{
	if (obj is PdfImageObject)
	{
		//Translate bounding box's coordinates into device's coordinate system
		//Using exactly the same values as in the render method.
		var pt1 = page.PageToDevice(0, 0, bmp.Width, bmp.Height, PageRotate.Normal, obj.BoundingBox.Left, obj.BoundingBox.Top);
		var pt2 = page.PageToDevice(0, 0, bmp.Width, bmp.Height, PageRotate.Normal, obj.BoundingBox.Right, obj.BoundingBox.Bottom);

		//Build rectangle
		var rect = new Rectangle(pt1.X, pt1.Y, pt2.X - pt1.X, pt2.Y - pt1.Y);
		//And  normalize it if it has the negative size in any dimension
		if (rect.Width < 0)
		{
			rect.Width = -rect.Width;
			rect.X -= rect.Width;
		}
		if (rect.Height < 0)
		{
			rect.Height = -rect.Height;
			rect.Y -= rect.Height;
		}
		//Now you can draw the rectangle
		g.DrawRectangle(new Pen(Color.Red, 3), rect);
	}
}

Edited by user Thursday, November 24, 2016 9:03:30 PM(UTC)  | Reason: Not specified

thanks 1 user thanked Paul Rayman for this useful post.
jacksonsavitraz on 11/25/2016(UTC)
jacksonsavitraz  
#5 Posted : Monday, November 28, 2016 6:32:45 PM(UTC)
Quote
jacksonsavitraz

Rank: Newbie

Groups: Registered
Joined: 11/23/2016(UTC)
Posts: 5
Man
Brazil

Thanks: 2 times
When the PdfImageObject is inside a PdfFormObject the BoundingBox not woking correctly.

Using test.zip (453kb) downloaded 26 time(s). in GenerateImages("d:\\test.pdf");

With this code:
Code:

		public const int DPI = 96;

		public void GenerateImages(string filename)
		{
			PdfCommon.Initialize();

			using (var document = PdfDocument.Load(filename))
			{
				for (var i = 0; i < document.Pages.Count; i++)
				{
					using (var page = document.Pages[i])
					{
						int width = Convert.ToInt32(page.Width / 72.0 * DPI);
						int height = Convert.ToInt32(page.Height / 72.0 * DPI);

						using (var bmp = new PdfBitmap(width, height, true))
						{
							bmp.FillRect(0, 0, bmp.Width, bmp.Height, Color.White);
							page.Render(bmp, 0, 0, bmp.Width, bmp.Height, PageRotate.Normal, RenderFlags.FPDF_LCD_TEXT);

							using (var image = new Bitmap(bmp.Image))
							{
								using (var g = Graphics.FromImage(image))
								{
									using (var pen = new Pen(Color.FromArgb(128, Color.Red), 4))
									{
										foreach (var rectangle in FindImagesRectangles(page, page.PageObjects))
										{
											g.DrawRectangle(pen, rectangle);
										}
									}
								}

								image.Save(Path.Combine(Path.GetDirectoryName(filename), string.Format("page-{0}.png", i+1)), ImageFormat.Png);
							}
						}
					}
				}
			}

			PdfCommon.Release();
		}

		public Rectangle BoundingBoxToRectangle(PdfPage page, Rectangle boundingBox)
		{
			int width = Convert.ToInt32(page.Width / 72.0 * DPI);
			int height = Convert.ToInt32(page.Height / 72.0 * DPI);

			var top = page.PageToDevice(0, 0, width, height, PageRotate.Normal, boundingBox.Left, boundingBox.Top);
			var bottom = page.PageToDevice(0, 0, width, height, PageRotate.Normal, boundingBox.Right, boundingBox.Bottom);

			var rectangle = new Rectangle(top.X, top.Y, bottom.X - top.X, bottom.Y - top.Y);
			if (rectangle.Width < 0)
			{
				rectangle.Width = -rectangle.Width;
				rectangle.X -= rectangle.Width;
			}
			if (rectangle.Height < 0)
			{
				rectangle.Height = -rectangle.Height;
				rectangle.Y -= rectangle.Height;
			}
			if (rectangle.Y < 0)
			{
				rectangle.Height = rectangle.Height + rectangle.Y;
				rectangle.Y = 0;
			}
			if (rectangle.X < 0)
			{
				rectangle.Width = rectangle.Width + rectangle.X;
				rectangle.X = 0;
			}
			if (rectangle.Bottom >= height)
			{
				rectangle.Height = rectangle.Height - (height - rectangle.Bottom + 1);
			}
			if (rectangle.Right >= width)
			{
				rectangle.Width = rectangle.Width - (width - rectangle.Right + 1);
			}

			return rectangle;
		}

		public Rectangle[] FindImagesRectangles(PdfPage page, PdfPageObjectsCollection objects)
		{
			var rectangles = new List<Rectangle>();

			foreach (var obj in objects)
			{
				if (obj is PdfImageObject)
				{
					rectangles.Add(BoundingBoxToRectangle(page, obj.BoundingBox));
				}
				else if (obj is PdfFormObject)
				{
					var form = obj as PdfFormObject;
					rectangles.AddRange(FindImagesRectangles(page, form.PageObjects));
				}
			}

			return rectangles.ToArray();
		}


It's generating this:
page-1.png
Paul Rayman  
#6 Posted : Monday, November 28, 2016 11:39:42 PM(UTC)
Quote
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 898

Thanks: 3 times
Was thanked: 110 time(s) in 107 post(s)
You have to calculate the offset of the target bounding box as a subtraction between the form object's bounding box and sub-object's bounding box.

Please look at following code
Code:

var formBBox = obj.BoundingBox; // the BBox of an object on the page (form object)
var imgBBox = img.BoundingBox; //the BBbox of an object inside forms (found image object)
imgBBox.Offset(formBBox.X - imgBBox.X, formBBox.Y - imgBBox.Y);

//Translate and draw imgBBox
PageToDevice(...imgBBox...);
...

Edited by user Monday, November 28, 2016 11:51:35 PM(UTC)  | Reason: Not specified

jacksonsavitraz  
#7 Posted : Tuesday, November 29, 2016 8:14:09 PM(UTC)
Quote
jacksonsavitraz

Rank: Newbie

Groups: Registered
Joined: 11/23/2016(UTC)
Posts: 5
Man
Brazil

Thanks: 2 times
Originally Posted by: Paul Rayman Go to Quoted Post
You have to calculate the offset of the target bounding box as a subtraction between the form object's bounding box and sub-object's bounding box.


It worked for attached file, but not for this another file:
test2.zip (448kb) downloaded 28 time(s).

Coordinates are incorrect:
page-1.png

This is my current code:
Code:

		public const int DPI = 96;

		public Rectangle BoundingBoxToRectangle(PdfPage page, Rectangle boundingBox, PdfFormObject form = null)
		{
			int width = Convert.ToInt32(page.Width / 72.0 * DPI);
			int height = Convert.ToInt32(page.Height / 72.0 * DPI);

			if (form != null)
			{
				boundingBox.Offset(form.BoundingBox.X - boundingBox.X, form.BoundingBox.Y - boundingBox.Y);
			}

			var top = page.PageToDevice(0, 0, width, height, PageRotate.Normal, boundingBox.Left, boundingBox.Top);
			var bottom = page.PageToDevice(0, 0, width, height, PageRotate.Normal, boundingBox.Right, boundingBox.Bottom);

			var rectangle = new Rectangle(top.X, top.Y, bottom.X - top.X, bottom.Y - top.Y);
			if (rectangle.Width < 0)
			{
				rectangle.Width = -rectangle.Width;
				rectangle.X -= rectangle.Width;
			}
			if (rectangle.Height < 0)
			{
				rectangle.Height = -rectangle.Height;
				rectangle.Y -= rectangle.Height;
			}
			if (rectangle.Y < 0)
			{
				rectangle.Height = rectangle.Height + rectangle.Y;
				rectangle.Y = 0;
			}
			if (rectangle.X < 0)
			{
				rectangle.Width = rectangle.Width + rectangle.X;
				rectangle.X = 0;
			}
			if (rectangle.Bottom >= height)
			{
				rectangle.Height = rectangle.Height - (height - rectangle.Bottom + 1);
			}
			if (rectangle.Right >= width)
			{
				rectangle.Width = rectangle.Width - (width - rectangle.Right + 1);
			}

			return rectangle;
		}

		public Rectangle[] FindImagesRectangles(PdfPage page, PdfPageObjectsCollection objects, PdfFormObject form = null)
		{
			var rectangles = new List<Rectangle>();

			foreach (var obj in objects)
			{
				if (obj is PdfImageObject)
				{
					rectangles.Add(BoundingBoxToRectangle(page, obj.BoundingBox, form));
				}
				else if (obj is PdfFormObject)
				{
					var parent = obj as PdfFormObject;
					rectangles.AddRange(FindImagesRectangles(page, parent.PageObjects, parent));
				}
			}

			return rectangles.ToArray();
		}

		public void GenerateImages(string filename)
		{
			PdfCommon.Initialize();

			using (var document = PdfDocument.Load(filename))
			{
				for (var i = 0; i < document.Pages.Count; i++)
				{
					using (var page = document.Pages[i])
					{
						int width = Convert.ToInt32(page.Width / 72.0 * DPI);
						int height = Convert.ToInt32(page.Height / 72.0 * DPI);

						using (var bmp = new PdfBitmap(width, height, true))
						{
							bmp.FillRect(0, 0, bmp.Width, bmp.Height, Color.White);
							page.Render(bmp, 0, 0, bmp.Width, bmp.Height, PageRotate.Normal, RenderFlags.FPDF_LCD_TEXT);

							using (var image = new Bitmap(bmp.Image))
							{
								using (var g = Graphics.FromImage(image))
								{
									using (var pen = new Pen(Color.FromArgb(128, Color.Red), 4))
									{
										foreach (var rectangle in FindImagesRectangles(page, page.PageObjects))
										{
											g.DrawRectangle(pen, rectangle);
										}
									}
								}

								image.Save(Path.Combine(Path.GetDirectoryName(filename), string.Format("page-{0}.png", i+1)), ImageFormat.Png);
							}
						}
					}
				}
			}

			PdfCommon.Release();
		}
Paul Rayman  
#8 Posted : Thursday, December 1, 2016 6:25:08 PM(UTC)
Quote
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 898

Thanks: 3 times
Was thanked: 110 time(s) in 107 post(s)
Well... my previous answer is incorrect.
Thanks for having pointed out my misunderstanding of this issue.
I have read the PDF reference and found following.

With forms the nuance is there: the form object has transformation matrix and this matrix should be taken into account when calculating any coordinates of form's objects.
In this case, you must use the following code to obtain the bounding box of the form's image object.

Code:

int l, t, r, b;
Pdfium.FPDFPageObj_GetBBox(img.Handle, form.Matrix, out l, out t, out r, out b);
var imgBBox = new Rectangle(l, t, r - l, b - t);


Where
the form - it is a PdfFormObject
the img - it is a PdfImageObject from form.PageObjects collection
thanks 1 user thanked Paul Rayman for this useful post.
jacksonsavitraz on 12/3/2016(UTC)
Guest  
#9 Posted : Saturday, May 2, 2020 7:07:06 AM(UTC)
Quote
Guest

Rank: Guest

Groups: Guests
Joined: 1/5/2016(UTC)
Posts: 144

Was thanked: 4 time(s) in 4 post(s)
Pdfium.FPDFPageObj_GetBBox(img.Handle, form.Matrix, out l, out t, out r, out b)
paul,
can you rewrite the code above with the object method _GetBBox ?,
I also didnot find any code example how to extract images from pdf , with it's coordinates (locations)
in pixels ?

I still getting wrong, rectangles coordinates of the extracted images (?)

Edited by user Wednesday, May 6, 2020 6:07:50 AM(UTC)  | Reason: SYNTAX

Guest  
#10 Posted : Monday, May 18, 2020 9:51:01 AM(UTC)
Quote
Guest

Rank: Guest

Groups: Guests
Joined: 1/5/2016(UTC)
Posts: 144

Was thanked: 4 time(s) in 4 post(s)
Originally Posted by: Guest Go to Quoted Post

I also didnot find any code example how to extract images from pdf , with it's coordinates (locations)
in pixels ?


Try checking the PdfImageObject.Matrix. This will give you the location of the image in the fields e and f of the matrix. Width and height can be in a and d.
These values are given in PDF points.
1 PDF point is 1/72 of an inch. Therefore, if you need pixels, just calculate them taking into account the DPI of the image you have rendered.

pixel = point / 72.0 * dpi;
Guest  
#11 Posted : Tuesday, June 2, 2020 1:50:26 PM(UTC)
Quote
Guest

Rank: Guest

Groups: Guests
Joined: 1/5/2016(UTC)
Posts: 144

Was thanked: 4 time(s) in 4 post(s)
extract images from pdf with coordinates.

still getting wrong pixel coordinates

can somebody help, give link of complete c# code example ,
I use very simple pdf with image on head page header.

PdfImageObject.Matrix - giving same results

so long time, no-body uses such a feature ?

Paul Rayman  
#12 Posted : Tuesday, June 2, 2020 6:46:11 PM(UTC)
Quote
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 898

Thanks: 3 times
Was thanked: 110 time(s) in 107 post(s)
Hello,,

Please clarify your question in more details. What do you mean when talking about coordinates? In what coordinate system should these coordinates be indicated? Why do you think that the coordinates obtained using GetBoundingBox are incorrect? Please provide your test PDF and minimal code illustrating the problem.
Guest  
#13 Posted : Wednesday, June 17, 2020 2:37:20 AM(UTC)
Quote
Guest

Rank: Guest

Groups: Guests
Joined: 1/5/2016(UTC)
Posts: 144

Was thanked: 4 time(s) in 4 post(s)
Hi Paul

1. from all discussions above,
can you supply in this thread, a corrected version , how to extract the images from a PDF+ coordinates
in points, and then in pixels (as written in ..ToDevice...)

2. are the calculation of extracted rectangles , of mr. jacksonsavitraz correct ?

we don't have any documentation how treat rectangles that coming with minus values like (x,-y,w,-h...)

after having the correct code example (that not found in whole site)
I'll check it.

and then supply:
my demo PDF is simple with one image on top of page (some logo)
and image of signature (down the page)

wait for help,



Paul Rayman  
#14 Posted : Wednesday, June 17, 2020 6:45:17 AM(UTC)
Quote
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 898

Thanks: 3 times
Was thanked: 110 time(s) in 107 post(s)
I really do not understand your difficulties.

1. PdfImageObject.BoundingBox - will give you the actual location and size of the image in points (in the page coordinate system). these coordinates are independent on render size, rotation, etc. One PDF point is 1/72 inches.
2. PdfPage.PageToDevice - allows you to translate points into pixels. Pixels, of course, depend on how you rendered the document, so you need to pass the same parameters to this method as when rendering
3. The origin of the page coordinate system (point with coordinates 0; 0) is located in the lower left corner of the page.
4. The origin of the "pixel coordinate system" is the upper left corner of the bitmap into which you render the page.
5. You must take into account that the page can be rotated at an angle of 90,180 or 270 degrees. The coordinates in the page coordinate system are independent of page rotation, in contrast to the coordinates in pixels. In order not to get a negative width or height, you should take this into account.

At what stage do you encounter difficulties?
Please formulate a specific question with code examples and a PDF document.

Edited by user Wednesday, June 17, 2020 6:46:42 AM(UTC)  | Reason: Not specified

Quick Reply Show Quick Reply
Users browsing this topic
New Topic Post Reply
Forum Jump  
You can post new topics in this forum.
You can reply to topics in this forum.
You can delete your posts in this forum.
You can edit your posts in this forum.
You cannot create polls in this forum.
You can vote in polls in this forum.