logo
Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

Options
Go to last post Go to first unread
Paul Rayman  
#1 Posted : Tuesday, January 5, 2016 6:20:16 AM(UTC)
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 1,115

Thanks: 8 times
Was thanked: 130 time(s) in 127 post(s)
Question:
I want to extract images from PDF. Tried many solutions, but still not getting a solution. Help me out....Thanks in advance

Answer:
We suggest you to use Pdfium.Net SDK. The following example will help you to get your task done.

Code:

private int _writedImageIndex = 0;

public void ExtractAllImages()
{
    //Initialize the SDK library
    //You have to call this function before you can call any PDF processing functions.
    PdfCommon.Initialize();

    //Open and load a PDF document from a file.
    using (var doc = PdfDocument.Load(@"c:\test001.pdf"))
    {
        //Enumerate all pages sequentially in a given document
        foreach (var page in doc.Pages)
        {
            //Extract and save images
            ExtractImagesFromPage(page);

            //dipose page object to unload it from memory
            page.Dispose();
        }
    }

    //Release all resources allocated by the SDK library
    PdfCommon.Release();
}

private void ExtractImagesFromPage(PdfPage page)
{
    //Enumerate all objects on a page
    foreach (var obj in page.PageObjects)
    {
        var imageObject = obj as PdfImageObject;
        if (imageObject == null)
            continue; //if not an image object then nothing do

        //Save image to disk
        var path = string.Format(@"c:\\Images\\image_{0}.png", ++_writedImageIndex);
        imageObject.Bitmap.Image.Save(path, ImageFormat.Png);
    }
}

Edited by user Thursday, April 28, 2016 9:29:55 AM(UTC)  | Reason: Not specified

eagleview  
#2 Posted : Monday, February 1, 2016 5:59:39 PM(UTC)
eagleview

Rank: Member

Groups: Registered
Joined: 1/28/2016(UTC)
Posts: 17
United States

Thanks: 3 times
Hi - I've tried the code and continue to get errors...I've add the Assembly and namespace info to fix the Bitmap errors. But I cannot find a solution to the following errors.

Severity Code Description Project File Line Suppression State
Error CS1503 Argument 4: cannot convert from 'System.Windows.Media.PixelFormat' to 'System.Drawing.Imaging.PixelFormat'
Error CS0117 'PixelFormat' does not contain a definition for 'Undefined'
Error CS0117 'PixelFormat' does not contain a definition for 'Format24bppRgb'
Error CS0117 'PixelFormat' does not contain a definition for 'Format32bppRgb'
Error CS0117 'PixelFormat' does not contain a definition for 'Format32bppArgb'
Error CS0103 The name 'ImageFormat' does not exist in the current context

I'm trying to use Tesseract to convert an Image PDF to text so I can process it with PDFium.

Thank you for any help you can provide,

Michael
Paul Rayman  
#3 Posted : Monday, February 1, 2016 7:21:55 PM(UTC)
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 1,115

Thanks: 8 times
Was thanked: 130 time(s) in 127 post(s)
The code above for the WinForms (it's uses the GDI+ Bitmaps), but you are trying to use WPF.

So you need rewrite CreateGdiBitmap for WPF.

Something like this:

Code:
private BitmapSource CreateGdiBitmap(PdfBitmap bitmap)
{
    //Determine pixel format
    PixelFormat pixelFormat = ...
    switch (bitmap.Format)
    {
        case BitmapFormats.FXDIB_Rgb: pixelFormat = wpf_pixel_format
        case BitmapFormats.FXDIB_Rgb32: pixelFormat = wpf_pixel_format
        case BitmapFormats.FXDIB_Argb: pixelFormat = wpf_pixel_format
    }

    //Create a Gdi+ bitmap
    return BitmapSource.Create(bitmap.Width, bitmap.Height, 96, 96, pixelFormat, null, bitmap.Buffer, bitmap.Stride*bitmap.Height, bitmap.Stride);
}
thanks 1 user thanked Paul Rayman for this useful post.
eagleview on 2/1/2016(UTC)
eagleview  
#4 Posted : Monday, February 1, 2016 8:47:18 PM(UTC)
eagleview

Rank: Member

Groups: Registered
Joined: 1/28/2016(UTC)
Posts: 17
United States

Thanks: 3 times
Thanks, Paul. Understand - will recode with your example. Thanks much!
tudorc  
#5 Posted : Monday, March 10, 2025 8:35:14 AM(UTC)
tudorc

Rank: Newbie

Groups: Registered
Joined: 3/7/2025(UTC)
Posts: 3
Romania

I'm trying this code with the latest version of Pdfium.Net.Sdk (4.98.2704) and it doesn't compile. PdfBitmap doesn't have an Image property at all for net 6.0, only for .net framework. I'm unsure what the recommended way of accessing the bytes or stream is. Ideally I'd like to save the image to a stream to further process it with other libraries like emgu. There is a Buffer property of type IntPtr, but I'm unsure what's there and under what format and length.

Also, do images in Pdfium.Net.Sdk require libgdi support? Or System.Drawing.Image, which is no longer supported as of .net 6?
Paul Rayman  
#6 Posted : Thursday, March 13, 2025 5:36:00 AM(UTC)
Paul Rayman

Rank: Administration

Groups: Administrators
Joined: 1/5/2016(UTC)
Posts: 1,115

Thanks: 8 times
Was thanked: 130 time(s) in 127 post(s)
Hello,

All code related to System.Drawing has been moved to the Patagames.Pdf.Gdi.dll assembly.

If your project is <TargetFramework>net8.0-windows</TargetFramework>, then this assembly should be added automatically when installing via NuGet. Otherwise, you need to add this assembly to the project dependencies yourself. You can find this assembly in the net80-window folder of the NuGet package.

After adding the assembly, access to the PdfBitmap.Image property will appear through extensions.

Edited by user Thursday, March 13, 2025 5:38:24 AM(UTC)  | Reason: Not specified

Users browsing this topic
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.