Rank: Member
Groups: Registered
Joined: 1/28/2016(UTC) Posts: 17  Thanks: 3 times
|
HI,
Does the Tesseract.Net SDK support tesseract's ability to generate PDF output of a searchable PDF? I found that the newer versions of Tesseract (3.03 RC and later) support PDF output directly. That would make it much easier to work with the text that tesseract can find. Is this method already included in the .Net SDK? If so, can you show a quick example?
THANKS!
Michael
|
|
|
|
Rank: Administration
Groups: Administrators
Joined: 1/5/2016(UTC) Posts: 1,115
Thanks: 8 times Was thanked: 130 time(s) in 127 post(s)
|
Yes, of course. Please look at OcrPdfRenderer class Code:
public void Tiff2Pdf()
{
using (var api = OcrApi.Create())
{
api.Init(Languages.English);
using (var renderer = OcrPdfRenderer.Create("multipage_pdf_file", "c:\\YourApp\\tessdata\\"))
{
renderer.BeginDocument("Title");
api.ProcessPages(@"c:\multipage.tif", null, 0, renderer);
renderer.EndDocument();
}
}
}
or Code:
static void Main(string[] args)
{
PdfCommon.Initialize();
double scaleFactor = 1;
var ocr = OcrApi.Create();
ocr.Init(Languages.English);
using (var renderer = OcrPdfRenderer.Create(@"d:\3\multipage_pdf_file", "tessdata\\"))
{
renderer.BeginDocument("document title");
int i = 0;
using (var doc = PdfDocument.Load(@"d:\3\review.pdf"))
{
foreach (var page in doc.Pages)
{
Console.WriteLine(string.Format("Page {0}", i++));
int width = (int)(page.Width * scaleFactor);
int height = (int)(page.Height * scaleFactor);
using (var bitmap = new PdfBitmap(width, height, true))
{
bitmap.FillRect(0, 0, width, height, Color.White);
page.Render(bitmap, 0, 0, width, height, PageRotate.Normal, RenderFlags.FPDF_LCD_TEXT);
ocr.ProcessPage(OcrPix.FromBitmap(bitmap.Image as Bitmap), null, 0, renderer);
}
}
}
renderer.EndDocument();
}
ocr.Dispose();
}
Edited by user 9 years ago
| Reason: Not specified
|
|
|
|
Rank: Member
Groups: Registered
Joined: 1/28/2016(UTC) Posts: 17  Thanks: 3 times
|
Looks perfect for what I need to do! Thanks, Paul!!
|
|
|
|
Rank: Administration
Groups: Administrators
Joined: 1/5/2016(UTC) Posts: 1,115
Thanks: 8 times Was thanked: 130 time(s) in 127 post(s)
|
|
|
|
|
Forum Jump
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.
Important Information:
The Patagames Software Support Forum uses cookies. By continuing to browse this site, you are agreeing to our use of cookies.
More Details
Close