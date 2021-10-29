So i’m trying to use IronOcr in .NET 5, and i’m having trouble understanding the coordinates it gives me.

Context: IronOcr is a Tesseract library. I’m using it to scan an image. The image is 688x688. I’m using the following code:

using System; using System.IO; using System.Linq; using System.Collections.Generic; using System.Drawing; using IronOcr; using System.Text.RegularExpressions; namespace ScratchConsole { class Program { static void Main() { string file = @"a_valid_file_path.jpg"; Image saveme = Image.FromFile(file); Graphics graphic = Graphics.FromImage(saveme); //This is for later manipulation of the original picture. var tess = new IronTesseract(); using var Input = new OcrInput(); var ContentArea = new Rectangle() { X = 300, Y = 200, Height = 300, Width = 388 }; Input.AddImage(saveme, ContentArea); tess.Configuration.TesseractVariables["classify_font_name"] = "Arial"; var res = tess.Read(Input); string text = res.Text; Console.WriteLine(text); } } }

Note: I am limiting the scan area of the OCR with the rectangle, as shown in their example.

When i break at the Console.Writeline and inspect res , i can find the block that it found the line of text in (which is Blocks[1], or the second block). The Location of that block is:

res.Blocks[1].Location {X = 1162 Y = 817 Width = 336 Height = 36} Bottom: 853 Height: 36 IsEmpty: false Left: 1162 Location: {X = 1162 Y = 817} Right: 1498 Size: {Width = 336 Height = 36} Top: 817 Width: 336 X: 1162 Y: 817 height: 36 width: 336 x: 1162 y: 817

the dimensions of this dont look right, and neither do the X/Y coordinates. The original image is only 688x688, so how is IronOCR finding a bounding box starting at 1162,817? The original text, if i bound it in MS Paint, is roughly 150x20, and is somewhere around 495, 347 as a top-left coordinate…

Am I missing something obvious? Is there some scaling factor somewhere that I can use to translate back to my original image? All i’m trying to do is create a bounding box around a regex-defined phrase (well, either one of a pair of phrases) in order to erase it from the original image…