Ranorex Studio primarily relies on UI element recognition to interact with applications. However, some applications render text as images or graphics, making it difficult to capture with standard object recognition. In these cases, you can use Tesseract OCR (optical character recognition) to extract and validate text from screenshots.
This guide explains how to integrate Tesseract OCR with your Ranorex tests.
Install Tesseract packages
- In the Projects view, right-click your solution.
- Click Manage packages…
- In the search field of the window that appears, enter "Tesseract" and click Add to install the package.
- Then, enter "IronOCR" and click Add to install the package.
- Check if the TessData folder has been added to your project.
- If it’s missing, copy it from another solution and paste it into the project folder.
- You can download a sample project that includes the folder: OCRDesktop
- Under the Installed tab in the Manage Packages, verify that both Tesseract and IronOCR packages.
Using Tesseract OCR in Ranorex Studio
You can implement OCR either as a user code action (for single use) or a user code collection (to reuse across multiple recording modules).
Example
//Initialize the Tesseract Engine. The '..\..\' means to go to the Project folder for 'tessdata' folder
//By default it will look in the '\bin\Debug\' folder.
var engine = new TesseractEngine(@"..\..\tessdata", "eng", EngineMode.Default);
engine.DefaultPageSegMode = PageSegMode.SingleLine;
//Capture screenshot of my object
var element = Host.Local.FindSingle<Cell>(cell.GetPath());
Bitmap img_1 = Imaging.CaptureImage(element);
Report.Screenshot(element);
//Process the image for text
using (Page page = engine.Process(img_1 ))
{
string strOCRText = page.GetText();
Report.Info("OCR Text: " + strOCRText);
return strOCRText;
} If results are unclear, you can improve accuracy by:
- Upscaling the bitmap image.
- Converting to grayscale.
- Adjusting the DefaultPageSegMode depending on the type of text expected.
- Using a character whitelist, for example:
engine.SetVariable("tessedit_char_whitelist", "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
");