In this article I will explain with an example, how to read or extract text from image using Tesseract OCR library in Windows Forms (WinForms) Application using C# and VB.Net.
This process of reading or extracting text from images is also termed as Optical Character Recognition (OCR).
 
 
Installing and configuring Tesseract Library
Installing Tesseract Library
You will need to install the Tesseract package using the following command.
Install-Package Tesseract -Version 5.2.0
 
For more details on how to install package from Nuget, please refer my article, Install Nuget package in Visual Studio 2017, 2019, 2022.
 
Downloading and configuring Tesseract Data Files
You will need to download the Tesseract Data files from the following link.
Once downloaded, unzip it.
Read (Extract) Text from Image using Tesseract OCR in C# and VB.Net
 
Then copy it to the project root folder and rename it to tessdata as shown below.
Read (Extract) Text from Image using Tesseract OCR in C# and VB.Net
 
 
Form Design
The following Windows Form consists of a Button, a Label and OpenFileDialog control.
Note: For more details on how to use OpenFileDialog, please refer my article, Using OpenFileDialog in C# and VB.Net.
 
Read (Extract) Text from Image using Tesseract OCR in C# and VB.Net
 
 
Namespaces
You will need to import the following namespaces.
C#
using System.IO;
using Tesseract;
 
VB.Net
Imports System.IO
Imports Tesseract
 
 
Reading Text from Image File using C# and VB.Net
Inside the Button Click event handler, the Path of the selected File is read from the FileName property of the OpenFileDialog Box and passed to the ExtractTextFromImage method.
Inside the ExtractTextFromImage method, first the Tesseract Engine is initialized by setting the tessdata folder path and the Language.
Then, the file is read from the path using Tesseract Pix object and then the text is extracted from the image using Tesseract Page object.
Finally, the extracted text is assigned to the Label control.
C#
private void btnSelect_Click(object sender, EventArgs e)
{
    if (openFileDialog1.ShowDialog() == DialogResult.OK)
    {
        string filePath = openFileDialog1.FileName;
        string extractText = this.ExtractTextFromImage(filePath);
        lblText.Text = extractText;
    }
}
 
private string ExtractTextFromImage(string filePath)
{
    string tessdataPath = Application.StartupPath.Replace("\\bin\\Debug", "") + Path.DirectorySeparatorChar + "tessdata";
    using (TesseractEngine engine = new TesseractEngine(tessdataPath, "eng", EngineMode.Default))
    {
        using (Pix pix = Pix.LoadFromFile(filePath))
        {
            using (Tesseract.Page page = engine.Process(pix))
            {
                return page.GetText();
            }
        }
    }
}
 
VB.Net
Private Sub btnSelect_Click(ByVal sender As Object, ByVal e As EventArgs) Handles btnSelect.Click
    If openFileDialog1.ShowDialog() = DialogResult.OK Then
        Dim filePath As String = openFileDialog1.FileName
        Dim extractText As String = Me.ExtractTextFromImage(filePath)
        lblText.Text = extractText
    End If
End Sub
 
Private Function ExtractTextFromImage(ByVal filePath As String) As String
    Dim tessdataPath As String = Application.StartupPath.Replace("\bin\Debug", "") + Path.DirectorySeparatorChar & "tessdata"
    Using engine As TesseractEngine = New TesseractEngine(tessdataPath, "eng", EngineMode.Default)
        Using pix As Pix = Pix.LoadFromFile(filePath)
            Using page As Tesseract.Page = engine.Process(pix)
                Return page.GetText()
            End Using
        End Using
    End Using
End Function
 
 
Screenshots
Image with some text
Read (Extract) Text from Image using Tesseract OCR in C# and VB.Net
 
The extracted Text
Read (Extract) Text from Image using Tesseract OCR in C# and VB.Net
 
 
Demo
Read (Extract) Text from Image using Tesseract OCR in C# and VB.Net
 
 
Downloads