Articles on: Tutorials

Create your Info Smart Extractor using AI

This tutorial explains how to create a Smart Extractor in Cogniflow to extract any specific text from images, text, or PDF documents.

From the dashboard view, click “New Project”
In this tutorial, select "Image and PDF" to extract from images of PDF documents. Please use Plain Text if you are planning to integrate Cogniflow with services that typically use text as input such as Gmail, X, WhatsApp, etc.

Select Smart Extractor project

In this step, you have to specify the entities that you want to use as extraction criteria. For this example, let’s say we want to extract information from a receipt that you have taken a photo with your cell phone. Click on "Add manually"

Define the entities you want to extract

Define entity

Name: This is the entity or field name you want to extract, and it will be used in the output as the key identifier.
Description: This is optional, but it could be very helpful to give extra instructions to be more effective in identifying an entity that is not a common type like a date, number, or currency.
Output format: You can use this to convert an extracted value to a specific format. For example, for dates, you can use “MM/DD/YYYY” or “MM-DD-YYYY,” etc.

In this example, we added receipt_date and total_amount entities, as you can see in the image below:

Initial entities definition

Once the experiment is created, click the “Use this model” button
Upload an image of a receipt to test the model. Let's imagine you forgot to add the name of seller, so let's add that entity as well. You can use the "Edit entites" **or the **"Settings" tab.

Edit entities from test page