
Invoice Data Extractor
Discover how an Invoice Data Extractor OCR Flow can streamline your financial processes by automating the extraction and organization of invoice data. Learn abo...
Learn how to automate invoice data extraction using AI-based OCR and Python with FlowHunt’s API, enabling fast, accurate, and scalable document processing.
AI-driven OCR goes beyond the capabilities of traditional OCR in that it uses artificial intelligence to understand context, handle numerous layout varieties, and produce high-quality structured data extraction out of even the most complex documents. While traditional OCR is designed to pick up text from a fixed format, AI OCR can handle many types of layouts and configurations common in invoices and other business documents.
Invoices have to be processed efficiently and with a high degree of accuracy, whether it is related to the accounting, logistics, or procurement department. AI OCR automates data extraction and smooths out workflows, improving data accuracy.
Most conventional companies extract data from invoices manually by using employees for these tasks. This is a very time-consuming and costly operation which can be automated in many different fields and companies, such as tax, legal, and finance companies, and more.
This process takes 5 to 15 seconds and costs 0.01 – 0.02 credits, where you normally would have to pay $15 – $30 per hour for an employee to do the same task.
Processor | Cost per Year | Invoices Processed per Year | Cost per Invoice |
---|---|---|---|
Human | $30,000 | 12,000 | $2.50 |
FlowHunt | $162 | 12,000 | $0.013 |
FlowHunt (at $30,000) | $30,000 | 2,250,000 | $0.0133 |
I would say FlowHunt is more efficient by a huge margin.
While OCR is highly beneficial, it comes with some challenges:
To tackle these challenges, it’s essential to use a powerful and flexible OCR tool. FlowHunt’s API provides a robust OCR solution capable of handling complex document structures, making it ideal for large-scale OCR projects.
To automate the process, you’ll need to install the following Python libraries:
pip install requests pdf2image git+https://github.com/QualityUnit/flowhunt-python-sdk.git
This installs:
This code will take a PDF, convert it into images, send each image to FlowHunt for OCR processing, and save the output in CSV format.
Import Libraries
import json
import os
import re
import time
import requests
import flowhunt
from flowhunt.rest import ApiException
from pprint import pprint
from pdf2image import convert_from_path
json
, os
, re
, and time
help with JSON handling, file management, regular expressions, and time intervals.requests
: Used to handle HTTP requests, like downloading the OCR results.flowhunt
: FlowHunt’s SDK handles authentication and communication with the OCR API.pdf2image
: Converts PDF pages to images, enabling individual page OCR.Function to Convert PDF Pages to Images
def convert_pdf_to_image(path: str) -> None:
"""
Convert a PDF file to images, storing each page as a JPEG.
"""
images = convert_from_path(path)
for i in range(len(images)):
images[i].save('data/images/' + 'page' + str(i) + '.jpg', 'JPEG')
convert_from_path
: Converts each PDF page to an image.images[i].save
: Saves each page as an individual JPEG for OCR processing.Extracting the Output Attachment URL
def extract_attachment_url(data_string):
pattern = r'```flowhunt\n({.*})\n```'
match = re.search(pattern, data_string, re.DOTALL)
if match:
json_string = match.group(1)
try:
json_data = json.loads(json_string)
return json_data.get('download_link', None)
except json.JSONDecodeError:
print("Error: Failed to decode JSON.")
return None
return None
API Configuration and Authentication
convert_pdf_to_image("data/test.pdf")
FLOW_ID = "<FLOW_ID_HERE>"
configuration = flowhunt.Configuration(
host="https://api.flowhunt.io",
api_key={"APIKeyHeader": "<API_KEY_HERE>"}
)
Initializing the API Client
with flowhunt.ApiClient(configuration) as api_client:
auth_api = flowhunt.AuthApi(api_client)
api_response = auth_api.get_user()
workspace_id = api_response.api_key_workspace_id
workspace_id
for subsequent API calls.Starting a Flow Session
flows_api = flowhunt.FlowsApi(api_client)
from_flow_create_session_req = flowhunt.FlowSessionCreateFromFlowRequest(flow_id=FLOW_ID)
create_session_rsp = flows_api.create_flow_session(workspace_id, from_flow_create_session_req)
Uploading Images for OCR Processing
for image in os.listdir("data/images"):
image_name, image_extension = os.path.splitext(image)
with open("data/images/" + image, "rb") as file:
try:
flow_sess_attachment = flows_api.upload_attachments(
create_session_rsp.session_id,
file.read()
)
Invoking OCR Processing and Polling for Results
invoke_rsp = flows_api.invoke_flow_response(
create_session_rsp.session_id,
flowhunt.FlowSessionInvokeRequest(message="")
)
while True:
get_flow_rsp = flows_api.poll_flow_response(
create_session_rsp.session_id, invoke_rsp.message_id
)
print("Flow response: ", get_flow_rsp)
if get_flow_rsp.response_status == "S":
print("done OCR")
break
time.sleep(3)
Downloading and Saving OCR Output
attachment_url = extract_attachment_url(get_flow_rsp.final_response[0])
if attachment_url:
response = requests.get(attachment_url)
with open("data/results/" + image_name + ".csv", "wb") as file:
file.write(response.content)
To execute this script:
data/
folder.<FLOW_ID_HERE>
and <API_KEY_HERE>
with your FlowHunt credentials.This Python script offers an efficient solution for scaling OCR processes, ideal for industries with high document processing demands. With FlowHunt’s API, this solution handles document-to-CSV conversion, streamlining workflows and boosting productivity.
Click HERE for the Gist version.
import json
import os
import re
import time
import requests
import flowhunt
from flowhunt.rest import ApiException
from pprint import pprint
from pdf2image import convert_from_path
def convert_pdf_to_image(path: str) -> None:
"""
Convert a pdf file to an image
:return:
"""
images = convert_from_path(path)
for i in range(len(images)):
images[i].save('data/images/' + 'page'+ str(i) +'.jpg', 'JPEG')
def extract_attachment_url(data_string):
pattern = r'```flowhunt\n({.*})\n```'
match = re.search(pattern, data_string, re.DOTALL)
if match:
json_string = match.group(1)
try:
json_data = json.loads(json_string)
return json_data.get('download_link', None)
except json.JSONDecodeError:
print("Error: Failed to decode JSON.")
return None
return None
convert_pdf_to_image("data/test.pdf")
FLOW_ID = "<FLOW_ID_HERE>"
configuration = flowhunt.Configuration(host = "https://api.flowhunt.io",
api_key = {"APIKeyHeader": "<API_KEY_HERE>"})
with flowhunt.ApiClient(configuration) as api_client:
auth_api = flowhunt.AuthApi(api_client)
api_response = auth_api.get_user()
workspace_id = api_response.api_key_workspace_id
flows_api = flowhunt.FlowsApi(api_client)
from_flow_create_session_req = flowhunt.FlowSessionCreateFromFlowRequest(
flow_id=FLOW_ID
)
create_session_rsp = flows_api.create_flow_session(workspace_id, from_flow_create_session_req)
for image in os.listdir("data/images"):
image_name, image_extension = os.path.splitext(image)
with open("data/images/" + image, "rb") as file:
try:
flow_sess_attachment = flows_api.upload_attachments(
create_session_rsp.session_id,
file.read()
)
invoke_rsp = flows_api.invoke_flow_response(create_session_rsp.session_id, flowhunt.FlowSessionInvokeRequest(
message="",
))
while True:
get_flow_rsp = flows_api.poll_flow_response(create_session_rsp.session_id, invoke_rsp.message_id)
print("Flow response: ", get_flow_rsp)
if get_flow_rsp.response_status == "S":
print("done OCR")
attachment_url = extract_attachment_url(get_flow_rsp.final_response[0])
if attachment_url:
print("Attachment URL: ", attachment_url, "\n Downloading the file...")
response = requests.get(attachment_url)
with open("data/results/" + image_name + ".csv", "wb") as file:
file.write(response.content)
break
time.sleep(3)
except ApiException as e:
print("error for file ", image)
print(e)
AI-based OCR leverages machine learning and NLP to understand document context, handle complex layouts, and extract structured data from invoices, unlike traditional OCR which relies on fixed-format text recognition.
AI OCR delivers speed, accuracy, scalability, and structured outputs, reducing manual work, minimizing errors, and enabling seamless integration with business systems.
By using FlowHunt’s Python SDK, you can convert PDFs to images, send them to FlowHunt’s API for OCR, and retrieve structured data in CSV format, automating the entire extraction process.
Common challenges include poor image quality, complex document layouts, and varied languages. FlowHunt’s API is designed to handle these with advanced AI models and flexible processing capabilities.
FlowHunt’s AI OCR can process invoices in seconds at a fraction of human cost, delivering massive efficiency gains and scalability for growing businesses.
Arshia is an AI Workflow Engineer at FlowHunt. With a background in computer science and a passion for AI, he specializes in creating efficient workflows that integrate AI tools into everyday tasks, enhancing productivity and creativity.
Automate invoice data extraction with FlowHunt’s robust AI OCR. Save time, reduce errors, and streamline your workflows by converting PDFs to structured data in seconds.
Discover how an Invoice Data Extractor OCR Flow can streamline your financial processes by automating the extraction and organization of invoice data. Learn abo...
Discover how AI-powered OCR is transforming data extraction, automating document processing, and driving efficiency in industries like finance, healthcare, and ...
Optical Character Recognition (OCR) is a transformative technology that converts documents such as scanned papers, PDFs, or images into editable and searchable ...