Building a Receipt Scanner App With OCR, OpenAI, and PostgreSQL
In this article, you'll learn how to build a receipt scanner app using OCR for text extraction, OpenAI for processing, and PostgreSQL for efficient data storage.
Join the DZone community and get the full member experience.
Join For FreeManaging expenses and keeping track of receipts can be cumbersome. Digitalizing receipts and extracting product information automatically can greatly enhance efficiency. In this blog, we’ll build a Receipt Scanner App where users can scan receipts using their phone, extract data from them using OCR (Optical Character Recognition), process the extracted data with OpenAI to identify products and prices, store the data in PostgreSQL, and analyze product prices across different stores.
What Does the Receipt Scanner App do?
This app allows users to:
- Scan receipts: Users can take pictures of their receipts with their phone.
- Extract text: The app will use OCR to recognize the text from the receipt images.
- Analyze product information: With OpenAI’s natural language processing capabilities, we can intelligently extract the product names and prices from the receipt text.
- Store data: The extracted data is stored in a PostgreSQL database.
- Track prices: Users can later retrieve price ranges for products across different stores, providing insights into spending patterns and price comparisons.
Tech Stack Overview
We'll be using the following technologies:
Frontend (Mobile)
- Expo - React Native: For the mobile app that captures receipt images and uploads them to the backend.
Backend
- Node.js with Express: For handling API requests and managing interactions between the frontend, Google Cloud Vision API, OpenAI, and PostgreSQL.
- Google Cloud Vision API: For Optical Character Recognition (OCR) to extract text from receipt images.
- OpenAI GPT-4: For processing and extracting meaningful information (product names, prices, etc.) from the raw receipt text.
- PostgreSQL: For storing receipt and product information in a structured way.
Step 1: Setting Up the Backend with Node.js and PostgreSQL
1. Install the Required Dependencies
Let’s start by setting up a Node.js project that will serve as the backend for processing and storing receipt data. Navigate to your project folder and run:
mkdir receipt-scanner-backend
cd receipt-scanner-backend
npm init -y
npm install express multer @google-cloud/vision openai pg body-parser cors dotenv
2. Set Up PostgreSQL
We need to create a PostgreSQL database that will store information about receipts and products.
Create two tables:
- receipts: Stores metadata about each receipt.
- products: Stores individual product data, including names, prices, and receipt reference.
CREATE TABLE receipts (
id SERIAL PRIMARY KEY,
store_name VARCHAR(255),
receipt_date DATE
);
CREATE TABLE products (
id SERIAL PRIMARY KEY,
product_name VARCHAR(255),
price DECIMAL(10, 2),
receipt_id INTEGER REFERENCES receipts(id)
);
3. Set Up Google Cloud Vision API
- Go to the Google Cloud Console, create a project, and enable the Cloud Vision API.
- Download your API credentials as a JSON file and save it in your backend project directory.
4. Set Up OpenAI API
- Create an account at Open AI and obtain your API key.
- Store your OpenAI API key in a
.env
file like this:ShellOPENAI_API_KEY=your-openai-api-key-here
5. Write the Backend Logic
Google Vision API (vision.js)
This script will use the Google Cloud Vision API to extract text from the receipt image.
Google Vision for Text Extraction (vision.js)
const vision = require('@google-cloud/vision');
const client = new vision.ImageAnnotatorClient({
keyFilename: 'path-to-your-google-vision-api-key.json',
});
async function extractTextFromImage(imagePath) {
const [result] = await client.textDetection(imagePath);
const detections = result.textAnnotations;
return detections[0]?.description || '';
}
module.exports = { extractTextFromImage };
OpenAI Text Processing (openaiService.js)
This service will use OpenAI GPT-4 to analyze the extracted text and identify products and their prices.
const { Configuration, OpenAIApi } = require('openai');
const configuration = new Configuration({
apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(configuration);
async function processReceiptText(text) {
const prompt = `
You are an AI that extracts product names and prices from receipt text.
Here’s the receipt data:
"${text}"
Return the data as a JSON array of products with their prices, like this:
[{"name": "Product1", "price": 9.99}, {"name": "Product2", "price": 4.50}]
`;
const response = await openai.createCompletion({
model: 'gpt-4',
prompt,
max_tokens: 500,
});
return response.data.choices[0].text.trim();
}
module.exports = { processReceiptText };
Setting Up Express (app.js)
Now, we’ll integrate the OCR and AI processing in our Express server. This server will handle image uploads, extract text using Google Vision API, process the text with OpenAI, and store the results in PostgreSQL.
require('dotenv').config();
const express = require('express');
const multer = require('multer');
const { Pool } = require('pg');
const { extractTextFromImage } = require('./vision');
const { processReceiptText } = require('./openaiService');
const app = express();
app.use(express.json());
const pool = new Pool({
user: 'your-db-user',
host: 'localhost',
database: 'your-db-name',
password: 'your-db-password',
port: 5432,
});
const upload = multer({ dest: 'uploads/' });
app.get('/product-price-range/:productName', async (req, res) => {
const { productName } = req.params;
try {
// Query to get product details, prices, and store names
const productDetails = await pool.query(
`SELECT p.product_name, p.price, r.store_name, r.receipt_date
FROM products p
JOIN receipts r ON p.receipt_id = r.id
WHERE p.product_name ILIKE $1
ORDER BY p.price ASC`,
[`%${productName}%`]
);
if (productDetails.rows.length === 0) {
return res.status(404).json({ message: 'Product not found' });
}
res.json(productDetails.rows);
} catch (error) {
console.error(error);
res.status(500).json({ error: 'Failed to retrieve product details.' });
}
});
app.post('/upload-receipt', upload.single('receipt'), async (req, res) => {
try {
const imagePath = req.file.path;
const extractedText = await extractTextFromImage(imagePath);
const processedData = await processReceiptText(extractedText);
const products = JSON.parse(processedData);
const receiptResult = await pool.query(
'INSERT INTO receipts (store_name, receipt_date) VALUES ($1, $2) RETURNING id',
['StoreName', new Date()]
);
const receiptId = receiptResult.rows[0].id;
for (const product of products) {
await pool.query(
'INSERT INTO products (product_name, price, receipt_id) VALUES ($1, $2, $3)',
[product.name, product.price, receiptId]
);
}
res.json({ message: 'Receipt processed and stored successfully.' });
} catch (error) {
console.error(error);
res.status(500).json({ error: 'Failed to process receipt.' });
}
});
app.listen(5000, () => {
console.log('Server running on port 5000');
});
Step 2: Building the React Native Frontend
Now that our backend is ready, we’ll build the React Native app for capturing and uploading receipts.
1. Install React Native and Required Libraries
npx expo init receipt-scanner-app
cd receipt-scanner-app
npm install axios expo-image-picker
2. Create the Receipt Scanner Component
This component will allow users to capture an image of a receipt and upload it to the backend for processing.
App.js
import React from 'react';
import { NavigationContainer } from '@react-navigation/native';
import { createStackNavigator } from '@react-navigation/stack';
import ProductPriceSearch from './ProductPriceSearch'; // Import the product price search screen
import ReceiptUpload from './ReceiptUpload'; // Import the receipt upload screen
const Stack = createStackNavigator();
export default function App() {
return (
<NavigationContainer>
<Stack.Navigator initialRouteName="ReceiptUpload">
<Stack.Screen name="ReceiptUpload" component={ReceiptUpload} />
<Stack.Screen name="ProductPriceSearch" component={ProductPriceSearch} />
</Stack.Navigator>
</NavigationContainer>
);
}
ProductPriceSearch.js
import React, { useState } from 'react';
import { View, Text, TextInput, Button, FlatList, StyleSheet } from 'react-native';
import axios from 'axios';
const ProductPriceSearch = () => {
const [productName, setProductName] = useState('');
const [productDetails, setProductDetails] = useState([]);
const [message, setMessage] = useState('');
// Function to search for a product and retrieve its details
const handleSearch = async () => {
try {
const response = await axios.get(`http://localhost:5000/product-price-range/${productName}`);
setProductDetails(response.data);
setMessage('');
} catch (error) {
console.error(error);
setMessage('Product not found or error retrieving data.');
setProductDetails([]); // Clear previous search results if there was an error
}
};
const renderProductItem = ({ item }) => (
<View style={styles.item}>
<Text style={styles.productName}>Product: {item.product_name}</Text>
<Text style={styles.storeName}>Store: {item.store_name}</Text>
<Text style={styles.price}>Price: ${item.price}</Text>
</View>
);
return (
<View style={styles.container}>
<Text style={styles.title}>Search Product Price by Store</Text>
<TextInput
style={styles.input}
placeholder="Enter product name"
value={productName}
onChangeText={setProductName}
/>
<Button title="Search" onPress={handleSearch} />
{message ? <Text style={styles.error}>{message}</Text> : null}
<FlatList
data={productDetails}
keyExtractor={(item, index) => index.toString()}
renderItem={renderProductItem}
style={styles.list}
/>
</View>
);
};
const styles = StyleSheet.create({
container: {
flex: 1,
justifyContent: 'center',
padding: 20,
},
title: {
fontSize: 24,
textAlign: 'center',
marginBottom: 20,
},
input: {
height: 40,
borderColor: '#ccc',
borderWidth: 1,
padding: 10,
marginBottom: 20,
},
list: {
marginTop: 20,
},
item: {
padding: 10,
backgroundColor: '#f9f9f9',
borderBottomWidth: 1,
borderBottomColor: '#eee',
marginBottom: 10,
},
productName: {
fontSize: 18,
fontWeight: 'bold',
},
storeName: {
fontSize: 16,
marginTop: 5,
},
price: {
fontSize: 16,
color: 'green',
marginTop: 5,
},
error: {
color: 'red',
marginTop: 10,
textAlign: 'center',
},
});
export default ProductPriceSearch;
ReceiptUpload.js
import React, { useState } from 'react';
import { View, Button, Image, Text, StyleSheet } from 'react-native';
import * as ImagePicker from 'expo-image-picker';
import axios from 'axios';
const ReceiptUpload = () => {
const [receiptImage, setReceiptImage] = useState(null);
const [message, setMessage] = useState('');
// Function to open the camera and capture a receipt image
const captureReceipt = async () => {
const permissionResult = await ImagePicker.requestCameraPermissionsAsync();
if (permissionResult.granted === false) {
alert('Permission to access camera is required!');
return;
}
const result = await ImagePicker.launchCameraAsync();
if (!result.cancelled) {
setReceiptImage(result.uri);
}
};
// Function to upload the receipt image to the backend
const handleUpload = async () => {
if (!receiptImage) {
alert('Please capture a receipt image first!');
return;
}
const formData = new FormData();
formData.append('receipt', {
uri: receiptImage,
type: 'image/jpeg',
name: 'receipt.jpg',
});
try {
const response = await axios.post('http://localhost:5000/upload-receipt', formData, {
headers: { 'Content-Type': 'multipart/form-data' },
});
setMessage(response.data.message);
} catch (error) {
console.error(error);
setMessage('Failed to upload receipt.');
}
};
return (
<View style={styles.container}>
<Text style={styles.title}>Upload Receipt</Text>
<Button title="Capture Receipt" onPress={captureReceipt} />
{receiptImage && (
<Image source={{ uri: receiptImage }} style={styles.receiptImage} />
)}
<Button title="Upload Receipt" onPress={handleUpload} />
{message ? <Text style={styles.message}>{message}</Text> : null}
</View>
);
};
const styles = StyleSheet.create({
container: {
flex: 1,
justifyContent: 'center',
padding: 20,
},
title: {
fontSize: 24,
textAlign: 'center',
marginBottom: 20,
},
receiptImage: {
width: 300,
height: 300,
marginTop: 20,
marginBottom: 20,
},
message: {
marginTop: 20,
textAlign: 'center',
color: 'green',
},
});
export default ReceiptUpload;
Explanation
- expo-image-picker is used to request permission to access the device's camera and to capture an image of the receipt.
- The captured image is displayed on the screen and then uploaded to the backend using axios.
3. Running the App
To run the app:
-
Start the Expo development server:
Plain Textnpx expo start
-
Scan the QR code using the Expo Go app on your phone. The app will load, allowing you to capture and upload receipts.
Step 3: Running the Application
Start the Backend
Run the backend on port 5000:
node app.js
Run the React Native App
Open the iOS or Android emulator and run the app:
npx expo init receipt-scanner-app
cd receipt-scanner-app
npm install axios expo-image-picker
Once the app is running:
- Capture a receipt image.
- Upload the receipt to the backend.
- The backend will extract the text, process it with OpenAI, and store the data in PostgreSQL.
Step 4: Next Steps
Enhancements
- Authentication: Implement user authentication so that users can manage their personal receipts and data.
- Price comparison: Provide analytics and price comparison across different stores for the same product.
- Improve parsing: Enhance the receipt parsing logic to handle more complex receipt formats with OpenAI.
Conclusion
We built a Receipt Scanner App from scratch using:
- Expo - React Native for the frontend.
- Node.js, Google Cloud Vision API, and OpenAI for text extraction and data processing.
- PostgreSQL for storing and querying receipt data.
The Receipt Scanner App we built provides users with a powerful tool to manage their receipts and gain valuable insights into their spending habits. By leveraging AI-powered text extraction and analysis, the app automates the process of capturing, extracting, and storing receipt data, saving users from the hassle of manual entry. This app allows users to:
- Easily scan receipts: Using their mobile phone, users can capture receipts quickly and effortlessly without needing to manually input data.
- Track spending automatically: Extracting product names, prices, and other details from receipts helps users keep a detailed log of their purchases, making expense tracking seamless.
- Compare product prices: The app can provide price ranges for products across different stores, empowering users to make smarter shopping decisions and find the best deals.
- Organize receipts efficiently: By storing receipts in a structured database, users can easily access and manage their purchase history. This is particularly useful for budgeting, tax purposes, or warranty claims.
Overall, the Price Match App is a valuable tool for anyone looking to streamline their receipt management, track their spending patterns, and make data-driven decisions when shopping. With features like AI-powered text processing, automatic product identification, and price comparison, users benefit from a more organized, efficient, and intelligent way of managing their personal finances and shopping habits.
By automating these tasks, the app frees up time and reduces errors, allowing users to focus on more important things. Whether you're tracking business expenses, managing household finances, or simply looking for the best deals on products, this app simplifies the process and adds value to everyday tasks.
Opinions expressed by DZone contributors are their own.
Comments