PostgresML: Extension That Turns PostgreSQL Into a Platform for AI Apps
Learn how to use the PostgresML extension for text translation, sentiment analysis, and other AI-related tasks within PostgreSQL.
Join the DZone community and get the full member experience.
Join For FreePostgresML is an extension of the PostgreSQL ecosystem that allows the training, fine-tuning, and use of various machine learning and large language models within the database. This extension turns PostgreSQL into a complete MLOps platform, supporting various natural language processing tasks and expanding Postgres's capabilities as a vector database.
The extension complements pgvector, another foundational extension for apps wishing to use Postgres as a vector database for AI use cases. With pgvector, applications can easily store and work with embeddings generated by large language models (LLMs). PostgresML takes it further by enabling the training and execution of models within the database.
Let's look at the PostgresML extension in action by using PostgreSQL for language translation tasks and user sentiment analysis.
Enable PostgresML
The easiest way to start with PostgresML is by deploying a database instance with the pre-installed extension in Docker.
Use the following command to launch PostgreSQL with PostgresML in a container and open a database session with the psql tool:
docker run \
-it \
-v postgresml_data:/var/lib/postgresql \
-p 5432:5432 \
-p 8000:8000 \
ghcr.io/postgresml/postgresml:2.7.12 \
sudo -u postgresml psql -d postgresml
Once the container has started and the psql session is open, check that the pgml extension (short for PostgresML) is on the extensions list:
select * from pg_extension;
oid | extname | extowner | extnamespace | extrelocatable | extversion | extconfig | extcondition
-------+---------+----------+--------------+----------------+------------+-----------+--------------
13540 | plpgsql | 10 | 11 | f | 1.0 | |
16388 | pgml | 16385 | 16387 | f | 2.7.12 | |
(2 rows)
Finally, if you run the \d
command, you'll see a list of database objects used internally by PostgresML.
\d
List of relations
Schema | Name | Type | Owner
--------+-----------------------+----------+------------
pgml | deployed_models | view | postgresml
pgml | deployments | table | postgresml
pgml | deployments_id_seq | sequence | postgresml
pgml | files | table | postgresml
pgml | files_id_seq | sequence | postgresml
pgml | models | table | postgresml
pgml | models_id_seq | sequence | postgresml
...truncated
Text Translation With PostgresML
PostgresML integrates with Hugging Face Transformers to enable the latest natural language processing (NLP) models in PostgreSQL. Hugging Face features thousands of pre-trained models that can be used for tasks like sentiment analysis, text classification, summarization, translation, question answering, and more.
For instance, suppose you store a product catalog in PostgreSQL, with all the product descriptions in English. Now, you need to display these descriptions in French for customers visiting your e-commerce website from France.
What if someone gets interested in Apple's AirTag? PostgresML can facilitate the translation from English to French using one of the translation transformers:
SELECT pgml.transform(
'translation_en_to_fr',
inputs => ARRAY[
'AirTag is a supereasy way to keep track of your stuff.
Attach one to your keys, slip another in your backpack.
And just like that, they’re on your radar in the Find My app,
where you can also track down your Apple devices and keep up with
friends and family.'
]
) AS french;
-[ RECORD 1 ]--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
french | [{"translation_text": "AirTag est un moyen super facile de suivre vos objets. Attachez-leur à vos clés, glissez-leur dans votre sac à dos. Et comme ça, ils sont sur votre radar dans l’app Find My, où vous pouvez aussi retrouver vos appareils Apple et suivre vos amis et votre famille."}]
translation_en_to_fr
- the name of a pre-configured transformer utilizing one of the models from Hugging Face.inputs
- an array of text that needs translation.
If the e-commerce website also caters to Spanish-speaking countries, then product descriptions can be translated into Spanish using a different model:
select pgml.transform(
task => '{"task": "translation",
"model": "Helsinki-NLP/opus-mt-en-es"
}'::JSONB,
inputs => ARRAY[
'AirTag is a supereasy way to keep track of your stuff.
Attach one to your keys, slip another in your backpack.
And just like that, they’re on your radar in the Find My app,
where you can also track down your Apple devices and keep up with
friends and family.'
]
) as spanish;
-[ RECORD 1 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
spanish | [{"translation_text": "AirTag es una manera superfácil de hacer un seguimiento de tus cosas. Conecta una a tus llaves, desliza otra en tu mochila. Y así mismo, están en tu radar en la aplicación Find My, donde también puedes rastrear tus dispositivos Apple y mantenerte al día con tus amigos y familiares."}]
task
- a custom task for translation using one of Helsinki-NLP's models. You can choose from thousands of models available on the Hugging Face hub.
Overall, PostgresML can improve user experience by returning text that has already been translated back to the application layer.
Sentiment Analysis With PostgresML
What about engaging in more sophisticated ML and AI-related tasks with PostgresML? One such task is the sentiment analysis of data being inserted or stored in the database.
Imagine that customers of the e-commerce website can share their feedback on the products. PostgresML can assist in monitoring customer sentiment about specific products and proactively responding to various concerns and complaints.
For example, a customer purchased a headset and shared feedback that PostgresML classified as negative:
SELECT pgml.transform(
task => 'text-classification',
inputs => ARRAY[
'I regret buying this headset. It does not connect to my laptop over Bluetooth.'
]
) AS positivity;
-[ RECORD 1 ]----------------------------------------------------
positivity | [{"label": "NEGATIVE", "score": 0.9996261596679688}]
task
- a pre-configured transformation for text classification tasks.inputs
- the text for sentiment analysis.
A company representative reached out to the customer promptly and helped to solve the problem. As a result, the customer shared follow-up feedback that was classified as positive.
SELECT pgml.transform(
task => 'text-classification',
inputs => ARRAY[
'I regret buying this headset. It does not connect to my laptop over Bluetooth.',
'The problem is solved. Jane reached out to me and helped with the setup. Love the product!'
]
) AS positivity;
-[ RECORD 1 ]-------------------------------------------------------------------------------------------------------
positivity |
[{"label": "NEGATIVE", "score": 0.9996261596679688},
{"label": "POSITIVE", "score": 0.999795138835907}]
Just like with the translation tasks, you can utilize thousands of other models from Hugging Face for sentiment analysis and other text classification tasks. For instance, here's how you can switch to the RoBERTa model, which was trained on approximately 40,000 English posts on X (Twitter):
SELECT pgml.transform(
task => '{"task": "text-classification",
"model": "finiteautomata/bertweet-base-sentiment-analysis"
}'::jsonb,
inputs => ARRAY[
'I regret buying this headset. It does not connect to my laptop over Bluetooth.',
'The problem is solved. Jane reached out to me and helped with the setup. Love the product!'
]
) AS positivity;
-[ RECORD 1 ]----------------------------------------------------------------------------------------------
positivity | [{"label": "NEG", "score": 0.9769334197044371},
{"label": "POS", "score": 0.9884902238845824}]
The RoBERTa model has also accurately classified the sentiment of the comments, allowing the e-commerce company to address user concerns and complaints promptly as soon as negative feedback gets into PostgreSQL.
Summary
As a vector database, Postgres isn't limited to storing and querying embeddings. With the PostgresML extension, Postgres can be transformed into a computational platform for various AI and ML tasks.
Discover more about PostgresML and PostgreSQL as a vector database in the following hands-on practical guides:
Opinions expressed by DZone contributors are their own.
Comments