Build a Philosophy Quote Generator With Vector Search and Astra DB (Part 3)
In the final installment of this series, complete the quote generator application by using an LLM to generate philosophical quotes.
Join the DZone community and get the full member experience.
Join For FreeThis is the third and last part of a mini-series about building a vector search-based GenAI application with Python and DataStax Astra DB. In this post, we complete the application by using the search (from Part 2) as a starting point to implement a "philosophical quote generator."
If you haven't already, check out part one, where all the important concepts are explained, and part 2, which explains how the search engine takes advantage of vector embeddings stored in Astra DB.
Construct the Quote Generator
Now, we’ll build the part of the application that, when given a textual input (and, optionally, filters clauses such as author name and/or labels), uses an LLM to generate a new "philosophical quote," similar in tone and content to existing entries.
Retrieval-Augmented Generation (RAG)
At the heart of this flow, we'll have the vector-powered search facility we implemented; we’ll invoke the LLM's "chat completion" with a prompt that’s essentially structured as follows:
Generate a novel philosophical quote similar in tone and spirit to the provided examples: <EXAMPLE1> , <EXAMPLE2>, …, and pertaining to the topic: <TOPIC>.
In this prompt, "topic" will be the very input supplied by the user, and the "examples" will be the results found through the search we implemented:
It turns out that even sketchy input formulations, such as just "politics and virtue," do a satisfactory job as search strings in that they still lead to relevant quotes for the generation task.
This is the essence of RAG (retrieval-augmented-generation): we use search to retrieve relevant pieces of text that are then fed into an LLM-powered text-generation prompt. As stated at the beginning of this post, the overwhelming success of this approach and its variants is what made vector search a key ingredient in modern GenAI applications.
The pattern demonstrated in this sample application can be adapted with slight changes to serve many different use cases, from question-answering over a knowledge base (made into text-snippets-with-vector) to giving personalized advice (with past user interactions likewise stored in a manner fit to vector-search retrieval).
There are many ideas to enhance the capabilities of LLM-based algorithms, ranging from sampling several "chains of thought" in parallel to better validate solutions to a problem to checking for the validity of partial answers by preparing "trick questions" to try and catch the LLM in the act of making up plausible falsehoods (i.e., the ubiquitous plague of "hallucinations"). However, it is striking that most of these advanced approaches are, or can be, complemented with the basic RAG mechanism. Vector search as an information retrieval technology is a successful way to provide an arbitrarily large amount of easy-to-update "knowledge" to almost any generative AI system.
Implementing Quote Generation
Let's take a look at the code for quote generation. Now that the search subproblem has been solved, what remains doesn’t require directly interacting with the vector store; all we need from the store are the search results.
completion_model_name = "gpt-3.5-turbo"
generation_prompt_template = """"Generate a single short philosophical quote on the given topic,
similar in spirit and form to the provided actual example quotes.
Do not exceed 20-30 words in your quote.
REFERENCE TOPIC: "{topic}"
ACTUAL EXAMPLES:
{examples}
"""
def generate_quote(topic, n=2, author=None, tags=None):
quotes = find_quote_and_author_p(
query_quote=topic,
n=n,
author=author,
tags=tags,
)
if quotes:
prompt = generation_prompt_template.format(
topic=topic,
examples="\n".join(f" - {quote[0]}" for quote in quotes),
)
response = openai.ChatCompletion.create(
model=completion_model_name,
messages=[{"role": "user", "content": prompt}],
temperature=0.7,
max_tokens=320,
)
return response.choices[0].message.content.replace('"', '').strip()
else:
print("** no quotes found.")
return None
As you see, all we do is define a suitable prompt template and then, when asked for a synthetic quote, run a search to fill the prompt with the results and then invoke an LLM for completion according to the instructions in the prompt. Let's check how this generation routine performs:
>>> print(generate_quote("politics and virtue"))
Politics without virtue is like a ship without a captain - destined to be guided by turbulent currents, lacking true direction.
>>> print(generate_quote("animals", author="schopenhauer"))
Neglecting the moral worth of animals reflects a crude and barbaric mindset. True morality lies in universal compassion.
Not bad at all! This really sounds like something these great thinkers might actually have said.
Next Steps: Bring It to Production and Scale
You have seen how easy it is to get started with vector search using Astra DB; in just a few lines of code, we have built a semantic text retrieval and generation pipeline, including the creation and population of the storage backend, i.e., the vector store.
Moreover, you retain some choice as to the particular technology to use. You can achieve the same goals whether working with the convenient, more abstract CassIO library or by constructing and executing statements directly with the CQL drivers — each choice comes with its pros and cons.
If you plan to bring this application to a production-like setup, there is, of course, more to be done. First, you might want to work at an even higher abstraction level, namely that provided by the various LLM frameworks available, such as LangChain or LlamaIndex (both of which support Astra DB / Cassandra as a vector store backend).
Second, you would need something like a REST API exposing the functions we built. This is something you can achieve, for example, with a few lines of FastAPI code, essentially wrapping the generate_quote and find_quote_and_author_p functions seen earlier. There'll soon be a post on this blog showing in detail how an API around LLM functions can be structured.
A last consideration is the scale of your data. In a production application, you will probably handle way more than the 500 or so items inserted here. You might need to work with a vector store consisting of millions of entries. Should you be concerned about performance? The short answer is no; Astra DB is designed to handle massive data sizes with extremely low read and write latencies. Your vector-based application will remain snappy even after throwing loads of data at it.
Conclusion
We have started with a general introduction to vector search (what it is, how it works, what problems it solves), and then we put these concepts to use by building a small yet representative GenAI application with vector technology at its heart. While convincingly impersonating Immanuel Kant might not be everyone's idea of the needs of an enterprise application (nor that of a fun party game, either), the key ideas and concepts illustrated in this post have a broader validity and are highly relevant in the current rapidly-evolving AI landscape.
We have used Astra DB as the backend for the vector store. Its ease of use, paired with its performance and ability to scale, make it the perfect fit for your next GenAI endeavor, regardless of the size of your data. You can sign up now for a free-tier Astra account and start experimenting with its vector capabilities in minutes.
Additionally, as we have shown, you can choose your preferred tooling when working with Astra DB as a vector store: whether through CQL or the CassIO library, we invite you to try out this example, tinker with it, and perhaps use it as an inspiration to later come up with your own GenAI application.
That's not all. The vector capabilities of Astra DB are integrated into the most popular LLM frameworks, so you can easily develop a more complex application based, for example, on LangChain or LlamaIndex, and still take advantage of Astra DB for your vector storage.
Reference Links
If you want to know more, here is a curated list of references and starting points to continue your journey with vectors and Astra DB:
- CassIO, reference end-to-end notebook (notebook, colab)
- CQL, reference end-to-end notebook (notebook, colab)
- Astra DB, Vector overview
- CassIO (the site covers LlamaIndex and LangChain Astra integration as well)
- CQL reference and Python drivers for Cassandra / Astra DB
- More? (e.g., performance or other related blog posts)
Attribution Statement
Apache Cassandra, Cassandra, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.
Opinions expressed by DZone contributors are their own.
Comments