Build a Philosophy Quote Generator With Vector Search and Astra DB
In part 1 of this 'Infinite Wisdom Series,' learn the how and the why of vector search and vector embeddings by building a vector store from scratch.
Join the DZone community and get the full member experience.
Join For FreeThe field of generative AI (GenAI), which has ignited a computing revolution this year, encompasses several technologies, key ideas, and paradigms. Although there will surely be more astonishing developments to come, vector search has become one of the most crucial tools in GenAI.
In this three-part series, we will demonstrate the power of vector search by building a vector store from scratch and using it to accomplish two standard tasks: semantic search and text generation based on provided examples.
This is what we'll build: first, a semantic search engine for finding quotes by famous philosophers. Then, this search facility will be the basis to create, in true GenAI style, a tool that can invent new, plausible snippets of philosophical wisdom in the style of your favorite philosopher! Along the way, we’ll illustrate how the various parts work together so that by the end, you'll have acquired not just a taste for Arthur Schopenhauer but also working knowledge to get started with your own GenAI application.
We will work with Python and use DataStax Astra DB for the vector-search back-end. Though most concepts are generally applicable, some aspects of the particular implementation take advantage of the specific architecture of this database.
The reference application this series is about has been developed in two flavors:
- One version uses the database drivers directly to interface with the database
- The other leverages the "CassIO" library, which abstracts away the database-specific aspects, offering a more Pythonic interface that "just works."
We will point out the differences between these two approaches and provide comprehensive references at the end to find out more. Both implementations are available as notebooks hosted in the OpenAI Cookbook repo: all you need is to create a free-tier Astra DB account and get an OpenAI API key. If you're eager to do some hands-on work, you can open the driver-based or the CassIO-based version right now as Google Colab interactive notebooks.
This three-part series is structured as follows:
- In this post, we’ll summarize where the need for vector search arises and give a brief account of how it works
- In Part 2, we build a search engine based on vector embeddings to find quotes by famous philosophers
- Part 3 is where the search is used at the heart of a full GenAI application: a generator of new philosophical quotes.
Let's start!
Why Vector Search?
In a typical GenAI application, large language models (LLMs) are employed to produce text. For instance, answers to customers' questions, summaries, or suggestions based on context and past interactions. But in most cases, one cannot just use the LLM out-of-the-box, as it may lack the required domain-specific knowledge. To solve this problem while avoiding an often expensive (or outright unavailable) model fine-tuning step, the approach known as RAG (retrieval-augmented generation) has emerged.
In practice, in the RAG paradigm, first, a search is performed to obtain elements of textual information relevant to the specific task (for example, documentation snippets pertinent to the customer question being asked). Then, in a second step, these pieces of text are put into a suitably-designed prompt and passed to the LLM, which is instructed to craft an answer using the supplied information. The RAG approach has proven to be one of the main workhorses to expand the capabilities of LLMs. While the range of methods to augment the powers of LLMs is in rapid evolution (even fine-tuning approaches are experiencing a sort of comeback right now), RAG is considered one of the key ingredients.
So, the first problem at hand is that of retrieving "relevant" information. While this is not a new problem that has traditionally been solved with keyword-based search (possibly with preprocessing steps such as lemmatization/stemming or other variants), recently, vector search has emerged as a superior approach. Let's see what it does and why it works so well.
There are two key concepts that play together, namely:
- For a given piece of text, an "embedding vector" can be computed. This looks like a fixed-length sequence of numbers that encode the meaning of the sentence, rather than its exact wording, to a striking degree of accuracy.
- In the space where these vectors live, the typical mathematical definitions for the "distance between vectors" happen to measure fairly well the degree of (semantic) similarity between the corresponding sentences.
In practice, this means that one can map a set of phrases to the corresponding vectors (i.e., points in a certain space) and then look for phrases with similar contents by actually looking for points close to each other in this space. Vector embeddings allow mapping a task in the domain of natural language processing (NLP) to a simpler, better-understood mathematical task in…geometry!
So, suppose you are building a service to find phrases similar to a user-provided input, searching in a possibly large corpus of text. You can pre-compute the vector embedding for all phrases in the corpus and store them in a suitable database along with the texts. Once this is done, queries would work like this: first, calculate the vector V for the query sentence; second, run a database query for the rows whose vector is the closest to V.
A database with this kind of capability is called a vector database. Nowadays, with the extraordinary growth of GenAI, many databases have started to offer vector-oriented features, and new databases have sprung up, explicitly built around this need.
Astra DB, a DBaaS offering built on the planet-scale and ultra-high-availability distributed database Apache Cassandra®, provides solid support for vector search workloads with top-class performance. You can try the vector capabilities of Astra DB right now; you can create a free tier account and start experimenting with it, for example, by running the demo application outlined throughout the rest of this post!
An important need that emerges when setting up a vector search-based application is that of filtering based on metadata. For instance, you may want to look for items in your e-commerce offering whose description is "similar to" a provided search query, but still, you may want to limit the search to entries with "special_offer = True" or "style = casual" in the query itself. You will see a practical application of vector-search filtering and two different ways to accommodate it when creating the vector store in the database.
Coming up Next
In the next installment of this mini-series, we will put these concepts to use by building a vector store and developing a search engine on top of it.
Opinions expressed by DZone contributors are their own.
Comments