DZone Spotlight

Thursday, February 20 View All Articles »

Generate Unit Tests With AI Using Ollama and Spring Boot

By Amol Gote

CORE

There are scenarios where we would not want to use commercial large language models (LLMs) because the queries and data would go into the public domain. There are ways to run open-source LLMs locally. This article explores the option of running Ollama locally interfaced with the Sprint boot application using the SpringAI package. We will create an API endpoint that will generate unit test cases for the Java code that has been passed as part of the prompt using AI with the help of Ollama LLM. Running Open-Source LLM Locally 1. The first step is to install Ollama; we can go to ollama.com, download the equivalent OS version, and install it. The installation steps are standard, and there is nothing complicated. 2. Please pull the llama3.2 model version using the following: PowerShell ollama pull llama3.2 For this article, we are using the llama3.2 version, but with Ollama, we can run a number of other open-source LLM models; you can find the list over here. 3. After installation, we can verify that Ollama is running by going to this URL: http://localhost:11434/. You will see the following status "Ollama is running": 4. We can also use test containers to run Ollama as a docker container and install the container using the following command: PowerShell docker run -d -v ollama:/root/.ollama -p 11438:11438 --name ollama ollama/ollama Since I have Ollama, I have been running locally using local installation using port 11434; I have swapped the port to 11438. Once the container is installed, you can run the container and verify that Ollama is running on port 11438. We can also verify the running container using the docker desktop as below: SpringBoot Application 1. We will now create a SpringBoot application using the Spring Initializer and then install the SpringAI package. Please ensure you have the following POM configuration: XML <repositories> <repository> <id>spring-milestones</id> <name>Spring Milestones</name> <url>https://repo.spring.io/milestone</url> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>spring-snapshots</id> <name>Spring Snapshots</name> <url>https://repo.spring.io/snapshot</url> <releases> <enabled>false</enabled> </releases> </repository> </repositories> <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-ollama-spring-boot-starter</artifactId> <version>1.0.0-SNAPSHOT</version> <type>pom</type> <scope>import</scope> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency> <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-ollama</artifactId> <version>1.0.0-M6</version> </dependency> </dependencies> 2. We will then configure the application.properties for configuring the Ollama model as below: Properties files spring.application.name=ollama spring.ai.ollama.base-url=http://localhost:11434 spring.ai.ollama.chat.options.model=llama3.2 3. Once the spring boot application is running, we will write code to generate unit tests for the Java code that is passed as part of the prompt for the spring boot application API. 4. We will first write a service that will interact with the Ollama model; below is the code snippet: Java @Service public class OllamaChatService { @Qualifier("ollamaChatModel") private final OllamaChatModel ollamaChatModel; private static final String INSTRUCTION_FOR_SYSTEM_PROMPT = """ We will using you as a agent to generate unit tests for the code that is been passed to you, the code would be primarily in Java. You will generate the unit test code and return in back. Please follow the strict guidelines If the code is in Java then only generate the unit tests and return back, else return 'Language not supported answer' If the prompt has any thing else than the Java code provide the answer 'Incorrect input' """; public OllamaChatService(OllamaChatModel ollamaChatClient) { this.ollamaChatModel = ollamaChatClient; } public String generateUnitTest(String message){ String responseMessage = null; SystemMessage systemMessage = new SystemMessage(INSTRUCTION_FOR_SYSTEM_PROMPT); UserMessage userMessage = new UserMessage(message); List<Message> messageList = new ArrayList<>(); messageList.add(systemMessage); messageList.add(userMessage); Prompt userPrompt = new Prompt(messageList); ChatResponse extChatResponse = ollamaChatModel.call(userPrompt); if (extChatResponse != null && extChatResponse.getResult() != null && extChatResponse.getResult().getOutput() != null){ AssistantMessage assistantMessage = ollamaChatModel.call(userPrompt).getResult().getOutput(); responseMessage = assistantMessage.getText(); } return responseMessage; } } 5. Please take a look at the PROMPT_INSTRUCTIONS; we'd like to define the chat agent's purpose and responsibility. We are enforcing the responsibility to generate unit test code for Java. If anything else is sent, the Prompt answer will be returned as "Incorrect Input." 6. Then, we will build an API endpoint, which will interact with the chat service. Java @RestController @RequestMapping("/api/ai/ollama") public class OllamaChatController { @Autowired OllamaChatService ollamaChatService; @PostMapping("/unit-test") public ChatResponse generateUnitTests(@RequestBody ChatRequest request) { String response = this.ollamaChatService.generateUnitTest(request.getPrompt()); ChatResponse chatResponse = new ChatResponse(); chatResponse.setMessage(response); return chatResponse; } } Running the API Endpoint Generate unit tests. Below is the sample output from the API for generating unit tests for the random number method. Here are the unit tests for the given Java code: Java import org.junit.jupiter.api.Test; import static org.junit.jupiter.api.Assertions.*; import java.util.Random; public class RandomNumberGeneratorTest { @Test public void testGetRandomNumberMinAndMaxSame() { int min = 10; int max = 10; try { int randomNumber = getRandomNumber(min, max); fail("Expected IllegalArgumentException"); } catch (IllegalArgumentException e) { assertEquals("Max must be greater than min", e.getMessage()); } } @Test public void testGetRandomNumberMinLessThanZero() { int min = -10; int max = 10; try { int randomNumber = getRandomNumber(min, max); fail("Expected IllegalArgumentException"); } catch (IllegalArgumentException e) { assertEquals("Max must be greater than min", e.getMessage()); } } @Test public void testGetRandomNumberMinGreaterThanMax() { int min = 10; int max = 5; try { int randomNumber = getRandomNumber(min, max); fail("Expected IllegalArgumentException"); } catch (IllegalArgumentException e) { assertEquals("Max must be greater than min", e.getMessage()); } } @Test public void testGetRandomNumberMinAndMaxValid() { int min = 10; int max = 20; int randomNumber = getRandomNumber(min, max); assertTrue(randomNumber >= min && randomNumber <= max); } @Test public void testGetRandomNumberMultipleTimes() { Random random = new Random(); int min = 1; int max = 10; int numberOfTests = 10000; for (int i = 0; i < numberOfTests; i++) { int randomNumber1 = getRandomNumber(min, max); int randomNumber2 = getRandomNumber(min, max); assertTrue(randomNumber1 != randomNumber2); } } } Random input: Conclusion Integrating Ollama locally ensures data privacy and saves costs compared to closed-source commercial LLM models. Closed-source commercial LLM models are powerful, but this approach provides an alternative if open-source models can perform simple tasks. You can find the source code on GitHub. More

*You* Can Shape Trend Reports: Join DZone's GenAI Research + Enter the Prize Drawing!

By Lauren Forbes

Hey, DZone Community! We have an exciting year ahead of research for our beloved Trend Reports. And once again, we are asking for your insights and expertise (anonymously if you choose) — readers just like you drive the content we cover in our Trend Reports. Check out the details for our research survey below. Comic by Daniel Stori Generative AI Research Generative AI is revolutionizing industries, and software development is no exception. At DZone, we're diving deep into how GenAI models, algorithms, and implementation strategies are reshaping the way we write code and build software. Take our short research survey ( ~10 minutes) to contribute to our latest findings. We're exploring key topics, including: Embracing generative AI (or not)Multimodal AIThe influence of LLMsIntelligent searchEmerging tech And don't forget to enter the raffle for a chance to win an e-gift card of your choice! Join the GenAI Research Over the coming month, we will compile and analyze data from hundreds of respondents; results and observations will be featured in the "Key Research Findings" of our Trend Reports. Your responses help inform the narrative of our Trend Reports, so we truly cannot do this without you. Stay tuned for each report's launch and see how your insights align with the larger DZone Community. We thank you in advance for your help! —The DZone Content and Community team More

Trend Report

Observability and Performance

The dawn of observability across the software ecosystem has fully disrupted standard performance monitoring and management. Enhancing these approaches with sophisticated, data-driven, and automated insights allows your organization to better identify anomalies and incidents across applications and wider systems. While monitoring and standard performance practices are still necessary, they now serve to complement organizations' comprehensive observability strategies. This year's Observability and Performance Trend Report moves beyond metrics, logs, and traces — we dive into essential topics around full-stack observability, like security considerations, AIOps, the future of hybrid and cloud-native observability, and much more.

Refcard #401

Getting Started With Agentic AI

By Lahiru Fernando

Refcard #400

Java Application Containerization and Deployment

By Mark Heckler

Building Intelligent Microservices With Go and AWS AI Services

Coupling Go's lightweight programming capabilities with AWS' robust AI services allows developers to build performant, scalable, and intelligent microservices devoted to diverse business needs. This blog explains how Go, and AWS AI services can be combined to create intelligent microservices, discusses the benefits of this approach, and provides a step-by-step guide to getting started. Why Use Go for Microservices? Golang, or Go, is a statically typed, compiled programming language that speaks Google. It aims to meet some requirements regarding simplicity, performance, and scalability. Combined, they make it an excellent choice for building microservices: Concurrency. Its built-in concurrency support through goroutines and channels lets developers easily address multiple tasks without incurring a big performance overhead.Fast compilation and execution. Because it is a compiled language, Go offers high execution speeds and fast build times, which is essential for microservices needing to respond quickly to user requests.Minimal memory footprint. Effective memory usage means that Go keeps its microservices small and, hence, cheap.Rich standard library. Its great built-in standard library includes tools for networking, HTTP handling, and JSON parsing, making it easier to develop microservices.Scalability. Go was intrinsic at the creation stage to keep the philosophy simple and foolproof, aiding the developers in building and maintaining scalable systems easily. Why Choose AWS AI Services? AWS offers developer AI service suites for NLP, computer vision, ML, and predictive analysis. The seamless combination of AWS AI services with microservices offers the following: The major advantage of AWS AI services is their SDK and API platform, which would make integration much easier on microservices made in Go.AWS automatically scales its services for demand to maintain consistent performance under varying workloads.AWS's pay-as-you-use model ensures one only pays for the resources utilized.Pre-trained from Amazon NLP (Amazon Comprehend), image recognition (Amazon Rekognition), and text-to-speech (Amazon Polly), the list goes on the likes of these.AWS follows standard security protocols in the industry to protect user data for AI services. Key AWS AI Services for Intelligent Microservices Highlighted below are some AWS AI services that can be used for building intelligent microservices: Amazon Recognition. Provides image and video analysis capabilities such as object detection, facial recognition, and content moderation.Amazon Comprehend. An application that offers features such as natural language processing for sentiment analysis, entity recognition, and language detection.Amazon Polly. Text-to-speech conversion tool; apps with voice-enabled functionality are built.Amazon Sage Maker. ML model building training and deployment tool.Amazon Translate. Provides real-time and batch language translation. Amazon Textract. Extracting text and data from forms and tables in scanned documents.Amazon Lex. Enables the creation of conversational interfaces for applications using voice and text.Amazon Transcribe. Converts speech into text for applications like transcription services and voice analytics. The Architecture of Intelligent Microservices With Go and AWS The architecture of intelligent microservices involves several layers: Frontend layer. User interfaces or APIs that interact with end users.Microservices layer. Go-based microservices that handle specific business functionalities. Each Microservices communicates with the AWS AI services for processing.Data layer. Includes databases or data storage solutions, such as Amazon RDS, DynamoDB, or S3, for managing application data.AWS AI integration layer. AWS AI services that process data and return results to the microservices.Monitoring and logging. Tools like AWS CloudWatch and AWS X-Ray to monitor the performance and diagnose issues in the microservices. A Step-by-Step Guide Step 1: Setting Up the Development Environment Go Configuration Basics Download and install Go from the official Go website. After installation, have your Go workspace set up and environment variables specified. Once Go is ready, install AWS SDK for Go for AWS services integration. Configure your AWS credentials using AWS CLI for secure authenticated access to your services. Step 2: Design the Microservices Channel the microservices through their specialization. For image analysis service, set up Amazon Rekognition to identify objects on an image; use Amazon Comprehend as a sentiment analysis service that analyzes user feedback; and utilize Amazon Polly as the text-to-speech conversion service to speak textual notifications. Each microservice solves a particular business requirement without losing flexibility. Step 3: Integrating AWS AI Services Make the necessary interconnections between microservices and AWS AI services by creating AWS sessions, starting the service client, and calling the appropriate APIs. At this juncture, proper communication is ensured and remains efficient between the microservices and AI services, thus giving intelligent results. Step 4: Deployment of the Microservices After microservice development, dockerize the microservices for portability and consistent work across environments. Appropriately configure the containers for the various services. Use Kubernetes/AWS ECS to orchestrate and manage the deployment of the containerized microservices for greater availability and scalability. Monitor performance and enable logging through AWS CloudWatch, while having the Auto Scaling groups to cater to the different workloads. Step 5: Testing and Optimization Conduct thorough unit and integration tests to verify that every microservice works as it should. Understand microservice communication performance with respect to AWS services to boost its performance and improve responsiveness and resource utilization. The frequent testing and process iteration would serve to ensure the reliability and scalability of the system. Benefits of Using Go and AWS AI Services With improved productivity. Go's simplicity and managed services of AWS reduce the time and effort needed for intelligent application building.Improving scalability. The lightweight Go combined with elastic AWS infrastructure guarantees the seamless scale of microservices.Cost efficiency. The pay-as-you-go pricing model of AWS and Go's low memory footprint enhances cost savings.Intelligence. AWS AI services add intelligent capabilities to microservices, like advanced functionalities such as sentiment analysis, image recognition, and speech synthesis. Conclusion Building intelligent microservices with the combination of Go and AWS AI services thus offers great performance, scale, and advanced functions. With the strengths of Go's efficient design and AWS AI technologies for intelligent apps, developers are already creating microservices that meet modern business needs. Whatever the goal better customer experience, improved business propositions, or real-time analysis-integration of Go and AWS requires both adaptability and sturdiness in application ecosystems. The deployment of microservices allows businesses to innovate faster and to easily adapt to changing requirements while not breaking the whole system. Between this, AWS AI services allow for many easily integrated pre-trained models and tools. This reduces AI-driven solutions' complexity, giving teams the time and space to deliver value to their users.

By sairamakrishna Karri

Creating an Agentic RAG for Text-to-SQL Applications

The blend of retrieval-augmented generation (RAG) and generative AI models has brought changes to natural language processing by improving the responses to queries. In the realm of Agentic RAG, this conventional method of relying on a monolithic model for tasks has been enhanced by introducing modularity and autonomy. By breaking down the problem-solving process into tools integrated within an agent, Agentic RAG provides benefits like accuracy, transparency, scalability, and debugging capabilities. The Vision Behind Agentic RAG for Text-to-SQL Traditional RAG systems often retrieve relevant documents and rely on a single monolithic model to generate responses. Although this is an effective method in some cases, when it comes to structural outputs like the case of generating SQL, this approach may not be the most effective. This is where we can leverage the power of the Agentic RAG framework, where we: Divide the tasks into smaller, more manageable tools within an agentImprove accuracy by assigning tasks to specialized toolsEnhance transparency by tracing the reasoning and workflow of each toolSimplify scaling and debugging through modular design Let's talk about how this tool works and the role each component plays in transforming user questions into accurate SQL queries. Architecture Overview The structure comprises an agent utilizing tools within the text-to-SQL workflow. The process can be summarized as follows: User Query → Query Transformation Tool → Few Shot Prompting Tool → Hybrid Search Tool → Re Ranking Tool → Table Retrieval Tool → Prompt Building Tool → LLM Execution Tool → SQL Execution Tool → Final Output 1. User Query Transformation Tool This tool would entail processing the user query for a better understanding of the LLM. It addresses ambiguities, rephrases user questions, translates abbreviations into their forms, and provides context when necessary. Enhancements Handle temporal references. Map terms like "as of today" or "till now" to explicit dates.Replace ambiguous words. For example, "recent" could be replaced by "last 7 days."Connecting shorthand or abbreviations to their names. Example Input: "Show recent sales MTD." Transformed query: "Retrieve sales data for the last 7 days (Month to Date)." Python from datetime import date, timedelta def transform_query(user_query): # Handle open-ended temporal references today = date.today() transformations = { "as of today": f"up to {today}", "till now": f"up to {today}", "recent": "last 7 days", "last week": f"from {today - timedelta(days=7)} to {today}", } for key, value in transformations.items(): user_query = user_query.replace(key, value) # Map common abbreviations abbreviations = { "MTD": "Month to Date", "YTD": "Year to Date", } for abbr, full_form in abbreviations.items(): user_query = user_query.replace(abbr, full_form) return user_query query_transform_tool = Tool( name="Query Transformer", func=transform_query, description="Refines user queries for clarity and specificity, handles abbreviations and open-ended terms." ) 2. Few Shot Prompting Tool This tool makes a call to the LLM to identify the question of a kind from a set (we can also say matching the template). The matched question enhances the prompt with an example SQL query. Example Workflow 1. Input question: "Show me total sales by product for the 7 days." 2. Predefined templates: "Show sales grouped by region." → Example SQL; SELECT region, SUM(sales) ..."Show total sales by product." → Example SQL; SELECT product_name, SUM(sales) ... 3. Most similar question: "Show total sales by product." 4. Output example SQL: SELECT product_name, SUM(sales) FROM ... Python from langchain.chat_models import ChatOpenAI llm = ChatOpenAI(model="gpt-4") predefined_examples = { "Show sales grouped by region": "SELECT region, SUM(sales) FROM sales_data GROUP BY region;", "Show total sales by product": "SELECT product_name, SUM(sales) FROM sales_data GROUP BY product_name;", } def find_similar_question(user_query): prompt = "Find the most similar question type for the following user query:\n" prompt += f"User Query: {user_query}\n\nOptions:\n" for example in predefined_examples.keys(): prompt += f"- {example}\n" prompt += "\nRespond with the closest match." response = llm.call_as_function(prompt) most_similar = response['content'] return predefined_examples.get(most_similar, "") few_shot_tool = Tool( name="Few-Shot Prompting", func=find_similar_question, description="Finds the most similar question type using an additional LLM call and retrieves the corresponding example SQL." ) 3. Hybrid Search Tool For a robust retrieval, this tool combines semantic search, keyword search based on BM25, and keyword-based mapping. The search results from these search methods are put together using reciprocal rank fusion. How does it all come together? Keyword Table Mapping This approach maps the tables to the keywords that are contained in the query. For example: The presence of "sales" results in the sales table being shortlisted.The presence of "product" results in the products table being shortlisted. Keyword Overlap Mapping (BM25) This is a search method based on keyword overlap that shortlists tables in line with relevance. For this, we shall apply the BM25 technique. This sorts the papers in order of relevance to a user search. This search technique considers term saturation in view as well as TF-IDF (Term Frequency-Inverse Document Frequency). Term Frequency (TF) helps one to measure the frequency of a term in a given document. The Inverse Document Frequency (IDF) approach underlines words that show up in every document lessening importance. Normalizing takes document length into account to prevent any bias toward longer papers. Given: sales_data: Contains terms like "sales," "date," "product."products: Contains terms like "product," "category."orders: Contains terms like "order," "date," "customer."financials: Contains terms like "revenue," "profit," "expense." User query: "Show total sales by product." Identify terms in the user query: ["sales," "product"].Sort every document (based on frequency and relevance of these terms) in DataBaseTable. Relevance of documents: sales: High relevance due to both "sales" and "product"products: High relevance due to "product." orders: Lower relevance due to the presence of only "sales."financials: Not relevant. Output: Ranked list: [products, sales_data, orders, financials] Semantic Search In this search method, as the name suggests, we find semantically similar tables utilizing vector embeddings. We achieve this by calculating a similarity score, such as cosine similarity, between the document (table vectors) and user query vectors. Reciprocal Rank Fusion Combines BM25 and semantic search results using reciprocal rank fusion strategy, which is explained a little more in detail below: Reciprocal Rank Fusion (RRF) combining BM25 and semantic search: RRF is a method to combine results from multiple ranking algorithms (e.g., BM25 and semantic search). It assigns a score to each document based on its rank in the individual methods, giving higher scores to documents ranked higher across multiple methods. RRF formula: RRF(d) = Σ(r ∈ R) 1 / (k + r(d)) Where: d is a documentR is the set of rankers (search methods)k is a constant (typically 60)r(d) is the rank of document d in search method r Step-by-Step Example Input data. 1. BM25 ranking results: products (Rank 1)sales_data (Rank 2)orders (Rank 3) 2. Semantic search ranking results: sales_data (Rank 1)financials (Rank 2)products (Rank 3) Step-by-Step Fusion For each table, compute the score: 1. sales_data BM25 Rank = 2, Semantic Rank = 1RRF Score = (1/60+2 ) + (1/60+1) = 0.03252 2. products BM25 Rank = 1, Semantic Rank = 3RRF Score = (1/60+1) + (1/60+3)= .03226 3. orders BM25 Rank = 3, Semantic Rank = Not RankedRRF Score = (1/60+3)= 0.01587 4. financials BM25 Rank = Not Ranked, Semantic Rank = 2RRF Score = (1/60+2)=0.01613 5. Sort by RRF score sales_data (highest score due to top rank in semantic search).products (high score from BM25).orders (lower relevance overall).financials (limited overlap). Final output: ['sales_data', 'products,' 'financials,' 'orders'] Tables retrieved using Keyword Table mapping are always included. Python from rank_bm25 import BM25Okapi def hybrid_search(query): # Keyword-based mapping keyword_to_table = { "sales": "sales_data", "product": "products", } keyword_results = [table for keyword, table in keyword_to_table.items() if keyword in query.lower()] # BM25 Search bm25 = BM25Okapi(["sales_data", "products", "orders", "financials"]) bm25_results = bm25.get_top_n(query.split(), bm25.corpus, n=5) # Semantic Search semantic_results = vector_store.similarity_search(query, k=5) # Reciprocal Rank Fusion def reciprocal_rank_fusion(results): rank_map = {} for rank, table in enumerate(results): rank_map[table] = rank_map.get(table, 0) + 1 / (1 + rank) return sorted(rank_map, key=rank_map.get, reverse=True) combined_results = reciprocal_rank_fusion(bm25_results + semantic_results) return list(set(keyword_results + combined_results)) hybrid_search_tool = Tool( name="Hybrid Search", func=hybrid_search, description="Combines keyword mapping, BM25, and semantic search with RRF for table retrieval." ) 4. Re-Ranking Tool This tool ensures the most relevant tables are prioritized. Example Input tables: ["sales_data," "products," "financials"]Re-ranking logic For each table, compute a relevance score by concatenating the query and the table description.Sort by relevance score. Output: ["sales_data," "products"] A little more into the Re- ranking logic: The cross-encoder calculates a relevance score by analyzing the concatenated query and table description as a single input pair. This process involves: Pair input. The query and each table description are paired and passed as input to the cross-encoder.Joint encoding. Unlike separate encoders (e.g., bi-encoders), the cross-encoder jointly encodes the pair, allowing it to better capture context and dependencies between the query and the table description.Scoring. The model outputs a relevance score for each pair, indicating how well the table matches the query. Python from transformers import pipeline reranker = pipeline("text-classification", model="cross-encoder/ms-marco-TinyBERT-L-2") def re_rank_context(query, results): scores = [(doc, reranker(query + " " + doc)[0]['score']) for doc in results] return [doc for doc, score in sorted(scores, key=lambda x: x[1], reverse=True)] re_rank_tool = Tool( name="Re-Ranker", func=re_rank_context, description="Re-ranks the retrieved tables based on relevance to the query." ) 5. Prompt Building Tool This tool constructs a detailed prompt for the language model, incorporating the user’s refined query, retrieved schema, and examples from the Few-Shot Prompting Tool. Assume you are someone who is proficient in generating SQL queries. Generate an SQL query to: Retrieve total sales grouped by product for the last 7 days. Relevant tables: sales_data: Contains columns [sales, date, product_id].products: Contains columns [product_id, product_name]. Example SQL: Plain Text SELECT product_name, SUM(sales) FROM sales_data JOIN products ON sales_data.product_id = products.product_id GROUP BY product_name; Future Scope While this system uses a single agent with multiple tools to simplify modularity and reduce complexity, a multi-agent framework could be explored in the future. We could possibly explore the following: Dedicated agents for context retrieval. Separate agents for semantic and keyword searches.Task-specific agents. Agents specialized in SQL validation or optimization.Collaboration between agents. Using a coordination agent to manage task delegation. This approach could enhance scalability and allow for more sophisticated workflows, especially in enterprise-level deployments. Conclusion Agentic RAG for text-to-SQL applications offers a scalable, modular approach to solving structured query tasks. By incorporating hybrid search, re-ranking, few-shot prompting, and dynamic prompt construction within a single-agent framework, this system ensures accuracy, transparency, and extensibility. This enhanced workflow demonstrates a powerful blueprint for turning natural language questions into actionable SQL queries.

By Arjun Bali

Build a Data Analytics Platform With Flask, SQL, and Redis

In this article, I’ll walk through the development of a Flask-based web application that interacts with an SQL Server database to analyze population data. The application allows users to query population ranges, fetch counties by state, and retrieve states within specific population ranges. I shall also discuss how to integrate Redis for caching query results to improve performance. Why Flask, SQL Server, and Redis? Flask is a lightweight and flexible Python web framework that is perfect for building small to medium-sized web applications. It provides the necessary tools to create RESTful APIs, render dynamic HTML templates, and interact with databases. On the other hand, SQL Server is a robust relational database management system (RDBMS) that is widely used in enterprise applications. Combining Flask with SQL Server allows us to build a powerful application for data analysis and visualization. To further enhance performance, we’ll integrate Redis, an in-memory data store, to cache frequently accessed query results. This reduces the load on the database and speeds up response times for repeated queries. Application Overview Our Flask application performs the following tasks: Query population ranges. Users can specify a year and population range to get counts of states falling within those ranges.Fetch counties by state. Users can input a state code to retrieve a list of counties.Retrieve states by population range. Users can specify a population range and year to get a list of states within that range.Note. To test, feel free to create your own schema in the database and insert sample data as needed based on the following APIs shared using SQL queries. Also, the HTML pages that are used here can be basic table design that grabs the returned data from the Flask app code and display the results. Let’s dive into the implementation details. Setting Up the Flask Application 1. Prerequisites Before starting, ensure you have the following installed through your terminal root (commands compatible with MacOS): Python 3.x Flask (pip install flask)SQLAlchemy (pip install sqlalchemy)PyODBC (pip install pyodbc)Redis (pip install redis) 2. Database Connection We use SQLAlchemy to connect to the SQL Server database. Here’s how the connection can be configured: Python from sqlalchemy import create_engine import urllib # SQL Server connection string params = urllib.parse.quote_plus( "Driver={ODBC Driver 17 for SQL Server};" "Server=tcp:username.database.windows.net,1433;" "Database=population;" "Uid=user@username;" "Pwd={azure@123};" "Encrypt=yes;" "TrustServerCertificate=no;" "Connection Timeout=30;" ) engine = create_engine("mssql+pyodbc:///?odbc_connect=%s" % params) This connection string uses the ODBC Driver for SQL Server and includes parameters for encryption and timeout. 3. Redis Configuration Redis is used to cache query results. Here’s how to set up the Redis connection: Python import redis # Redis connection redis_client = redis.StrictRedis( host='username.redis.cache.windows.net', port=6380, db=0, password='encryptedpasswordstring', ssl=True ) 4. Implementing the Application Routes Home Page Route The home page route renders the main page of the application: Python @app.route('/') def index(): return render_template('index.html') Population Range Query With Redis Caching This route handles queries for population ranges. It first checks if the result is cached in Redis. If not, it queries the database and caches the result for future use: Python @app.route('/population-range', methods=['GET', 'POST']) def population_range(): if request.method == 'POST': # input params defined for this api year = request.form['yr1'] range1_start = request.form['r1'] range1_end = request.form['r2'] range2_start = request.form['r3'] range2_end = request.form['r4'] range3_start = request.form['r5'] range3_end = request.form['r6'] # Map year to column name year_map = { '2010': 'ten', '2011': 'eleven', '2012': 'twelve', '2013': 'thirteen', '2014': 'fourteen', '2015': 'fifteen', '2016': 'sixteen', '2017': 'seventeen', '2018': 'eighteen' } year_column = year_map.get(year, 'ten') # Default to 'ten' if year not found # Build cache key cache_key = f"population_range_{year_column}_{range1_start}_{range1_end}_{range2_start}_{range2_end}_{range3_start}_{range3_end}" # Check if result is cached cached_result = redis_client.get(cache_key) if cached_result: result = eval(cached_result) # Deserialize cached result time_taken = 0 # No database query, so time taken is negligible cache_status = "Cache Hit" else: # Build SQL query query = f""" SELECT SUM(CASE WHEN {year_column} BETWEEN '{range1_start}' AND '{range1_end}' THEN 1 ELSE 0 END) AS range1_count, SUM(CASE WHEN {year_column} BETWEEN '{range2_start}' AND '{range2_end}' THEN 1 ELSE 0 END) AS range2_count, SUM(CASE WHEN {year_column} BETWEEN '{range3_start}' AND '{range3_end}' THEN 1 ELSE 0 END) AS range3_count FROM popul """ print(query) # For debugging # Execute query and measure time start_time = time() result = engine.execute(query).fetchall() end_time = time() time_taken = end_time - start_time cache_status = "Cache Miss" # Cache the result redis_client.set(cache_key, str(result), ex=3600) # Cache for 1 hour return render_template('display.html', data1=result, t1=time_taken, cache_status=cache_status) return render_template('index.html') Fetch Counties by State With Redis Caching This route retrieves counties for a given state code. It also uses Redis to cache the results: Python @app.route('/counties-by-state', methods=['GET', 'POST']) def counties_by_state(): if request.method == 'POST': state_code = request.form['state_code'] # Build cache key cache_key = f"counties_by_state_{state_code}" # Check if result is cached cached_result = redis_client.get(cache_key) if cached_result: result = eval(cached_result) # Deserialize cached result time_taken = 0 # No database query, so time taken is negligible cache_status = "Cache Hit" else: # Build SQL query query = f""" SELECT county FROM dbo.county WHERE state = (SELECT state FROM codes WHERE code = '{state_code}') """ print(query) # For debugging # Execute query and measure time start_time = time() result = engine.execute(query).fetchall() end_time = time() time_taken = end_time - start_time cache_status = "Cache Miss" # Cache the result redis_client.set(cache_key, str(result), ex=3600) # Cache for 1 hour return render_template('counties.html', data=result, time_taken=time_taken, cache_status=cache_status) return render_template('index.html') Retrieve States by Population Range With Redis Caching This route fetches states within a specified population range and caches the results: Python @app.route('/states-by-population', methods=['GET', 'POST']) def states_by_population(): if request.method == 'POST': year = request.form['year'] population_start = request.form['population_start'] population_end = request.form['population_end'] # Map year to column name year_map = { '2010': 'ten', '2011': 'eleven', '2012': 'twelve', '2013': 'thirteen', '2014': 'fourteen', '2015': 'fifteen', '2016': 'sixteen', '2017': 'seventeen', '2018': 'eighteen' } year_column = year_map.get(year, 'ten') # Default to 'ten' if year not found # Build cache key cache_key = f"states_by_population_{year_column}_{population_start}_{population_end}" # Check if result is cached cached_result = redis_client.get(cache_key) if cached_result: result = eval(cached_result) # Deserialize cached result time_taken = 0 # No database query, so time taken is negligible cache_status = "Cache Hit" else: # Build SQL query query = f""" SELECT state FROM popul WHERE {year_column} BETWEEN '{population_start}' AND '{population_end}' """ print(query) # For debugging # Execute query and measure time start_time = time() result = engine.execute(query).fetchall() end_time = time() time_taken = end_time - start_time cache_status = "Cache Miss" # Cache the result redis_client.set(cache_key, str(result), ex=3600) # Cache for 1 hour return render_template('states.html', data=result, time_taken=time_taken, cache_status=cache_status) return render_template('index.html') Performance Comparison: SQL Server vs. Redis Query TypeRedis Fetch TimeSQL Execution TimePopulation Range Query (Cached)0.002 seconds0.000 secondsPopulation Range Query (Fresh)0.002 seconds1.342 seconds Key takeaway: Redis reduces execution time from ~1.3 seconds to ~0.002 seconds, making queries 650x faster! How Redis Improves Performance Redis is an in-memory data store that acts as a caching layer between the application and the database. Here’s how it works in our application: Cache key. A unique key is generated for each query based on its parameters.Cache check. Before executing a database query, the application checks if the result is already cached in Redis.Cache hit. If the result is found in Redis, it is returned immediately, avoiding a database query.Cache miss. If the result is not found, the query is executed, and the result is cached in Redis for future use.Cache expiry. Cached results are set to expire after a specified time (e.g., 1 hour) to ensure data freshness. By caching frequently accessed query results, Redis significantly reduces the load on the database and improves response times for repeated queries. Conclusion In this article, we built a Flask application that interacts with a SQL Server database to analyze population data. We integrated Redis to cache query results, improving performance and reducing database load. By following best practices, you can extend this application to handle more complex queries and scale it for production use. Link: The source code of this full application can be found on GitHub.

By Sushma Kukkadapu

Upcoming DZone Events

DZone events bring together industry leaders, innovators, and peers to explore the latest trends, share insights, and tackle industry challenges. From Virtual Roundtables to Fireside Chats, our events cover a wide range of topics, each tailored to provide you, our DZone audience, with practical knowledge, meaningful discussions, and support for your professional growth. DZone Events Happening Soon Below, you’ll find upcoming events that you won't want to miss. Modernizing Enterprise Java Applications: Jakarta EE, Spring Boot, and AI Integration Date: February 25, 2025Time: 1:00 PM ET Register for Free! Unlock the potential of AI integration in your enterprise Java applications with our upcoming webinar! Join Payara and DZone to explore how to enhance your Spring Boot and Jakarta EE systems using generative AI tools like Spring AI and REST client patterns. What to Consider When Building an IDP Date: March 4, 2025Time: 1:00 PM ET Register for Free! Is your development team bogged down by manual tasks and “TicketOps”? Internal Developer Portals (IDPs) streamline onboarding, automate workflows, and enhance productivity—but should you build or buy? Join Harness and DZone for a webinar to explore key IDP capabilities, compare Backstage vs. managed solutions, and learn how to drive adoption while balancing cost and flexibility. DevOps for Oracle Applications with FlexDeploy: Automation nd Compliance Made Easy Date: March 11, 2025Time: 1:00 PM ET Register for Free! Join Flexagon and DZone as Flexagon's CEO unveils how FlexDeploy is helping organizations future-proof their DevOps strategy for Oracle Applications and Infrastructure. Explore innovations for automation through compliance, along with real-world success stories from companies who have adopted FlexDeploy. Make AI Your App Development Advantage: Learn Why and How Date: March 12, 2025Time: 10:00 AM ET Register for Free! The future of app development is here, and AI is leading the charge. Join Outsystems and DZone, on March 12th at 10am ET, for an exclusive Webinar with Luis Blando, CPTO of OutSystems, and John Rymer, industry analyst at Analysis.Tech, as they discuss how AI and low-code are revolutionizing development.You will also hear from David Gilkey, Leader of Solution Architecture, Americas East at OutSystems, and Roy van de Kerkhof, Director at NovioQ. This session will give you the tools and knowledge you need to accelerate your development and stay ahead of the curve in the ever-evolving tech landscape. Developer Experience: The Coalescence of Developer Productivity, Process Satisfaction, and Platform Engineering Date: March 12, 2025Time: 1:00 PM ET Register for Free! Explore the future of developer experience at DZone’s Virtual Roundtable, where a panel will dive into key insights from the 2025 Developer Experience Trend Report. Discover how AI, automation, and developer-centric strategies are shaping workflows, productivity, and satisfaction. Don’t miss this opportunity to connect with industry experts and peers shaping the next chapter of software development. Unpacking the 2025 Developer Experience Trends Report: Insights, Gaps, and Putting it into Action Date: March 19, 2025Time: 1:00 PM ET Register for Free! We’ve just seen the 2025 Developer Experience Trends Report from DZone, and while it shines a light on important themes like platform engineering, developer advocacy, and productivity metrics, there are some key gaps that deserve attention. Join Cortex Co-founders Anish Dhar and Ganesh Datta for a special webinar, hosted in partnership with DZone, where they’ll dive into what the report gets right—and challenge the assumptions shaping the DevEx conversation. Their take? Developer experience is grounded in clear ownership. Without ownership clarity, teams face accountability challenges, cognitive overload, and inconsistent standards, ultimately hampering productivity. Don’t miss this deep dive into the trends shaping your team’s future. What's Next? DZone has more in store! Stay tuned for announcements about upcoming Webinars, Virtual Roundtables, Fireside Chats, and other developer-focused events. Whether you’re looking to sharpen your skills, explore new tools, or connect with industry leaders, there’s always something exciting on the horizon. Don’t miss out — save this article and check back often for updates!

By Alanna Lovejoy

Search: From Basic Document Retrieval to Answer Generation

In the digital age, the ability to find relevant information quickly and accurately has become increasingly critical. From simple web searches to complex enterprise knowledge management systems, search technology has evolved dramatically to meet growing demands. This article explores the journey from index-based basic search engines to retrieval-based generation, examining how modern techniques are revolutionizing information access. The Foundation: Traditional Search Systems Traditional search systems were built on relatively simple principles: matching keywords and ranking results based on relevance, user signals, frequency, positioning, and many more. While effective for basic queries, these systems faced significant limitations. They struggled with understanding context, handling complex multi-part queries, resolving indirect references, performing nuanced reasoning, and providing user-specific personalization. These limitations became particularly apparent in enterprise settings, where information retrieval needs to be both precise and comprehensive. Python from collections import defaultdict import math class BasicSearchEngine: def __init__(self): self.index = defaultdict(list) self.document_freq = defaultdict(int) self.total_docs = 0 def add_document(self, doc_id, content): # Simple tokenization terms = content.lower().split() # Build inverted index for position, term in enumerate(terms): self.index[term].append((doc_id, position)) # Update document frequencies unique_terms = set(terms) for term in unique_terms: self.document_freq[term] += 1 self.total_docs += 1 def search(self, query): terms = query.lower().split() scores = defaultdict(float) for term in terms: if term in self.index: idf = math.log(self.total_docs / self.document_freq[term]) for doc_id, position in self.index[term]: tf = 1 # Simple TF scoring scores[doc_id] += tf * idf return sorted(scores.items(), key=lambda x: x[1], reverse=True) # Usage example search_engine = BasicSearchEngine() search_engine.add_document("doc1", "Traditional search systems use keywords") search_engine.add_document("doc2", "Modern systems employ advanced techniques") results = search_engine.search("search systems") Enterprise Search: Bridging the Gap Enterprise search introduced new complexities and requirements that consumer search engines weren't designed to handle. Organizations needed systems that could search across diverse data sources, respect complex access controls, understand domain-specific terminology, and maintain context across different document types. These challenges drove the development of more sophisticated retrieval techniques, setting the stage for the next evolution in search technology. The Paradigm Shift: From Document Retrieval to Answer Generation The landscape of information access underwent a dramatic transformation in early 2023 with the widespread adoption of large language models (LLMs) and the emergence of retrieval-augmented generation (RAG). Traditional search systems, which primarily focused on returning relevant documents, were no longer sufficient. Instead, organizations needed systems that could not only find relevant information but also provide it in a format that LLMs could effectively use to generate accurate, contextual responses. This shift was driven by several key developments: The emergence of powerful embedding models that could capture semantic meaning more effectively than keyword-based approaches The development of efficient vector databases that could store and query these embeddings at scale The recognition that LLMs, while powerful, needed accurate and relevant context to provide reliable responses The traditional retrieval problem thus evolved into an intelligent, contextual answer generation problem, where the goal wasn't just to find relevant documents, but to identify and extract the most pertinent pieces of information that could be used to augment LLM prompts. This new paradigm required rethinking how we chunk, store, and retrieve information, leading to the development of more sophisticated ingestion and retrieval techniques. Python import numpy as np from transformers import AutoTokenizer, AutoModel import torch class ModernRetrievalSystem: def __init__(self, model_name="sentence-transformers/all-MiniLM-L6-v2"): self.tokenizer = AutoTokenizer.from_pretrained(model_name) self.model = AutoModel.from_pretrained(model_name) self.document_store = {} def _get_embedding(self, text: str) -> np.ndarray: """Generate embedding for a text snippet""" inputs = self.tokenizer(text, return_tensors="pt", max_length=512, truncation=True, padding=True) with torch.no_grad(): outputs = self.model(**inputs) embedding = outputs.last_hidden_state[:, 0, :].numpy() return embedding[0] def chunk_document(self, text: str, chunk_size: int = 512) -> list: """Implement late chunking strategy""" # Get document-level embedding first doc_embedding = self._get_embedding(text) # Chunk the document words = text.split() chunks = [] current_chunk = [] current_length = 0 for word in words: word_length = len(self.tokenizer.encode(word)) if current_length + word_length > chunk_size: chunks.append(" ".join(current_chunk)) current_chunk = [word] current_length = word_length else: current_chunk.append(word) current_length += word_length if current_chunk: chunks.append(" ".join(current_chunk)) return chunks def add_document(self, doc_id: str, content: str): """Process and store document with context-aware chunking""" chunks = self.chunk_document(content) for i, chunk in enumerate(chunks): context = f"Document: {doc_id}, Chunk: {i+1}/{len(chunks)}" enriched_chunk = f"{context}\n\n{chunk}" embedding = self._get_embedding(enriched_chunk) self.document_store[f"{doc_id}_chunk_{i}"] = { "content": chunk, "context": context, "embedding": embedding } The Rise of Modern Retrieval Systems An Overview of Modern Retrieval Using Embedding Models Modern retrieval systems employ a two-phase approach to efficiently access relevant information. During the ingestion phase, documents are intelligently split into meaningful chunks, which preserve context and document structure. These chunks are then transformed into high-dimensional vector representations (embeddings) using neural models and stored in specialized vector databases. During retrieval, the system converts the user's query into an embedding using the same neural model and then searches the vector database for chunks whose embeddings have the highest cosine similarity to the query embedding. This similarity-based approach allows the system to find semantically relevant content even when exact keyword matches aren't present, making retrieval more robust and context-aware than traditional search methods. At the heart of these modern systems lies the critical process of document chunking and retrieval from embeddings, which has evolved significantly over time. Evolution of Document Ingestion The foundation of modern retrieval systems starts with document chunking — breaking down large documents into manageable pieces. This critical process has evolved from basic approaches to more sophisticated techniques: Traditional Chunking Document chunking began with two fundamental approaches: Fixed-size chunking. Documents are split into chunks of exactly specified token length (e.g., 256 or 512 tokens), with configurable overlap between consecutive chunks to maintain context. This straightforward approach ensures consistent chunk sizes but may break natural textual units. Semantic chunking. A more sophisticated approach that respects natural language boundaries while maintaining approximate chunk sizes. This method analyzes the semantic coherence between sentences and paragraphs to create more meaningful chunks Drawbacks of Traditional Chunking Consider an academic research paper split into 512-token chunks. The abstract might be split midway into two chunks, disconnecting the context of its introduction and conclusions. A retrieval model would struggle to identify the abstract as a cohesive unit, potentially missing the paper’s central theme. In contrast, semantic chunking may keep the abstract intact but might struggle with other sections, such as cross-referencing between the discussion and conclusion. These sections might end up in separate chunks, and the links between them could still be missed. Late Chunking: A Revolutionary Approach Legal documents, such as contracts, frequently contain references to clauses defined in other sections. Consider a 50-page employment contract where Section 2 states, 'The Employee shall be subject to the non-compete obligations detailed in Schedule A' while Schedule A, appearing 40 pages later, contains the actual restrictions like 'may not work for competing firms within 100 miles.' If someone searches for 'what are the non-compete restrictions?', traditional chunking that processes sections separately would likely miss this connection — the chunk with Section 2 lacks the actual restrictions, while the Schedule A chunk lacks the context that these are employee obligations Traditional chunking methods would likely split these references across chunks, making it difficult for retrieval models to maintain context. Late chunking, by embedding the entire document first, captures these cross-references seamlessly, enabling precise extraction of relevant clauses during a legal search. Late chunking represents a significant advancement in how we process documents for retrieval. Unlike traditional methods that chunk documents before processing, late chunking: First, processes the entire document through a long context embedding model Creates embeddings that capture the full document context Only then applies chunking boundaries to create final chunk representations This approach offers several advantages: Preserves long-range dependencies between different parts of the document Maintains context across chunk boundaries Improves handling of references and contextual elements Late chunking is particularly effective when combined with reranking strategies, where it has been shown to reduce retrieval failure rates by up to 49% Contextual Enablement: Adding Intelligence to Chunks Consider a 30-page annual financial report where critical information is distributed across different sections. The Executive Summary might mention "ACMECorp achieved significant growth in the APAC region," while the Regional Performance section states, "Revenue grew by 45% year-over-year," the Risk Factors section notes, "Currency fluctuations impacted reported earnings," and the Footnotes clarify "All APAC growth figures are reported in constant currency, excluding the acquisition of TechFirst Ltd." Now, imagine a query like "What was ACME's organic revenue growth in APAC?" A basic chunking system might return just the "45% year-over-year" chunk because it matches "revenue" and "growth." However, this would be misleading as it fails to capture critical context spread across the document: that this growth number includes an acquisition, that currency adjustments were made, and that the number is specifically for APAC. A single chunk in isolation could lead to incorrect conclusions or decisions — someone might cite the 45% as organic growth in investor presentations when, in reality, a significant portion came from M&A activity. One of the major limitations of basic chunking is the loss of context. This method aims to solve that context problem by adding relevant context to each chunk before processing. The process works by: Analyzing the original document to understand the broader context Generating concise, chunk-specific context (typically 50-100 tokens) Prepending this context to each chunk before creating embeddings Using both semantic embeddings and lexical matching (BM25) for retrieval This technique has shown impressive results, reducing retrieval failure rates by up to 49% in some implementations. Evolution of Retrieval Retrieval methods have seen dramatic advancement from simple keyword matching to today's sophisticated neural approaches. Early systems like BM25 relied on statistical term-frequency methods, matching query terms to documents based on word overlap and importance weights. The rise of deep learning brought dense retrieval methods like DPR (Dense Passage Retriever), which could capture semantic relationships by encoding both queries and documents into vector spaces. This enabled matching based on meaning rather than just lexical overlap. More recent innovations have pushed retrieval capabilities further. Hybrid approaches combining sparse (BM25) and dense retrievers help capture both exact matches and semantic similarity. The introduction of cross-encoders allowed for more nuanced relevance scoring by analyzing query-document pairs together rather than independently. With the emergence of large language models, retrieval systems gained the ability to understand and reason about content in increasingly sophisticated ways. Recursive Retrieval: Understanding Relationships Recursive retrieval advances the concept further by exploring relationships between different pieces of content. Instead of treating each chunk as an independent unit, it recognizes that chunks often have meaningful relationships with other chunks or structured data sources. Consider a real-world example of a developer searching for help with a memory leak in a Node.js application: 1. Initial Query "Memory leak in Express.js server handling file uploads." The system first retrieves high-level bug report summaries with similar symptoms A matching bug summary describes: "Memory usage grows continuously when processing multiple file uploads" 2. First Level Recursion From this summary, the system follows relationships to: Detailed error logs showing memory patterns Similar bug reports with memory profiling data Discussion threads about file upload memory management 3. Second Level Recursion Following the technical discussions, the system retrieves: Code snippets showing proper stream handling in file uploads Memory leak fixes in similar scenarios Relevant middleware configurations 4. Final Level Recursion For implementation, it retrieves: Actual code commits diffs that fixed similar issues Unit tests validating the fixes Performance benchmarks before and after fixes At each level, the retrieval becomes more specific and technical, following the natural progression from problem description to solution implementation. This layered approach helps developers not only find solutions but also understand the underlying causes and verification methods. This example demonstrates how recursive retrieval can create a comprehensive view of a problem and its solution by traversing relationships between different types of content. Other applications might include: A high-level overview chunk linking to detailed implementation chunks A summary chunk referencing an underlying database table A concept explanation connecting to related code examples During retrieval, the system not only finds the most relevant chunks but also explores these relationships to gather comprehensive context. Recursive retrieval takes the concept further by exploring relationships between different pieces of content. Instead of treating each chunk as an independent unit, it recognizes that some chunks might have special relationships with others or with structured data sources. For example, in a technical documentation system: A high-level overview chunk might link to detailed implementation chunks A summary chunk might reference an underlying database table A concept explanation might connect to related code examples During retrieval, the system not only finds the most relevant chunks but also explores these relationships to gather comprehensive context. A Special Case of Recursive Retrieval Hierarchical chunking represents a specialized implementation of recursive retrieval, where chunks are organized in a parent-child relationship. The system maintains multiple levels of chunks: Parent chunks – larger pieces providing a broader context Child chunks – smaller, more focused pieces of content The beauty of this approach lies in its flexibility during retrieval: Initial searches can target precise child chunks The system can then "zoom out" to include parent chunks for additional context Overlap between chunks can be carefully managed at each level Python import networkx as nx from typing import Set, Dict, List class RecursiveRetriever: def __init__(self, base_retriever): self.base_retriever = base_retriever self.relationship_graph = nx.DiGraph() def add_relationship(self, source_id: str, target_id: str, relationship_type: str): """Add a relationship between chunks""" self.relationship_graph.add_edge(source_id, target_id, relationship_type=relationship_type) def recursive_search(self, query: str, max_depth: int = 2) -> Dict[str, List[str]]: """Perform recursive retrieval""" results = {} visited = set() # Get initial results initial_results = self.base_retriever.search(query) first_level_ids = [doc_id for doc_id, _ in initial_results] results["level_0"] = first_level_ids visited.update(first_level_ids) # Recursively explore relationships for depth in range(max_depth): current_level_results = [] for doc_id in results[f"level_{depth}"]: related_docs = self._get_related_documents(doc_id, visited) current_level_results.extend(related_docs) visited.update(related_docs) if current_level_results: results[f"level_{depth + 1}"] = current_level_results return results # Usage example retriever = ModernRetrievalSystem() recursive = RecursiveRetriever(retriever) # Add relationships recursive.add_relationship("doc1_chunk_0", "doc2_chunk_0", "related_concept") results = recursive.recursive_search("modern retrieval techniques") Putting It All Together: Modern Retrieval Architecture Modern retrieval systems often combine multiple techniques to achieve optimal results. A typical architecture might: Use hierarchical chunking to maintain document structure Apply contextual embeddings to preserve semantic meaning Implement recursive retrieval to explore relationships Employ reranking to fine-tune results This combination can reduce retrieval failure rates by up to 67% compared to basic approaches. Multi-Modal Retrieval: Beyond Text As organizations increasingly deal with diverse content types, retrieval systems have evolved to handle multi-modal data effectively. The challenge extends beyond simple text processing to understanding and connecting information across images, audio, and video formats. The Multi-Modal Challenge Multi-modal retrieval faces two fundamental challenges: 1. Modality-Specific Complexity Each type of content presents unique challenges. Images, for instance, can range from simple photographs to complex technical diagrams, each requiring different processing approaches. A chart or graph might contain dense information that requires specialized understanding. 2. Cross-Modal Understanding Perhaps the most significant challenge is understanding relationships between different modalities. How does an image relate to its surrounding text? How can we connect a technical diagram with its explanation? These relationships are crucial for accurate retrieval. Solutions and Approaches Modern systems address these challenges through three main approaches: 1. Unified Embedding Space Uses models like CLIP to encode all content types in a single vector space Enables direct comparison between different modalities Simplifies retrieval but may sacrifice some nuanced understanding 2. Text-Centric Transformation Converts all content into text representations Leverages advanced language models for understanding Works well for text-heavy applications but may lose modal-specific details 3. Hybrid Processing Maintains specialized processing for each modality Uses sophisticated reranking to combine results Achieves better accuracy at the cost of increased complexity The choice of approach depends heavily on specific use cases and requirements, with many systems employing a combination of techniques to achieve optimal results. Looking Forward: The Future of Retrieval As AI and machine learning continue to advance, retrieval systems are becoming increasingly sophisticated. Future developments might include: More nuanced understanding of document structure and relationships Better handling of multi-modal content (text, images, video) Improved context preservation across different types of content More efficient processing of larger knowledge bases Conclusion The evolution from basic retrieval to answer generation systems reflects our growing need for more intelligent information access. Organizations can build more effective knowledge management systems by understanding and implementing techniques like contextual retrieval, recursive retrieval, and hierarchical chunking. As these technologies continue to evolve, we can expect even more sophisticated approaches to emerge, further improving our ability to find and utilize information effectively.

By Meghana Puvvadi

Creating a Web Project: Key Steps to Identify Issues

When developing a product, issues inevitably arise that can impact both its performance and stability. Slow system response times, error rate increases, bugs, and failed updates can all damage the reputation and efficiency of your project. However, before addressing these problems, it is essential to gather and analyze statistics on their occurrence. This data will help you make informed decisions regarding refactoring, optimization, and error-fixing strategies. Step 1: Performance Analysis Performance is a crucial metric that directly affects user experience. To improve it, the first step is to regularly track its key indicators: Monitor server response time. Measure and track response time variations based on time of day, server load, or system changes.Track memory and resource consumption. Regular monitoring helps identify issues early. By analyzing this data, you can assess the quality of releases and patches, detect memory leaks, and plan for hardware upgrades or refactoring.Analyze SQL queries. Gather statistics on the slowest queries and their frequency. There are numerous ways to collect these data points. Various comprehensive Application Performance Monitoring (APM) systems, such as New Relic, Datadog, Dynatrace, AppSignal, and Elastic APM, provide deep insights into your applications. These tools help identify performance bottlenecks, troubleshoot issues, and optimize services by profiling applications and tracking performance at specific code segments. However, these solutions are often paid services; they can be complex to configure or just overkill, especially for smaller teams. Datalog web interface Plain Text slow_query_log = 1 # Enables logging long_query_time = 20 # Defines slow query threshold (seconds) slow_query_log_file = /var/log/mysql/slow-query.log # Log file location log-queries-not-using-indexes = 1 # Log queries without indexes You can then view the log using: Shell tail -f /var/log/mysql/slow-query.log Or: Shell mysqldumpslow /var/log/mysql/slow-query.log Then, you can analyze the query plan in a conventional way using tools like EXPLAIN (EXPLAIN ANALYZE), etc. By the way, for a better understanding of EXPLAIN results, you can use services to visualize problem areas, such as https://explain.dalibo.com or https://explain.depesz.com. expiain.dalibo query plan visualization For real-time server monitoring, Zabbix collects data on memory, CPU, disk usage, network activity, and other critical resources. It can be easily deployed on any operating system and supports push data collection models, auto-registration of agents, custom alerts, and data visualization. Zabbix web interface Another powerful alternative is the Grafana + Prometheus combination. Prometheus collects metrics from servers, applications, databases, and other sources using exporters, stores these metrics in the database, and provides access via the powerful PromQL query language. Grafana connects to Prometheus (and other data sources), allowing the creation of graphs, dashboards, alerts, and reports with an intuitive interface for visualization and filtering. Notably, there are already hundreds of pre-built Prometheus exporters for metric collection, such as: node_exporter, mysql_exporter, nginx_exporter. Grafana dashboard Step 2: Debugging Project Issues Bugs are inevitable, so it is essential not just to fix them, but also to properly track and analyze their reasons and their fixing time. Every new bug or defect should be logged in a task management system, with a corresponding ticket for resolution. This allows for: Correlating bug frequency with product version releases.Measuring bug resolution time.Evaluating debugging efficiency for future planning. If you use Jira, the Jira Dashboards feature provides filtered statistics using JQL (Jira Query Language). The Created vs Resolved Chart offers a clear visualization of bug trends. But before you can analyze fixing times, you should first set up tools for error aggregation and prioritization. ELK Stack (Elasticsearch, Logstash, Kibana) is a popular standard, allowing log collection from multiple sources and storing them in Elasticsearch for deep analysis with Kibana. Kibana web interface However, the ELK Stack is not the only solution. You can also use Grafana Loki. Loki is easier to configure, integrates seamlessly with Grafana for visualization, and is ideal for projects that require a lightweight and user-friendly log management solution. A great approach is to set up error notifications. For example, if a previously unseen error occurs or the number of such errors exceeds a set threshold, the system can notify developers. In some teams, a ticket is automatically created in a task tracker for further investigation and resolution. This helps reduce response time for critical bugs and ensures project stability, especially during frequent releases and updates. Another popular error-tracking tool worth mentioning is Sentry. It easily integrates with any application or web server, allowing log collection from various sources for in-depth analysis. Key features include: Tracking error occurrences and their frequency. Configurable alerts based on specific rules (e.g., sending notifications to a messenger or email). Flexible integrations with task management systems (e.g., automatic bug task creation in Jira). Sentry web interface APM systems such as Datadog or New Relic (mentioned earlier) also provide tools for error analysis. If you're already using an APM solution, it might be a suitable choice for your needs. Finally, user feedback should not be overlooked. Users may report issues that automated systems fail to detect but significantly impact their experience. Since most systems are developed for users, collecting and analyzing their feedback is an invaluable data source that should never be ignored. Step 3: Collecting Product Metrics During both the development and usage stages, issues don’t always manifest directly as bugs or errors. Sometimes, they appear through changes in product metrics. For example, a minor bug or hidden error might lead to a drop in sales, reduced user session duration, or an increase in bounce rates. Such changes can go unnoticed if product metrics are not actively monitored. This is why collecting and tracking product metrics is a crucial part of any project. Metrics help detect problems before they result in significant financial losses and serve as an early warning system for necessary analysis, changes, or optimizations. The specific product metrics to track will vary depending on the type of project, but some are common across industries. These are key examples: User Engagement Metrics Average time spent on the website or in the app Number of active users (DAU – Daily Active Users, MAU – Monthly Active Users) Retention rate – how often users return Financial Metrics Number of sales or subscriptions Average revenue per user (ARPU) Conversion rate – the percentage of users who complete a target action User Acquisition Metrics Advertising campaign effectiveness Bounce rate – percentage of users who leave without interaction Conversion rates from different traffic sources (SEO, social media, email marketing) Each metric should be aligned with business goals. For example, an e-commerce store prioritizes purchase conversion rates, while a media platform focuses on average content watch time. Context Matters When analyzing any metrics (whether technical or product-related), always take external factors into account. Weekends, holidays, marketing campaigns, and seasonal activity spikes all influence system performance and the statistics you collect. Compare data across different time frames: year-over-year, week-over-week, or day-to-day. If your project operates internationally, consider regional differences – local holidays, cultural variations, and user habits can significantly impact results. The effectiveness of changes can vary greatly depending on the audience. For example, performance improvements in one region may have little impact on metrics in another. Conclusion Almost no serious issue can be identified without collecting large amounts of data. Regular monitoring, careful analysis, and consideration of context will help your product grow and evolve under any circumstances. However, keep in mind that collecting excessive data can hinder your analysis rather than help. Focus on gathering only the most relevant metrics and indicators for your specific project.

By Filipp Shcherbanich

CORE

From Zero to Scale With AWS Serverless

In recent years, cloud-native applications have become the go-to standard for many businesses to build scalable applications. Among the many advancements in cloud technologies, serverless architectures stand out as a transformative approach. Ease-of-use and efficiency are the two most desirable properties for modern application development, and serverless architectures offer these. This has made serverless the game changer for both the cloud providers and the consumers. For companies that are looking to build applications with this approach, major cloud providers offer several serverless solutions. In this article, we will explore the features, benefits, and challenges of this architecture, along with use cases. In this article, I used AWS as an example to explore the concepts, but the same concepts are applicable across all major cloud providers. Serverless Serverless does not mean there are no servers. It simply means that the underlying infrastructure for those services is managed by the cloud providers. This allows the architects and developers to design and build the applications without worrying about managing the infrastructure. It is similar to using the ride-sharing app Uber: when you need a ride, you don’t worry about owning or maintaining a car. Uber handles all that, and you just focus on getting where you need to go by paying for the ride. Serverless architectures offer many benefits that make them suitable and attractive for many use cases. Here are some of the key advantages: Auto Scaling One of the biggest advantages of serverless architecture is that it inherently supports scaling. Cloud providers handle the heavy lifting to offer near-infinite, out-of-the-box scalability. For instance, if an app built using Serverless technologies suddenly gains popularity, the tools or services automatically scale to meet the app’s needs. We don’t have to wake up in the middle of the night to add the servers or other resources. Focus on Innovation Since you are no longer burdened with managing servers, you can instead focus on building the application, adding features towards app’s growth. For any organization, whether small, medium, or large, this approach helps in concentrating on what truly matters — business growth. Cost Efficiency With traditional server models, you often end up paying for unused resources as they are bought upfront and managed even when they are not in use. Serverless changes this by switching to a pay-as-you-use model. In most of the scenarios, you only pay for the resources that you actually use. If the app you build doesn’t get traction right away, your costs will be minimal, like paying for a single session instead of an entire year. As the app’s traffic grows, the cost will grow accordingly. Faster Time-to-Market With serverless frameworks, you can build and deploy applications much faster compared to traditional server models. When the app is ready, it can be deployed with minimal effort using serverless resources. Instead of spending time on server management, you can focus on development and adding new features, shipping them at a faster pace. Reduced Operational Maintenance Since cloud providers manage the infrastructure, the consumers need not worry about provisioning, maintaining, scaling, or handling security patches and vulnerabilities. Serverless frameworks offer flexibility and can be applied to a variety of use cases. Whether it is building web applications or processing real-time data, they provide the scalability and efficiency needed for these use cases. Building Web Service APIs With AWS Serverless Now that we have discussed the benefits of serverless architectures, let us dive into some practical examples. In this section, we will create a simple backend web application using AWS serverless resources. The above backend application design contains three layers to provide APIs for a web application. Once deployed on AWS, the gateway endpoint is available for API consumption. When the APIs are called by the users, the requests are routed through the API gateway to appropriate lambda functions. For each API request, Lambda function gets triggered, and it accesses the DynamoDB to store and retrieve data. This design is a streamlined, cost-effective solution that scales automatically as demand grows, making it an ideal choice for building APIs with minimal overhead. The components in this design integrate well with each other providing flexibility. There are two major components in this architecture — computing and storage. Serverless Computing Serverless computing changed the way cloud-native applications and services are built and deployed. It promises a real pay-as-you-go model with millisecond-level granularity without wasting any resources. Due to its simplicity and economic advantages, this approach gained popularity, and many cloud providers support these capabilities. The simplest way to use serverless computing is by providing code to be executed by the platform on demand. This approach led to the rise of Function-as-a-service (FaaS) platforms focused on allowing small pieces of code represented as functions to run for a limited amount of time. The functions are triggered by events like HTTP requests, storage changes, messages, or notifications. As these functions are invoked and stopped when the code execution is complete, they don’t keep any persistent state. To maintain the state or persist the data, they use services like DynamoDB which provide durable storage capabilities. AWS Lambda is capable of scaling as per the demand. For example, AWS Lambda processed more than 1.3 trillion invocations on Prime Day 2024. Such capabilities are crucial in handling the sudden spurts of traffic. Serverless Storage In the serverless computing ecosystem, serverless storage refers to cloud-based storage solutions that scale automatically without having the consumers manage the infrastructure. These services offer many capabilities, including on-demand scalability, high availability, and pay-as-you-go. For instance, DynamoDB is a fully managed, serverless NoSQL database designed to handle key-value and document data models. It is purpose-built for applications requiring consistent performance at any scale, offering single-digit millisecond latency. It also provides seamless integration capabilities with many other services. Major cloud providers offer numerous serverless storage options for specific needs, such as S3, ElastiCache, Aurora, and many more. Other Use Cases In the previous section, we discussed how to leverage serverless architecture to build backend APIs for a web application. There are several other use cases that can benefit from serverless architecture. A few of those use cases include: Data Processing Let’s explore another example of how serverless architecture can be used to notify services based on data changes in a datastore. For instance, in an e-commerce platform, let’s say on the creation of an order, several services need to be informed. Within the AWS ecosystem, the order can be stored in DynamoDB upon creation. In order to notify other services, multiple events can be triggered based on this storage event. Using DynamoDB Streams, a Lambda function can be invoked when this event occurs. This lambda function can then push the change event to SNS (Simple Notification Service). SNS acts as the notification service to notify several other services that are interested in these events. Real-Time File Processing In many applications, users upload images that need to be stored, processed for resizing, converted to different formats, and analyzed. We can achieve this functionality using AWS serverless architecture in the following way. When an image is uploaded, it is pushed to an S3 bucket configured to trigger an event to invoke a Lambda function. The Lambda function can process the image, store metadata in DynamoDB, and store resized images in another S3 bucket. This scalable architecture can be used to process millions of images without requiring to manage any infrastructure or any manual intervention. Challenges Serverless architectures offer many benefits, but they also bring certain challenges that need to be addressed. Cold Start When a serverless function is invoked, the platform needs to create, initialize, and run a new container to execute the code. This process, known as cold start, can introduce additional latency in the workflow. Techniques like keeping functions warm or using provisioned concurrency can help reduce this delay. Monitoring and Debugging As there can be a large number of invocations, monitoring and debugging can become complex. It can be challenging to identify and debug issues in applications that are heavily used. Configuring tools like AWS Cloudwatch for metrics, logs, and alerts is highly recommended to address these issues. Although serverless architectures scale automatically, the resource configurations must be optimized to prevent bottlenecks. Proper resource allocation and implementation of cost optimization strategies are essential. Conclusion The serverless architecture is a major step towards the development of cloud-native applications backed by serverless computing and storage. It is heavily used in many types of applications, including event-driven workflows, data processing, file processing, and big data analytics. Due to its scalability, agility, and high availability, serverless architecture has become a reliable choice for businesses of all sizes.

By Ravi Laudya

Why and How to Participate in Open-Source Projects in 2025

Are you a software developer looking to accelerate your career, enhance your skills, and expand your professional network? If so, contributing to an open-source project in 2025 might be your best decision. Open source is more than just a technical exercise; it’s a gateway to learning from industry experts, mastering new technologies, and creating a lasting impact on the developer community. Over the years, one of the most common career-related questions I have encountered is: Why should I participate in an open-source project? With 2025 upon us, this question remains as relevant as ever. In this article, I will explore the reasons for engaging in open source, explain how to get started, and highlight some projects to consider contributing to this year. Why Participate in an Open-Source Project? Using Simon Sinek’s Golden Circle, let’s start with the fundamental question: Why? Participating in an open-source project is one of the best ways to enhance both hard and soft skills as a software engineer. Here’s how: Hard Skills Learn how to write better code by collaborating with some of the best developers in the industry.Gain experience with cutting-edge technologies, such as the latest Java versions, Hibernate best practices, and JVM internals.Expand your knowledge of software design patterns, architecture styles, and problem-solving approaches professionals use worldwide. Soft Skills Improve your communication skills in written discussions (PR reviews, documentation) and real-time interactions. Additionally, enhance your verbal communication skills by participating in meetings, discussions, and presentations, which will help you become more confident in explaining technical concepts to a broader audience.Develop negotiation and persuasion skills when proposing changes or advocating for new features.Expand your professional network, allowing more people to recognize your contributions and capabilities. When you contribute to open source, you distinguish yourself from the vast number of software engineers who use the software. Only a tiny percentage build and maintain these projects. A track record of contributions adds credibility to your resume and LinkedIn profile, making you stand out in the job market. How to Get Started in Open Source A common misconception is that contributing to open source is complicated or reserved for experts. This is not true. Anyone can start contributing by following a structured approach. Here are five steps to begin your open-source journey: 1. Choose a Project Select a project that aligns with your interests or career goals. To become a database expert, contribute to an open-source database. If you want to improve your API development skills, work on frameworks related to API design. Since open-source contributions often start as a hobby in your free time, ensure that the project provides valuable learning opportunities and supports your career aspirations. 2. Join the Team Communication Channels Once you have selected a project, join the community. Open-source projects use various communication channels such as Slack, Discord, mailing lists, or forums. Introduce yourself and observe discussions, pull requests, and issue tracking to understand how the community operates. 3. Read the Documentation Documentation is the bridge between you, as a contributor, and the project maintainers. Many developers rely on tutorials, blog posts, and YouTube videos, but reading the official documentation gives you a deeper understanding of how the project works. This also helps you identify documentation gaps that you can improve later. 4. Start with Tests, Documentation, and Refactoring Before jumping into feature development, focus on tasks that are valuable but often overlooked, such as: Improving documentation clarity.Writing tests to increase code coverage.Refactoring legacy code to align with modern Java features (e.g., replacing Java 5 code with Java 17 constructs like Streams and Lambdas). These contributions are always welcome, and since they are difficult to reject, they serve as a great entry point into any project. 5. Propose Enhancements and New Features Once you have built credibility within the project by handling documentation, testing, and refactoring tasks, you can propose enhancements and new features. Many developers start by suggesting new features immediately, but without familiarity with the project's goals and context, such proposals may be disregarded. Establishing yourself first as a reliable contributor makes it easier for your ideas to be accepted and integrated into the project. Open-Source Projects to Contribute to in 2025 If you are looking for projects to contribute to this year, consider well-established ones under foundations like Eclipse and Apache, as well as other impactful open-source projects: Jakarta Data – for those interested in Java persistence and data accessJakarta NoSQL – ideal for developers exploring NoSQL databases with Jakarta EEEclipse JNoSQL – a great entry point for those working with NoSQL in JavaWeld – a core implementation of CDI (Contexts and Dependency Injection)Spring Framework – one of the most widely used frameworks in Java developmentQuarkus – a Kubernetes-native Java stack tailored for GraalVM and cloud-native applicationsOracle NoSQL – a high-performance distributed NoSQL database for enterprise applicationsMongoDB – a widely-used NoSQL document database for modern applications Conclusion In this article, I explained why participating in open source is beneficial, how to start contributing, and which projects to consider in 2025. Contrary to popular belief, contributing is not difficult — it simply requires time, discipline, and consistency. I have been contributing to open source for over a decade, and chances are, you are already using some of the projects I have worked on. I hope this guide helps you get started, and I look forward to seeing you on a mailing list or a pull request soon! Video

By Otavio Santana

CORE

Loading XML into MongoDB

There are many situations where you may need to export data from XML to MongoDB. Despite the fact that XML and JSON(B) formats used in MongoDB have much in common, they also have a number of differences that make them non-interchangeable. Therefore, before you face the task of exporting data from XML to MongoDB, you will need to: Write your own XML parsing scripts;Use ETL tools. Although modern language models can write parsing scripts quite well in languages like Python, these scripts will have a serious problem — they won't be unified. For each file type, modern language models will generate a separate script. If you have more than one type of XML, this already creates significant problems in maintaining more than one parsing script. The above problem is usually solved using specialized ETL tools. In this article, we will look at an ETL tool called SmartXML. Although SmartXML also supports converting XML to a relational representation we will only look at the process of uploading XML into MongoDB. The actual XML can be extremely large and complex. This article is an introductory article, so we will dissect a situation in which: All XML has the same structure;The logical model of the XML is the same as the storage model in MongoDB;Extracted fields don't need complex processing; We'll cover those cases later, but first, let's examine a simple example: XML <marketingData> <customer> <name>John Smith</name> <email>john.smith@example.com</email> <purchases> <purchase> <product>Smartphone</product> <category>Electronics</category> <price>700</price> <store>TechWorld</store> <location>New York</location> <purchaseDate>2025-01-10</purchaseDate> </purchase> <purchase> <product>Wireless Earbuds</product> <category>Audio</category> <price>150</price> <store>GadgetStore</store> <location>New York</location> <purchaseDate>2025-01-11</purchaseDate> </purchase> </purchases> <importantInfo> <loyaltyStatus>Gold</loyaltyStatus> <age>34</age> <gender>Male</gender> <membershipID>123456</membershipID> </importantInfo> <lessImportantInfo> <browser>Chrome</browser> <deviceType>Mobile</deviceType> <newsletterSubscribed>true</newsletterSubscribed> </lessImportantInfo> </customer> <customer> <name>Jane Doe</name> <email>jane.doe@example.com</email> <purchases> <purchase> <product>Laptop</product> <category>Electronics</category> <price>1200</price> <store>GadgetStore</store> <location>San Francisco</location> <purchaseDate>2025-01-12</purchaseDate> </purchase> <purchase> <product>USB-C Adapter</product> <category>Accessories</category> <price>30</price> <store>TechWorld</store> <location>San Francisco</location> <purchaseDate>2025-01-13</purchaseDate> </purchase> <purchase> <product>Keyboard</product> <category>Accessories</category> <price>80</price> <store>OfficeMart</store> <location>San Francisco</location> <purchaseDate>2025-01-14</purchaseDate> </purchase> </purchases> <importantInfo> <loyaltyStatus>Silver</loyaltyStatus> <age>28</age> <gender>Female</gender> <membershipID>654321</membershipID> </importantInfo> <lessImportantInfo> <browser>Safari</browser> <deviceType>Desktop</deviceType> <newsletterSubscribed>false</newsletterSubscribed> </lessImportantInfo> </customer> <customer> <name>Michael Johnson</name> <email>michael.johnson@example.com</email> <purchases> <purchase> <product>Headphones</product> <category>Audio</category> <price>150</price> <store>AudioZone</store> <location>Chicago</location> <purchaseDate>2025-01-05</purchaseDate> </purchase> </purchases> <importantInfo> <loyaltyStatus>Bronze</loyaltyStatus> <age>40</age> <gender>Male</gender> <membershipID>789012</membershipID> </importantInfo> <lessImportantInfo> <browser>Firefox</browser> <deviceType>Tablet</deviceType> <newsletterSubscribed>true</newsletterSubscribed> </lessImportantInfo> </customer> <customer> <name>Emily Davis</name> <email>emily.davis@example.com</email> <purchases> <purchase> <product>Running Shoes</product> <category>Sportswear</category> <price>120</price> <store>FitShop</store> <location>Los Angeles</location> <purchaseDate>2025-01-08</purchaseDate> </purchase> <purchase> <product>Yoga Mat</product> <category>Sportswear</category> <price>40</price> <store>FitShop</store> <location>Los Angeles</location> <purchaseDate>2025-01-09</purchaseDate> </purchase> </purchases> <importantInfo> <loyaltyStatus>Gold</loyaltyStatus> <age>25</age> <gender>Female</gender> <membershipID>234567</membershipID> </importantInfo> <lessImportantInfo> <browser>Edge</browser> <deviceType>Mobile</deviceType> <newsletterSubscribed>false</newsletterSubscribed> </lessImportantInfo> </customer> <customer> <name>Robert Brown</name> <email>robert.brown@example.com</email> <purchases> <purchase> <product>Smartwatch</product> <category>Wearable</category> <price>250</price> <store>GadgetPlanet</store> <location>Boston</location> <purchaseDate>2025-01-07</purchaseDate> </purchase> <purchase> <product>Fitness Band</product> <category>Wearable</category> <price>100</price> <store>HealthMart</store> <location>Boston</location> <purchaseDate>2025-01-08</purchaseDate> </purchase> </purchases> <importantInfo> <loyaltyStatus>Silver</loyaltyStatus> <age>37</age> <gender>Male</gender> <membershipID>345678</membershipID> </importantInfo> <lessImportantInfo> <browser>Chrome</browser> <deviceType>Mobile</deviceType> <newsletterSubscribed>true</newsletterSubscribed> </lessImportantInfo> </customer> </marketingData> In this example, we will upload in the MongoDB only the fields that serve a practical purpose, rather than the entire XML. Create a New Project It is recommended to create a new project from the GUI. This will automatically create the necessary folder structure and parsing rules. A full description of the project structure can be found in the official documentation. All parameters described in this article can be configured in graphical mode, but for clarity, we will focus on the textual representation. In addition to the config.txt file with project settings, job.txt for batch work, the project itself consists of: Template of intermediate internal SmartDOM view, located in the project folder templates/data-templates.red.Rules for processing and transformation of SmartDOM itself, located in the rules folder. Let's consider the structure of data-templates.red: Plain Text #[ sample: #[ marketing_data: #[ customers: [ customer: [ name: none email: none purchases: [ purchase: [ product: none category: none price: none store: none location: none purchase_date: none ] ] ] ] ] ] ] Note The name sample is the name of the category, and it doesn't matter.The marketing_data is the name of the subcategory. We need at least one code subcategory (subtype).The intermediate view names don't require exact matches with XML tag names. In this example, we intentionally used the snake_case style. Extract Rules The rules are located in the rules directory in the project folder. When working with MongoDB we will only be interested in two rules: tags-matching-rules.red — sets the matches between the XML tag tree and SmartDOMgrow-rules.red — describes the relationship between SmartDOM nodes and real XML nodes Plain Text sample: [ purchase: ["purchase"] customer: ["customer"] ] The key will be the name of the node in SmartDOM; the value will be an array containing the node spelling variants from the real XML file. In our example, these names are the same. Ignored Tags To avoid loading minor data into MongoDB in the example above, we create files in the ignores folder — one per section, named after each section. These files contain lists of tags to skip during extraction. For our example, we'll have a sample.txt file containing: Plain Text ["marketingData" "customer" "lessImportantInfo" "browser"] ["marketingData" "customer" "lessImportantInfo" "deviceType"] ["marketingData" "customer" "lessImportantInfo" "newsletterSubscribed"] As a result, when analyzing morphology, the intermediate representation will take the next form: Plain Text customers: [ customer: [ name: "John Smith" email: "john.smith@example.com" loyalty_status: "Gold" age: "34" gender: "Male" membership_id: "123456" purchases: [ purchase: [ product: "Smartphone" category: "Electronics" price: "700" store: "TechWorld" location: "New York" purchase_date: "2025-01-10" ] ] ] ] Note that after morphological analysis, only a minimal representation is shown containing data from the first found nodes. Here's the JSON file that will be generated: JSON { "customers": [ { "name": "John Smith", "email": "john.smith@example.com", "loyalty_status": "Gold", "age": "34", "gender": "Male", "membership_id": "123456", "purchases": [ { "product": "Smartphone", "category": "Electronics", "price": "700", "store": "TechWorld", "location": "New York", "purchase_date": "2025-01-10" }, { "product": "Wireless Earbuds", "category": "Audio", "price": "150", "store": "GadgetStore", "location": "New York", "purchase_date": "2025-01-11" } ] }, { "name": "Jane Doe", "email": "jane.doe@example.com", "loyalty_status": "Silver", "age": "28", "gender": "Female", "membership_id": "654321", "purchases": [ { "product": "Laptop", "category": "Electronics", "price": "1200", "store": "GadgetStore", "location": "San Francisco", "purchase_date": "2025-01-12" }, { "product": "USB-C Adapter", "category": "Accessories", "price": "30", "store": "TechWorld", "location": "San Francisco", "purchase_date": "2025-01-13" }, { "product": "Keyboard", "category": "Accessories", "price": "80", "store": "OfficeMart", "location": "San Francisco", "purchase_date": "2025-01-14" } ] }, { "name": "Michael Johnson", "email": "michael.johnson@example.com", "loyalty_status": "Bronze", "age": "40", "gender": "Male", "membership_id": "789012", "purchases": [ { "product": "Headphones", "category": "Audio", "price": "150", "store": "AudioZone", "location": "Chicago", "purchase_date": "2025-01-05" } ] }, { "name": "Emily Davis", "email": "emily.davis@example.com", "loyalty_status": "Gold", "age": "25", "gender": "Female", "membership_id": "234567", "purchases": [ { "product": "Running Shoes", "category": "Sportswear", "price": "120", "store": "FitShop", "location": "Los Angeles", "purchase_date": "2025-01-08" }, { "product": "Yoga Mat", "category": "Sportswear", "price": "40", "store": "FitShop", "location": "Los Angeles", "purchase_date": "2025-01-09" } ] }, { "name": "Robert Brown", "email": "robert.brown@example.com", "loyalty_status": "Silver", "age": "37", "gender": "Male", "membership_id": "345678", "purchases": [ { "product": "Smartwatch", "category": "Wearable", "price": "250", "store": "GadgetPlanet", "location": "Boston", "purchase_date": "2025-01-07" }, { "product": "Fitness Band", "category": "Wearable", "price": "100", "store": "HealthMart", "location": "Boston", "purchase_date": "2025-01-08" } ] } ] } Configuring Connection to MongoDB Since MongoDB doesn't support direct HTTP data insertion, an intermediary service will be required. Let's install the dependencies: pip install flask pymongo. The service itself: Python from flask import Flask, request, jsonify from pymongo import MongoClient import json app = Flask(__name__) # Connection to MongoDB client = MongoClient('mongodb://localhost:27017') db = client['testDB'] collection = db['testCollection'] @app.route('/insert', methods=['POST']) def insert_document(): try: # Flask will automatically parse JSON if Content-Type: application/json data = request.get_json() if not data: return jsonify({"error": "Empty JSON payload"}), 400 result = collection.insert_one(data) return jsonify({"insertedId": str(result.inserted_id)}), 200 except Exception as e: import traceback print(traceback.format_exc()) return jsonify({"error": str(e)}), 500 if __name__ == '__main__': app.run(port=3000) We'll set up the MongoDB connection settings in the config.txt file (see nosql-url): Plain Text job-number: 1 root-xml-folder: "D:/data/data-samples" xml-filling-stat: false ; table: filling_percent_stat should exists ignore-namespaces: false ignore-tag-attributes: false use-same-morphology-for-same-file-name-pattern: false skip-schema-version-tag: true use-same-morphology-for-all-files-in-folder: false delete-data-before-insert: none connect-to-db-at-project-opening: true source-database: "SQLite" ; available values: PostgreSQL/SQLite target-database: "SQLite" ; available values: PostgreSQL/SQLite/NoSQL bot-chatID: "" bot-token: "" telegram-notifications: true db-driver: "" db-server: "127.0.0.1" db-port: "" db-name: "" db-user: "" db-pass: "" sqlite-driver-name: "SQLite3 ODBC Driver" sqlite-db-path: "" nosql-url: "http://127.0.0.1:3000/insert" append-subsection-name-to-nosql-url: false no-sql-login: "" ; login and pass are empty no-sql-pass: "" Remember that MongoDB will automatically create a database and a collection of the same name if they do not exist. However, this behavior may cause errors, and it is recommended to disable it by default. Let's run the service itself: Python python .\app.py Next, click Parse, then Send JSON to NoSQL. Now connect to the MongoDB console in any convenient way and execute the following commands: Plain Text show databases admin 40.00 KiB config 72.00 KiB local 72.00 KiB testDB 72.00 KiB use testDB switched to db testDB db.testCollection.find().pretty() The result should look like the following: JSON { _id: ObjectId('278e1b2c7c1823d4fde120ef'), customers: [ { name: 'John Smith', email: 'john.smith@example.com', loyalty_status: 'Gold', age: '34', gender: 'Male', membership_id: '123456', purchases: [ { product: 'Smartphone', category: 'Electronics', price: '700', store: 'TechWorld', location: 'New York', purchase_date: '2025-01-10' }, { product: 'Wireless Earbuds', category: 'Audio', price: '150', store: 'GadgetStore', location: 'New York', purchase_date: '2025-01-11' } ] }, { name: 'Jane Doe', email: 'jane.doe@example.com', loyalty_status: 'Silver', age: '28', gender: 'Female', membership_id: '654321', purchases: [ { product: 'Laptop', category: 'Electronics', price: '1200', store: 'GadgetStore', location: 'San Francisco', purchase_date: '2025-01-12' }, { product: 'USB-C Adapter', category: 'Accessories', price: '30', store: 'TechWorld', location: 'San Francisco', purchase_date: '2025-01-13' }, { product: 'Keyboard', category: 'Accessories', price: '80', store: 'OfficeMart', location: 'San Francisco', purchase_date: '2025-01-14' } ] }, { name: 'Michael Johnson', email: 'michael.johnson@example.com', loyalty_status: 'Bronze', age: '40', gender: 'Male', membership_id: '789012', purchases: [ { product: 'Headphones', category: 'Audio', price: '150', store: 'AudioZone', location: 'Chicago', purchase_date: '2025-01-05' } ] }, { name: 'Emily Davis', email: 'emily.davis@example.com', loyalty_status: 'Gold', age: '25', gender: 'Female', membership_id: '234567', purchases: [ { product: 'Running Shoes', category: 'Sportswear', price: '120', store: 'FitShop', location: 'Los Angeles', purchase_date: '2025-01-08' }, { product: 'Yoga Mat', category: 'Sportswear', price: '40', store: 'FitShop', location: 'Los Angeles', purchase_date: '2025-01-09' } ] }, { name: 'Robert Brown', email: 'robert.brown@example.com', loyalty_status: 'Silver', age: '37', gender: 'Male', membership_id: '345678', purchases: [ { product: 'Smartwatch', category: 'Wearable', price: '250', store: 'GadgetPlanet', location: 'Boston', purchase_date: '2025-01-07' }, { product: 'Fitness Band', category: 'Wearable', price: '100', store: 'HealthMart', location: 'Boston', purchase_date: '2025-01-08' } ] } ] } Conclusion In this example, we have seen how we can automate the uploading of XML files to MongoDB without having to write any code. Although the example considers only one file, it is possible within the framework of one project to a huge number of types and subtypes of files with different structures, as well as to perform quite complex manipulations, such as type conversion and the use of external services to process field values in real time. This allows not only the unloading of data from XML but also the processing of some of the values via external API, including the use of large language models.

By Luca Sanders

Goose Migrations for Smooth Database Changes

Hello, mate! Today, let’s talk about what database migrations are and why they’re so important. In today’s world, it’s no surprise that any changes to a database should be done carefully and according to a specific process. Ideally, these steps would be integrated into our CI/CD pipeline so that everything runs automatically. Here’s our agenda: What’s the problem?How do we fix it?A simple exampleA more complex exampleRecommendationsResultsConclusion What’s the Problem? If your team has never dealt with database migrations and you’re not entirely sure why they’re needed, let’s sort that out. If you already know the basics, feel free to skip ahead. Main Challenge When we make “planned” and “smooth” changes to the database, we need to maintain service availability and meet SLA requirements (so that users don’t suffer from downtime or lag). Imagine you want to change a column type in a table with 5 million users. If you do this “head-on” (e.g., simply run ALTER TABLE without prep), the table could get locked for a significant amount of time — and your users would be left without service. To avoid such headaches, follow two rules: Apply migrations in a way that doesn’t lock the table (or at least minimizes locks).If you need to change a column type, it’s often easier to create a new column with the correct type first and then drop the old one afterward. Another Problem: Version Control and Rollbacks Sometimes you need to roll back a migration. Doing this manually — going into the production database and fiddling with data — is not only risky but also likely impossible if you don’t have direct access. That’s where dedicated migration tools come in handy. They let you apply changes cleanly and revert them if necessary. How Do We Fix It? Use the Right Tools Each language and ecosystem has its own migration tools: For Java, Liquibase or Flyway are common.For Go, a popular choice is goose (the one we’ll look at here).And so on. Goose: What It Is and Why It’s Useful Goose is a lightweight Go utility that helps you manage migrations automatically. It offers: Simplicity. Minimal dependencies and a transparent file structure for migrations.Versatility. Supports various DB drivers (PostgreSQL, MySQL, SQLite, etc.).Flexibility. Write migrations in SQL or Go code. Installing Goose Shell go install github.com/pressly/goose/v3/cmd/goose@latest How It Works: Migration Structure By default, Goose looks for migration files in db/migrations. Each migration follows this format: Shell NNN_migration_name.(sql|go) NNN is the migration number (e.g., 001, 002, etc.).After that, you can have any descriptive name, for example init_schema.The extension can be .sql or .go. Example of an SQL Migration File: 001_init_schema.sql: SQL -- +goose Up CREATE TABLE users ( id SERIAL PRIMARY KEY, username VARCHAR(255) NOT NULL, created_at TIMESTAMP NOT NULL DEFAULT now() ); -- +goose Down DROP TABLE users; Our First Example Changing a Column Type (String → Int) Suppose we have a users table with a column age of type VARCHAR(255). Now we want to change it to INTEGER. Here’s what the migration might look like (file 005_change_column_type.sql): SQL -- +goose Up ALTER TABLE users ALTER COLUMN age TYPE INTEGER USING (age::INTEGER); -- +goose Down ALTER TABLE users ALTER COLUMN age TYPE VARCHAR(255) USING (age::TEXT); What’s happening here: Up migration We change the age column to INTEGER. The USING (age::INTEGER) clause tells PostgreSQL how to convert existing data to the new type.Note that this migration will fail if there’s any data in age that isn’t numeric. In that case, you’ll need a more complex strategy (see below). Down migration If we roll back, we return age to VARCHAR(255).We again use USING (age::TEXT) to convert from INTEGER back to text. The Second and Complex Cases: Multi-Step Migrations If the age column might contain messy data (not just numbers), it’s safer to do this in several steps: Add a new column (age_int) of type INTEGER.Copy valid data into the new column, dealing with or removing invalid entries.Drop the old column. SQL -- +goose Up -- Step 1: Add a new column ALTER TABLE users ADD COLUMN age_int INTEGER; -- Step 2: Try to move data over UPDATE users SET age_int = CASE WHEN age ~ '^[0-9]+$' THEN age::INTEGER ELSE NULL END; -- (optional) remove rows where data couldn’t be converted -- DELETE FROM users WHERE age_int IS NULL; -- Step 3: Drop the old column ALTER TABLE users DROP COLUMN age; -- +goose Down -- Step 1: Recreate the old column ALTER TABLE users ADD COLUMN age VARCHAR(255); -- Step 2: Copy data back UPDATE users SET age = age_int::TEXT; -- Step 3: Drop the new column ALTER TABLE users DROP COLUMN age_int; To allow a proper rollback, the Down section just mirrors the actions in reverse. Automation is Key To save time, it’s really convenient to add migration commands to a Makefile (or any other build system). Below is an example Makefile with the main Goose commands for PostgreSQL. Let’s assume: The DSN for the database is postgres://user:password@localhost:5432/dbname?sslmode=disable.Migration files are in db/migrations. Shell # File: Makefile DB_DSN = "postgres://user:password@localhost:5432/dbname?sslmode=disable" MIGRATIONS_DIR = db/migrations # Install Goose (run once) install-goose: go install github.com/pressly/goose/v3/cmd/goose@latest # Create a new SQL migration file new-migration: ifndef NAME $(error Usage: make new-migration NAME=your_migration_name) endif goose -dir $(MIGRATIONS_DIR) create $(NAME) sql # Apply all pending migrations migrate-up: goose -dir $(MIGRATIONS_DIR) postgres $(DB_DSN) up # Roll back the last migration migrate-down: goose -dir $(MIGRATIONS_DIR) postgres $(DB_DSN) down # Roll back all migrations (be careful in production!) migrate-reset: goose -dir $(MIGRATIONS_DIR) postgres $(DB_DSN) reset # Check migration status migrate-status: goose -dir $(MIGRATIONS_DIR) postgres $(DB_DSN) status How to Use It? 1. Create a new migration (SQL file). This generates a file db/migrations/002_add_orders_table.sql. Shell make new-migration NAME=add_orders_table 2. Apply all migrations. Goose will create a schema_migrations table in your database (if it doesn’t already exist) and apply any new migrations in ascending order. Shell make migrate-up 3. Roll back the last migration. Just down the last one. Shell make migrate-down 4. Roll back all migrations (use caution in production). Full reset. Shell make migrate-reset 5. Check migration status. Shell make migrate-status Output example: Shell $ goose status $ Applied At Migration $ ======================================= $ Sun Jan 6 11:25:03 2013 -- 001_basics.sql $ Sun Jan 6 11:25:03 2013 -- 002_next.sql $ Pending -- 003_and_again.go Summary By using migration tools and a Makefile, we can: Restrict direct access to the production database, making changes only through migrations.Easily track database versions and roll them back if something goes wrong.Maintain a single, consistent history of database changes.Perform “smooth” migrations that won’t break a running production environment in a microservices world.Gain extra validation — every change will go through a PR and code review process (assuming you have those settings in place). Another advantage is that it’s easy to integrate all these commands into your CI/CD pipeline. And remember — security above all else. For instance: YAML jobs: migrate: runs-on: ubuntu-latest steps: - name: Install Goose run: | make install-goose - name: Run database migrations env: DB_DSN: ${{ secrets.DATABASE_URL } run: | make migrate-up Conclusion and Tips The main ideas are so simple: Keep your migrations small and frequent. They’re easier to review, test, and revert if needed.Use the same tool across all environments so dev, stage, and prod are in sync.Integrate migrations into CI/CD so you’re not dependent on any one person manually running them. In this way, you’ll have a reliable and controlled process for changing your database structure — one that doesn’t break production and lets you respond quickly if something goes wrong. Good luck with your migrations! Thanks for reading!

By Ilia Ivankin