DevOps and CI/CD Resources

DZone's Featured DevOps and CI/CD Resources

Build Serverless Applications Using Rust on AWS Lambda

By Eric Jonathan

Serverless computing has changed how teams build apps that scale effortlessly. But here’s the catch: popular tools like Node.js and Python often face delays when starting up, hog memory, or just don’t perform as smoothly as needed. That’s where Rust shines. Built for lightning speed and reliability without the bulk, it’s quickly becoming the secret weapon for serverless setups. In this walkthrough, we’ll teach you how to build and launch serverless functions using Rust on AWS Lambda. Why Rust for AWS Lambda? Blazing-Fast Cold Starts AWS Lambda cold starts — the delay when a function initializes — are a critical performance bottleneck. Unlike interpreted languages (e.g., Python), Rust compiles to machine-native binaries, eliminating interpreter startup overhead. Combined with Rust’s lack of a garbage collector (GC), this can result in cold starts as low as 50–75 ms, even for complex functions. Memory Safety Without Compromise Rust’s ownership model guarantees memory safety at compile time, preventing common vulnerabilities like buffer overflows. This is critical for serverless, where functions often process untrusted input (e.g., data from an API Gateway). Tiny Binaries, Lower Costs Rust binaries are often just 5–10 MB when optimized, compared to 50–100 MB for equivalent Node.js or Python deployments. Smaller binaries mean: Faster deployment timesReduced memory usage (leading to lower AWS Lambda costs)Compatibility with restrictive environments like AWS Lambda@Edge Async-First Concurrency Rust’s async/await syntax, paired with runtimes like Tokio, enables non-blocking I/O operations. This is ideal for serverless functions handling concurrent API requests or database queries. Setting Up Rust for AWS Lambda Install the Rust Toolchain Start by installing Rust and the AWS Lambda-specific tools: Rust # Install Rust + Cargo curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh # Add nightly toolchain (required for some optimizations) rustup default nightly rustup target add x86_64-unknown-linux-musl # Install cargo-lambda cargo install cargo-lambda Create a New Lambda Project Use cargo-lambda to scaffold a new function: Rust cargo lambda new my-lambda-function Your generated Cargo.toml might include essential dependencies like: TOML [dependencies] lambda_runtime = "0.8" tokio = { version = "1.0", features = ["macros"] } serde = { version = "1.0", features = ["derive"] } Write a Basic Handler Replace src/main.rs with a Lambda function that processes JSON input: Rust use lambda_runtime::{handler_fn, Context, Error}; use serde_json::{json, Value}; Rust #[tokio::main] async fn main() -> Result<(), Error> { lambda_runtime::run(handler_fn(handler)).await } async fn handler(event: Value, _: Context) -> Result<Value, Error> { let name = event["name"].as_str().unwrap_or("World"); Ok(json!({ "message": format!("Hello, {}!", name) })) } Key Components #[tokio::main] – configures the async runtimehandler_fn – wraps the handler for AWS Lambda compatibilityserde_json – parses and serializes JSON payloads Code Snippets With Expected Outputs Below, you’ll see additional code examples illustrating structured logging, error handling, and Terraform deployment, each paired with expected inputs and outputs. Basic Lambda Handler (Extended Example) Rust async fn handler(event: Value, _: Context) -> Result<Value, Error> { let name = event["name"].as_str().unwrap_or("World"); Ok(json!({ "message": format!("Hello, {}!", name) })) } Input JSON { "name": "Alice" } Output JSON { "message": "Hello, Alice!" } Input (No Name) JSON {} Output JSON { "message": "Hello, World!" } Structured Logging With Tracing Rust use tracing::{info, Level}; use tracing_subscriber::FmtSubscriber; fn main() { let subscriber = FmtSubscriber::builder() .with_max_level(Level::INFO) .finish(); tracing::subscriber::set_global_default(subscriber).unwrap(); info!("Lambda initialized"); // ... } CloudWatch Log Output JSON 2023-10-05T12:34:56Z INFO my_lambda_function Lambda initialized Logs appear in AWS CloudWatch, queryable via CloudWatch Insights for deeper analysis. Error Handling With thiserror Rust #[derive(thiserror::Error, Debug)] enum LambdaError { #[error("Missing field: {0}")] MissingField(String), #[error(transparent)] SerdeJson(#[from] serde_json::Error), } async fn handler(event: Value, _: Context) -> Result<Value, LambdaError> { let name = event["name"] .as_str() .ok_or(LambdaError::MissingField("name".into()))?; Ok(json!({ "message": format!("Hello, {}!", name) })) } Input (Missing Name) JSON { "age": 30 } Output (Error) JSON { "errorMessage": "Missing field: name", "errorType": "LambdaError" } AWS Deployment JSON { "resource": { "aws_lambda_function": { "rust_lambda": { "function_name": "rust-serverless", "runtime": "provided.al2", "handler": "bootstrap", "filename": "target/lambda/my-lambda-function/bootstrap.zip", "role": "${aws_iam_role.lambda_exec.arn}", "memory_size": 128, "timeout": 10 } }, "aws_iam_role": { "lambda_exec": { "name": "rust-lambda-role", "assume_role_policy": { "Version": "2012-10-17", "Statement": [ { "Action": "sts:AssumeRole", "Effect": "Allow", "Principal": { "Service": "lambda.amazonaws.com" } } ] } } } } } Output After terraform Apply JSON aws_lambda_function.rust_lambda: Creating... aws_lambda_function.rust_lambda: Creation complete after 5s Apply complete! Resources: 1 added, 0 changed, 0 destroyed. Outputs JSON lambda_arn = "arn:aws:lambda:us-east-1:123456789012:function:rust-serverless" Optimizing Rust for AWS Lambda Reduce Binary Size AWS Lambda charges for memory usage, so smaller binaries can save costs: Plain Text # Compile with musl for static linking cargo lambda build --release --target x86_64-unknown-linux-musl # Strip debug symbols (saves ~30% size) strip target/x86_64-unknown-linux-musl/release/bootstrap Pro tip: Use cargo udeps to audit unused dependencies. Cold Start Mitigation Precompiled binaries. The x86_64-unknown-linux-musl target ensures compatibility with AWS Lambda’s Amazon Linux 2 environment.Provisioned concurrency. Pre-initialize Lambda instances via the AWS Console, Terraform, or CloudFormation to reduce cold starts for high-traffic functions. Async Best Practices Rust’s async runtime (Tokio) helps you run multiple I/O-bound tasks concurrently. Rust async fn fetch_s3_object(bucket: &str, key: &str) -> Result<Vec<u8>, Error> { let client = aws_sdk_s3::Client::new(&aws_config::load_from_env().await); let resp = client.get_object().bucket(bucket).key(key).send().await?; let data = resp.body.collect().await?; Ok(data.into_bytes().to_vec()) } Use concurrency to fetch data from multiple sources without blocking the main thread. Observability and Debugging Structured logging. Already shown above with the tracing crate.Error handling. thiserror for typed errors that help you quickly pinpoint issues in logs or metrics.AWS X-Ray. Consider X-Ray for advanced tracing if you need deeper visibility into call chains, especially across microservices. Advanced Optimization Example Fetching S3 Data Concurrently Rust async fn fetch_s3_object(bucket: &str, key: &str) -> Result<Vec<u8>, Error> { let client = aws_sdk_s3::Client::new(&aws_config::load_from_env().await); let resp = client.get_object().bucket(bucket).key(key).send().await?; let data = resp.body.collect().await?; Ok(data.into_bytes().to_vec()) } Input JSON { "bucket": "my-bucket", "key": "data.json" } Output JSON { "content": "<base64_encoded_data>", "metadata": { "last_modified": "2023-10-05T12:34:56Z" } } You can initiate multiple fetch_s3_object calls concurrently using tokio::join!, slashing overall execution time. Final Deployment Workflow Build JSON cargo lambda build --release --target x86_64-unknown-linux-musl strip target/x86_64-unknown-linux-musl/release/bootstrap Deploy JSON terraform apply -auto-approve If you’re using aws_lambda_function_url, you can expose the function publicly via HTTPS once the apply step completes. Invoke JSON aws lambda invoke \ --function-name rust-serverless \ --payload '{"name":"Alice"}' output.json Response (output.json) JSON { "message": "Hello, Alice!" } Conclusion Rust’s combination of speed, safety, and efficiency makes it ideal for serverless computing. By leveraging tools like cargo-lambda, tokio, and Terraform, you can deploy production-ready functions that outperform traditional runtimes in cold starts, memory usage, and overall cost. Next Steps Explore AWS Lambda Extensions for secrets management and advanced logging.Integrate with AWS SQS or EventBridge for event-driven architectures.Benchmark your own functions using AWS X-Ray to visualize call traces. By adopting Rust for serverless, you’re not just optimizing performance — you’re future-proofing your architecture for the next wave of modern, scalable applications. Further Reading AWS Lambda Rust Runtime GitHubA Guide to AWS Software DevelopmentRust Documentation More

The Right ETL Architecture for Multi-Source Data Integration

By Murat Balkan

CORE

When building ETL (Extract, Transform, Load) pipelines for marketing analytics, customer insights, or similar data-driven use cases, there are two primary architectural approaches: dedicated pipelines per source and common pipeline with integration, core, and sink layers. Each has its distinct non-functional trade-offs in terms of maintainability, performance, cost efficiency, and operational visibility. Let’s explore the best practices and pros and cons of both approaches. A Common Use Case: Multi-Source Marketing Data Aggregation Consider a scenario where an organization needs to aggregate marketing data from sources like Google Ads, TikTok Ads, Facebook Ads, and internal customer data sources. The collected data needs to be transformed, analyzed, and stored in different tables or databases for further insights. Two approaches exist: Dedicated ETL pipelines per source. Each data source has a separate ETL pipeline deployment that independently extracts, transforms, and loads data into the target(s).A common ETL pipeline with an integration layer. A unified pipeline that includes an integration layer (handling ingestion from different sources and filtering), a core processing layer (handling common transformations, deduplication, and business logic), and a sink layer (handling writes to a single or different destination(s) as needed). Within this approach, there are two main variants: Common core only. The integration layer processes data per source, but the core layer handles transformations centrally before distributing data to destination(s).Common core + destination. A fully unified model where transformations and formatting are done in the core layer before data is directed to the appropriate destination(s). Dedicated Pipelines: Pros and Cons Pros Simplicity. Each pipeline is tailored for a single source, making it easier to understand and troubleshoot.Granular optimization. Since each ETL is independent, performance optimizations can be source-specific.Less complexity in initial setup. Teams can get started quickly with isolated pipelines, without worrying about commonality between pipelines. Cons High maintenance overhead. More pipelines mean more configurations, monitoring, and operational overhead.Scalability challenges. Independent pipelines may require redundant processing logic and infrastructure, leading to inefficient resource utilization and resource duplication.Limited cross-source insights. Since each source has its own pipeline, cross-source event correlation (e.g., deduplication, attribution modeling, complex event processing scenarios) becomes challenging. Common Pipeline With Integration, Core, and Sink Layers: Pros and Cons Pros Cross-pipeline visibility. A common processing layer allows event correlation across different sources, enabling advanced insights such as complex event processing (CEP).Better resource utilization. With a shared compute layer, economies of scale are achieved in terms of hardware and license costs.Consistency across data sources. Business logic, transformations, and quality checks are centralized, reducing inconsistencies.Scalable and maintainable. Instead of maintaining multiple ETL jobs, a single pipeline can be optimized and scaled efficiently. Cons Increased complexity. A shared pipeline requires robust orchestration and error-handling mechanisms with more moving parts.Single point of failure risk. If the core layer fails, multiple data sources are affected.Higher initial investment. Designing a robust common pipeline with abstraction layers takes more design and development effort upfront. Common Data Model Considerations One major challenge of a common ETL pipeline is the need for a common (or canonical) data model (CDM). Since data sources often have different schemas and formats, a CDM must be established to standardize the data before it reaches the core processing layer. This model ensures that transformations are uniform across sources and enables complex event-processing scenarios across sources. There are different implementations of normalizing to a common data model. These can be schema-based, where the schema is stored and version-controlled in schema registries while being enforced at the entry points, or staging-table-based, where data is written into an intermediary staging table from the ingestion layer, enforcing schema at the database level. The schema-based approach is generally considered best practice as it allows easy versioning, flexibility, and validation at multiple stages of the pipeline. However, in real-world implementations, the staging-table-based approach is also used, especially when the ETL pipeline exists within a single data source, such as reverse ELT scenarios where data is transformed after being loaded into the final destination. Conclusion In this blog, I talked about the pros and cons of building dedicated versus common pipelines for a single use case (marketing analytics). For organizations dealing with scattered external or internal data sources, the choice depends on scalability needs, maintenance costs, and operational complexity. A common ETL pipeline with an integration layer offers better visibility, scalability, and efficiency but requires upfront investment in orchestration and fault tolerance. On the other hand, dedicated pipelines are quick to deploy but may lead to inefficiencies in the long run. Additionally, adopting a common data model is crucial when implementing a shared ETL approach, as it ensures data consistency across sources and simplifies processing logic. Choosing the right ETL approach is not just about tech — it’s about balancing business needs, operational efficiency, and long-term maintainability. More

From Zero to Scale With AWS Serverless

By Ravi Laudya

Enhancing API Integration Efficiency With a Mock Client

By Sumudu Nissanka

Build a Stateless Microservice With GitHub Copilot in VSCode

By Sibasis Padhi

Using the Log Node in IBM App Connect Enterprise

In the world of IBM App Connect Enterprise (ACE), effective logging is crucial for monitoring and troubleshooting. With the introduction of the Log node, it's now easier than ever to log ExceptionList inserts directly into the activity log, which can be viewed from the WebUI. The Log node can be especially valuable, often replacing the Trace node in various scenarios. This article contains two sections: the first will guide you through the process of using the Log node to log these inserts, helping you streamline your debugging and monitoring processes. The second section explores some scenarios that provide Log node hints and tips around default values. Section 1: Logging ACE's ExceptionList Inserts (In the Activity Log With the Log Node) Introducing the Log Node The Log node is a recent addition to ACE (v12.0.11.0), originating from the Designer, which simplifies logging activities. By using the Log node, you can log custom messages directly into the activity log for easy monitoring and follow-up. Understanding the ExceptionList in ACE Since this article mentions and works with the ExceptionList, I’ll recap very quickly what the ExceptionList is. The ExceptionList in ACE is a structured (built-in) way of displaying captured exceptions within your message flows. It provides detailed information about errors, including the file, line, function, type, and additional context that can be invaluable during troubleshooting. As most of you know, the insert fields contain the most usable information so we will be focussing on those. Setting up the Demo Flow To demonstrate how to log ExceptionList inserts using the Log node, we'll set up a simple flow: 1. Create the Flow Add an HTTP Input node.Add an HTTP Request node.Add a Log node.Add an HTTP Reply node. 2. Connect the Nodes Connect the nodes as shown in the diagram to form a complete flow. Configuring the Log Node Next, we need to configure the Log node to capture and log the ExceptionList inserts: 1. Properties Go to the Map Inputs part of the properties.Configure the Log node to include the ExceptionList as a Map input. 2. Configure Open the Configure wizard for the Log Node. Set the basic values for the log message: “Log Level” and “Message detail”. Next, add a new property, give it a name, and set the type to “Array of strings”. Click on “Edit mappings” and click on the left button that looks like a menu with 3 items. Click on “ExceptionList”, scroll down, and select “Insert”. This gives you all the inserts of the ExceptionList. If that is what you want, great — we are done here. But if you only require the insert fields of the last two ExceptionList items, which tend to be the most interesting ones, you can select only those as well. It’s rather important to know that the ExceptionList ordering here is reversed compared to the message tree. So the last two ExceptionList items in the flow are the first two in this JSON representation. 3. Filter Click on Insert and go to “Edit expression”. Change the expression to “$mappingInput_ExceptionList[0..1].Insert”. Do not forget to hit “Save”! Sending a Message Through the Flow To test our configuration, we'll send a message through the flow using the Flow Exerciser: 1. Send Message Open the Flow Exerciser, create a new message, and send it through the HTTP Input node. 2. Monitor the Progress Observe the message flow and ensure the Log node captures and logs the ExceptionList details. To make sure the log message has been written, wait until you get a reply back. Viewing the ExceptionList in the Activity Log Once the message has been processed, you can view the logged ExceptionList inserts in the activity log through the WebUI: 1. Access the Activity Log Navigate to the activity log section in the WebUI to see the logged ExceptionList inserts: Integration Server > Application > Message flows > flow 2. Review Logged Details The activity log should display detailed information about the exceptions captured during the flow execution. The log entry in detail: This is enough to tell me what the issue is. Section 2: ACE Log Node Tips and Tricks, Default Values Having explored how to handle and parse the ExceptionList, let's now examine some scenarios using the Log node. Fallback or Default Values Imagine you want to log a specific field from an incoming message that may or may not be present. Let's use the HTTP Authorization header as an example. If you configure the Log node to use this header as an input parameter, it will either display it or omit it entirely. The syntax to retrieve the Authorization header from your input message is: {{$mappingInput_InputRoot.HTTPInputHeader.Authorization} Apply this to your Log node: When your message contains the header, it appears in the Activity Log Tags. If the field is missing, the tag disappears. This behavior isn’t always ideal and can complicate troubleshooting. Adding default values can help clarify the situation. When you go through the functions, there is no default or coalesce function available in JSONata, then how can you do it? If you would write this in JavaScript, you would simply type the following: Authorization = $mappingInput_InputRoot.HTTPInputHeader.Authorization || UNDEFINED But that doesn’t work in JSONata. What you can do is either one of these: Use a ternary operator expression.Use sequence flattening. Ternary Operator A ternary operator is essentially an IF statement in expression form: condition ? ifTrue : ifFalse Apply this to our JSONata example: {{$mappingInput_InputRoot.HTTPInputHeader.Authorization ? $mappingInput_InputRoot.HTTPInputHeader.Authorization: "FALLBACK"} What happens is that the first parameter, the actual field, is cast to a Boolean and that result is used to choose between the field value or the fallback value. If the field is an empty sequence, empty array, empty object, empty string, zero or null the expression will default to FALLBACK. Sequence Flattening Sequence flattening in JSONata is a useful feature for handling non-existent fields (i.e., an empty sequence). Consider this example: [$field1, $field2, “FALLBACK”][0] The code returns the first value of a flattened sequence. If $field1 has a value, it is returned; otherwise, if $field2 has a value, $field2 is returned. If neither has a value, “FALLBACK” is returned. This functions similarly to a chained COALESCE in ESQL. Here’s how it applies to our example: {{[$mappingInput_InputRoot.HTTPInputHeader.Authorization, "UNDEFINED"][0]} Example Here’s how both options look in a test setup with HEADER_TO and Header_SF: You can see the fields directly from the header as expected: When the fields are unavailable, the output is: These examples might not be the most realistic, but you could use them to determine the type of authentication used (e.g., basic, OAuth) or if a specific message field is filled in. Now that you know how to do it, it's up to you to find the right use case to apply it to. Conclusion By using the Log node to capture and log ExceptionList inserts, you can significantly enhance your ability to monitor and troubleshoot message flows in IBM App Connect Enterprise. This approach ensures that all relevant error details are readily available in the activity log, making it easier to diagnose and resolve issues. Acknowledgment to Dan Robinson and David Coles for their contribution to this article. Resources IBM App Connect Documentation: Log nodeIBM App Connect Documentation: Activity logsJSONata: Sequences"The One Liner If Statement (Kinda): Ternary Operators Explained"IBM App Connect Documentation: Adding entries to the activity log by using a Log node

By Matthias Blomme

Configuring Autoscaling for Various Machine Learning Model Types

AWS Sagemaker has simplified the deployment of machine learning models at scale. Configuring effective autoscaling policies is crucial for balancing performance and cost. This article aims to demonstrate how to set up various autoscaling policies using TypeScript CDK, focusing on request, memory, and CPU-based autoscaling for different ML model types. Model Types Based on Invocation Patterns At a high level, model deployment in SageMaker can be broken into three main categories based on invocation patterns: 1. Synchronous (Real-Time) Inference Synchronous inference is suitable when immediate response or feedback is required by end users, such as when a website interaction is required. This approach is particularly well-suited for applications that demand quick response times with minimal delay. Examples include fraud detection in financial transactions and dynamic pricing in ride-sharing. 2. Asynchronous Inference Asynchronous inference is ideal for handling queued requests when it is acceptable to process messages with a delay. This type of inference is preferred when the model is memory/CPU intensive and takes more than a few seconds to respond. For instance, video content moderation, analytics pipeline, and Natural Language Processing (NLP) for textbooks. 3. Batch Processing Batch processing is ideal when data needs to be processed in chunks (batches) or at scheduled intervals. Batch processing is mostly used for non-time-sensitive tasks when you need the output to be available at periodic intervals like daily or weekly. For example, periodic recommendation updates, where an online retailer generates personalized product recommendations for its customers weekly. Predictive maintenance, where daily jobs are run to predict machines that are likely to fail, is another good example. Types of Autoscaling in SageMaker With CDK Autoscaling in SageMaker can be tailored to optimize different aspects of performance based on the model’s workload: 1. Request-Based Autoscaling Use Case Best for real-time (synchronous) inference models that need low latency. Example Scaling up during peak shopping seasons for an e-commerce recommendation model to meet high traffic. 2. Memory-Based Autoscaling Use Case Beneficial for memory-intensive models, such as large NLP models. Example Increasing instance count when memory usage exceeds 80% for image processing models that require high resolution. 3. CPU-Based Autoscaling Use Case Ideal for CPU-bound models that require more processing power. Example Scaling for high-performance recommendation engines by adjusting instance count as CPU usage reaches 75%. Configuring Autoscaling Policies in TypeScript CDK Below is an example configuration of different scaling policies using AWS CDK with TypeScript: TypeScript import * as cdk from 'aws-cdk-lib'; import * as sagemaker from 'aws-cdk-lib/aws-sagemaker'; import * as autoscaling from 'aws-cdk-lib/aws-applicationautoscaling'; import { Construct } from 'constructs'; export class SageMakerEndpointStack extends cdk.Stack { constructor(scope: Construct, id: string, props?: cdk.StackProps) { super(scope, id, props); const AUTO_SCALE_CONFIG = { MIN_CAPACITY: 1, MAX_CAPACITY: 3, TARGET_REQUESTS_PER_INSTANCE: 1000, CPU_TARGET_UTILIZATION: 70, MEMORY_TARGET_UTILIZATION: 80 }; // Create SageMaker Endpoint const endpointConfig = new sagemaker.CfnEndpointConfig(this, 'EndpointConfig', { productionVariants: [{ modelName: 'YourModelName', // Replace with your model name variantName: 'prod', initialInstanceCount: AUTO_SCALE_CONFIG.MIN_CAPACITY, instanceType: 'ml.c5.2xlarge' }] }); const endpoint = new sagemaker.CfnEndpoint(this, 'Endpoint', { endpointName: 'YourEndpointName', // Replace with your endpoint name endpointConfig: endpointConfig }); // Set up autoscaling const scalableTarget = endpoint.createScalableInstanceCount({ minCapacity: AUTO_SCALE_CONFIG.MIN_CAPACITY, maxCapacity: AUTO_SCALE_CONFIG.MAX_CAPACITY }); this.setupRequestBasedAutoscaling(scalableTarget); this.setupCpuBasedAutoscaling(scalableTarget, endpoint); this.setupMemoryBasedAutoscaling(scalableTarget, endpoint); this.setupStepAutoscaling(scalableTarget, endpoint); } private setupRequestBasedAutoscaling(scalableTarget: sagemaker.ScalableInstanceCount) { scalableTarget.scaleOnRequestCount('ScaleOnRequestCount', { targetRequestsPerInstance: AUTO_SCALE_CONFIG.TARGET_REQUESTS_PER_INSTANCE }); } private setupCpuBasedAutoscaling(scalableTarget: sagemaker.ScalableInstanceCount, endpoint: sagemaker.CfnEndpoint) { scalableTarget.scaleOnMetric('ScaleOnCpuUtilization', { metric: endpoint.metricCPUUtilization(), targetValue: AUTO_SCALE_CONFIG.CPU_TARGET_UTILIZATION }); } private setupMemoryBasedAutoscaling(scalableTarget: sagemaker.ScalableInstanceCount, endpoint: sagemaker.CfnEndpoint) { scalableTarget.scaleOnMetric('ScaleOnMemoryUtilization', { metric: endpoint.metricMemoryUtilization(), targetValue: AUTO_SCALE_CONFIG.MEMORY_TARGET_UTILIZATION }); } // Example configuration of step scaling. // Changes the number of instances to scale up and down based on CPU usage private setupStepAutoscaling(scalableTarget: sagemaker.ScalableInstanceCount, endpoint: sagemaker.CfnEndpoint) { scalableTarget.scaleOnMetric('StepScalingOnCpu', { metric: endpoint.metricCPUUtilization(), scalingSteps: [ { upper: 30, change: -1 }, { lower: 60, change: 0 }, { lower: 70, upper: 100, change: 1 }, { lower: 100, change: 2 } ], adjustmentType: autoscaling.AdjustmentType.CHANGE_IN_CAPACITY }); } } Note: CPU metrics can exceed 100% when instances have multiple cores, as they measure total CPU utilization. Balancing Autoscaling Policies by Model Type Autoscaling policies differ based on model requirements: Batch Processing Models Request- or CPU-based autoscaling is ideal here since you won't have to pay for resources when traffic is low or none. Synchronous Models In order to provide a swift response to spikes in real-time requests, request-based autoscaling is recommended. Asynchronous Models CPU-based scaling with longer cooldowns prevents over-scaling and maintains efficiency. Key Considerations for Effective Autoscaling 1. Cost Management Tune metric thresholds to optimize cost without sacrificing performance. 2. Latency Requirements For real-time models, prioritize low-latency scaling; batch and asynchronous models can handle slight delays. 3. Performance Monitoring Regularly assess model performance and adjust configurations to adapt to demand changes. Like in the example above, we can use more than one autoscaling policy to balance cost and performance, but that can lead to increased complexity in setup and management. Conclusion With AWS SageMaker's autoscaling options, you can effectively configure resource management for different types of ML models. By setting up request-based, memory-based, and CPU-based policies in CDK, you can optimize both performance and costs across diverse applications.

By Koushik Balaji Venkatesan

Setting Up DBT and Snowpark for Machine Learning Pipelines

AI/ML workflows excel on structured, reliable data pipelines. To streamline these processes, DBT and Snowpark offer complementary capabilities: DBT is for modular SQL transformations, and Snowpark is for programmatic Python-driven feature engineering. Here are some key benefits of using DBT, Snowpark, and Snowflake together: Simplifies SQL-based ETL with DBT’s modularity and tests.Handles complex computations with Snowpark’s Python UDFs.Leverages Snowflake’s high-performance engine for large-scale data processing. Here’s a step-by-step guide to installing, configuring, and integrating DBT and Snowpark into your workflows. Step 1: Install DBT In Shell, you can use Python’s pip command for installing packages. Assuming Python is already installed and added to your PATH, follow these steps: Shell # Set up a Python virtual environment (recommended): python3 -m venv dbt_env source dbt_env/bin/activate # Install DBT and the Snowflake adapter: pip install dbt-snowflake # Verify DBT installation dbt --version Step 2: Install Snowpark Shell # Install Snowpark for Python pip install snowflake-snowpark-python # Install additional libraries for data manipulation pip install pandas numpy # Verify Snowpark installation python -c "from snowflake.snowpark import Session; print('successful Snowpark installation')" Step 3: Configuring DBT for Snowflake DBT requires a profiles.yml file to define connection settings for Snowflake. Locate the DBT Profiles Directory By default, DBT expects the profiles.yml file in the ~/.dbt/ directory. Create the directory if it doesn’t exist: Shell mkdir -p ~/.dbt Create the profiles.yml File Define your Snowflake credentials in the following format: YAML my_project: outputs: dev: type: snowflake account: your_account_identifier user: your_username password: your_password role: your_role database: your_database warehouse: your_warehouse schema: your_schema target: dev Replace placeholders like your_account_identifier with your Snowflake account details. Test the Connection Run the following command to validate your configuration: Shell dbt debug If the setup is correct, you’ll see a success message confirming the connection. Step 4: Setting Up Snowpark Ensure Snowflake Permissions Before using Snowpark, ensure your Snowflake user has the following permissions: Access to the warehouse and schema.Ability to create and register UDFs (User-Defined Functions). Create a Snowpark Session Set up a Snowpark session using the same credentials from profiles.yml: Python from snowflake.snowpark import Session def create_session(): connection_params = { "account": "your_account_identifier", "user": "your_username", "password": "your_password", "role": "your_role", "database": "your_database", "warehouse": "your_warehouse", "schema": "your_schema", } return Session.builder.configs(connection_params).create() session = create_session() print("Snowpark session created successfully") Register a Sample UDF Here’s an example of registering a simple Snowpark UDF for text processing: Python def clean_text(input_text): return input_text.strip().lower() session.udf.register( func=clean_text, name="clean_text_udf", input_types=["string"], return_type="string", is_permanent=True ) print("UDF registered successfully") Step 5: Integrating DBT With Snowpark You have a DBT model named raw_table that contains raw data. raw_table DBT Model Definition SQL -- models/raw_table.sql SELECT * FROM my_database.my_schema.source_table Use Snowpark UDFs in DBT Models Once you’ve registered a UDF in Snowflake using Snowpark, you can call it directly from your DBT models. SQL -- models/processed_data.sql WITH raw_data AS ( SELECT id, text_column FROM {{ ref('raw_table') } ), cleaned_data AS ( SELECT id, clean_text_udf(text_column) AS cleaned_text FROM raw_data ) SELECT * FROM cleaned_data; Run DBT Models Execute your DBT models to apply the transformation: Shell dbt run --select processed_data Step 6: Advanced AI/ML Use Case For AI/ML workflows, Snowpark can handle tasks like feature engineering directly in Snowflake. Here’s an example of calculating text embeddings: Create an Embedding UDF Using Python and a pre-trained model, you can generate text embeddings: Python from transformers import pipeline def generate_embeddings(text): model = pipeline("feature-extraction", model="bert-base-uncased") return model(text)[0] session.udf.register( func=generate_embeddings, name="generate_embeddings_udf", input_types=["string"], return_type="array", is_permanent=True ) Integrate UDF in DBT Call the embedding UDF in a DBT model to create features for ML: SQL -- models/embedding_data.sql WITH raw_text AS ( SELECT id, text_column FROM {{ ref('raw_table') } ), embedded_text AS ( SELECT id, generate_embeddings_udf(text_column) AS embeddings FROM raw_text ) SELECT * FROM embedded_text; Best Practices Use DBT for reusable transformations: Break down complex SQL logic into reusable models.Optimize Snowpark UDFs: Write lightweight, efficient UDFs to minimize resource usage.Test Your Data: Leverage DBT’s testing framework for data quality.Version Control Everything: Track changes in DBT models and Snowpark scripts for traceability. Conclusion By combining DBT’s SQL-based data transformations with Snowpark’s advanced programming capabilities, you can build AI/ML pipelines that are both scalable and efficient. This setup allows teams to collaborate effectively while leveraging Snowflake’s computational power to process large datasets. Whether you’re cleaning data, engineering features, or preparing datasets for ML models, the DBT-Snowpark integration provides a seamless workflow to unlock your data’s full potential.

By Sevinthi Kali Sankar Nagarajan

How to Automate Blob Deletion in Azure Storage Using PowerShell

Azure storage accounts are a cornerstone for data storage solutions in the Azure ecosystem, supporting various workloads, from storing SQL backups to serving media files. Automating tasks like deleting outdated or redundant blobs from storage containers can optimize storage costs and ensure efficiency. This guide will walk you through using PowerShell to safely and effectively delete blobs from an Azure storage account. Whether you're managing SQL backups, application logs, or other unstructured data, this process can be applied to a wide range of scenarios where cleanup is a routine requirement. New to Storage Account? One of the core services within Microsoft Azure is the storage account service. Many services utilize storage accounts for storing data, such as Virtual Machine Disks, Diagnostics logs (especially application logs), SQL backups, and others. You can also use the Azure storage account service to store your own data, such as blobs or binary data. As per MSDN, Azure blob storage allows you to store large amounts of unstructured object data. You can use blob storage to gather or expose media, content, or application data to users. Because all blob data is stored within containers, you must create a storage container before you can begin to upload data. Step-by-Step Step 1: Get the Prerequisite Inputs In this example, I will delete a SQL database (backed up or imported to storage) stored in bacpac format in SQL container. PowerShell ## prerequisite Parameters $resourceGroupName="rg-dgtl-strg-01" $storageAccountName="sadgtlautomation01" $storageContainerName="sql" $blobName = "core_2022110824.bacpac" Step 2: Connect to Your Azure Subscription Using the az login command with a service principal is a secure and efficient way to authenticate and connect to your Azure subscription for automation tasks and scripts. In scenarios where you need to automate Azure management tasks or run scripts in a non-interactive manner, you can authenticate using a service principal. A service principal is an identity created for your application or script to access Azure resources securely. PowerShell ## Connect to your Azure subscription az login --service-principal -u "210f8f7c-049c-e480-96b5-642d6362f464" -p "c82BQ~MTCrPr3Daz95Nks6LrWF32jXBAtXACccAV" --tenant "cf8ba223-a403-342b-ba39-c21f78831637" Step 3: Check if the Container Exists in the Storage Account When working with Azure Storage, you may need to verify if a container exists in a storage account or create it if it doesn’t. You can use the Get-AzStorageContainer cmdlet to check for the existence of a container. PowerShell ## Get the storage account to check container exist or need to be create $storageAccount = Get-AzStorageAccount -ResourceGroupName $resourceGroupName -Name $storageAccountName ## Get the storage account context $context = $storageAccount.Context Step 4: Ensure the Container Exists Before Deleting the Blob We need to use Remove-AzStorageBlob cmdlet to delete a blob from the Azure Storage container. PowerShell ## Check if the storage container exists if(Get-AzStorageContainer -Name $storageContainerName -Context $context -ErrorAction SilentlyContinue) { Write-Host -ForegroundColor Green $storageContainerName ", the requested container exit,started deleting blob" ## Create a new Azure Storage container Remove-AzStorageBlob -Container $storageContainerName -Context $context -Blob $blobName Write-Host -ForegroundColor Green $blobName deleted } else { Write-Host -ForegroundColor Magenta $storageContainerName "the requested container does not exist" } Here is the full code: PowerShell ## Delete a Blob from an Azure Storage ## Input Parameters $resourceGroupName="rg-dgtl-strg-01" $storageAccountName="sadgtlautomation01" $storageContainerName="sql" $blobName = "core_2022110824.bacpac" ## Connect to your Azure subscription az login --service-principal -u "210f8f7c-049c-e480-96b5-642d6362f464" -p "c82BQ~MTCrPr3Daz95Nks6LrWF32jXBAtXACccAV" --tenant "cf8ba223-a403-342b-ba39-c21f78831637" ## Function to create the storage container Function DeleteblogfromStorageContainer { ## Get the storage account to check container exist or need to be create $storageAccount = Get-AzStorageAccount -ResourceGroupName $resourceGroupName -Name $storageAccountName ## Get the storage account context $context = $storageAccount.Context ## Check if the storage container exists if(Get-AzStorageContainer -Name $storageContainerName -Context $context -ErrorAction SilentlyContinue) { Write-Host -ForegroundColor Green $storageContainerName ", the requested container exit,started deleting blob" ## Remove the blob in Azure Storage container Remove-AzStorageBlob -Container $storageContainerName -Context $context -Blob $blobName Write-Host -ForegroundColor Green $blobName deleted } else { Write-Host -ForegroundColor Magenta $storageContainerName "the requested container does not exist" } } #Call the Function DeleteblogfromStorageContainer Here is the output: Conclusion Automating blob deletion in Azure storage accounts using PowerShell is a practical approach for maintaining a clutter-free and efficient storage system. By following the steps outlined, you can seamlessly integrate this process into your workflows, saving time and reducing manual efforts. This method is not just limited to SQL backup files. It can also be extended to managing other types of data stored in Azure Storage, such as application logs, diagnostic files, or media content. By ensuring the existence of containers and leveraging PowerShell's robust cmdlets, you can confidently manage your Azure resources in an automated, error-free manner.

By thiyagu selvaraj

Running Docker Containers in HashiCorp Nomad: A Beginner’s Guide

Nomad, a flexible and lightweight orchestrator developed by HashiCorp, is an excellent tool for managing containerized applications like Docker. This guide walks you through running Docker containers with Nomad, designed specifically for beginners. Whether you're deploying a simple web server or experimenting with microservices, this guide will provide you with the foundation to get started. What Is Nomad? Nomad is a simple, flexible, and scalable workload orchestrator that supports running containerized and non-containerized applications. Though it is not as popular as Kubernetes, which currently dominates the container orchestration space, Nomad has its advantages: ease of use, lightweight architecture, and support for mixed workloads. Prerequisites Before running Docker containers in Nomad, ensure the following: Nomad Installed: Download and install Nomad from HashiCorp's website.Docker Installed: Install Docker and confirm it is running with: Plain Text docker --version Nomad Agent Running: Start a Nomad agent in development mode for simplicity. The -dev flag starts a local Nomad cluster in development mode: Plain Text nomad agent -dev Step 1: Write a Nomad Job File Nomad uses job specification files written in HashiCorp Configuration Language (HCL) to define the jobs it manages. Below is a basic example of a Nomad job file (docker.nomad) to run an NGINX web server. Example: Simple Nomad Job File YAML job "nginx-job" { datacenters = ["dc1"] group "web-group" { count = 1 task "nginx" { driver = "docker" config { image = "nginx:latest" # Docker image to run ports = ["http"] } resources { network { port "http" { static = 8080 # Expose container on host's port 8080 } } } } } } Explanation of the Job File job: Defines the job name and datacenter to deploy to.group: Groups related tasks (containers) together.task: Specifies a single container to run, including the driver (docker) and configuration.config: Contains Docker-specific settings like the image to use and ports to expose.resources: Defines resource limits and networking settings for the container. Step 2: Run the Nomad Job Submit the job file to the Nomad cluster using the nomad run command: YAML nomad run docker.nomad This will schedule the NGINX container on the Nomad agent. If successful, you’ll see output indicating that the job has been successfully deployed. Step 3: Verify the Job Check the status of the job using: YAML nomad status nginx-job You should see the job details, including its allocation ID and deployment status. Step 4: Access the Running Container Find the IP address of the host running the container. If you're running locally, it’s likely 127.0.0.1.Open a web browser and visit http://localhost:8080. You should see the default NGINX welcome page. Step 5: Stop the Job To stop the container, use the nomad stop command: YAML nomad stop nginx-job This will cleanly shut down the container managed by Nomad. Advanced Examples 1. Add Environment Variables You can pass environment variables to your Docker container using the env block in the task section: YAML task "nginx" { driver = "docker" config { image = "nginx:latest" ports = ["http"] } env { APP_ENV = "production" } } 2. Mount Volumes To mount a host directory into the container, use the volumes option in the config block: YAML task "nginx" { driver = "docker" config { image = "nginx:latest" ports = ["http"] volumes = ["/host/path:/container/path"] } } 3. Scale Containers To scale the container to multiple instances, modify the count parameter in the group section: YAML group "web-group" { count = 3 # Run 3 instances of the container } Nomad will distribute the instances across available nodes in your cluster. Tips for Beginners Test in Development Mode: Start with the nomad agent -dev command for quick local testing before deploying to a production cluster.Leverage Nomad's Web UI: Use the Nomad UI (enabled by default in dev mode) to monitor jobs and allocations visually. Access it at http://localhost:4646.Use Logs for Debugging: Check logs for troubleshooting with: Plain Text nomad logs <allocation-id> Conclusion Running Docker containers with HashiCorp Nomad is a straightforward and powerful way to orchestrate workloads. By defining jobs in simple HCL files, you can easily deploy, monitor, and scale containers across your infrastructure. Whether you're just starting or looking for an alternative to Kubernetes, Nomad’s simplicity and flexibility make it an excellent choice for managing containerized applications.

By Nilesh Jain

Management Capabilities 101: Ensuring On-Time Delivery in Agile-Driven Projects

People may perceive Agile methodology and hard deadlines as two incompatible concepts. The word “Agile” is often associated with flexibility, adaptability, iterations, and continuous improvement, while “deadline” is mostly about fixed dates, finality, and time pressure. Although the latter may sound threatening, project teams can prioritize non-negotiable deadlines and simultaneously modify those that are flexible. The correct approach is the key. In this article, we’ll analyze how deadlines are perceived within an Agile framework and what techniques can help successfully manage deadlines in Agile-driven projects. Immersing Into the Vision of a Powerful Methodology RAD, Scrumban, Lean, XP, AUP, FDD... do these words sound familiar? If you’re involved in IT, you surely must have heard them before. They all are about Agile. This methodology presupposes splitting the software creation process within a project into small iterations called sprints (each typically lasting 2-3 weeks). Agile enables regular delivery of a working product increment as an alternative to a single extensive software rollout. It also fosters openness to any changes, quick feedback for continuous IT product enhancement, and more intensive communication between teams. This approach is ideal for complex projects with dynamic requirements, frequent functionality updates, and the need for continuous alignment with user feedback. Grasping How Time Limitations Are Woven Into an Agile-Driven Landscape Although Agile emphasizes boosted flexibility, it doesn’t mean that deadlines can be neglected. They must be addressed with the same level of responsibility and attention but with a more adaptable mindset. As sprints are short, unforeseen issues or alterations are contained within that specific sprint. This helps mitigate the risks of delaying the entire project and simplifies problem-solving, as only a limited part of the project is impacted at a time. Moreover, meeting deadlines in Agile projects relies heavily on accurate task estimations. If they are off the mark, project teams risk either falling behind schedule because of overcommitting or spending time aimlessly due to an insufficient workload for the sprint. If such situations happen even once, team members must reevaluate their approach to estimating tasks to better align them with team capacity. Proven Practices for Strategic Navigation of Time Constraints Let’s have a closer look at a number of practices for ensuring timely releases throughout the entire Agile development process and keep project teams moving in the right direction: 1. Foster a Steady Dialogue The majority of Agile frameworks support specific ceremonies that ensure transparency and keep team members and stakeholders informed of all project circumstances, thus effectively managing deadlines. For instance, during a daily stand-up meeting, project teams discuss current progress, objectives, and the quickest and most impactful ways of overcoming hurdles to complete all sprint tasks on time. A backlog refinement meeting is another pivotal activity during which a product owner reviews tasks in the backlog to confirm that prioritized activities are completed before each due date. A retrospective meeting performed after each sprint analyzes completed work and considers an improved approach to addressing problems in the future to minimize their effect on hitting deadlines. 2. Set Up Obligatory Sprint Planning Before each sprint, a product owner or a Scrum master needs to conduct a sprint planning meeting, during which they collaborate with software developers to decide on the efforts for each task and prioritize which items from the backlog should be completed further. To achieve this, they analyze what objectives should be attained during this sprint, what techniques will be used to fulfill them, and who will be responsible for each backlog item. This helps ensure that team members continuously progress towards specific goals, have clarity regarding the upcoming activities, and deliver high-quality output, always staying on schedule. 3. Promote Clarity for Everyone Meeting deadlines requires a transparent work environment where everyone has quick access to the current project status, especially in distributed teams. Specific tools, such as Kanban boards or task cards, contribute to achieving this. They provide a flexible shared space that gives a convenient overview of the entire workflow of tasks with highlighted priorities and due dates. This enables team members to prioritize critical tasks without delays, control task completion time, and take full accountability for their work. 4. Implement a Resilient Change Management Framework The ability to swiftly and proficiently process probable modifications in scope or objectives within a sprint directly impacts a team’s ability to adhere to time constraints. Change-handling workflows enable teams to manage adjustments continuously, reducing the risk of downtime or missed deadlines. Therefore, key project contributors, product owners, and Scrum masters can formulate a prioritization system to define which alterations should be addressed first. They also should discuss how each adjustment corresponds to milestones and the end goal. 5. Create a Clear Definition of Done The definition of done is a win-win practice that fosters straightforward criteria for marking tasks as complete. When everyone understands these criteria, they deliver more quality achievements aligned with high standards, minimize the chance of last-minute rework, and decrease the accumulation of technical debt on the project. 6. Follow Time Limits To enhance task execution, team leaders can adopt time limits — for example, restricting daily stand-ups to 15 minutes. This helps to focus on the task and avoid distractions to meet deadlines. Final Thoughts Navigating deadlines in Agile projects is a fully attainable goal that requires an effective strategy. By incorporating practices such as regular communication, sprint planning, transparency, a change management approach, a definition of done, and timeboxing, specialists can successfully accomplish short — and long-term targets without compromising set deadlines.

By Pavel Novik

Karpenter vs. Kubernetes Cluster Autoscaler: Which Is Right for You?

As organizations scale their workloads in Kubernetes, managing cluster resources efficiently becomes paramount. Kubernetes provides built-in scaling capabilities, such as the Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA), but scaling the underlying nodes is where Cluster Autoscaler (CA) has been the go-to solution for years. However, Karpenter, an open-source node provisioning solution, has emerged as a strong alternative, promising more efficient and dynamic scaling. In this article, we will delve deep into the features, benefits, limitations, and use cases of Karpenter and Kubernetes Cluster Autoscaler. By the end, you will be able to determine which tool best suits your needs. Understanding Kubernetes Cluster Autoscaler Kubernetes Cluster Autoscaler (CA) is a time-tested tool that adjusts the size of a cluster to match the demands of running workloads. It achieves this by adding or removing nodes based on unschedulable pods and underutilized nodes. Here’s a breakdown of its functionality: Key Features of Cluster Autoscaler Pod-centric scaling: Detects unschedulable pods and adjusts node capacity to fit them.Node downsizing: Identifies underutilized nodes and scales them down when they no longer host any workloads.Integration with cloud providers: Works seamlessly with major cloud platforms like AWS, GCP, and Azure.Customizable behavior: Supports scaling policies, taints, tolerations, and labels to meet specific workload requirements. Strengths of Cluster Autoscaler Mature and proven: It has been part of Kubernetes since 2016, making it highly reliable.Cloud-native compatibility: Excellent support for managed Kubernetes services like Amazon EKS, Google GKE, and Azure AKS.Cost optimization: Shrinks the cluster by removing unused nodes to save costs. Challenges of Cluster Autoscaler Static scaling mechanism: Decisions are based on a set of preconfigured rules, leading to inefficiencies in dynamic workloads.Latency in scaling: CA may introduce delays in scaling, especially when managing large or bursty workloads.Limited customizability: While it supports scaling parameters, its operational model is less flexible compared to newer solutions like Karpenter. Introducing Karpenter Karpenter, developed by AWS, is an open-source, next-generation provisioning system designed to optimize Kubernetes cluster resources. Unlike Cluster Autoscaler, Karpenter focuses on flexibility, efficiency, and workload-aware scaling. Key Features of Karpenter Real-time scaling: Rapidly provisions and deprovisions nodes based on workload requirements.Dynamic instance selection: Chooses the most cost-effective and performant instances in real-time, leveraging Spot Instances when possible.Workload awareness: Supports fine-grained workload characteristics, such as GPU-based instances, ephemeral storage, and specific labels.Native integration: Deeply integrates with Kubernetes APIs, making it cloud-agnostic. Strengths of Karpenter Dynamic scaling: Optimized for real-time scaling needs, making it ideal for bursty or unpredictable workloads.Cloud-agnostic: Works with any Kubernetes cluster, including on-premise and edge environments.Cost efficiency: Supports automatic selection of Spot Instances, Reserved Instances, or Savings Plans to minimize cost.Resource optimization: Provisions nodes tailored to workload requirements, reducing resource waste. Challenges of Karpenter Newer technology: Still maturing compared to Cluster Autoscaler, with fewer production case studies.Learning curve: Requires a deeper understanding of Kubernetes workload patterns and configuration.AWS-centric focus: While cloud-agnostic, its tight integration with AWS services makes it more suitable for AWS-heavy environments. Head-to-Head Comparison Feature/Aspect Cluster Autoscaler Karpenter Scaling Speed Moderate Fast Flexibility in Node Types Limited to predefined configurations Dynamic, based on workload needs Cloud Provider Support Comprehensive Cloud-agnostic Ease of Use Simpler to configure Requires deeper expertise Cost Efficiency Moderate High, with Spot/Reserved instance optimization Maturity Established and stable Emerging and rapidly evolving Integration Works with cloud-managed Kubernetes Direct integration with Kubernetes APIs Use Cases: When to Choose What Choose Kubernetes Cluster Autoscaler if: You prefer a tried-and-true solution with broad community support.Your workloads have predictable scaling patterns.You are using a managed Kubernetes service, such as EKS, AKS, or GKE.You have limited expertise in managing node-level scaling and need a simpler tool. Choose Karpenter if: Your workloads are dynamic and bursty, requiring real-time scaling.You need fine-grained control over node provisioning and types.Cost optimization through Spot Instances or Savings Plans is a top priority.You are building cloud-agnostic or hybrid Kubernetes deployments.You operate in an AWS-heavy environment, leveraging its deep integration. Practical Considerations Setup and Configuration Cluster Autoscaler requires setting scaling policies and often depends on the node group configuration provided by the cloud provider.Karpenter requires installing a controller in the cluster and configuring it to interact with workload-specific needs dynamically. Performance Tuning Cluster Autoscaler may need fine-tuning of its thresholds and settings to avoid overprovisioning or underutilization.Karpenter adapts to real-time requirements but needs workload profiling to achieve optimal efficiency. The Verdict Both Karpenter and Kubernetes Cluster Autoscaler are powerful tools, but their utility depends on your unique requirements: Cluster Autoscaler is a reliable and straightforward choice for most managed Kubernetes clusters.Karpenter excels in environments requiring rapid scaling, workload-specific node provisioning, and cost optimization. For organizations prioritizing agility and cost efficiency, Karpenter is a forward-looking solution. For those needing stability and ease of use, Cluster Autoscaler remains a solid option. By carefully evaluating your workload patterns, cloud environment, and scaling requirements, you can choose the tool that aligns with your operational goals. Author’s Note: As the Kubernetes ecosystem evolves, scaling strategies will continue to improve. Keeping abreast of the latest developments in tools like Karpenter and Cluster Autoscaler can ensure your clusters remain efficient, cost-effective, and reliable.

By Sai Sandeep Ogety

CORE

Mastering Cloud Containerization: A Step-by-Step Guide to Deploying Containers in the Cloud

Containers have transformed how we deploy, scale, and manage applications by packaging code and dependencies in a standardized unit that can run consistently across any environment. When used in cloud environments, containers offer: Portability across development, testing, and production.Scalability to quickly adapt to traffic and demand.Efficiency with reduced overhead compared to traditional virtual machines. In this tutorial, we’ll walk through a full setup of a cloud-hosted containerized application, covering: Basics of containers and why they’re ideal for the cloud.Setting up a Dockerized application.Deploying the container to a cloud provider (using Google Cloud Platform as an example).Scaling and managing your container in the cloud. Container Basics: How Containers Fit Into Cloud Workflows Containers encapsulate all the libraries and dependencies needed to run an application. Unlike traditional virtual machines, in which each has an OS, containers share the host OS, making them lightweight and efficient. Why Containers for Cloud? Fast startup times mean quicker scaling to handle variable traffic.Consistency across environments ensures that code behaves the same from developer laptops to production.Resource efficiency enables high-density deployment on the same infrastructure. Core Components of Cloud Containerization Container Engine: Manages and runs containers (e.g., Docker, containerd).Orchestration: Ensures app reliability, scaling, and load balancing (e.g., Kubernetes, ECS).Registry: Stores container images for access across environments (e.g., Docker Hub, GCR). Setting Up a Dockerized Application We’ll start by containerizing a simple Node.js application. Step 1: Create the Application 1. In a project folder, initialize a Node.js project: Shell mkdir cloud-container-app && cd cloud-container-app npm init -y 2. Create a basic server file, app.js: JavaScript const express = require('express'); const app = express(); app.get('/', (req, res) => { res.send('Hello, Cloud Container!'); }); app.listen(3000, () => { console.log('App running on port 3000'); }); 3. Add Express to the project: Shell npm install express Step 2: Create a Dockerfile This Dockerfile specifies how to package the app in a Docker container. Dockerfile # Use the Node.js image as a base FROM node:14 # Set working directory WORKDIR /app # Copy files and install dependencies COPY . . RUN npm install # Expose the app’s port EXPOSE 3000 # Start the application CMD ["node", "app.js"] Step 3: Build and Test the Docker Image Locally 1. Build the Docker image: Shell docker build -t cloud-container-app . 2. Run the container locally: Shell docker run -p 3000:3000 cloud-container-app 3. Visit `http://localhost:3000` in your browser. You should see "Hello, Cloud Container!" displayed. Deploying the Container to Google Cloud Platform (GCP) In this section, we’ll push the container image to Google Container Registry (GCR) and deploy it to Google Kubernetes Engine (GKE). Step 1: Set Up a GCP Project 1. Create a GCP Project. Go to the [Google Cloud Console] (https://console.cloud.google.com) and create a new project. 2. Enable the Kubernetes Engine and Container Registry APIs for your project. Step 2: Push the Image to Google Container Registry 1. Tag the Docker image for Google Cloud: Shell docker tag cloud-container-app gcr.io/<YOUR_PROJECT_ID>/cloud-container-app 2. Push the image to GCR: Shell docker push gcr.io/<YOUR_PROJECT_ID>/cloud-container-app Step 3: Create a Kubernetes Cluster 1. Initialize a GKE Cluster: Shell gcloud container clusters create cloud-container-cluster --num-nodes=2 2. Configure kubectl to connect to your new cluster: Shell gcloud container clusters get-credentials cloud-container-cluster Step 4: Deploy the Containerized App to GKE 1. Create a k8s-deployment.yaml file to define the deployment and service: YAML apiVersion: apps/v1 kind: Deployment metadata: name: cloud-container-app spec: replicas: 2 selector: matchLabels: app: cloud-container-app template: metadata: labels: app: cloud-container-app spec: containers: - name: cloud-container-app image: gcr.io/<YOUR_PROJECT_ID>/cloud-container-app ports: - containerPort: 3000 --- apiVersion: v1 kind: Service metadata: name: cloud-container-service spec: type: LoadBalancer selector: app: cloud-container-app ports: - protocol: TCP port: 80 targetPort: 3000 2. Deploy the application to GKE: Shell kubectl apply -f k8s-deployment.yaml 3. Get the external IP to access the app: Shell kubectl get services Scaling and Managing Containers in GKE Step 1: Scale the Deployment To adjust the number of replicas (e.g., to handle higher traffic), use the following command: Shell kubectl scale deployment cloud-container-app --replicas=5 GKE will automatically scale the app by adding replicas and distributing the load. Step 2: Monitor and Manage Logs 1. Use the command below to view logs for a specific pod in Kubernetes: Shell kubectl logs <POD_NAME> 2. Enable the GKE dashboard to monitor pod status and resource usage and manage deployments visually. Conclusion: Leveraging Containers for Scalable Cloud Deployments This tutorial covered: Containerizing a Node.js app using Docker.Deploying to Google Kubernetes Engine (GKE) for scalable, managed hosting.Managing and scaling containers effectively in the cloud. With containers in the cloud, your applications gain scalability, portability, and efficiency — ensuring they’re ready to handle demand, adapt quickly, and remain consistent across environments.

By Kuppusamy Vellamadam Palavesam

Introducing the MERGE Command in PostgreSQL 15

Developers often need to merge data from external sources to the base table. The expectation of this merge operation is that data from external sources refresh data at the main table. Refresh means inserting new records, updating existing records, and deleting if a record is not found. Since the 9.4 version release, PostgreSQL has supported the INSERT command with the ‘ON CONFLICT’ clause. While this proved to be a workaround for the MERGE command, many features, such as conditional delete and simplicity of query, were missing. Other databases, such as Oracle and SQL Server, already support MERGE commands. Until PostgreSQL 15, it was still one of the long-awaited features for PostgreSQL users. This article discusses the advantages, use cases, and benchmarking of the MERGE command, which is newly supported by PostgreSQL 15. How the MERGE Command Works in PostgreSQL The MERGE command selects rows from one or more external data sources to UPDATE, DELETE, or INSERT on the target table. Depending on the condition used in the MERGE command, the user running the MERGE command should have UPDATE, DELETE, and INSERT privileges on the target table. The user should also have SELECT privileges on the source table(s). A typical PostgreSQL MERGE command looks like as below: SQL MERGE INTO target_table t1 USING source_table t2 ON t2.col1 = t2.col1 WHEN MATCHED THEN UPDATE/INSERT/DELETE ... WHEN NOT MATCHED THEN UPDATE/INSERT/DELETE; Example of a MERGE Command The image below explains a typical use case of the MERGE command in PostgreSQL. The source data that matches the target data can be updated or deleted, and the source data that does not match can be inserted into the target data. For example, the image below shows a classic data warehouse populated with data from various data marts. Assume the data warehouse hosts data from IOT sensor stations. These stations host data marts 1, 2, 3, and so forth at respective stations. These data marts are hosted locally on the stations. In the steps below, we’ll merge data from different stations to the main stations. Step-by-Step Step 1: Create the Data Warehouse Table (Target Station Table) Create the data warehouse table, the target station table that is loaded by other stations: SQL Postgres=> CREATE table station_main ( station_id integer primary key , data text , create_time timestamp default current_timestamp , update_time timestamp default current_timestamp ); CREATE TABLE Step 2: Insert Sample Data into the Target Table Load some sample records into the target table: SQL postgres=> INSERT into station_main VALUES (1, 'data11'), (2, 'data22'), (3, 'data44'), (4, 'data44'), (5, 'data55'); INSERT 0 5 Step 3: Verify Data Insertion Check the data to ensure it's inserted correctly: SQL postgres=> SELECT * from station_main; station_id | data | create_time | update_time ------------+--------+---------------------------+--------------------------- 1 | data11 | 2023-08-11 21:21:30.19364 | 2023-08-11 21:21:30.19364 2 | data22 | 2023-08-11 21:21:30.19364 | 2023-08-11 21:21:30.19364 3 | data44 | 2023-08-11 21:21:30.19364 | 2023-08-11 21:21:30.19364 4 | data44 | 2023-08-11 21:21:30.19364 | 2023-08-11 21:21:30.19364 5 | data55 | 2023-08-11 21:21:30.19364 | 2023-08-11 21:21:30.19364 (5 rows) Step 4: Create a Table for Various Stations Create a table for data from other stations: SQL postgres=> create table station_1 ( station_id integer , data text ); CREATE TABLE Time: 4.288 ms Step 5: Insert Sample Data into the Station Table Insert sample records into the newly created station table: SQL postgres=> INSERT INTO station_1 VALUES (1, 'data11'), (2, 'data22'), (3, 'data44'), (4, 'data44'), (5, 'data55'); INSERT 0 5 Time: 2.221 ms Step 6: Verify Data in Station 1 Check the inserted data in station_1: SQL postgres=> SELECT * from station_1; station_id | data ------------+-------- 1 | data11 2 | data22 3 | data44 4 | data44 5 | data55 (5 rows) Step 7: Insert New Data and Update Existing Data in Station 1 Assuming station_1 updates some base values and inserts new values, insert new data and update existing data: SQL postgres=> INSERT INTO station_1 VALUES (6, 'data66'); INSERT 0 1 postgres=> postgres=> UPDATE station_1 set data='data10' where station_id=1; UPDATE 1 postgres=> postgres=> SELECT * from station_1; station_id | data ------------+--------- 2 | data22 3 | data44 4 | data44 5 | data55 6 | data66 1 | data10 (6 rows) Step 8: Merge Data from Station 1 to Station Main If station_1 only inserts new data, we could have used INSERT INTO SELECT FROM command. But, in this scenario, few data have been modified, so we need to merge the data: SQL postgres=> MERGE INTO station_main sm USING station_1 s ON sm.station_id=s.station_id when matched then update set data=s.data WHEN NOT MATCHED THEN INSERT (station_id, data) VALUES (s.station_id, s.data); MERGE 6 Verify data after the merge: SQL postgres=> SELECT * from station_main ; postgres=> SELECT * from station_main ; station_id | data | create_time | update_time ------------+--------+----------------------------+---------------------------- 2 | data22 | 2023-08-11 21:27:23.076226 | 2023-08-11 21:27:23.076226 3 | data44 | 2023-08-11 21:27:23.076226 | 2023-08-11 21:27:23.076226 4 | data44 | 2023-08-11 21:27:23.076226 | 2023-08-11 21:27:23.076226 5 | data55 | 2023-08-11 21:27:23.076226 | 2023-08-11 21:27:23.076226 6 | data66 | 2023-08-11 21:29:13.354859 | 2023-08-11 21:29:13.354859 1 | data10 | 2023-08-11 21:27:23.076226 | 2023-08-11 21:27:23.076226 (6 rows) Advantages of the Merge Command MERGE command helps easily manage a set of external data files such as application log files. For example, you can host data and log files into separate tablespaces hosted at cheaper storage disks and then create a MERGE table to use them as one. Other advantages include: Better speed: Small source tables perform better than a single large table. You can split a large source table based on some clause and then use individual tables to merge into the target table.More efficient searches: Searching in underlying smaller tables is quicker than searching in one large table.Easier table repair: Repairing individual smaller tables is easier than a large table. Here, repairing tables means removing data anomalies.Instant mapping: A MERGE table does not need to maintain an index of its own, making it fast to create or remap. As a result, MERGE table collections are very fast to create or remap. (You must still specify the index definitions when you create a MERGE table, even though no indexes are created.) If you have a set of tables from which you create a large table on demand, you can instead create a MERGE table from them on demand. This is much faster and saves a lot of disk space. Use Cases of MERGE Command in PostgreSQL 15 Below are the common use cases of MERGE commands: IoT data collecting from various machines: Data coming from various machines can be saved in smaller temporary tables and can be merged with target data at certain intervals. E-commerce data: Product availability data coming from various fulfillment centers can be merge with central repository of target product table. Customer data: Customer transaction data coming from different sources can be merged with main customer data. Automation of MERGE Command in PostgreSQL Using pg_cron Automating MERGE is important for error-prone MERGE execution and reducing maintenance hassle. PostgreSQL supports pg_cron extension. This extension is used to configure scheduled tasks for a PostgreSQL database. A typical pg_cron command mentioned below shows the scheduling options: SELECT cron.schedule('Minute Hour Date Month Day of the week', 'Task'); The pg_cron job below schedules MERGE command every minute: SQL SELECT cron.schedule('30 * * * *', $$MERGE INTO station_main sm USING station_1 s ON sm.station_id=s.station_id when matched then update set data=s.data WHEN NOT MATCHED THEN INSERT (station_id, data) VALUES (s.station_id, s.data);$$); Benchmarking: MERGE vs. UPSERT In this section, we will benchmark and compare traditional UPSERT (i.e., INSERT with ‘ON CONFLICT’ clause) and the MERGE command. Setup Step 1: Create a target table with 1,000,000 records, with id column as primary key: SQL CREATE table target_table as (SELECT generate_series(1,1000000) AS id, floor(random() * 1000) AS data); ALTER TABLE target_table ADD PRIMARY KEY (id); Step 2: Create a source data table with all the contents of the target table and id column as a primary key: SQL CREATE table source_table as SELECT * from target_table; ALTER TABLE source_table ADD PRIMARY KEY (id); Step 3: Create a temporary table with additional data: SQL CREATE table source_table_temp as (SELECT generate_series(1000001,1200000) AS id, floor(random() * 1000) AS data); Step 4: Insert data from the temporary table to the source table and update 400,000 rows: Step 5: Turn timing on and run the UPSERT command. The total time taken for the UPSERT operation is 5904 ms. Now, we repeat the above steps and run the MERGE command. The total time it takes for the MERGE operation is 4,484 ms. For our specific use case, we'll find that the MERGE command performed 30% better compared to the UPSERT operation. Conclusion MERGE command has been one of the most popular features added to PostgreSQL 15. It improved the performance of table refresh and makes table maintenance easier. Further, newly supported extensions, such as pg_cron can be used to automate MERGE operation.

By Vivek Singh

A Framework for Developing Service-Level Objectives: Essential Guidelines and Best Practices for Building Effective Reliability Targets

Editor's Note: The following is an article written for and published in DZone's 2024 Trend Report, Observability and Performance: The Precipice of Building Highly Performant Software Systems. "Quality is not an act, it's a habit," said Aristotle, a principle that rings true in the software world as well. Specifically for developers, this means delivering user satisfaction is not a one-time effort but an ongoing commitment. To achieve this commitment, engineering teams need to have reliability goals that clearly define the baseline performance that users can expect. This is precisely where service-level objectives (SLOs) come into picture. Simply put, SLOs are reliability goals for products to achieve in order to keep users happy. They serve as the quantifiable bridge between abstract quality goals and the day-to-day operational decisions that DevOps teams must make. Because of this very importance, it is critical to define them effectively for your service. In this article, we will go through a step-by-step approach to define SLOs with an example, followed by some challenges with SLOs. Steps to Define Service-Level Objectives Like any other process, defining SLOs may seem overwhelming at first, but by following some simple steps, you can create effective objectives. It's important to remember that SLOs are not set-and-forget metrics. Instead, they are part of an iterative process that evolves as you gain more insight into your system. So even if your initial SLOs aren't perfect, it's okay — they can and should be refined over time. Figure 1. Steps to define SLOs Step 1: Choose Critical User Journeys A critical user journey refers to the sequence of interactions a user takes to achieve a specific goal within a system or a service. Ensuring the reliability of these journeys is important because it directly impacts the customer experience. Some ways to identify critical user journeys can be through evaluating revenue/business impact when a certain workflow fails and identifying frequent flows through user analytics. For example, consider a service that creates virtual machines (VMs). Some of the actions users can perform on this service are browsing through the available VM shapes, choosing a region to create the VM in, and launching the VM. If the development team were to order them by business impact, the ranking would be: Launching the VM because this has a direct revenue impact. If users cannot launch a VM, then the core functionality of the service has failed, affecting customer satisfaction and revenue directly.Choosing a region to create the VM. While users can still create a VM in a different region, it may lead to a degraded experience if they have a regional preference. This choice can affect performance and compliance.Browsing through the VM catalog. Although this is important for decision making, it has a lower direct impact on the business because users can change the VM shape later. Step 2: Determine Service-Level Indicators That Can Track User Journeys Now that the user journeys are defined, the next step is to measure them effectively. Service-level indicators (SLIs) are the metrics that developers use to quantify system performance and reliability. For engineering teams, SLIs serve a dual purpose: They provide actionable data to detect degradation, guide architectural decisions, and validate infrastructure changes. They also form the foundation for meaningful SLOs by providing the quantitative measurements needed to set and track reliability targets. For instance, when launching a VM, some of the SLIs can be availability and latency. Availability: Out of the X requests to launch a VM, how many succeeded? A simple formula to calculate this is: If there were 1,000 requests and 998 requests out of them succeeded, then the availability is = 99.8%. Latency: Out of the total number of requests to launch a VM, what time did the 50th, 95th, or 99th percentile of requests take to launch the VM? The percentiles here are just examples and can vary depending on the specific use case or service-level expectations. In a scenario with 1,000 requests where 900 requests were completed in 5 seconds and the remaining 100 took 10 seconds, the 95th percentile latency would be = 10 seconds.While averages can also be used to calculate latencies, percentiles are typically recommended because they account for tail latencies, offering a more accurate representation of the user experience. Step 3: Identify Target Numbers for SLOs Simply put, SLOs are the target numbers we want our SLIs to achieve in a specific time window. For the VM scenario, the SLOs can be: The availability of the service should be greater than 99% over a 30-day rolling window.The 95th percentile latency for launching the VMs should not exceed eight seconds. When setting these targets, some things to keep in mind are: Using historical data. If you need to set SLOs based on a 30-day rolling period, gather data from multiple 30-day windows to define the targets. If you lack this historical data, start with a more manageable goal, such as aiming for 99% availability each day, and adjust it over time as you gather more information. Remember, SLOs are not set in stone; they should continuously evolve to reflect the changing needs of your service and customers. Considering dependency SLOs. Services typically rely on other services and infrastructure components, such as databases and load balancers. For instance, if your service depends on a SQL database with an availability SLO of 99.9%, then your service's SLO cannot exceed 99.9%. This is because the maximum availability is constrained by the performance of its underlying dependencies, which cannot guarantee higher reliability. Challenges of SLOs It might be intriguing to set the SLO as 100%, but this is impossible. A 100% availability, for instance, means that there is no room for important activities like shipping features, patching, or testing, which is not realistic. Defining SLOs requires collaboration across multiple teams, including engineering, product, operations, QA, and leadership. Ensuring that all stakeholders are aligned and agree on the targets is essential for the SLO to be successful and actionable. Step 4: Account for Error Budget An error budget is the measure of downtime a system can afford without upsetting customers or breaching contractual obligations. Below is one way of looking at it: If the error budget is nearly depleted, the engineering team should focus on improving reliability and reducing incidents rather than releasing new features.If there's plenty of error budget left, the engineering team can afford to prioritize shipping new features as the system is performing well within its reliability targets. There are two common approaches to measuring the error budget: time based and event based. Let's explore how the statement, "The availability of the service should be greater than 99% over a 30-day rolling window," applies to each. Time-Based Measurement In a time-based error budget, the statement above translates to the service being allowed to be down for 43 minutes and 50 seconds in a month, or 7 hours and 14 minutes in a year. Here's how to calculate it: Determine the number of data points. Start by determining the number of time units (data points) within the SLO time window. For instance, if the base time unit is 1 minute and the SLO window is 30 days: Calculate the error budget. Next, calculate how many data points can "fail" (i.e., downtime). The error budget is the percentage of allowable failure. Convert this to time: This means the system can experience 7 hours and 14 minutes of downtime in a 30-day window. Last but not least, the remaining error budget is the difference between the total possible downtime and the downtime already used. Event-Based Measurement For event-based measurement, the error budget is measured in terms of percentages. The aforementioned statement translates to a 1% error budget in a 30-day rolling window. Let's say there are 43,200 data points in that 30-day window, and 100 of them are bad. You can calculate how much of the error budget has been consumed using this formula: Now, to find out how much error budget remains, subtract this from the total allowed error budget (1%): Thus, the service can still tolerate 0.77% more bad data points. Advantages of Error Budget Error budgets can be utilized to set up automated monitors and alerts that notify development teams when the budget is at risk of depletion. These alerts enable them to recognize when a greater caution is required while deploying changes to production. Teams often face ambiguity when it comes to prioritizing features vs. operations. Error budget can be one way to address this challenge. By providing clear, data-driven metrics, engineering teams are able to prioritize reliability tasks over new features when necessary. The error budget is among the well-established strategies to improve accountability and maturity within the engineering teams. Cautions to Take With Error Budgets When there is extra budget available, developers should actively look into using it. This is a prime opportunity to deepen the understanding of the service by experimenting with techniques like chaos engineering. Engineering teams can observe how the service responds and uncover hidden dependencies that may not be apparent during normal operations. Last but not least, developers must monitor error budget depletion closely as unexpected incidents can rapidly exhaust it. Conclusion Service-level objectives represent a journey rather than a destination in reliability engineering. While they provide important metrics for measuring service reliability, their true value lies in creating a culture of reliability within organizations. Rather than pursuing perfection, teams should embrace SLOs as tools that evolve alongside their services. Looking ahead, the integration of AI and machine learning promises to transform SLOs from reactive measurements into predictive instruments, enabling organizations to anticipate and prevent failures before they impact users. Additional resources: Implementing Service Level Objectives, Alex Hidalgo, 2020 "Service Level Objects," Chris Jones et al., 2017 "Implementing SLOs," Steven Thurgood et al., 2018 Uptime/downtime calculator This is an excerpt from DZone's 2024 Trend Report, Observability and Performance: The Precipice of Building Highly Performant Software Systems.Read the Free Report

By Siri Varma Vegiraju

CORE

DevOps and CI/CD

DZone's Featured DevOps and CI/CD Resources

Top DevOps and CI/CD Experts

The Latest DevOps and CI/CD Topics