Sales Forecasting With Snowflake Cortex ML Functions
Snowflake Cortex is a collection of integrated machine learning models and AI features designed to implement AI-driven solutions within the Snowflake environment.
Join the DZone community and get the full member experience.
Join For FreeSnowflake Cortex is a suite of Machine Learning (ML) and Artificial Intelligence (AI) capabilities letting businesses leverage the power of computing on their data. The machine learning functions like FORECAST, TOP_INSIGHTS
and ANOMALY_DETECTION
allows access to the leading large language models (LLMs) for working on both structured and unstructured data through SQL statements. Using these functions, data/business analysts can produce estimations, and recommendations and identify abnormalities within their data without knowing Python or other programming languages and without an understanding of building large language models.
FORECAST: SNOWFLAKE.ML.FORECAST
function enables businesses to forecast the metrics based on historical performance. You can use these functions to forecast future demand, Pipeline gen, sales, and revenue over a period.ANOMALY_DETECTION
:SNOWFLAKE.ML.ANOMALY_DETECTION
function helps flag outliers based on both unsupervised and supervised learning models. These functions can be used to identify the spikes in your key performance indicators and track the abnormal trends.TOP_INSIGHTS: SNOWFLAKE.ML.TOP_INSIGHTS
function enables the analysts to root cause the significant contributors to a particular metric of interest. This can help you track the drivers like demand channels driving your sales, and agents dragging your customer satisfaction down.
In this article, I will focus on exploring the FORECAST
function to implement the time series forecast model to estimate the sales for a superstore based on the historical sales.
Data Setup and Exploration
For the purpose of this article, we will use the historical Superstore Sales data along with the holiday calendar. The following code block can be used to create both the tables being used in this article and visualize the historical sales data.
CREATE OR REPLACE TABLE superstore.superstore_ml_functions.superstore_sales(
Order_Date DATE,
Segment VARCHAR(16777216),
Region VARCHAR(16777216),
Category VARCHAR(16777216),
Sub_Category VARCHAR(16777216),
Sales NUMBER(17,0)
);
CREATE OR REPLACE TABLE superstore.superstore_ml_functions.us_calender(
Date DATE,
HOLIDAY VARCHAR(16777216)
);
select * from superstore.superstore_ml_functions.superstore_sales where category = 'Technology';
Having explored the historical sales, I would train the forecast model based on the last 12 months of sales. The following code can be used to create the training data table.
CREATE OR REPLACE TABLE superstore_sales_last_year AS (
SELECT
to_timestamp_ntz(Order_Date) AS timestamp,
Segment,
Category,
Sub_Category,
Sales
FROM
superstore_sales
WHERE
Order_Date > (SELECT max(Order_Date) - interval '1 year' FROM superstore_sales)
GROUP BY
all
);
Train the Forecast Model
SNOWFLAKE.ML.FORECAST SQL
function can be used to train the forecast model based on the historical data, in this section we will create a view to be used as a training dataset for technology sales and train the model.
CREATE OR REPLACE VIEW technology_sales AS (
SELECT
timestamp,
sum(Sales) as Sales
FROM
superstore_sales_last_year
WHERE
category = 'Technology'
group by timestamp
);
CREATE OR REPLACE SNOWFLAKE.ML.FORECAST technology_forecast (
INPUT_DATA => SYSTEM$REFERENCE('VIEW', 'technology_sales'),
TIMESTAMP_COLNAME => 'TIMESTAMP',
TARGET_COLNAME => 'SALES'
);
SHOW SNOWFLAKE.ML.FORECAST;
Creating and Visualizing the Forecasts
Having trained the forecast model, let’s use the following code block to create predictions for the next 90 days.
CALL technology_forecast!FORECAST(FORECASTING_PERIODS => 90);
-- Run immediately after the above call to store results!
CREATE OR REPLACE TABLE technology_predictions AS (
SELECT
*
FROM
TABLE(RESULT_SCAN(-1))
);
SELECT
timestamp,
sales,
NULL AS forecast
FROM
technology_sales
WHERE
timestamp > '2023-01-01'
UNION
SELECT
TS AS timestamp,
NULL AS sales,
forecast
FROM
technology_predictions
ORDER BY
timestamp asc;
The trend line in YELLOW in the above chart visualizes the predictions for the same in the next 90 days.
Conclusion
In the end, in this article, we have explored the SNOWFLAKE.ML.FORECAST
function to build an LLM forecast model for a superstore sales prediction, visualized the historical data, created necessary training datasets, build the forecast model, and visualized the estimations. As a next step, I would recommend continued exploration of the Snowflake Cortex framework to build multiple forecast models based on dimensions, anomaly detection, and top insights based on in-house large language models.
Opinions expressed by DZone contributors are their own.
Comments