Snowflake Data Time Travel
Explore Snowflake's data recovery time-travel feature that restores past data at the table, schema, or database level and is helpful for restores and backups.
Join the DZone community and get the full member experience.
Join For FreeSnowflake is a leading cloud-based data storage and analytics service that provides various solutions for data warehouses, data engineering, AI/ML modeling, and other related services. It has multiple features and functionalities; one powerful data recovery feature is Time Travel. It allows users to access historical data from the past. It is beneficial when a user comes across any of the below scenarios:
- Retrieving the previous row or column value before the current DML operation
- Recovering the last state of data for backup or redundancy
- Updating or deleting records from the table by mistake
- Restoring the previous state of the table, schema, or database
Snowflake's Continuous Data Protection Life Cycle allows time travel within a window of 1 to 90 days. For the Enterprise edition, up to 90 days of retention is allowed.
Time Travel SQL Extensions
Time Travel can be achieved using Offsets, Timestamps, and Statements keywords in addition to the AT
or BEFORE
clause.
Offset
If a user wants to retrieve past data or recover a table from the older state data using time parameters, then the user can use the query below, where offset is defined in seconds.
SELECT * FROM any_table AT(OFFSET => -60*5); -- For 5 Minutes
CREATE TABLE recoverd_table CLONE any_table AT(OFFSET => -3600); -- For 1 Hour
Timestamp
Suppose a user wants to query data from the past or recover a schema for a specific timestamp. Then, the user can utilize the below query.
SELECT * FROM any_table AT(TIMESTAMP => 'Sun, 05 May 2024 16:20:00 -0700'::timestamp_tz);
CREATE SCHEMA recovered_schema CLONE any_schema AT(TIMESTAMP => 'Wed, 01 May 2024 01:01:00 +0300'::timestamp_tz);
Statement
Users can also use any unique query ID to get the latest data until the statement.
SELECT * FROM any_table BEFORE(STATEMENT => '9f6e1bq8-006f-55d3-a757-beg5a45c1234');
CREATE DATABASE recovered_db CLONE any_db BEFORE(STATEMENT => '9f6e1bq8-006f-55d3-a757-beg5a45c1234');
The command below sets the data retention time and increases or decreases.
CREATE TABLE any_table(id NUMERIC, name VARCHAR, created_date DATE) DATA_RETENTION_TIME_IN_DAYS=90;
ALTER TABLE any_table SET DATA_RETENTION_TIME_IN_DAYS=30;
If data retention is not required, then we can also use SET DATA_RETENTION_TIME_IN_DAYS=0;
.
Objects that do not have an explicitly defined retention period can inherit the retention from the upper object level. For instance, tables that do not have a specified retention period will inherit the retention period from schema, and schema that does not have the retention period defined will inherit from the database level. The account level is the highest level of the hierarchy and should be set up with 0 days for data retention.
Now consider a case where a table, schema, or database accidentally drops, causing all the data to be lost. During such cases, when any data object gets dropped, it's kept in Snowflake's back-end until the data retention period. For such cases, Snowflake has a similar great feature that will bring those objects back with below SQL.
UNDROP TABLE any_table;
UNDROP SCHEMA any_schema;
UNDROP DATABASE any_database;
If a user creates a table with the same name as the dropped table, then Snowflake creates a new table, not restore the old one. When the user uses the above UNDROP
command, Snowflake restores the old object. Also, the user needs permission or ownership to restore the object.
After the Time Travel period, if the object isn't retrieved within the data retention period, it is transferred to Snowflake Fail-Safe, where users can't query. The only way to recover that is by using Snowflake's help, and it stores the data for a maximum of 7 days.
Challenges
Time travel, though useful, has a few challenges, as shown below.
- The Time Travel has a default one-day setup for transient and temporary tables in Snowflake.
- Any objects except tables, such as views, UDFs, and stored procedures, are not supported.
- If a table is recreated with the same name, referring to the older version of the same name requires renaming the current table as, by default, Time Travel will refer to the latest version.
Conclusion
The Time Travel feature is quick, easy, and powerful. It's always handy and gives users more comfort while operating production-sensitive data. The great thing is that users can run these queries themselves without having to involve admins. With a maximum retention of 90 days, users have more than enough time to query back in time or fix any incorrectly updated data. In my opinion, it is Snowflake's strongest feature.
Reference
Opinions expressed by DZone contributors are their own.
Comments