Snowflake Administration: A Comprehensive Step-by-Step Guide
Managing Snowflake involves overseeing various tasks to ensure optimal performance, security, and data integrity.
Join the DZone community and get the full member experience.
Join For FreeSnowflake is a powerful cloud-based data warehousing platform renowned for its scalability, flexibility, and ease of use.
As an administrator, managing Snowflake involves overseeing various tasks to ensure:
- Optimal performance
- Security
- Data integrity
Let’s deep dive into the day-to-day activities of an administrator.
Snowflake Architecture and Administration View
Snowflake Administration Overview
1. User Management
User management is a critical activity of an administrator. Effective user management is crucial for maintaining security and operational efficiency.
Administrators can create, modify, and remove users as needed.
Creating Users
To provision a new user, administrators should use the CREATE USER
command.
SQL
CREATE USER XXXXX
PASSWORD = 'StrongPassword123'
DEFAULT_ROLE = 'PUBLIC'
DEFAULT_WAREHOUSE = 'my_warehouse'
DEFAULT_NAMESPACE = 'my_database.public'
MUST_CHANGE_PASSWORD = TRUE;
XXXXX
user is created with a specified password and default role.- The
MUST_CHANGE_PASSWORD
parameter ensures that the user changes their password upon their first login.
Modifying Users
Amendments to user attributes of changing a password can be made using the ALTER USER
command:
SQL
ALTER USER XXXXX
SET PASSWORD = 'NewStrongPassword456';
Removing Users
The DROP USER
command is used to drop the user from the DB:
SQL
DROP USER john_doe;
2. Role Management
Snowflake Role Management helps in assigning resources to the user base. Administrators can create, assign, and remove roles to manage user permissions effectively.
Creating Roles
New roles are created with the CREATE ROLE
command:
SQL
CREATE ROLE data_scientist;
Assigning Roles
To grant a role to a user:
SQL
GRANT ROLE data_scientist TO USER XXXXX;
Removing Roles
To drop a role:
SQL
DROP ROLE data_scientist;
3. Managing Warehouses
Snowflake warehouses are virtual compute clusters that execute queries. Efficient management of warehouses ensures efficient query performance and resource utilization.
Creating Warehouses
To set up a new warehouse:
SQL
CREATE WAREHOUSE my_warehouse
WITH
WAREHOUSE_SIZE = 'X-Small'
AUTO_SUSPEND = 300
AUTO_RESUME = TRUE;
This command defines the warehouse size and auto-suspend/resume settings.
- Note: The
Auto_Suspend
parameter can be set at the command level rather than from the front end.
Modifying Warehouses
Changes in the size or auto-suspend time can be made using the ALTER WAREHOUSE
command:
SQL
ALTER WAREHOUSE my_warehouse
SET WAREHOUSE_SIZE = 'Small'
SET AUTO_SUSPEND = 600;
Scaling Virtual Warehouses
To scale up or down based on workload, adjust the size:
SQL
ALTER WAREHOUSE my_warehouse SET WAREHOUSE_SIZE ='Medium';
Using Multi-Cluster Warehouses
Enable multi-cluster warehouses to handle large numbers of concurrent queries.
SQL
ALTER WAREHOUSE my_warehouse SET MAX_CLUSTERS=3;
Removing Warehouses
To drop the warehouse, the DROP WAREHOUSE
command is used.
SQL
DROP WAREHOUSE my_warehouse;
4. Databases and Schemas Administration
Databases and schemas in Snowflake are organized and managed in a systematic, isolated, and hierarchical manner with controlled access mechanisms. Administrators can create, modify, and drop the objects and structure data efficiently.
Creating Databases
To create a new database use the command CREATE DATABASE
:
SQL
CREATE DATABASE my_database;
Creating Schemas
Schemas help us organize the objects and datasets effectively within a database and can be created using CREATE SCHEMA
:
SQL
CREATE SCHEMA my_database.my_schema;
Removing Databases
To drop a database along with all its objects, use the command DROP DATABASE
:
SQL
DROP DATABASE my_database CASCADE;
5. Managing Tables
Tables are the fundamental units for storing data. Administrators create, modify, and delete tables as required.
Creating Tables
To establish a new table, use CREATE TABLE
:
SQL
CREATE TABLE my_database.my_schema.my_table (
id INT AUTOINCREMENT,
name STRING,
created_at TIMESTAMP_LTZ DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (id)
);
Modifying Tables
Adding columns or altering table structures can be accomplished with the ALTER TABLE
command:
SQL
ALTER TABLE my_database.my_schema.my_table
ADD COLUMN email STRING;
Removing Tables
To drop a table, use the command DROP TABLE
:
SQL
DROP TABLE my_database.my_schema.my_table;
6. Data Loading and Unloading
Loading and unloading data are critical tasks in Snowflake's data management.
Loading Data
To load data from a stage into a table:
SQL
COPY INTO my_database.my_schema.my_table
FROM @my_stage/my_data_file.csv
FILE_FORMAT = (TYPE = 'CSV' FIELD_OPTIONALLY_ENCLOSED_BY = '"');
Unloading Data
To unload data from a table to a stage:
SQL
COPY INTO @my_stage/unloaded_data/
FROM my_database.my_schema.my_table
FILE_FORMAT = (TYPE = 'CSV');
7. Monitoring and Optimization
Monitoring system performance and optimizing resource usage are key to maintaining an efficient Snowflake environment.
Query History
To view historical queries:
SQL
SELECT * FROM TABLE(information_schema.query_history())
WHERE query_text ILIKE '%my_query%';
Optimizing Query Performance
Using Query Profiling
Analyze query execution plans and optimize queries.
SQL
EXPLAIN SELECT * FROM my_table WHERE column = 'value';
Using Materialized Views
Speed up complex queries by precomputing and storing results.
SQL
CREATE MATERIALIZED VIEW my_view AS
SELECT column1, SUM(column2) FROM my_table GROUP BY column1;
Warehouse Usage
Checking the usage of warehouses helps in understanding the performance:
SQL
SELECT * FROM INFORMATION_SCHEMA.WAREHOUSE_LOAD_HISTORY
WHERE WAREHOUSE_NAME = 'my_warehouse'
AND START_TIME > DATEADD(day, -1, CURRENT_TIMESTAMP);
8. Security and Access Control
Security is a key aspect of Snowflake data management. Snowflake provides granular control over data access in administering snowflake objects and is key for managing and controlling the environment.
Granting Privileges
To give access to a table:
SQL
GRANT SELECT ON TABLE my_database.my_schema.my_table TO ROLE data_scientist;
Revoking Privileges
To revoke access:
SQL
REVOKE SELECT ON TABLE my_database.my_schema.my_table FROM ROLE data_scientist;
9. Data Sharing
Data sharing is one of Snowflake's capabilities and it facilitates collaboration between different accounts.
Creating a Share
To create a share:
SQL
CREATE SHARE my_share;
Adding Objects to a Share
To include tables in a share:
SQL
ALTER SHARE my_share ADD TABLE my_database.my_schema.my_table;
Granting Access to a Share
To allow access to a share:
SQL
GRANT USAGE ON SHARE my_share TO ROLE consumer_role;
10. Backup and Restore
Snowflake offers automated data protection, but manual backup and restore operations can be performed as needed.
Creating a Backup
A backup can be made using database cloning:
SQL
CREATE DATABASE my_database_backup CLONE my_database;
Restoring From a Backup
To restore data:
SQL
CREATE DATABASE my_restored_database CLONE my_database_backup;
11. Account Management
Administrative tasks related to account management ensure proper configuration and monitoring of Snowflake settings.
Viewing Account Information
To access account parameters:
SQL
SHOW PARAMETERS IN ACCOUNT;
Configuring Account Parameters
Adjusting account settings:
SQL
ALTER ACCOUNT SET PARAMETER = 'value';
In Snowflake, account parameters play a crucial role in managing and configuring your account's behavior and features. These parameters influence the Snowflake environment operations.
- Auto Resume: This determines whether a warehouse should automatically resume if a query is submitted while it is suspended.
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'AUTO_RESUME';
- Auto Suspend: Specifies the amount of inactivity time (in seconds) before a warehouse is automatically suspended
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'AUTO_SUSPEND';
- Default Role: Defines the default role assigned to new users
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'DEFAULT_ROLE';
- Encryption: Indicates whether data encryption is enabled for the account
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'ENCRYPTION';
- Fail-Safe: Determines whether the failsafe feature is enabled, which provides data recovery options beyond the time-travel period
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'FAILSAFE';
- Max Concurrency Level: Sets the maximum number of concurrent queries that can be executed in a warehouse
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'MAX_CONCURRENCY_LEVEL'
- Query Tag: Allows setting default query tags that can be used for monitoring and tracking query performance
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'QUERY_TAG';
- Replication: Configures the replication settings for databases, enabling data replication between Snowflake regions or accounts
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'REPLICATION';
- Share: Manages the settings related to data sharing, including the default share settings
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'SHARE';
- Timestamp Output Format: Defines the default format for timestamp output
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'TIMESTAMP_OUTPUT_FORMAT';
- Use Catalog: Specifies the default catalog used for querying and operations
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'USE_CATALOG';
- Use Schema: Defines the default schema to be used for querying and operations
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'USE_SCHEMA';
- Warehouse Size: Sets the default size of newly created warehouses
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'WAREHOUSE_SIZE';
- Time Zone: Specifies the default time zone for the account
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'TIMEZONE';
- Result Scan Timeout: Sets the timeout period for scanning query results
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'RESULT_SCAN_TIMEOUT';
- Auto Scale: Determines whether auto-scaling is enabled for warehouses to adjust compute resources based on workload
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'AUTO_SCALE';
- Login History Retention Days: Specifies the number of days to retain login history
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'LOGIN_HISTORY_RETENTION_DAYS';
- Database Restore: Configures settings related to database restore operations
SQL
SHOW PARAMETERS IN ACCOUNT LIKE 'DATABASE_RESTORE';
12. Storage Management
Snowflake manages storage automatically, but understanding and monitoring storage usage is essential for cost management and optimization.
Viewing Storage Usage
To check how much storage is being used by a database or schema:
SQL
SELECT * FROM INFORMATION_SCHEMA.STORAGE_USAGE
WHERE TABLE_SCHEMA = 'my_schema';
Managing Data Retention
Snowflake provides features like time travel and fail-safe for data recovery, managing the retention period effectively is a crucial aspect of the data journey
- Time Travel: Allows access to historical data for a specific retention period
SQL
ALTER TABLE my_table SET DATA_RETENTION_TIME_IN_DAYS=7;
- Fail-safe: Provides an additional layer of data recovery beyond the time travel period, but it is not configurable by users
Dropping Unnecessary Data
To manage storage effectively, periodically drop old or unused tables and databases:
SQL
DROP TABLE my_database.my_schema.old_table;
Cost Management
Controlling and managing costs associated with Snowflake resources is crucial for budget management.
Tracking Costs
Snowflake’s cost-tracking features to monitor and analyze your spending are critical to managing costs within the budgets defined.
SQL
SELECT * FROM ACCOUNT_USAGE.COST_HISTORY
WHERE START_TIME > DATEADD(day,-30,CURRRENT_TIMESTAMP);
Setting Up Budget Alerts
Implement alerts and notifications for cost thresholds to avoid unexpected charges.
Reviewing and Adjusting Resource Usage
Regularly review resource usage and adjust warehouse sizes, data retention settings, and concurrency settings to optimize costs.
Conclusion
Snowflake administration involves a diverse range of tasks, from user and role management to data loading and security. By mastering these tasks, administrators can maintain a secure, efficient, and well-organized data environment. This approach not only ensures optimal performance but also enhances the overall effectiveness of data management within the Snowflake environment.
Opinions expressed by DZone contributors are their own.
Comments