Snowflake Key Concepts

January 19, 2024

Snowflake is a cloud-based data warehouse that offers scalable and secure storage and processing of structured and semi-structured data. Snowflake uses a shared-nothing architecture for parallel processing and offers native support for SQL as well as other programming languages such as Python, R, and Java

Snowflake’s unique architecture consists of three key layers:

- Database Storage
- Query Processing
- Cloud Services

In Snowflake, they have done the decoupling of storage and compute.

Virtual Warehouse:

- A compute resource in Snowflake that processes queries and performs data loading and unloading.
- It can be independently scaled up or down based on demand.
- You can resume and suspend very easily.

Micro-Partition:

- A storage unit in Snowflake that contains a subset of the data in a table.
- Micro-partitions are automatically optimized for efficient querying.
- Within each micro-partition, data is stored in a columnar data structure, allowing better compression and efficient access only to those columns required by a query.

Time Travel:

- A feature in Snowflake that allows users to query historical data at specific points in time or within a specific time range.

Data Sharing:

- Snowflake’s Secure Data Sharing feature allows you to share objects (such as tables) from a database in your account with another Snowflake account without having to duplicate the data and without the need to copy or transfer the data.

Restoring:

- It provides you the facility to restore with simple SQL Commands like UNDROP TABLE.

Multi-cluster:

- Concurrency is no longer a problem for Snowflake, unlike traditional data warehouses with concurrency issues where users and processes must compete for resources. Because of Snowflake’s multi-cluster architecture, concurrency is not an issue anymore.

Caching Results:

- To help speed up your queries and reduce costs, the Snowflake architecture includes caching at various levels. When a query is run, for example, Snowflake keeps the results of the query for 24 hours. So, if the same query is run again by the same user or another account user, the results are already available to be returned, assuming the underlying data hasn’t changed. This is especially useful for analysis work, as it eliminates the need to rerun complex queries to access previous data or compare the results of complex queries before and after a change.

Multi-table INSERT:

- Snowflake allows Multi-table INSERT and threads are executed in parallel.

QUALIFY:

- In a SELECT statement, the QUALIFY clause filters the results of window functions. QUALIFY does with window functions what HAVING does with aggregate functions and GROUP BY clauses.

Pricing:

- You can pay for actual consumption only.

Yogesh Shinde

LinkedInProfile

Next >>>

Comments

PRAVIN AUTI19 January 2024 at 23:45
Very good insights from this blog
ReplyDelete
Replies

Add comment

Search This Blog

Tech-Discussion

Snowflake Key Concepts

Comments

Post a Comment

Popular posts from this blog

Snowflake Vs MS-SQL - Part 36 - SPLIT_TO_TABLE

Snowflake Vs MS-SQL - Part 63 - ON Clause & Cross Join

Snowflake Vs MS-SQL - Part 33 - INITCAP