In this lesson we will:
- Learn about Snowflakes Data Cloud positioning.
The Data Cloud
Though most people think of Snowflake as a Data warehouse, it is continuing to grown into more of a complete platform for managing data, including ETL, data science, data sharing and machine learning type activities. For this reason, Snowflake refer to their own platform as a Data Cloud, describing their attempt to meet more end to end and higher level requirements of data teams.
Some the key features of the Data Cloud are:
Data Lake
As well as storing relational data within the traditional data warehouse, Snowflake can also enable a Data Lake pattern. This means that data is stored outside of Snowflake (for instance in an AWS S3 account), but queried and accessed via Snowflake. This combines the best of the Data Warehouse and Data Lake in a similar way to the Data Lakehouse pattern.
Data Sharing
Snowflake includes features for sharing data across accounts or even other organisations such as business partners or clients. Critically, this sharing is done in a controlled and goverened manner which opens up a number of business use cases. Where appropriate, this data can be sold and licensed through the Snowflake Data Marketplace.
Data Science
Snowflake alllows you to carry out Data Science activities such as model building or analytics directly within the platform. This is enabled with the Snowpark feature which allows you to run Python code directly within Snowflake. Once developed, Snowflake can also host your machine learning models to simplify production deployment.
Data Engineering
Snowflake strays into areas relating to Data Engineering which have historically been managed by third party tools. These include incrementally loading data and data transformations. This approach simplifies your data pipelines by avoiding the introduction of additional tooling.