Your company has probably been in this position before – a groundbreaking new technology has emerged, and you are trying to determine if market hype should be balanced against actual performance. A self-managing data warehouse such as Snowflake can provide insight in just a few weeks without the time-consuming process of building a data warehouse from scratch. Does Snowflake deliver on its promise? If so, does it still need a defined implementation strategy? In both cases, the answer is yes.
Snowflake Deployment Best Practices: A CTO’s Guide to a Modern Data Platform originally appeared in our eBook. For the full eBook, click here.
How Does a Snowflake Differ From Other Droplets?
Low overhead…Massive scale
Compared to other cloud data warehouses, Snowflake provides enterprise-grade functionality without sacrificing simplicity. Performance and cost are automatically balanced by automatically scaling up or down. With Snowflake, computation and storage are separated. This is key because virtually all other databases, Redshift included, combine both, so you need to size for your greatest workload and be prepared to incur those costs.
All your data can be stored in Snowflake in one location, along with your computation. Scripts enable you to automatically create a massive Snowflake warehouse in real time and scale it back down once it’s completed – all while performing complex transformations in near real time. Hence, you can reduce your costs without compromising on the result.
Developing and testing in elastic environments
Duplicate database environments are no longer required for develop ment and testing. Rather than creating multiple clusters for each environment, you can spin up a test environment whenever you need it, point it to Snowflake storage, and test it before moving it to production. Maintaining and managing three clusters simultaneously can be challenging with Redshift. Because Snowflake charges by the second, you stop paying when your workload finishes.
DevOps practices for continuous integration and delivery (CI/CD) can streamline testing, making it more like modern application development than traditional data warehouse practice. Just imagine how painful this could be if you used Redshift.
Avoiding FTP with External Data Sharing
Furthermore, storage and computing are separated to enable data sharing and other differentiating features. Even if your data will be shared with a vendor, partner, or customer who is not a Snowflake customer, you can still share it. Your data is being indexed by Snowflake in the background (based on the security requirements you’ve specified). If you typically share data via FTP using scripts, you now have an easier, more secure, and auditable way to share your data outside of your organization. Instead of cumbersome manual processes that often lead to data security headaches, healthcare organizations can create a shared data repository so that their providers can access it.
Despite its role in your data ecosystem, Snowflake does not live in a silo
This should always be on your mind. As your organization evolves, a modern data platform will include application integration, machine learning, data science, and many other components. Snowflake test is the best at handling analytics, but it isn’t designed to handle the rest of the house.
Even if you’re not aware of future tools, consider other possible components of your Snowflake deployment. Choosing the Snowflake public cloud flavor (Azure or AWS) will be the most important decision. SQL Server, Azure Machine Learning, or other Azure PaaS services are likely to be included, or is the AWS ecosystem more likely to be effective?
A company such as Snowflake recognizes that not every workload is suitable for them. Snowflake partners with Databricks to enable heavy data science workloads. We can expect to see further announcements of new partnerships in the next year as a result of the partnership with Microsoft.