If you have been tasked with building a data analytics stack at a scaling commerce brand, you already know what a massive and complex undertaking it is—all in the pursuit of creating a single source of truth around your data.
We’ve built thousands of instances for brands and from that experience, we both understand that complexity and know how to navigate it.
We just published a piece that walks through the components of a data analytics stack in detail, but we wanted to dive deeper into the cloud data warehouse piece. Choosing among the Big Three (Snowflake, BigQuery, and Redshift) is complicated. We partner and integrate with each, and so we want to help you to make the best cloud data warehouse choice that you can for your brand.
For each cloud data warehouse, we’ll describe and analyze the overall product, pricing, and implementation experience required to build your analytics stack on top of them.
Let’s start with Snowflake.
Snowflake is a multipurpose cloud data platform that, among other data products, offers virtual data warehouses that are ideal for brands. Snowflake can be connected to the Extract + Load and Business Intelligence/Visualization tools of your choice; it supports connections to all the major Extract + Load providers and BI tools (Looker, Tableau, and others).
Snowflake has become popular as a result of its clean UI and smooth UX, it is the third-highest rated data platform (behind BigQuery and Redshift) according to Forrester’s study, and it does not require heavy technical expertise to set up. Furthermore, it is a highly scalable, flexible, and fast platform that can handle the data demands of brands with high-volume, fast-moving streaming data. For example, if your brand collects user site interaction data and requires significant data storage that you can quickly query, Snowflake is well-built for that type of demand.
Snowflake offers both pre-paid and on-demand pricing models. Each model has three pricing tiers: Standard, Enterprise, and Business Critical; each of these tiers increases the quality or availability of variables such as security and customer support.
Most of the price you pay for Snowflake will be CPU/querying rather than data storage, which is charged by TB per month and varies slightly by region. Storage costs are usually between 5% and 20% of your total Snowflake bill.
Snowflake querying costs are based on Snowflake Credits (units of measure based on the time you spend using the Virtual Warehouse), and Snowflake Credits cost vary both by overall pricing tier (e.g., Enterprise) and region, though regional variability is less significant. Larger Snowflake Virtual Warehouses will consume more credits per hour (ranging from 1 credit per hour to 128 credits per hour).
If you opt for Snowflake’s on-demand pricing model, it will be crucial to know how much time your brand will need to spend running queries every day and how much your data storage needs may change month to month. Otherwise, you may end up paying far more than you’ve budgeted for.
Snowflake implementation and other considerations
- Snowflake requires a lower level of technical expertise to set up; SQL and data warehouse experience is required, and there is robust documentation to facilitate the process and potential troubleshooting.
- There is more of a “hands off” user experience with Snowflake; it scales with your needs, and Snowflake manages all of it.
- Because Snowflake runs on AWS, Microsoft, and Google Cloud, it is more agnostic toward online ecosystems.
BigQuery is a serverless and multicloud data warehouse system, and like Snowflake, it supports Extract + Load tools as well as BI/Visualization tools.
As an integral component of Google Cloud Platform, BigQuery’s UI resembles much of the rest of Google:
As of 2021, BigQuery is the overall highest-rated cloud data warehouse according to Forrester’s short eBook, and it performs the best among Snowflake and Redshift for the most extreme data storage and querying needs.
A small note relevant to commerce is that BigQuery was once not optimized around DML statements (modifying data within tables), but Google addressed this limitation in 2020); this was relevant to larger brands because of how frequently data may be updated. As of 2021, there may be some remaining edge cases for high volume merchants (such as large concurrent INSERT SQL statements) who collect a ton of streaming data.
BigQuery offers both on-demand and flat-rate pricing. However, it does not offer pricing tiers as Snowflake does; instead, it offers one overall pricing model that charges for data storage and data analysis. BigQuery categorizes data analysis as querying and anything else that activates data, such as some data manipulation language (DML) and data definition language (DDL).
BigQuery’s querying charge is a flat rate of $5 (regardless of location within the U.S.), and that cost varies by a couple dollars for international brands. For a discussion of BigQuery’s flat rate pricing, read Google’s documentation. An important note is that if you opt for flat rate pricing, and your usage exceeds what you have paid for, the queries will queue up and you will not be able to run them until you have expanded your usage capacity.
BigQuery implementation and other considerations
- BigQuery requires a similar level of technical expertise as Snowflake to implement: SQL and data warehouse experience is necessary.
- Originally, BigQuery did not support querying with SQL (this was changed several years ago), however some small differences remain with BigQuery SQL syntax.
- BigQuery allows for highly compressed test queries, which may save your brand money over time.
Redshift is a cloud data warehouse that, like Snowflake and BigQuery, supports Extract + Load tools as well as BI/Visualization tools for your data analytics stack.
Redshift’s UI tends to be a bit less user-friendly than Snowflake’s or BigQuery’s (shown below), but it ranks second overall (behind BigQuery, ahead of Snowflake) according to Forrester’s study.
Redshift is able to handle high-volume, low-speed data needs quite well, but it is less optimized for streaming data—particularly, frequently-changing streaming data. Redshift also offers a different data storage approach than Snowflake and BigQuery, in that you rent nodes from Amazon when you purchase data storage space. The nodes come in many sizes, which provides flexible monthly data storage options. However, scaling may present challenges or require frequent updates to payment.
Redshift offers the simplest and most predictable pricing model among the three providers. There are no pricing tiers in Redshift’s pricing model, and offerings are based around fixed rates: this goes for both data storage and querying.
However, as noted in the previous section, because frequent updates may be required if your brand is quickly scaling or captures streaming data, pricing and storage likewise may have to be updated frequently. Redshift’s model tends not to be an issue for even larger commerce brands that only capture basic user data and record standard commerce data, such as Orders and Fulfillment.
Redshift implementation and other considerations
- Redshift is a more “hands on” cloud data warehouse option, and it has high levels of sophisticated customization for experienced engineers.
- Redshift can function well with less data experience, but optimization may be harder.
- Redshift has poorer JSON support than Snowflake and BigQuery because of how JSON is stored.
- Redshift works particularly well if you use other AWS products and is engineered with the AWS ecosystem in mind.
There are pros and cons to each of these cloud data warehouse options, and the one you choose to use will be based on your team’s technical experience and ecosystem expertise.
If you’re interested in having all of your data analytics stack (Extract, Load, Transform + BI) and your cloud data warehouse purpose-built, automated, and optimized for commerce (so that you don’t have to worry about choosing a data warehouse and the yearlong process of building your data analytics stack from scratch), Daasity can help: that’s what we do.
If you want to talk to one of Daasity’s data analytics experts today, head over here.