This tutorial shows you how to count conditionally using Snowflake-SQL.
Snowflake is, in my opinion, the best data warehousing solution on the market. It is easy to manage and the clear separation of storage and compute make for both performant and cost effective solution. Unless there is an extremely compelling reason to chose a competitor (Redshift, Bigquery) it’s what I recommend to most of my clients.
The flavor of SQL that Snowflake supports will be familiar to those who have used Redshift (or PostgreSQL) before. In most cases, you can write SQL as you ‘know it’ and it will be accepted fine. That said, there are many extra functions and ‘hidden gems’ in Snowflake. This post is part of a series of tutorials for lesser-known Snowflake functions.
A common pattern in SQL is having to count based on a specific condition. An example of this might be to find out how many null values are in a data set.
The canonical way to do this in SQL is to define the condition in a
case statement with
1 for where the condition is true and
0 otherwise, and then
sum() those values. Let’s take a look at an example:
select sum(case when shipping_price is null then 1 else 0 end) as shipping_is_null, sum(case when shiping_price is not null then 1 else 0 end) as shipping_is_not_null from products
Snowflake-SQL provides a function to handle this pattern neatly:
count_if() function takes a single argument, being a condition that evaluates
Let’s write our query from above using
select count_if(shipping_price is null) as shipping_is_null, count_if(shipping_price is not null) as shipping_is_not_null from products
As you can see, this has the benefit of being both shorter and more explicit — definitely a better option in my perspective.
count_if() is a standard aggregate function, so you can use it with
group by as well:
select product_category, count_if(shipping_price is null) as shipping_is_null, count_if(shipping_price is not null) as shipping_is_not_null from products group by 1
I hope you find this function a helpful addition to your Snowflake-SQL toolbox.
Note: although this post focuses on Snowflake, BigQuery has an equivalent function
countif() which works in exactly the same way. Unfortunately Redshift does not have an equivalent function.