image_pdfimage_print

This tutorials shows you how to use the lag function to calculate moving averages in Snowflake.

It builds upon work we shared in Snowflake SQL Aggregate Functions & Table Joins and Snowflake Window Functions: Partition By and Order By.

Using lag to calculate a moving average

We can use the lag() function to calculate a moving average. We use the moving average when we want to spot trends or to reduce the volatility from the daily number as it can vary widely.

In other words, it’s better to look at a week of sales versus one day to see how a product is performing.

Create sample data

To study this, first create these two tables.

CREATE TABLE customers
  ( 
     customernumber     varchar(100) PRIMARY KEY, 
    customername varchar(50),
    phonenumber varchar(50),
    postalcode varchar(50),
    locale varchar(10),
    datecreated date,
    email varchar(50)
  );


CREATE TABLE orders
  ( 
     customernumber     varchar(100) PRIMARY KEY,
    ordernumber varchar(100),
    comments varchar(200),
    orderdate date,
    ordertype varchar(10),
    shipdate date,
discount number,
quantity int,
    productnumber varchar(50)
)

Then paste in this SQL data. The top of the data looks like this:

insert into customers (customernumber,customername,phonenumber,postalcode,locale,datecreated,email) values ('440','tiqthogsjwsedifisiir','3077854','vdew','','2020-09-27','twtp@entt.com');

insert into orders (customernumber,ordernumber,comments,orderdate,ordertype,shipdate,discount,quantity,productnumber) values ('440','402','swgstdhmju','2020-09-27','sale','2020-10-01','0.7005950240358919','61','BB111');

insert into customers (customernumber,customername,phonenumber,postalcode,locale,datecreated,email) values ('802','hrdngzutwelfhgwcyznt','1606845','rnmk','','2020-09-27','ympv@zfze.com');

insert into orders (customernumber,ordernumber,comments,orderdate,ordertype,shipdate,discount,quantity,productnumber) values ('802','829','jybwzvoyzb','2020-09-27','sale','2020-10-06','0.3702248922841853','75','FF4444');

insert into customers (customernumber,customername,phonenumber,postalcode,locale,datecreated,email) values ('199','ogvaevvhhqtjcqggafnv','8452159','hyxm','','2020-09-27','znqo@rftp.com');

Write SQL statement

Now we want to calculate the moving average total sales over the previous four days.

Here, we have a select statement inside a select statement because we want one order total per day. Then the lag statement looks over that one record to look at the previous day.

select  shipdate, (quantity + lag(quantity,1) over (order by shipdate) + 
lag(quantity,2) over (order by shipdate) + 
lag(quantity,3)over (order by shipdate) + lag(quantity,4) 
over (order by shipdate)) / 5  as movingaverage from  

(select shipdate, sum(quantity) as quantity from orders group by shipdate);

Here is the moving average. The first rows are null as the lag function looks back further that the window for rows that don’t exist.

We can prove that this calculation is correct, by calculating this another way.

Let’s sum orders by ship date.

select  shipdate, sum(quantity)
from orders group by shipdate
order by shipdate;

Then we copy the results into a spreadsheet:

  • On the left is the windows function.
  • On the right is the query above.

I have added a column using the spreadsheet function average() to show that the numbers are the same. So, you can easily see how the windows lag function works.

windows function	sum and group by	=AVERAGE(E3:E7)
SHIPDATE	MOVINGAVERAGE	SHIPDATE	SUM(QUANTITY)	moving average
2020-09-30		2020-09-30	427	
2020-10-01		2020-10-01	230	
2020-10-02		2020-10-02	657	
2020-10-03		2020-10-03	604	
2020-10-04	488.6	2020-10-04	525	488.6
2020-10-05	462	2020-10-05	294	462
2020-10-06	547.2	2020-10-06	656	547.2
2020-10-07	485.2	2020-10-07	347	485.2
2020-10-08	470.8	2020-10-08	532	470.8
2020-10-09	486.2	2020-10-09	602	486.2
2020-10-10	495.2	2020-10-10	339	495.2
2020-10-11	465.8	2020-10-11	509	465.8

Additional resources

For more tutorials like this, explore these resources:

Learn ML with our free downloadable guide

This e-book teaches machine learning in the simplest way possible. This book is for managers, programmers, directors – and anyone else who wants to learn machine learning. We start with very basic stats and algebra and build upon that.


Last updated: 10/09/2020

These postings are my own and do not necessarily represent BMC's position, strategies, or opinion.

See an error or have a suggestion? Please let us know by emailing blogs@bmc.com.

BMC Bring the A-Game

From core to cloud to edge, BMC delivers the software and services that enable nearly 10,000 global customers, including 84% of the Forbes Global 100, to thrive in their ongoing evolution to an Autonomous Digital Enterprise.
Learn more about BMC ›

About the author

Walker Rowe

Walker Rowe

Walker Rowe is an American freelancer tech writer and programmer living in Cyprus. He writes tutorials on analytics and big data and specializes in documenting SDKs and APIs. He is the founder of the Hypatia Academy Cyprus, an online school to teach secondary school children programming. You can find Walker here and here.