Machine Learning & Big Data Blog

Tableau: Calculate Covariance and Correlation Between Stock Prices and Earnings

Walker Rowe
4 minute read
Walker Rowe
image_pdfimage_print

Continuing our series on Tableau, here we explore two important components: how to calculate covariance and correlation and how to use the trend line.

We want to answer: “To what extent is the stock price positively correlated with earning?” That answer, as most people who follow the stock market know, is somewhat. Let’s get started.

Data for Tableau

We will use the same financial data that we used in our last Tableau tutorial: the stock prices, earnings, and dividends for Starbucks, Johnson & Johnson, and Disney. We will pick just one company, the J&J stock price. To simplify things (meaning: avoid having to join two tables and collapse one of them), we will get the stock prices from their quarterly reports instead of using the stock price data.

Understanding correlation and covariance

Correlation means a change in one component or area causes an equal increase in the other. The change may be positive or negative. Covariance measures how those variables change together, but it can scale to any number, making it difficult to understand.

Here we use stock price and earnings. We create a line chart of the quarter ending date against the price column. The correlation is shown to the left. This is the CORR function. We say that two items are positively correlated when this value is 1.

The value in our graph is 0.65, which indicates some but not very strong correlation. It would not make sense to plot the correlation value across the whole chart, since it’s a single number. So, Tableau shows the one number.

Plotting and using a trend line

Basically, a trend line will reaffirm what we observation from the correlation value. Think of this line in terms of linear regression: It is the line that most neatly slices the data points down the middle.

To add a trend line to your chart, pick the trend line from the worksheet design screen on the analytics tab. This adds a dashed line between the points on the plot line.

Since this is a XY plot (i.e., the line y = mx + b), if you hover the mouse over the trend line, Tableau shows you:

In this case:

  • y (price) is the dependent value
  • Earnings is the independent value
  • 9 is the value b
  • The y intercept, the point where x crosses the vertical (y) axis
  • m is the slope of the line y = mx + b or 0.0000000187
  • Price = (0.0000000187 * earnings) + 32.9

There are various types of trend lines. This one shows additional information:

  • R-squared = 0.608871 means the variation in price given variance in the earnings. As you can see it’s almost the same value as the correlation—which is what we would expect.
  • P-value < 0.0001 is a test of the null hypotheses. It tests whether there is no relationship between price and earnings. Clearly, they are related, as the P-value is a small number.

Disabling aggregation

Tableau always assumes you want to do aggregation. So, it adds the function SUM() to add numeric values and uses the YEAR() function on dates.

This is annoying behavior when you want to do a scatter or line chart. To use individual values and not the sum, elect the value dimension when you click on a number, like price.

The same is true with the date. Pick the exact date so it won’t use the YEAR function, although it will still scale the chart so it’s easy to read. You can still pick year anyway, or year with month to be more specific.

Calculating correlation

We calculate correlation by clicking Create Calculated field. The formula is below.

The curly braces {} means to make it a table calculation as opposed to running the calculation only against what is shown on the chart. That’s what we need to do, as Tableau automatically scales charts to make the data fit on the screen thus skipping over some data points.

Automate workflows to simplify your big data lifecycle

In this e-book, you’ll learn how you can automate your entire big data lifecycle from end to end—and cloud to cloud—to deliver insights more quickly, easily, and reliably.


These postings are my own and do not necessarily represent BMC's position, strategies, or opinion.

See an error or have a suggestion? Please let us know by emailing blogs@bmc.com.

Run and Reinvent Your Business with BMC

BMC has unmatched experience in IT management, supporting 92 of the Forbes Global 100, and earning recognition as an ITSM Gartner Magic Quadrant Leader for six years running. Our solutions offer speed, agility, and efficiency to tackle business challenges in the areas of service management, automation, operations, and the mainframe. Learn more about BMC ›

About the author

Walker Rowe

Walker Rowe

Walker Rowe is a freelance tech writer and programmer. He specializes in big data, analytics, and programming languages. Find him on LinkedIn or Upwork.