Dual-Axis Charts: Better Alternatives

plot of chunk unnamed-chunk-11

Alternatives to Dual-Axis Charts

In a previous blog post called 'Dual-Axis Charts: Temptations, Traps, Tips', I went through some of the pitfalls of using dual-axis charts. These are charts where you want to compare two metrics or data attributes but there are vastly different scales involved, so you reach for a layout with two y-axes in order to deal with the scales separately. I acknowledged this as a tempting choice, but fraught with danger due to at least a couple of common issues:

  1. Misrepresentation due to mixed scales: changing relative scales arbitrarily can suggest different conclusions and imply relationships that may not be as strong (or weak) as they appear.
  2. Difficultly in interpretation: these charts require extra mental effort to untangle the lines and associate them with their respective data points.

I also provided some basic tips on how to avoid/minimize these issues with dual-axis charts so here I want to try out some alternatives that may provide even better options for communicating effectively with data.

As noted in the previous post, there are two common scenarios where dual-axis charts come up and we will walk through alternatives for each of them.

  1. Comparing trends in two (or more) similar data sets that have vastly different scales.
  2. Comparing a volume metric with a related rate or ratio metric. (so, again, vastly different scales)

As usual, the examples here are produced using R, with ggplot2 package as the preferred visualization tool.

Scenario 1: Compare trends in similar metrics from two datasets

Same example as previous post: prices history of two different crypotcurrencies – Cardano, with its token ADA and Bitcoin (BTC).

To recap our initial setup:

As always, the visualization choice should be based on what questions we are trying to answer, what we are hoping to learn, and, ultimately what decisions we want to make.

If we are starting with general exploration, we still need to frame it up. Our first thought may be to compare prices over a recent period, to answer questions like:

  • What are the relative changes in the currencies over time?
  • Do the two follow a similar pattern of ups and downs?
  • Are there any points where a general pattern breaks? (could provide focus for further investigation)
  • Eventually: are there ways we can take advantage of these patterns to make investment decisions? (probably beyond initial scope but helps to have that broader perspective)

In this random sample of recent data (Cdn$), we can see the two sets of prices are on much different scales.

date BTC_CAD ADA_CAD
2021-02-19 70534 1.17
2021-05-20 49186 2.18
2021-08-08 55067 1.79
2021-09-21 52157 2.55

Line charts stacked vertically

cp01 <- crypto_data %>% ggplot(aes(x=date, y=BTC_CAD))+geom_line()+
  scale_y_continuous(labels=dollar_format())+
  labs(title='BTC (top) and ADA (bottom) prices (CDN$)', x='')+
  theme(axis.text.x = element_blank())
cp02 <- crypto_data %>% ggplot(aes(x=date, y=ADA_CAD))+geom_line()+
  scale_y_continuous(labels=dollar_format())+
  labs(x='')
#grid.arrange(cp01, cp02, nrow=2)
plot_grid(cp01, cp02, nrow=2, align='v')

plot of chunk crypto-lines-01

Here we still have the issue of scale ratios, but we have some advantages:

  • clear and easy to see which dataset is which.
  • separating the lines puts the focus on general pattern comparison, doesn't create as strong an implication around magnitude of comparative changes and doesn't create distractions like cross-over points, which are meaningless.

We are able to focus on the comparison, not on bending our mind around untangling the lines. This might more easily lead to a follow-up question, like:

  • seems to be some similarity in the trends but not super-consistent, I wonder what the correlation between the lines is?
corel <- cor.test(crypto_data$BTC_CAD, crypto_data$ADA_CAD)
corelcoef <- corel$estimate
corelci_lower <- corel$conf.int[1]
corelci_upper <- corel$conf.int[2]

Quick calculation shows r= 0.386, which is not that strong, and the 95% confidence interval is between 0.292 and 0.474, which seems pretty wide, lowering our sense of the strength of the relationship even further.

There's a whole other rabbit hole we can go down here, if we choose – creating scatterplots and all kinds of things – but for our purposes we'll continue with other visualization strategies.

% Change Comparison

Since a key part of what we are trying to understand is relative change in prices, a logical approach to get away from scale issue is to look at % changes. After calculation, we can get this view:

## calculate % changes day-over-day
crypto_data_pc <- crypto_data %>% mutate(BTC_CAD_pc=BTC_CAD/lag(BTC_CAD)-1,
                                         ADA_CAD_pc=ADA_CAD/lag(ADA_CAD)-1)
crypto_data_pc <- crypto_data_pc[-1,]
## produce chart of % changes for each currency
crypto_data_pc %>% ggplot(aes(x=date, y=BTC_CAD_pc))+geom_line(color='goldenrod')+
  scale_y_continuous(labels=percent_format())+
  geom_line(aes(y=ADA_CAD_pc), color='blue')+
  labs(title='Daily % Changes in Prices (gold=BTC, blue=ADA)', x="", y='Daily % Chg')

plot of chunk crypto-pc-lines-01

Pretty messy with this amount of data, but hopefully you can see how this approach could be useful in comparing the two currencies. This could be made more readable by either zooming in on shorter period OR aggregrating the data by week or month.

Here's the example with weeks (using Lubridate pkg to choosing weekday==7 ):

## use lubridate to get day of week for each date, filter for single day of week, calc WoW % chg
crypto_data_pc_wk <- crypto_data %>% mutate(
  weekday=wday(date)
) %>% filter(weekday==7) %>% mutate(
  BTC_CAD_pc=BTC_CAD/lag(BTC_CAD)-1,
  ADA_CAD_pc=ADA_CAD/lag(ADA_CAD)-1
)
crypto_data_pc_wk <- crypto_data_pc_wk[-1,] ## drop first row, since NA for % chg
## plot weekly change comparison
crypto_data_pc_wk %>% ggplot(aes(x=date, y=BTC_CAD_pc))+geom_line(color='goldenrod')+
  scale_y_continuous(labels=percent_format())+
  geom_line(aes(y=ADA_CAD_pc), color='blue')+
  labs(title='Weekly % Changes in Prices (gold=BTC, blue=ADA)', x="", y='Daily % Chg')

plot of chunk crypto-pc-wk-lines-01

Can also try this view with bars, after a bit of manipulation to longer data shape for ease of bar chart comparison:

## pivot data longer to make it easier to display side-by-side bars with legend
crypto_data_pc_wk_lg <- crypto_data_pc_wk %>% select(date, BTC_CAD_pc, ADA_CAD_pc) %>% 
  pivot_longer(cols=c(BTC_CAD_pc, ADA_CAD_pc), names_to='currency', values_to = 'pc_chg')  
## side-by-side bar plot
crypto_data_pc_wk_lg %>% ggplot(aes(x=date, y=pc_chg, fill=currency))+
  geom_col(position = position_dodge2())+
  scale_y_continuous(labels=percent_format())+
  labs(title='Weekly % Changes in Prices', x="", y='Daily % Chg')+
  theme(legend.position = 'top', legend.title = element_blank())

plot of chunk crypto-pc-wk-bar-01

Still some challenges with density of the chart, but highlights how bar/column charts can facilitate side-by-side comparisons, whereas line charts favour reading trends.

Difference in Difference

Once we see the comparison of % change week-over-week, we might want to go further to compare the differences in those changes. We can take the next logical step and calculate the difference between the % change in ADA and % change in BTC – a version of the 'difference in difference' approach used in statistics.

crypto_data_pc_wk <- crypto_data_pc_wk %>% mutate(
  ADA_BTC_diff=ADA_CAD_pc-BTC_CAD_pc
)
crypto_data_pc_wk %>% ggplot(aes(x=date, y=ADA_BTC_diff*100))+geom_col()+
  labs(title="Difference in % Difference: ADA-BTC", x="", y='Difference in Chg (percentage pts)')

plot of chunk crypto_data_diff_in_diff-01

This view highlights even further the lack of consistent relationship between changes in the two currencies:

  • if the two had the same % changes (increase or decrease) week-over-week, the bars would be at/close to 0.
  • if there was a consistent difference in difference, for example, if a 10% change in BTC accompanied an 15% change in ADA, and 5% change in BTC accompanied a 10% in ADA, the bars would all be at 5.

Although the bars appear to be somewhat centered around 0 there is a LOT of variation on either side. Getting back to our original questions, there is no identifiable pattern in comparative price changes and not even apparent changes in the relationship over time or at different periods.

Center and Spread

Looking at the relative patterns over time, one direction this could lead us is to ask questions around center and distribution of daily changes in the two currencies. This is going off on a tangent from the main goal of exploring alternatives to dual-axis time series charts, but I can't resist. ๐Ÿ˜‰

## BTC-CAD summary
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -0.1335 -0.0216 -0.0004  0.0024  0.0245  0.1855
## std deviation:  0.0422
## 
## ADA-CAD summary
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  -0.256  -0.027   0.003   0.009   0.032   0.322
## std deviation:  0.0692

Both appear to be pretty tightly-centered around 0, with Cardano being a bit more volatile. This could lead us into some distribution visualizations like histogram with changes in the two currencies overlaid each other…

crypto_data_pc %>% ggplot()+
  geom_histogram(aes(x=BTC_CAD_pc), fill='goldenrod', alpha=0.2)+
  geom_histogram(aes(x=ADA_CAD_pc), fill='blue', alpha=0.2)+
  labs(title='Distribution of Daily % Chg, gold=BTC-CAD, blue=ADA-CAD', x='')

plot of chunk crypto-hist-01

…or, personal preference, boxplot…

## pivot data longer to make it easier to display side-by-side bars with legend
crypto_data_pc_lg <- crypto_data_pc %>% select(date, BTC_CAD_pc, ADA_CAD_pc) %>% 
  pivot_longer(cols=c(BTC_CAD_pc, ADA_CAD_pc), names_to='currency', values_to = 'pc_chg')
## boxplot
crypto_data_pc_lg %>% ggplot(aes(x=currency, y=pc_chg))+geom_boxplot(fill='dodgerblue')+
  scale_y_continuous(labels=percent_format())+
  labs(title='Distribution of Daily % Chg', x='', y='Daily % Chg')

plot of chunk crypto-pc-box-01

Re-Scale the Data

Another option we have is to rescale both sets of prices so that they are on a common scale, and therefore more camparable. This is the kind of thing that is often done in machine learning in order to balance the weights of features. There are a number of potential pitfalls, so it is best to proceed with caution, understanding of your data, and objectives in mind. This info is presented as demonstration, not necessarily endorsement. ๐Ÿ˜‰

Two main approaches: Normalization and Standardization

This is a whole area unto itself and there are variations in the terminology used. I'm relying on the following references:

Based on the above, there are two general approaches, described as:

  • Normalization: scale the values from 0 – 1, using 'min-max scaling'. Doesn't treat outliers well. This is often described as maintaining the same distribution as original data, just shifting to different scale. However, it appears that will particularly volatile data, this is not always the case.
  • Standardization: aka 'z-score': scale the values so that mean = 0 and standard deviation = 1. No upper or lower bound, so tends to be better at handling outliers. Skews the data toward normal distribution.

There are general guidelines but no hard and fast rules around when to use one or the other, so let's check them both out.

Normalization

According to info on analyticsvidhya.com, considered 'good to use when you know that the distribution of your data does not follow a Gaussian distribution'. Let's check:

## use density function to compare actual vs normal ideal
## BTC
hist01 <- crypto_data %>% ggplot(aes(x=BTC_CAD))+geom_histogram(aes(y=..density..))+
  ## function to calculate ideal normal dist based on mean and sd in the dataset 
  stat_function(fun=dnorm, args=list(mean=mean(crypto_data$BTC_CAD), sd=sd(crypto_data$BTC_CAD)), color='red')+
  labs(title='BTC_CAD price distribution')
## ADA
hist02 <- crypto_data %>% ggplot(aes(x=ADA_CAD))+geom_histogram(aes(y=..density..))+
  ## function to calculate ideal normal dist based on mean and sd in the dataset
  stat_function(fun=dnorm, args=list(mean=mean(crypto_data$ADA_CAD), sd=sd(crypto_data$ADA_CAD)), color='red')+
  labs(title='ADA_CAD price distribution')
## print both
grid.arrange(hist01, hist02, nrow=1)

plot of chunk crypto-norm-check-01

This is pretty 'abnormal data' – hello cryptocurrency! – so another reason to exercise caution.

The formula for min-max normalization is pretty straightforward – basically yor each value in the data set you calculate the distance from the minimum value and then divide by the full range of data:

Xeach = (X – Xmin) / (Xmax-Xmin)

For fun, we can use the caret pkg, based on example code from JournalDev.com:

## use caret pkg functions for fun
library(caret)
process <- preProcess(crypto_data, method=c('range'))
crypto_norm <- predict(process, crypto_data)

## alternative methods:
## - mutate
crypto_data_minmax <- crypto_data %>% mutate(
  BTC_mm=(BTC_CAD-min(crypto_data$BTC_CAD))/(max(crypto_data$BTC_CAD)-min(crypto_data$BTC_CAD)),
  ADA_mm=(ADA_CAD-min(crypto_data$ADA_CAD))/(max(crypto_data$ADA_CAD)-min(crypto_data$ADA_CAD))
)
## - simple function with lapply
fminmax <- function(x){
  (x-min(x))/(max(x)-min(x))
  }
crypto_data_fminmax <- as.data.frame(lapply(crypto_data[,2:3], fminmax))

Create plot for display later:

## create a plot of normalized lines
p_norm <- crypto_norm %>% ggplot(aes(x=date))+
  geom_line(aes(y=BTC_CAD), color='goldenrod')+
  geom_line(aes(y=ADA_CAD), color='blue')+
  labs(title='Nrmlized Prices (gold=BTC, blue=ADA)', x='', y='normalized prices')

One thing to note is that contrary to info here (for example), the distribution of the newly normalized data is similar but not the same as original data:

plot of chunk crypto-hist-btc-norm-01

Standardization

  • use the built-in 'scale' function in R
## z-score scaling
crypto_scale <- as.data.frame(scale(crypto_data[2:3]))
crypto_scale <- crypto_scale %>% rename(
  BTC_CAD_scale=BTC_CAD,
  ADA_CAD_scale=ADA_CAD
)
## bind the values back to original data set, with dates
crypto_scale <- bind_cols(crypto_data, crypto_scale)

Here we see again that the distributions are similar but not the same before/after. This is expected with standardization: distributions shift more toward normal curve.

plot of chunk crypto-hist-std-01

Create line plot:

p_std <- crypto_scale %>% ggplot(aes(x=date))+
  geom_line(aes(y=BTC_CAD_scale), color='goldenrod')+
  geom_line(aes(y=ADA_CAD_scale), color='blue')+
  labs(title='Stdized Prices (gold=BTC, blue=ADA)', x='', y='standardized prices')

Compare the two methods:

plot of chunk crypto-norm-std-01

Very similar results, although interesting to see how the Standardized view on the right has a bit more spread for ADA relative to BTC, as values are not constrained between upper and lower bound. This may be a more accurate reflection of the higher volatility of ADA.

So if we focus on the Standardized version, what can we learn from this view, relative to the questions we want to answer? A few things to unpack so let me make take a stab at it:

  • we see the overall upward trend in both data sets, with peaks and valleys along the way, at different points for each currency
  • lots of volatility in each data set, with a bit relative volatility in ADA: including higher peaks and lower lows.
  • ADA was slower off the mark than BTC at the beginning of the year, peaked a bit after, trended down in concert with BTC and then took off, before falling more consistently and harder than BTC at the end of the year.

Now, with dual-axis charts like the companion versions in the previous post, we can draw similar conclusions, depending on how we configure the two axes – which is the crux of the problem. The point is this:

with a standardized comparison, we can make these conclusions with more confidence.

Percentile comparisons

Data can also be scaled using percentiles, resulting in a scale between 0 and 100, based on the ranking of each value. Doesn't seem to be a recommended for machine learning feature engineering, but could have application for comparisons as alternative to dual axis charts, where a more even comparison is wanted.

The tidyverse has a handy 'percent_rank' function for easy calculation.

crypto_data_pctl <- crypto_data %>% mutate(
  BTC_CAD_pctl=percent_rank(BTC_CAD),
  ADA_CAD_pctl=percent_rank(ADA_CAD)
)

p_pctl <- crypto_data_pctl %>% ggplot(aes(x=date))+
  geom_line(aes(y=BTC_CAD_pctl), color='goldenrod')+
  geom_line(aes(y=ADA_CAD_pctl), color='blue')+
  scale_y_continuous(labels=percent_format())+
  labs(title='Prcntile Prices (gold=BTC, blue=ADA)', x='', y='percentile rank')
p_pctl

plot of chunk crypto-pctl-lines-01

Also appears potentially usable although does seem to diverge from the other two approaches. Compare all three methods:

plot of chunk crypto-all-norm-01

The percentile version stands out as having different relative patterns compared to the other two. Possibly because by definition data points for each of the currencies will include all the same percentile values (0 to 100 in increments of n/100 where n=number of rows in the data). The only difference in the two will be the order in which the values occur in time, by date. So the percentile approach is one to avoid if the relative amount, and not just rank position, matters.

Conclusion – Scenario 1

There are some viable alternatives to dual-axis charts when trying to answer questions about the relationships/trends between to metrics on wildly different scales. In particular, comparing percentage changes can provide insights without the hazards of dual-axis charts.

Scenario 2: Comparing a Count and a Ratio

In other cases, we may want to compare patterns in a volume or count metric with a related key indicator. Here's an example using some Google Analytics data for a website:

  • daily users
  • daily conversion rate

Interesting questions with these metrics can inlude:

  • what is the relationship between patterns in site traffic and conversion rates?
  • do increases in daily users correspond to decreases in conversion rates or vice versa?
  • are there any points where breaking of the typical relationship between these metrics warrants further investigation?

Quick look at the data:

date users conv_rate
2021-11-06 464654 0.024
2021-11-10 439000 0.016
2021-11-12 576465 0.020
2021-11-17 414958 0.022

Approaches shown above can be applied to this data as well. So we'll focus on some further alternatives particularly well-suited to this type of data.

Line chart with Bar chart below

We saw above how line charts stocked above and below can be easier to interpret that dual-axis charts. In the previous blog post, there was an example of how a dual-axis chart can be improved by combining a line chart for a percentage/ratio metric with a bar chart for a volume metric. We can take from each of these examples and set up a line chart with a bar chart below it – similar to the typical way stock market charts display price data with volume data underneath.

plot of chunk unnamed-chunk-19

Here we're abandoning any attempt to align the scales in favour of considering the metrics separately but within the same context of time frame. This also:

  • frees us up to adjust the height ratio of the charts in order to focus on the key metric of interest.
  • the separation reduces confusion over the scales, as there is less ambiguity.
  • allows us to add context like a trend (regression) line without creating the confusion and clutter that we would have on a dual-axis chart.

My view is that these changes make it easier and more intuitive to arrive at the same conclusions from the previous blog post: conversion rates rising and fall out-of-sync with variations in user counts.

Again, this gets us quicker to follow-up questions and directed investigation, as well as enabling us to communicate what we are seeing in the data more clearly with others.

Conclusion – Scenario 2

As with previous scenario, there are viable alternatives to comparing a volume metric with a rate or ratio metric, even though they are on very different scales.

Overall Conclusion

With these viable alternatives to dual-axis charts, there should be few cases where an analyst needs to succumb to the temptation of the dual-axis monster. The solutions offered here, or variations on them to fit your needs, will usually provide more clarity and better reveal the answers to your data questions.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s