Causal Impact

A Python package for causal inference using Bayesian structural time-series models. It's a port of the R package CausalImpact, see https://github.com/google/CausalImpact.

What does the package do?

This package implements an approach to estimating the causal effect of a designed intervention on a time series. For example, how many additional daily clicks were generated by an advertising campaign? Answering a question like this can be difficult when a randomized experiment is not available.

How does it work?

Given a response time series (e.g., clicks) and a set of control time series (e.g., clicks in non-affected markets or clicks on other sites), the package constructs a Bayesian structural time-series model. This model is then used to try and predict the counterfactual, i.e., how the response metric would have evolved after the intervention if the intervention had never occurred. For details, see: Brodersen et al., Annals of Applied Statistics (2015).

What assumptions does the model make?

As with all non-experimental approaches to causal inference, valid conclusions require strong assumptions. In the case of CausalImpact, we assume that there is a set control time series that were themselves not affected by the intervention. If they were, we might falsely under- or overestimate the true effect. Or we might falsely conclude that there was an effect even though in reality there wasn't. The model also assumes that the relationship between covariates and treated time series, as established during the pre-period, remains stable throughout the post-period (see impact.model_args["dynamic_regression"] for a way of relaxing this assumption). Finally, it's important to be aware of the priors that are part of the model (see impact.model_args["prior_level_sd"] in particular).

How is the package structured?

The package is designed to make counterfactual inference as easy as fitting a regression model, but much more powerful, provided the assumptions above are met. The package has a single entry point, the function CausalImpact(). Given a response time series and a set of control time series, the function constructs a time-series model, performs posterior inference on the counterfactual, and returns a CausalImpact object. The results can be summarized in terms of a table, a verbal description, or a plot.

1. Installing the package

To install causalimpact run:

In [1]:
# !pip install git+https://github.com/jamalsenouci/causalimpact.git

Once installed, the package can be imported using:

In [2]:
from causalimpact import CausalImpact

2. Creating an example dataset

To illustrate how the package works, we create a simple toy dataset. It consists of a response variable y and a predictor x1. Note that in practice, we'd strive for including many more predictor variables and let the model choose an appropriate subset. The example data has 100 observations. We create an intervention effect by lifting the response variable by 10 units after timepoint 71.

In [3]:
import numpy as np
import pandas as pd
from statsmodels.tsa.arima_process import arma_generate_sample
import matplotlib
import seaborn as sns
%matplotlib inline
matplotlib.rcParams['figure.figsize'] = (15, 6)

np.random.seed(1)

x1 = arma_generate_sample(ar=[0.999], ma=[0.9], nsample=100) + 100
y = 1.2 * x1 + np.random.randn(100)

y[71:100] = y[71:100] + 10
data = pd.DataFrame(np.array([y, x1]).T, columns=["y","x1"])

We now have a simple matrix with 100 rows and two columns:

In [4]:
data.shape
data.head()
Out[4]:
y x1
0 121.308920 101.463374
1 120.563149 99.448868
2 119.832495 99.524170
3 119.433612 99.033362
4 119.840664 100.779647

We can visualize the generated data using:

In [5]:
data.plot()
Out[5]:
<matplotlib.axes._subplots.AxesSubplot at 0x23b92db3978>

3. Running an Analysis

To estimate the causal effect, we begin by specifying which period in the data should be used for training the model (pre-intervention period) and which period for computing a counterfactual prediction (post-intervention period).

In [6]:
pre_period = [0,69]
post_period = [70,99]

This says that time points 0...70 will be used for training, and time points 71...99 will be used for computing predictions. Alternatively, we could specify the periods in terms of dates or time points; see Section 5 for an example.

To perform inference, we run the analysis using:

In [7]:
impact = CausalImpact(data, pre_period, post_period)

This initialises a CausalImpact object

In [8]:
impact.run()

This instructs the package to assemble a structural time-series model, fit the model using MLE by default, and compute estimates of the causal effect. We can view the results in a dataframe as follows:

In [9]:
impact.inferences
Out[9]:
response cum_response point_pred point_pred_upper point_pred_lower cum_pred cum_pred_lower cum_pred_upper point_effect point_effect_lower point_effect_upper cum_effect cum_effect_lower cum_effect_upper
0 121.308920 121.308920 121.983438 3039.732063 -2795.765187 121.983438 -2795.765187 3039.732063 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
1 120.563149 241.872070 118.645144 121.268874 116.021414 240.628582 -2679.743773 3161.000937 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
2 119.832495 361.704565 119.703718 121.975935 117.431500 360.332300 -2562.312273 3282.976872 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
3 119.433612 481.138177 119.097649 121.239916 116.955382 479.429949 -2445.356891 3404.216789 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
4 119.840664 600.978841 121.490748 123.564989 119.416507 600.920697 -2325.940383 3527.781778 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
5 117.681232 718.660074 117.388078 119.420411 115.355745 718.308775 -2210.584639 3647.202189 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
6 122.626839 841.286913 122.257189 124.261096 120.253281 840.565964 -2090.331357 3771.463285 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
7 118.223373 959.510286 119.324678 121.308032 117.341325 959.890642 -1972.990032 3892.771317 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
8 120.078689 1079.588975 120.473868 122.441666 118.506070 1080.364511 -1854.483962 4015.212983 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
9 119.763025 1199.352000 119.752836 121.708450 117.797223 1200.117347 -1736.686739 4136.921433 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
10 120.207540 1319.559540 121.792669 123.738480 119.846859 1321.910016 -1616.839880 4260.659913 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
11 118.087980 1437.647520 117.452656 119.390409 115.514902 1439.362672 -1501.324978 4380.050322 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
12 120.497602 1558.145122 119.575679 121.506692 117.644665 1558.938351 -1383.680313 4501.557014 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
13 118.725290 1676.870412 119.573170 121.498462 117.647878 1678.511521 -1266.032435 4623.055476 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
14 121.576243 1798.446654 121.320728 123.241103 119.400354 1799.832249 -1146.632080 4746.296579 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
15 117.498645 1915.945299 118.676894 120.592996 116.760792 1918.509143 -1029.871288 4866.889575 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
16 119.774896 2035.720195 119.708102 121.620458 117.795746 2038.217246 -912.075542 4988.510033 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
17 117.435192 2153.155387 118.871681 120.780726 116.962637 2157.088927 -795.112905 5109.290759 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
18 121.167054 2274.322441 119.887921 121.794017 117.981824 2276.976847 -677.131081 5231.084775 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
19 121.038971 2395.361412 120.599240 122.502694 118.695786 2397.576087 -558.435295 5353.587469 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
20 118.785524 2514.146936 118.615820 120.516893 116.714746 2516.191907 -441.720549 5474.104363 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
21 120.462378 2634.609314 121.298686 123.197603 119.399769 2637.490593 -322.320779 5597.301965 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
22 122.248449 2756.857762 120.971038 122.867992 119.074084 2758.461631 -203.246696 5720.169957 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
23 122.510339 2879.368101 120.551150 122.446310 118.655990 2879.012781 -84.590706 5842.616268 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
24 119.115916 2998.484018 121.107335 123.000849 119.213820 3000.120116 34.623115 5965.617117 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
25 120.496999 3118.981017 119.140028 121.032027 117.248029 3119.260144 151.871144 6086.649144 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
26 121.494796 3240.475813 119.860322 121.750921 117.969723 3239.120467 269.840867 6208.400066 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
27 119.326369 3359.802182 118.952509 120.841811 117.063207 3358.072975 386.904074 6329.241876 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
28 118.511123 3478.313305 119.761481 121.649577 117.873384 3477.834456 504.777458 6450.891454 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
29 121.436703 3599.750008 120.669280 122.556254 118.782306 3598.503736 623.559764 6573.447707 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
70 117.791879 8527.215896 118.328108 120.196571 116.459646 8527.103835 5475.290680 11578.916989 -0.536229 1.332234 -2.404691 -0.536229 1.332234 -2.404691
71 130.356118 8657.572014 119.447472 121.315935 117.579010 8646.551307 5592.869690 11700.232923 10.908646 12.777109 9.040184 10.372417 14.109342 6.635492
72 132.701339 8790.273353 120.239067 122.107530 118.370605 8766.790374 5711.240295 11822.340453 12.462271 14.330734 10.593809 22.834689 28.440076 17.229301
73 130.698575 8920.971928 121.092166 122.960629 119.223704 8887.882540 5830.463999 11945.301082 9.606409 11.474871 7.737946 32.441097 39.914947 24.967247
74 130.384896 9051.356824 120.424425 122.292887 118.555962 9008.306965 5949.019961 12067.593969 9.960471 11.828934 8.092009 42.401568 51.743881 33.059256
75 127.587522 9178.944346 117.639456 119.507919 115.770994 9125.946421 6064.790955 12187.101888 9.948066 11.816529 8.079604 52.349635 63.560410 41.138859
76 131.000426 9309.944772 119.683654 121.552116 117.815191 9245.630075 6182.606146 12308.654004 11.316772 13.185235 9.448310 63.666407 76.745645 50.587169
77 130.607800 9440.552571 121.034754 122.903217 119.166292 9366.664829 6301.772437 12431.557221 9.573046 11.441508 7.704583 73.239453 88.187153 58.291752
78 130.928821 9571.481392 120.322524 122.190987 118.454062 9486.987353 6420.226499 12553.748208 10.606297 12.474759 8.737834 83.845749 100.661912 67.029586
79 130.503994 9701.985387 120.956175 122.824637 119.087712 9607.943528 6539.314211 12676.572845 9.547820 11.416282 7.679357 93.393569 112.078194 74.708944
80 128.487086 9830.472473 119.783571 121.652034 117.915109 9727.727099 6657.229320 12798.224879 8.703515 10.571978 6.835053 102.097084 122.650172 81.543996
81 130.096512 9960.568985 119.809267 121.677729 117.940804 9847.536366 6775.170124 12919.902608 10.287245 12.155708 8.418782 112.384329 134.805879 89.962779
82 130.704873 10091.273858 120.270665 122.139127 118.402202 9967.807031 6893.572326 13042.041735 10.434208 12.302671 8.565745 122.818537 147.108550 98.528524
83 131.736525 10223.010383 120.536900 122.405362 118.668437 10088.343931 7012.240764 13164.447098 11.199625 13.068088 9.331163 134.018162 160.176638 107.859687
84 130.103931 10353.114314 120.284648 122.153111 118.416186 10208.628579 7130.656949 13286.600208 9.819283 11.687745 7.950820 143.837445 171.864383 115.810507
85 129.511296 10482.625610 120.190192 122.058654 118.321729 10328.818771 7248.978679 13408.658863 9.321104 11.189567 7.452641 153.158549 183.053950 123.263149
86 129.837721 10612.463331 119.249489 121.117952 117.381027 10448.068260 7366.359705 13529.776815 10.588232 12.456694 8.719769 163.746781 195.510644 131.982918
87 130.648914 10743.112245 120.498198 122.366661 118.629736 10568.566458 7484.989441 13652.143475 10.150716 12.019179 8.282254 173.897497 207.529822 140.265171
88 130.412364 10873.524608 120.193543 122.062005 118.325080 10688.760001 7603.314521 13774.205480 10.218821 12.087284 8.350359 184.116318 219.617106 148.615530
89 131.147951 11004.672559 121.393930 123.262392 119.525467 10810.153930 7722.839988 13897.467873 9.754021 11.622484 7.885559 193.870339 231.239590 156.501088
90 132.456466 11137.129025 121.476644 123.345106 119.608181 10931.630574 7842.448169 14020.812979 10.979822 12.848285 9.111360 204.850161 244.087875 165.612448
91 130.569662 11267.698687 120.268991 122.137454 118.400529 11051.899565 7960.848698 14142.950432 10.300671 12.169133 8.432208 215.150832 256.257008 174.044656
92 131.498945 11399.197633 119.601360 121.469823 117.732898 11171.500925 8078.581596 14264.420255 11.897585 13.766048 10.029123 227.048417 270.023055 184.073779
93 130.420537 11529.618170 119.287528 121.155991 117.419066 11290.788454 8196.000661 14385.576246 11.133009 13.001471 9.264546 238.181426 283.024527 193.338325
94 131.116882 11660.735051 120.552913 122.421376 118.684451 11411.341367 8314.685112 14507.997622 10.563968 12.432431 8.695506 248.745394 295.456958 202.033831
95 128.456173 11789.191224 120.140554 122.009016 118.272091 11531.481921 8432.957204 14630.006638 8.315619 10.184081 6.447156 257.061013 305.641039 208.480987
96 130.230586 11919.421810 119.638803 121.507266 117.770340 11651.120724 8550.727544 14751.513904 10.591783 12.460245 8.723320 267.652796 318.101284 217.204307
97 130.467414 12049.889223 120.100357 121.968820 118.231894 11771.221081 8668.959439 14873.482723 10.367057 12.235519 8.498594 278.019853 330.336803 225.702902
98 130.140680 12180.029904 119.309840 121.178303 117.441378 11890.530921 8786.400816 14994.661026 10.830840 12.699303 8.962378 288.850693 343.036106 234.665279
99 131.799071 12311.828975 120.879959 122.748421 119.011496 12011.410880 8905.412312 15117.409448 10.919112 12.787575 9.050650 299.769805 355.823681 243.715929

100 rows 14 columns

4. Plotting the results

The easiest way of visualising the results is to use the plot() method of the CausalImpact object

In [10]:
impact.plot()

By default, the plot contains three panels. The first panel shows the data and a counterfactual prediction for the post-treatment period. The second panel shows the difference between observed data and counterfactual predictions. This is the pointwise causal effect, as estimated by the model. The third panel adds up the pointwise contributions from the second panel, resulting in a plot of the cumulative effect of the intervention.

Remember, once again, that all of the above inferences depend critically on the assumption that the covariates were not themselves affected by the intervention. The model also assumes that the relationship between covariates and treated time series, as established during the pre-period, remains stable throughout the post-period.

5. Working with dates and times

It is often more natural to feed a time-series object into CausalImpact() rather than a data frame. For example, we might create a data variable as follows:

In [11]:
date_range = pd.date_range(start="2014-01-01", periods=100)
data.index = date_range
data.head()
Out[11]:
y x1
2014-01-01 121.308920 101.463374
2014-01-02 120.563149 99.448868
2014-01-03 119.832495 99.524170
2014-01-04 119.433612 99.033362
2014-01-05 119.840664 100.779647

We can now specify the pre_period and post_period in terms of time points rather than indices:

In [12]:
pre_period = [pd.to_datetime(date) for date in ["2014-01-01", "2014-03-11"]]
post_period = [pd.to_datetime(date) for date in ["2014-03-12", "2014-04-10"]]

As a result, the x-axis of the plot shows time points instead of indices:

In [13]:
impact = CausalImpact(data, pre_period, post_period)
impact.run()
In [14]:
impact.inferences.head(2)
Out[14]:
response cum_response point_pred point_pred_upper point_pred_lower cum_pred cum_pred_lower cum_pred_upper point_effect point_effect_lower point_effect_upper cum_effect cum_effect_lower cum_effect_upper
2014-01-01 121.308920 121.30892 121.983438 3039.732063 -2795.765187 121.983438 -2795.765187 3039.732063 0.0 0.0 0.0 0.0 0.0 0.0
2014-01-02 120.563149 241.87207 118.645144 121.268874 116.021414 240.628582 -2679.743773 3161.000937 0.0 0.0 0.0 0.0 0.0 0.0
In [15]:
impact.plot()

6. Printing a summary table

To obtain a numerical summary of the analysis we use:

In [16]:
impact.summary()
                      Average    Cumulative
Actual                    130          3902
Predicted                 120          3602
95% CI             [118, 121]  [3546, 3658]
                                           
Absolute Effect             9           299
95% CI                [11, 8]    [355, 243]
                                           
Relative Effect          8.3%          8.3%
95% CI           [9.2%, 6.7%]  [9.9%, 6.7%]

The Average column talks about the average (across time) during the post-intervention period (in the example: time points 71 through 100). The Cumulative column sums up individual time points, which is a useful perspective if the response variable represents a flow quantity (such as queries, clicks, visits, installs, sales, or revenue) rather than a stock quantity (such as number of users or stock price).

In the example, the estimated average causal effect of treatment was 11 (rounded to a whole number; for full precision see impact$summary). This is because we observed an average value of 99 but would have expected an average value of only 89. The 95% posterior interval of the average effect is [9.8, 11]. Since this excludes 0, we (correctly) conclude that the intervention had a causal effect on the response variable. Since we generated the data ourselves, we know that we injected a true effect of 10, and so the model accurately recovered ground truth. One reason for this is that we ensured, by design, that the covariate x1 was not itself affected by the intervention. In practice, we must always reason whether this assumption is justified.

For additional guidance about the correct interpretation of the summary table, the package provides a verbal interpretation, which we can print using:

In [17]:
#TODO impact.summary("report")

7. Adjusting the model

So far, we've simply let the package decide how to construct a time-series model for the available data. However, there are several options that allow us to gain a little more control over this process. These options are passed into model_args as individual list elements, for example:

In [18]:
impact = CausalImpact(data, pre_period, post_period, model_args={"niter":5000, "nseasons":7})
impact.run()
In [19]:
impact.plot()

Available options

  • niter Number of MCMC samples to draw. More samples lead to more accurate inferences. Defaults to 1000.

  • standardize_data Whether to standardize all columns of the data before fitting the model. This is equivalent to an empirical Bayes approach to setting the priors. It ensures that results are invariant to linear transformations of the data. Defaults to TRUE.

  • prior_level_sd Prior standard deviation of the Gaussian random walk of the local level. Expressed in terms of data standard deviations. Defaults to 0.01, a typical choice for well-behaved and stable datasets with low residual volatility after regressing out known predictors (e.g., web searches or sales in high quantities). When in doubt, a safer option is to use 0.1, as validated on synthetic data, although this may sometimes give rise to unrealistically wide prediction intervals.

  • nseasons Period of the seasonal components. In order to include a seasonal component, set this to a whole number greater than 1. For example, if the data represent daily observations, use 7 for a day-of-week component. This interface currently only supports up to one seasonal component. To specify multiple seasonal components, use bsts to specify the model directly, then pass the fitted model in as bsts.model. Defaults to 1, which means no seasonal component is used.

  • season_duration Duration of each season, i.e., number of data points each season spans. Defaults to 1. For example, to add a day-of-week component to data with daily granularity, use model_args = list(nseasons = 7, season_duration = 1). To add a day-of-week component to data with hourly granularity, set model_args = list(nseasons = 7, season_duration = 24).

  • dynamic_regression Whether to include time-varying regression coefficients. In combination with a time-varying local trend or even a time-varying local level, this often leads to overspecification, in which case a static regression is safer. Defaults to FALSE.

8. Using a custom model

Instead of using the default model constructed by the CausalImpact package, we can use the bsts package to specify our own model. This provides the greatest degree of flexibility.

Before constructing a custom model, we set the observed data in the post-treatment period to NA, reflecting the fact that the counterfactual response is unobserved after the intervention. We keep a copy of the actual observed response in the variable post_period_response.

In [20]:
post_period = [70,100]
post_period_response = y[post_period[0]:post_period[1]].copy()
y[post_period[0]:post_period[1]] = np.nan

We next set up and estimate a time series model using the statsmodels package. Here is a simple example:

In [21]:
from statsmodels.tsa.statespace.structural import UnobservedComponents
ucm_model = UnobservedComponents(endog=y, exog=x1, level="llevel")
impact = CausalImpact(ucm_model=ucm_model, post_period_response=post_period_response)
impact.run()
#TODO
impact.plot()
In [22]:
impact.inferences
Out[22]:
response cum_response point_pred point_pred_upper point_pred_lower cum_pred cum_pred_lower cum_pred_upper point_effect point_effect_lower point_effect_upper cum_effect cum_effect_lower cum_effect_upper
0 121.308920 121.308920 134.164372 2094.129235 -1825.800490 134.164372 -1825.800490 2094.129235 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
1 120.563149 241.872070 118.645163 121.268872 116.021455 252.809536 -1709.779035 2215.398107 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
2 119.832495 361.704565 119.703727 121.975925 117.431528 372.513262 -1592.347507 2337.374032 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
3 119.433612 481.138177 119.097657 121.239906 116.955408 491.610920 -1475.392099 2458.613938 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
4 119.840664 600.978841 121.490747 123.564971 119.416524 613.101667 -1355.975575 2582.178909 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
5 117.681232 718.660074 117.388089 119.420405 115.355772 730.489756 -1240.619803 2701.599314 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
6 122.626839 841.286913 122.257183 124.261074 120.253292 852.746939 -1120.366510 2825.860388 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
7 118.223373 959.510286 119.324683 121.308020 117.341345 972.071621 -1003.025165 2947.168408 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
8 120.078689 1079.588975 120.473868 122.441650 118.506086 1092.545489 -884.519079 3069.610057 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
9 119.763025 1199.352000 119.752838 121.708435 117.797241 1212.298327 -766.721838 3191.318492 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
10 120.207540 1319.559540 121.792665 123.738459 119.846870 1334.090992 -646.874967 3315.056952 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
11 118.087980 1437.647520 117.452664 119.390402 115.514927 1451.543657 -531.360040 3434.447354 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
12 120.497602 1558.145122 119.575680 121.506678 117.644683 1571.119337 -413.715357 3555.954031 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
13 118.725290 1676.870412 119.573172 121.498447 117.647896 1690.692508 -296.067462 3677.452479 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
14 121.576243 1798.446654 121.320725 123.241083 119.400366 1812.013233 -176.667095 3800.693561 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
15 117.498645 1915.945299 118.676899 120.592984 116.760813 1930.690132 -59.906283 3921.286546 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
16 119.774896 2035.720195 119.708103 121.620443 117.795763 2050.398235 57.889480 4042.906989 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
17 117.435192 2153.155387 118.871684 120.780713 116.962656 2169.269919 174.852136 4163.687702 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
18 121.167054 2274.322441 119.887920 121.794001 117.981840 2289.157840 292.833976 4285.481703 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
19 121.038971 2395.361412 120.599238 122.502676 118.695799 2409.757077 411.529776 4407.984379 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
20 118.785524 2514.146936 118.615824 120.516881 116.714766 2528.372901 528.244542 4528.501260 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
21 120.462378 2634.609314 121.298682 123.197583 119.399781 2649.671583 647.644323 4651.698843 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
22 122.248449 2756.857762 120.971035 122.867973 119.074096 2770.642618 766.718419 4774.566816 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
23 122.510339 2879.368101 120.551149 122.446293 118.656004 2891.193766 885.374423 4897.013109 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
24 119.115916 2998.484018 121.107332 123.000831 119.213833 3012.301098 1004.588256 5020.013940 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
25 120.496999 3118.981017 119.140031 121.032014 117.248048 3131.441129 1121.836304 5141.045954 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
26 121.494796 3240.475813 119.860323 121.750907 117.969740 3251.301452 1239.806044 5262.796861 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
27 119.326369 3359.802182 118.952512 120.841799 117.063226 3370.253965 1356.869270 5383.638659 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
28 118.511123 3478.313305 119.761482 121.649563 117.873401 3490.015446 1474.742670 5505.288222 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
29 121.436703 3599.750008 120.669278 122.556236 118.782320 3610.684724 1593.524990 5627.844458 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
70 117.791879 8527.215896 118.328114 120.196561 116.459667 8539.284806 6445.256528 10633.313084 -0.536235 1.332213 -2.404682 -0.536235 1.332213 -2.404682
71 130.356118 8657.572014 119.447475 121.315922 117.579027 8658.732281 6562.835555 10754.629006 10.908644 12.777091 9.040197 10.372409 14.109303 6.635515
72 132.701339 8790.273353 120.239067 122.107514 118.370620 8778.971348 6681.206175 10876.736520 12.462271 14.330718 10.593824 22.834681 28.440022 17.229340
73 130.698575 8920.971928 121.092163 122.960610 119.223716 8900.063511 6800.429892 10999.697131 9.606412 11.474859 7.737965 32.441092 39.914881 24.967304
74 130.384896 9051.356824 120.424424 122.292871 118.555977 9020.487935 6918.985869 11121.990002 9.960472 11.828919 8.092025 42.401564 51.743799 33.059329
75 127.587522 9178.944346 117.639464 119.507911 115.771017 9138.127399 7034.756886 11241.497913 9.948058 11.816505 8.079611 52.349623 63.560305 41.138941
76 131.000426 9309.944772 119.683655 121.552102 117.815208 9257.811054 7152.572094 11363.050015 11.316771 13.185218 9.448324 63.666393 76.745523 50.587264
77 130.607800 9440.552571 121.034752 122.903199 119.166305 9378.845806 7271.738398 11485.953213 9.573048 11.441495 7.704601 73.239442 88.187018 58.291865
78 130.928821 9571.481392 120.322524 122.190971 118.454077 9499.168330 7390.192475 11608.144184 10.606297 12.474744 8.737850 83.845739 100.661762 67.029715
79 130.503994 9701.985387 120.956172 122.824619 119.087725 9620.124502 7509.280200 11730.968803 9.547822 11.416269 7.679375 93.393561 112.078031 74.709090
80 128.487086 9830.472473 119.783573 121.652020 117.915126 9739.908074 7627.195326 11852.620823 8.703514 10.571961 6.835067 102.097075 122.649992 81.544157
81 130.096512 9960.568985 119.809268 121.677715 117.940821 9859.717343 7745.136147 11974.298538 10.287244 12.155691 8.418797 112.384318 134.805683 89.962954
82 130.704873 10091.273858 120.270665 122.139112 118.402218 9979.988007 7863.538364 12096.437650 10.434208 12.302655 8.565761 122.818527 147.108338 98.528715
83 131.736525 10223.010383 120.536899 122.405346 118.668452 10100.524906 7982.206816 12218.842995 11.199626 13.068073 9.331179 134.018153 160.176412 107.859895
84 130.103931 10353.114314 120.284648 122.153095 118.416201 10220.809553 8100.623017 12340.996090 9.819283 11.687730 7.950836 143.837436 171.864142 115.810731
85 129.511296 10482.625610 120.190192 122.058639 118.321745 10340.999745 8218.944762 12463.054729 9.321104 11.189551 7.452657 153.158540 183.053693 123.263388
86 129.837721 10612.463331 119.249492 121.117939 117.381045 10460.249238 8336.325807 12584.172668 10.588229 12.456676 8.719782 163.746769 195.510368 131.983170
87 130.648914 10743.112245 120.498197 122.366644 118.629750 10580.747435 8454.955557 12706.539312 10.150717 12.019164 8.282270 173.897486 207.529532 140.265440
88 130.412364 10873.524608 120.193543 122.061990 118.325095 10700.940977 8573.280652 12828.601302 10.218821 12.087268 8.350374 184.116307 219.616801 148.615814
89 131.147951 11004.672559 121.393926 123.262373 119.525479 10822.334903 8692.806132 12951.863675 9.754025 11.622472 7.885578 193.870332 231.239273 156.501392
90 132.456466 11137.129025 121.476640 123.345087 119.608193 10943.811543 8812.414324 13075.208762 10.979826 12.848273 9.111379 204.850158 244.087546 165.612771
91 130.569662 11267.698687 120.268991 122.137438 118.400544 11064.080534 8930.814868 13197.346200 10.300671 12.169118 8.432224 215.150829 256.256664 174.044995
92 131.498945 11399.197633 119.601362 121.469809 117.732915 11183.681896 9048.547783 13318.816009 11.897583 13.766030 10.029136 227.048413 270.022694 184.074131
93 130.420537 11529.618170 119.287531 121.155978 117.419084 11302.969427 9165.966867 13439.971987 11.133006 13.001453 9.264559 238.181419 283.024147 193.338690
94 131.116882 11660.735051 120.552912 122.421359 118.684465 11423.522339 9284.651332 13562.393346 10.563969 12.432416 8.695522 248.745388 295.456564 202.034212
95 128.456173 11789.191224 120.140554 122.009001 118.272107 11543.662893 9402.923439 13684.402347 8.315619 10.184066 6.447171 257.061007 305.640629 208.481384
96 130.230586 11919.421810 119.638805 121.507252 117.770358 11663.301698 9520.693797 13805.909599 10.591781 12.460228 8.723334 267.652788 318.100857 217.204718
97 130.467414 12049.889223 120.100357 121.968804 118.231910 11783.402055 9638.925707 13927.878403 10.367057 12.235504 8.498610 278.019844 330.336361 225.703328
98 130.140680 12180.029904 119.309843 121.178290 117.441396 11902.711898 9756.367103 14049.056693 10.830838 12.699285 8.962391 288.850682 343.035646 234.665718
99 131.799071 12311.828975 120.879957 122.748404 119.011510 12023.591855 9875.378613 14171.805097 10.919115 12.787562 9.050668 299.769796 355.823207 243.716386

100 rows 14 columns

Finally, we call CausalImpact(). Instead of providing input data, we simply pass in the fitted model object (bsts_model). We also need to provide the actual observed response. This is needed so that the package can compute the difference between predicted response (stored in bsts_model) and actual observed response (stored in post_period_response).

9. FAQ

How do I cite the package in my work?

We recommend referencing the use of the CausalImpact R package as shown in the example below:

"CausalImpact r packageVersion("CausalImpact"), Brodersen et al., Annals of Applied Statistics (2015). http://google.github.io/CausalImpact/"

To find out which package version you are using, type import causalimpact; causalimpact.__version__. See the bottom of this page for full bibliographic details.

How can I check whether the model assumptions are fulfilled?

It's the elephant in the room with any causal analysis on observational data: how can we verify the assumptions that go into the model? Here are a few ways of getting started. First of all, it is critical to reason why the covariates that are included in the model (this was x1 in the example) were not themselves affected by the intervention. Sometimes it helps to plot all covariates and do a visual sanity check. Next, it is a good idea to examine how well the outcome data y can be predicted before the beginning of the intervention. This can be done by running CausalImpact() on an imaginary intervention. Then check how well the model predicted the data following this imaginary intervention. We would expect not to find a significant effect, i.e., counterfactual estimates and actual data should agree reasonably closely. Finally, when presenting or writing up results, be sure to list the above assumptions explicitly, including the priors in model_args, and discuss them with your audience.

May the data contain missing values?

The response variable (i.e., the first column in data) may contain missing values (NA), but covariates (all other columns in data) may not. If one of your covariates contains missing values, consider imputing (i.e., estimating) the missing values; if this is not feasible, leave the regressor out.

How can I customize the default plot?

By default, impact.plot() creates three panels, showing the counterfactual, pointwise, and cumulative impact estimates. One way of customizing the plot is to specify which panels should be included:

In [23]:
impact.plot(["original", "pointwise"])

This creates a plot without cumulative impact estimates. This is sensible whenever the response variable represents a stock quantity that cannot be meaningfully summed up across time (e.g., number of current subscribers), rather than a flow quantity (e.g., number of clicks).

How can I change the font size in the plot?

The plot() function for CausalImpact objects returns a matplotlib object. This means we can customize the plot using standard bokeh functions. For example, to increase the font size, we can do:

In [24]:
# TODO

How can I obtain 90% intervals instead of 95% intervals?

The size of the intervals is specified by the argument alpha, which defaults to 0.05. To obtain 90% intervals instead, we would use:

In [25]:
impact = CausalImpact(data, pre_period, post_period, alpha = 0.1)

Which predictor variables were used in the model?

Analyses may easily contain tens or hundreds of potential predictors (i.e., columns in the data function argument). Which of these were informative? We can plot the posterior probability of each predictor being included in the model using:

In [26]:
# TODO impact.coefficient_plot()