News Archive
PhpRiot Newsletter
Your Email Address:

More information

Smoothing with Holt-Winter

Note: This article was originally published at Planet PHP on 21 April 7800.
Planet PHP

In one of his talks at QCon, John Allspaw mentioned using Holt-Winter exponential smoothing on various monitoring instances. Wikipedia has a good entry on the subject, of course, but the basic idea is to take a noisy/spikey time series and smooth it out, so that unexpected changes will stand out even more. That's often initially done by taking a moving average, so say averaging the last 7 days of data and using that as the current day's value. More complicated schemes weight that average, so that the older data contributes less.

Simple exponential smoothing effectively takes this weighted average further, with more recent values being exponentially more important than older ones. However, this has problems in the face of a long term trend, so double exponential includes a factor for the general tendencies in the data (e.g. an increasing trend over time). Triple exponential, which we've using here, also includes a factor to consider seasonal changes, so I thought I'd give that one a go at implementing. Each of those three smoothing aspects have their own weighting factor, alpha, beta and gamma, that control how much of an impact they have, and by setting each to 0 we can have the same code do any one of the three algorithms. Below I've broken out the function into it's component parts, but you can see the whole thing on github

We'll give it a go on some web data that has an unexpected spike, see how visible that is against the timeline. The algorithm is pretty simple, but we need to setup a bunch of variables first. We start off by calculating an initial trend value by looking at the difference in the average values over the first two 'seasons' (the length being a configurable parameter of the function).

// Calculate an initial trend level
$trend1 = 0;
for($i = 0; $i $season_length; $i++) {
A A $trend1 += $data[$i];
$trend1 /= $season_length;
$trend2 = 0;
for($i = $season_length; $i 2*$season_length; $i++) {
A A $trend2 += $data[$i];
$trend2 /= $season_length;
$initial_trend = ($trend2 - $trend1) / $season_length;

Next we create an initial value for the 'level' part, the direct data smoothing parameter, map the data for the season index, and calculate the seasonal changes for the first period.

// Take the first value as the initial level
$initial_level = $data[0];
// Build index

Truncated by Planet PHP, read more at the original (another 24725 bytes)