In this tutorial, we will explore various detrending models using two popular Python libraries - statsmodels and scipy.

While there are several detrending methods, we will focus on four models:

We will start with a constant model from the scipy library, which assumes that the trend of the time series is a straight horizontal line.

Then we move to a model that captures a linear trend in the data.

After that, we will explore a quadratic model, using the statsmodels library.

Finally, we will look at a moving average model, which is particularly useful for smoothing out a time series with a non-stationary trend.

After these, you can check my tutorial using another detrending method: differencing.

What Is Detrending?

If you already know what detrending is, feel free to skip this section.

Time series data is a collection of observations made over time.

These observations may exhibit a trend.

In plain English, a trend means that the values of the time series may increase or decrease over time.

For example, the number of visitors to a website may increase as the website becomes more popular.

Sometimes we want to analyze the data without the trend to capture other patterns like seasonality or noise.

In these situations, detrending can help us.

We assume the time series is an additive combination of a trend and other components.

$$\mathrm{Y}(t) = \mathrm{T}\left(t\right) + \mathrm{S}\left(t\right) + \mathrm{e}\left(t\right)$$

Here:

  • $Y(t)$ is the original time series
  • $T(t)$ is the trend
  • $S(t)$ is the seasonal component
  • $e(t)$ is the noise

So, to detrend a time series, we need to estimate the trend component and then subtract it from the original data, leaving us with only the seasonal component and the noise.

I will use a time series with monthly historical temperature in my city (Sao Paulo, Brazil) to demonstrate how you can perform detrending in Python.

Here’s a preview of the data:

TIMESTAMP TEMP
1980-12-01 00:00:00 22.63
1952-10-01 00:00:00 20.74
1991-01-01 00:00:00 23.3
2012-04-01 00:00:00 22.45
2018-02-01 00:00:00 23.25

Monthly Temperature in Sao Paulo, Brazil

Detrending With A Constant Model (Scipy)

The simplest way to detrend a time series is by subtracting the mean value of the data.

This is called a constant model, and it assumes that the trend of the time series is a straight horizontal line.

We just need to use the detrend function from the scipy library.

from scipy.signal import detrend

detrended = detrend(data['TEMP'], type='constant')
detrended = pd.Series(detrended, index=data.index)

Monthly Temperature in Sao Paulo, Brazil, Detrended With A Constant Model

Scipy returns a numpy array, so we need to convert it to a pandas Series.

You pass the sequence of values as the first argument and the type of detrending model as the second argument.

As you can see, it didn’t do a great job at removing the trend, because we have what looks like a positive linear trend.

Let’s try a linear model.

Detrending With A Linear Model (Scipy)

It’s rare to find a time series where the trend is a simple horizontal line through time, but it’s very common to find a linear trend.

To perform linear detrending in Python, we can use the same detrend function from the scipy library, but this time we pass type='linear' as the second argument.

detrended = detrend(data['TEMP'], type='linear')
detrended = pd.Series(detrended, index=data.index)

Monthly Temperature in Sao Paulo, Brazil, Detrended With A Linear Model

As you can see, the linear model did a much better job at removing the trend. The data is well aligned with the horizontal axis.

In the next section, we will explore a quadratic model, which can capture more complex trends in the data.

Detrending With A Quadratic Model (Statsmodels)

If you find that the linear model is not enough to remove the trend, you can try a quadratic model.

This time we will use the detrend function from the statsmodels library.

from statsmodels.tsa.tsatools import detrend

detrended = detrend(data['TEMP'], order=2)

Monthly Temperature in Sao Paulo, Brazil, Detrended With A Quadratic Model

The function takes the sequence of values as the first argument and the order of the model as the second argument.

You can use this same function to detrend with a constant model (order=0) or a linear model (order=1), but I prefer to use the scipy library for that.

And you can go even further and use higher order models, but it’s usually not necessary.

Detrending With A Moving Average Model (Pandas)

When you trend changes over time, you can use a moving average model to smooth out the trend.

This model calculates the average value of the data over a certain window and subtracts it from each data point, resulting in a smoothed version of the time series.

We can use the rolling function from the pandas library to calculate the moving average.

rolling_mean = data['TEMP'].rolling(window=60, center=True).mean()
detrended = data['TEMP'] - rolling_mean

Monthly Temperature in Sao Paulo, Brazil, Detrended With A Moving Average Model

You can tune the window size to adjust for the length of the trend.

Trends in temperature should not change very quickly, so I used a window size of 60 months (5 years).

Statistical textbooks use a centered moving average, which means that the window takes into account the values before and after the current value instead of just the values before. This is why we use center=True.

How To Select The Best Detrending Model?

Always start by visually inspecting the data after applying each model.

This will give you a good idea of how well each model is detrending the data.

Look for any remaining trend, oscillations, or patterns that might suggest that the model is not fully capturing the underlying trend in the data.

It’s always a good idea to consider any domain-specific knowledge that might help you select the best detrending model.

For example, if you are modeling physical newspaper sales, you might expect a downward trend over time.

If you are detrending as a preprocessing step for a machine learning model, you can compare different models considering the performance of the full model pipeline on the validation set.