Deseasonalizing Time Series Data With Python

Time series data can be a valuable tool for predicting trends and making informed business decisions.

However, it can be difficult to analyze due to seasonal patterns and other fluctuations that can obscure underlying trends.

That’s where deseasonalizing comes in, allowing you to isolate trends and make more accurate predictions.

In this tutorial, we’ll explore two different approaches to deseasonalize time series data in Python: additive models and multiplicative models.

We’ll use the powerful Python library statsmodels to perform our analysis, and walk through the code step by step so that you can easily follow along.

What Is Seasonality?

If you already know what seasonality is, feel free to skip this section.

Seasonality in time series refers to a pattern that repeats itself at regular intervals over time.

This pattern could be daily, weekly, monthly, or even yearly, and it’s often caused by factors such as weather patterns, holidays, and other recurring events.

Let’s take the example of a company that sells ice cream.

The sales of ice cream may have a seasonal pattern, with higher sales during the summer and lower sales during the winter.

This is because people tend to eat more ice cream in the summer when it’s hot outside, and less in the winter when it’s cold.

This pattern of higher sales in the summer and lower sales in the winter is an example of seasonality in a time series.

Although the model doesn’t have information about the temperature, it can still capture the seasonal pattern in the data by looking at the time of year.

It just knows that in July (northern hemisphere) and January (southern hemisphere) it must adjust its predictions upwards, and in December (northern hemisphere) and June (southern hemisphere) it must adjust its predictions downwards.

In some cases, analyzing the data without this seasonal effect can help you better understand the underlying trends.

This is where deseasonalization comes in.

We can do this by using an additive model or a multiplicative model.

I will use a time series with monthly historical temperature in my city (Sao Paulo, Brazil) to demonstrate how you can perform deseasonalization in Python.

Here’s a preview of the data:

TIMESTAMP	TEMP
1980-12-01 00:00:00	22.63
1952-10-01 00:00:00	20.74
1991-01-01 00:00:00	23.3
2012-04-01 00:00:00	22.45
2018-02-01 00:00:00	23.25

Monthly Temperature in Sao Paulo, Brazil

Deseasonalizing With An Additive Model

In an additive model, the observations are modeled as a linear combination of the seasonal component, the trend component, and the error.

$\mathrm{Y}(t) = \mathrm{T}\left(t\right) + \mathrm{S}\left(t\right) + \mathrm{e}\left(t\right)$

where:

$\mathrm{Y}(t)$ is the observation at time $t$
$\mathrm{T}\left(t\right)$ is the trend at time $t$
$\mathrm{S}\left(t\right)$ is the seasonal component at time $t$
$\mathrm{e}\left(t\right)$ is the error at time $t$

The function seasonal_decompose from statsmodels decomposes a time series into these components.

from statsmodels.tsa.seasonal import seasonal_decompose

decomposition = seasonal_decompose(data['TEMP'], model='additive', period=12)

This function can’t handle missing values, so if you have it, you need to impute or remove them before using it.

You can do it easily with the interpolate method from pandas.

data['TEMP'].interpolate(inplace=True)

I like to sort my data by the timestamp before using it to be sure the function will do what I expect.

We want an additive model, so we set the model parameter to additive.

Our seasonality is monthly, so we set the period parameter to 12.

If your seasonality is:

hourly, set period to 24
weekdays, set period to 7
daily, set period to 365
weekly, set period to 52

You get the idea.

After the function runs, it returns a DecomposeResult object with attributes for trend and seasonal.

We just want the seasonal component, so we can access it with decomposition.seasonal.

To deseasonalize our data, we just need to subtract the seasonal component from the original data.

deseason_temp = data['TEMP'] - decomposition.seasonal

Deseasonalized Monthly Temperature in Sao Paulo, Brazil, Additive Model

Notice that the plot seems more jagged than the original one. Now it doesn’t have a strong seasonal component, but we can see the trend more clearly.

Deseasonalizing With A Multiplicative Model

In a multiplicative model, the observations are modeled as a product of the seasonal component, the trend component, and the error.

$\mathrm{Y}(t) = \mathrm{T}\left(t\right) \times \mathrm{S}\left(t\right) \times \mathrm{e}\left(t\right)$

where:

$\mathrm{Y}(t)$ is the observation at time $t$
$\mathrm{T}\left(t\right)$ is the trend at time $t$
$\mathrm{S}\left(t\right)$ is the seasonal component at time $t$
$\mathrm{e}\left(t\right)$ is the error at time $t$

Deseasonalizing with a multiplicative model involves dividing the observed values by the seasonal component to obtain the deseasonalized values.

We can use the same seasonal_decompose function from statsmodels to obtain the seasonal component for a multiplicative model.

The only difference is that we need to set the model parameter to 'multiplicative'.

decomposition = seasonal_decompose(data['TEMP'], model='multiplicative', period=12)

To deseasonalize our data, we just need to divide the original data by the seasonal component.

deseason_temp = data['TEMP'] / decomposition.seasonal

Deseasonalized Monthly Temperature in Sao Paulo, Brazil, Multiplicative Model

Again, we can see that the deseasonalized data has a clearer trend, without the influence of the seasonal component.

How To Choose Between An Additive Model And A Multiplicative Model?

In general, an additive model is more appropriate when the seasonal pattern in the data is consistent over time, regardless of whether the overall trend of the data is increasing or decreasing.

Let’s say we have a time series of monthly sales for a retail store.

The data shows that sales have a consistent increase each year during the holiday season (November and December).

This increase in sales is roughly the same regardless of the magnitude of the trend.

In this case, we can use an additive model to deseasonalize the data.

On the other hand, a multiplicative model is more appropriate when the seasonal fluctuations are proportional to the trend of the time series.

Let’s say we have a time series of monthly electricity usage for a household.

The data shows that usage increases during the summer months and decreases during the winter.

However, the increase in usage during the summer is larger when the overall electricity usage is higher (e.g. due to more people living in the household or using more appliances).

In this case, we can use a multiplicative model to deseasonalize the data since the amplitude of the seasonal component changes with the level of the trend.

Plot your data and use your domain knowledge (or ask domain experts) to decide which model is more appropriate.

And if you want to take the practical approach, you can try both models and see which one gives you better results according to your business objective.

If you are forecasting, for example, you can compare the forecasted values after deseasonalizing with both models and see which one gives you the best results.

What Is Seasonality?#

Deseasonalizing With An Additive Model#

Deseasonalizing With A Multiplicative Model#

How To Choose Between An Additive Model And A Multiplicative Model?#

What Is Seasonality?

Deseasonalizing With An Additive Model

Deseasonalizing With A Multiplicative Model

How To Choose Between An Additive Model And A Multiplicative Model?