In this blog post, I will walk you through a complete example of how to use Prophet for multiple time series forecasting.

Prophet, developed by Facebook (Meta) is an alternative to popular univariate time series models like ARIMA, that is claimed to be better for business use cases.

I will teach everything from installing Prophet to saving a trained model, and along the way, I will explain each step of the process in detail.

Let’s dive in!

Installing Prophet

Installing Prophet is very simple.

Do it via pip:

python -m pip install prophet

Or via conda:

conda install -c conda-forge prophet

Preparing the Data for Prophet

We will use real sales data from the Favorita store chain, from Ecuador.

We have sales data from 2013 to 2017 for multiple stores and product categories.

To measure the model’s performance, we will use WMAPE (Weighted Mean Absolute Percentage Error) with the absolutes of the actual values as weights.

import pandas as pd
import numpy as np

def wmape(y_true, y_pred):
    return np.abs(y_true - y_pred).sum() / np.abs(y_true).sum()

This is an adapted version of MAPE (Mean Absolute Percentage Error) that solves the problem of division by zero when there are no sales for a specific day.

For this tutorial I will use only the data from one store.

You can use as many stores, categories, SKUs, etc as you want.

data = pd.read_csv(os.path.join(path, 'train.csv'), index_col='id', parse_dates=['date'])
data2 = data.loc[data['store_nbr'] == 1, ['date', 'family', 'sales', 'onpromotion']]

The columns are:

  • date: date of the record
  • family: product category
  • sales: sales amount
  • onpromotion: how many products of that category were on promotion on that day

This data doesn’t contain a record for December 25, so to test Prophet’s ability to handle missing values, I inserted a row with NaN values for that day.

dec25 = list()
for year in range(2013,2017):
    for family in data2['family'].unique():
        dec25 += [{'date': pd.Timestamp(f'{year}-12-25'), 'family': family, 'sales': np.nan, 'onpromotion': np.nan}]
data2 = pd.concat([data2, pd.DataFrame(dec25)], ignore_index=True).sort_values('date')
data2 = data2.rename(columns={'date': 'ds', 'sales': 'y'})

This is what the first 5 rows of data2 look like:

ds family y onpromotion
2013-01-01 00:00:00 AUTOMOTIVE 0 0
2013-01-01 00:00:00 SEAFOOD 0 0
2013-01-01 00:00:00 SCHOOL AND OFFICE SUPPLIES 0 0
2013-01-01 00:00:00 PRODUCE 0 0
2013-01-01 00:00:00 PREPARED FOODS 0 0

One row per day per product category.

Prophet requires a minimum of two columns: the timestamp named ds and the value to forecast named y.

We have data from holidays in Ecuador and Prophet can use them to capture specific patterns around those days.

holidays_df = pd.read_csv(os.path.join(path, 'holidays_events.csv'))
holidays_df = holidays_df.loc[holidays_df['locale'] == 'National', ['date', 'description']]
holidays_df['lower_window'] = 0
holidays_df['upper_window'] = 0
holidays_df = holidays_df.rename(columns={'date': 'ds', 'description': 'holiday'})

This is what the first 5 rows of holidays_df look like:

ds holiday lower_window upper_window
2012-08-10 Primer Grito de Independencia 0 0
2012-10-09 Independencia de Guayaquil 0 0
2012-10-12 Traslado Independencia de Guayaquil 0 0
2012-11-02 Dia de Difuntos 0 0
2012-11-03 Independencia de Cuenca 0 0

Prophet requires a column named ds with the date of the holiday and a column holiday with the name of the holiday.

The columns lower_window and upper_window specify how many days before and after the holiday to consider as influenced by the holiday in the model.

This is an useful feature to consider in any time series model.

Take, for example, the days before Christmas.

People tend to buy more gifts as the date approaches, so it’s important to capture that pattern in the model.

In Brazil, in the days before Carnival, people buy costumes and other items to celebrate the holiday. During it, people tend to buy more food and drinks.

As I don’t have the specific city of the stores, I will use only the national holidays.

Finally, we do a simple time series split into train and validation sets.

train = data2.loc[data2['ds'] < '2017-01-01']
valid = data2.loc[(data2['ds'] >= '2017-01-01') & (data2['ds'] < '2017-04-01')]

Training Prophet

We need to train a Prophet model for each time series we want to forecast.

Let’s learn to do it with a simple loop first and then we will parallelize it.

from prophet import Prophet

p = list()
for family in train['family'].unique():
    print('family:', family)
    train_ = train.loc[train['family'] == family]
    valid_ = valid.loc[valid['family'] == family]
    
    m = Prophet(seasonality_mode='additive', yearly_seasonality=True, 
                weekly_seasonality=True, daily_seasonality=False,
                holidays=holidays_df)
    m.add_regressor('onpromotion')
    m.fit(train_)
    
    future = m.make_future_dataframe(periods=90, include_history=False)
    future = future.merge(valid_[['ds', 'onpromotion']], on='ds', how='left')
    forecast = m.predict(future)
    forecast['family'] = family
    p.append(forecast[['ds', 'yhat', 'family']])
    
p = pd.concat(p, ignore_index=True)
p['yhat'] = p['yhat'].clip(lower=0)
p = p.merge(valid, on=['ds', 'family'], how='left')
wmape(p['y'], p['yhat'])

We start by importing Prophet and creating a list p to store the predictions.

Then we have a loop that iterates over the unique values of the family column in the train dataset to create a separate forecast for each series.

Next, the code creates two new datasets, train_ and valid_, that contain only the rows of the original training and validation datasets that correspond to the current family being forecasted.

The next few lines create a new Prophet object called m and set a few parameters.

seasonality_mode is set to additive by default, but it could be set to multiplicative. This parameter controls how the seasonal components are modeled.

yearly_seasonality corresponds to seasonality across the entire year. It can be used to capture back-to-school seasonality, summer vacation seasonality, and so on.

weekly_seasonality models patterns that repeat across the week. It can be used to capture patterns like weekends being busier than weekdays.

daily_seasonality models patterns that repeat across the day. It can be used to capture patterns like lunchtime being busier than dinnertime.

The holidays parameter is set to holidays_df, which is our dataframe with information about the holidays in Ecuador.

Next, we use the function add_regressor to add the column onpromotion to the model. You can use this function to add as many external variables as you want to the model.

The function fit trains the model using the training data.

Next, the code creates a new dataframe called future that contains the dates for the next 90 days after the end of the train_ dataset.

I set the include_history parameter to False because I don’t want to include the dates from the training data in the future dataset.

This is what the first 5 rows of future look like at this point:

ds
2017-01-01
2017-01-02
2017-01-03
2017-01-04
2017-01-05

The next line adds the column onpromotion to the future dataset. This is necessary for the model to use the information about the promotions in the forecast.

If we had external variables that we could not plan in advance (like temperature or commodity prices), we would need to use estimates.

This is what the first 5 rows of future look like now:

ds onpromotion
2017-01-01 00:00:00 0
2017-01-02 00:00:00 0
2017-01-03 00:00:00 0
2017-01-04 00:00:00 0
2017-01-05 00:00:00 0

Hours in the timestamp are not important for this model, but Pandas added it automatically.

Forecasting With Prophet

We use the function predict to make predictions for the dates in future.

The forecast variable is the dataframe with the predicted values for each date in the future dataframe.

This is the input dataframe for the predict function:

ds onpromotion
2017-01-01 00:00:00 0
2017-01-02 00:00:00 0
2017-01-03 00:00:00 0
2017-01-04 00:00:00 0
2017-01-05 00:00:00 0

The prediction dataframe returned by the function contains many columns, but we will focus on the following:

ds yhat yhat_lower yhat_upper
2017-01-01 00:00:00 -0.398921 -3.12561 2.52133
2017-01-02 00:00:00 4.8447 1.82306 7.55953
2017-01-03 00:00:00 5.70195 2.66954 8.71459
2017-01-04 00:00:00 5.14328 2.05284 8.22002
2017-01-05 00:00:00 4.60457 1.48575 7.53965
  • yhat is the midpoint of the prediction interval. It is the most likely value for the forecast.
  • yhat_lower is the lower bound of the prediction interval.
  • yhat_upper is the upper bound of the prediction interval.

Next, the code adds a new column to the forecast dataframe called family and sets its value to the name of the family that was just forecasted.

This step is important because the final p dataframe will contain predictions for multiple families, and we need to be able to distinguish between them.

The forecast dataframe is then appended to the p list.

This process is repeated for each family (series) in the dataset.

After all of the series have been forecasted, we use the pd.concat function to concatenate all of the dataframes in the p list into a single dataframe.

I clipped the values in the yhat column to be greater than or equal to 0 because the yhat column contains negative values, which don’t make sense for this dataset.

Sometimes you see negative values in the target variable for demand forecasting problems because it considers more items were returned than were sold.

If negative values make sense for your problem, you can skip this step.

Finally, we merge the p dataframe with the valid dataframe to calculate the WMAPE metric.

WMAPE was 17.19% for the validation set.

In practice, we compare this value with the WMAPE of a baseline model to see if the model is worth using.

These are the first 5 rows of the p dataframe:

ds yhat family y onpromotion
2017-01-01 00:00:00 0 AUTOMOTIVE 0 0
2017-01-01 00:00:00 0 DAIRY 0 0
2017-01-01 00:00:00 0 SEAFOOD 0 0
2017-01-01 00:00:00 28.1703 HOME CARE 0 0
2017-01-01 00:00:00 0 BEAUTY 0 0

forecast of 6 families of products with Prophet

I used the following code to plot the forecast for a few families:

fig, ax = plt.subplots(3,2, figsize=(1280/96, 720/96), dpi=96)
ax = ax.flatten()
for ax_ ,family in enumerate(p['family'].unique()[:6]):
    p_ = p.loc[p['family'] == family]
    p_.plot(x='ds', y='y', ax=ax[ax_], label='Sales')
    p_.plot(x='ds', y='yhat', ax=ax[ax_], label='Forecast')
    ax[ax_].set_title(family)
    ax[ax_].legend()
    ax[ax_].set_xlabel('Date')
    ax[ax_].set_ylabel('Sales')
fig.tight_layout()
plt.show()

Parallelizing the Training And Prediction

We can make the training and prediction process MUCH faster by parallelizing it.

My favorite ways of doing it are using the joblib or dask libraries.

Here I will show you how to do it with joblib.

First we need to take the code from the loop and put it in a function.

from prophet import Prophet
from joblib import Parallel, delayed

def run_one(train_, valid_, family):
    print('family:', family)   
    m = Prophet(seasonality_mode='additive', yearly_seasonality=True, 
                weekly_seasonality=True, daily_seasonality=False,
                holidays=holidays_df)
    m.add_regressor('onpromotion')
    m.fit(train_)
    
    future = m.make_future_dataframe(periods=90, include_history=False)
    future = future.merge(valid_[['ds', 'onpromotion']], on='ds', how='left')
    forecast = m.predict(future)
    forecast['family'] = family
    return forecast[['ds', 'yhat', 'family']]

This function takes a training and validation set for a single family and returns a dataframe with the forecast. The third argument is the name of the family.

We could subset the train and valid dataframes inside the function, but sometimes doing it before helps us make less copies of the data in memory.

Next, we need to create a list of jobs to run in parallel.

jobs = list()
for family in train['family'].unique():
    train_ = train.loc[train['family'] == family]
    valid_ = valid.loc[valid['family'] == family]
    jobs.append(delayed(run_one)(train_, valid_, family))

First, we prepare the training and validation sets for each family.

Then we use the delayed function from the joblib library to create a job for each family.

We pass the run_one function as the argument and then pass the function arguments to a second set of parentheses.

This will not run the function, it will just create the jobs that we will run in parallel.

p = Parallel(n_jobs=4)(jobs)
p = pd.concat(p, ignore_index=True)
p['yhat'] = p['yhat'].clip(lower=0)
p = p.merge(valid, on=['ds', 'family'], how='left').sort_values('ds')
wmape(p['y'], p['yhat'])

We then call the Parallel object from the joblib library passing how many jobs we want to run in parallel as the n_jobs argument. Usually the number of jobs should be equal to the number of cores on your machine.

In the second set of parentheses, we pass the list of jobs that we created earlier.

Finally, with the p list of dataframes, we proceed with the same steps as before.

Saving a Prophet Model

If you want to save a Prophet model for later use, you can use tools from the serialize module inside the function we created.

The models are saved as JSON files.

from prophet.serialize import model_to_json, model_from_json

...
m.fit(train_)

with open(f'prophet_{family}.json', 'w') as f:
    f.write(model_to_json(m))

future = m.make_future_dataframe(periods=90, include_history=False)
...

Then, to load the model, you can use the following code:

with open(f'prophet_{family}.json', 'r') as f:
    m = model_from_json(f.read())

After loading you can use it as if you had just trained it.