Multiple Time Series Forecasting With XGBoost In Python

Forecasting multiple time series can be a daunting task, especially when dealing with large amounts of data. However, XGBoost is a powerful gradient boosting algorithm that has been shown to perform exceptionally well in time series forecasting tasks. In combination with MLForecast, which is a scalable and easy-to-use time series forecasting library, we can make the process of training an XGBoost model for multiple time series forecasting a breeze. Let’s dive into the step-by-step process of preparing our data, defining our XGBoost model, and training it using MLForecast in Python....

February 28, 2023 · 14 min · Mario Filho

Multivariate Time Series Forecasting in Python

In this article, we’ll explore how to use scikit-learn with mlforecast to train multivariate time series models in Python. Instead of wasting time and making mistakes in manual data preparation, let’s use the mlforecast library. It has tools that transform our raw time series data into the correct format for training and prediction with scikit-learn. It computes the main features we want when modeling time series, such as aggregations over sliding windows, lags, differences, etc....

February 25, 2023 · 11 min · Mario Filho

Multiple Time Series Forecasting With Holt-Winters In Python

In today’s article, we’re going to explore the ins and outs of training a Holt-Winters model for forecasting multiple time series in Python. Holt-Winters is a very popular forecasting algorithm that can capture seasonality and trends in time series data through exponential smoothing. I’ll use StatsForecast, a scalable and easy-to-use Python library that can help you train a Holt-Winters model quickly and efficiently. You don’t need to be a programming wizard to get started with this library, and it can save you hours of coding time....

May 26, 2023 · 9 min · Mario Filho

Multiple Time Series Forecasting with DeepAR in Python

In this post, we will learn how to use DeepAR to forecast multiple time series using GluonTS in Python. DeepAR is a deep learning algorithm based on recurrent neural networks designed specifically for time series forecasting. It works by learning a model based on all the time series data, instead of creating a separate model for each one. In my experience, this often works better than creating a separate model for each time series....

February 23, 2023 · 11 min · Mario Filho

Multiple Time Series Forecasting with Prophet in Python

In this blog post, I will walk you through a complete example of how to use Prophet for multiple time series forecasting. Prophet, developed by Facebook (Meta) is an alternative to popular univariate time series models like ARIMA, that is claimed to be better for business use cases. I will teach everything from installing Prophet to saving a trained model, and along the way, I will explain each step of the process in detail....

February 20, 2023 · 10 min · Mario Filho

Volatility Forecasting In Python

In this blog post, we will explore how we can use Python to forecast volatility using three methods: Naive, the popular GARCH and machine learning with scikit-learn. Volatility here is the standard deviation of the returns of a financial instrument. I will teach you starting points to kickstart your own research. Installing ARCH and mlforecast First we need to install the required packages. You can do it with pip: pip install arch pip install mlforecast or with conda:...

February 18, 2023 · 9 min · Mario Filho

Bayesian Time Series Forecasting in Python with Orbit

In this article, you will learn how to use Orbit, a Python library for Bayesian time series forecasting. Orbit is a very straightforward library developed at Uber that offers an interface to train Bayesian exponential smoothing models implemented via the probabilistic programming languages Stan and Pyro. This is a practical guide: the goal here is not to go into the math behind the models, but rather to show how you can use Orbit in practice to forecast time series data using Bayesian models....

February 16, 2023 · 11 min · Mario Filho

Intermittent Time Series Forecasting in Python

In this post, you will learn how to easily forecast intermittent time series data using the StatsForecast library in Python. Intermittent time series data is unique in the world of forecasting because it often includes missing data, irregular values, or a large number of zeros. They usually start appearing when you go down in the hierarchy levels of a time series. Traditional forecasting methods can struggle with these types of data, but after reading this post, you’ll have all the tools you need to tackle even the most complex intermittent time series....

February 14, 2023 · 14 min · Mario Filho

Multiple Time Series Forecasting with Temporal Convolutional Networks (TCN) in Python

In this article you will learn an easy, fast, step-by-step way to use Convolutional Neural Networks for multiple time series forecasting in Python. We will use the NeuralForecast library which implements the Temporal Convolutional Network (TCN) architecture. Temporal Convolutional Network (TCN) This architecture is a variant of the Convolutional Neural Network (CNN) architecture that is specially designed for time series forecasting. It was first presented as WaveNet. Source: WaveNet: A Generative Model for Raw Audio...

February 12, 2023 · 13 min · Mario Filho

Naive Time Series Forecasting in Python

What Is Naive Forecasting? Whenever you start a time series forecasting project, you should start with a naive model. A naive model is a very simple rule that you use to generate predictions for the future. It’s easy to implement and it gives you a baseline to compare your more complex models against. Here you will learn how to use the StatsForecast library, which provides the most popular naive models for time series forecasting in Python....

February 11, 2023 · 7 min · Mario Filho

Multiple Time Series Forecasting with ARIMA in Python

ARIMA is one of the most popular univariate statistical models used for time series forecasting. Here you will learn how to use the StatsForecast library, which provides a fast, scalable and easy-to-use interface for us to train ARIMA models in Python. To understand ARIMA, let’s take an example of sales forecasting. Suppose a retail store has historical sales data for the past 12 months. To make a sales forecast for the next 3 months, we can fit an ARIMA model to this data....

February 10, 2023 · 7 min · Mario Filho

Multiple Time Series Forecasting With LSTM In Python

You’ve probably heard about LSTMs, and might be curious about how they can help you with multiple time series forecasting. As machine learning practitioners, we come across various forecasting tasks, and choosing the right model can sometimes be a challenge. LSTMs have gained attention for their ability to handle long-term dependencies in sequential data, making them a promising choice for time series problems. By the end of this tutorial, you’ll have a deeper understanding of LSTMs and be better prepared to use them effectively for multiple time series forecasting projects....

May 24, 2023 · 13 min · Mario Filho

Multiple Time Series Forecasting With Scikit-learn

Forecasting time series is a very common task in the daily life of a data scientist, which is surprisingly little covered in beginner machine learning courses. It can be predicting future demand for a product, city traffic or even the weather. With accurate time series forecasts, companies can adjust their production strategies, inventory management, resource allocation and other key decisions, leading to significant cost reduction and increased efficiency. Furthermore, forecasts also allow companies to be more proactive rather than reactive, anticipating market trends and adjusting their strategies accordingly....

February 8, 2023 · 15 min · Mario Filho

Does XGBoost Need Feature Scaling Or Normalization?

If you are using XGBoost with decision trees as your base model, you don’t need to worry about scaling or normalizing your features. Decision trees are not sensitive to the scale of the features. In practice, I have seen very minor differences in score by scaling[features for decision trees, but these are due to numerical computing implementations and not significant in practice. If you are using XGBoost with linear models as base models, it is a good idea to scale or normalize the features....

December 30, 2022 · 7 min · Mario Filho

Adstock in Marketing Mix Modeling

What Is Adstock in Marketing Mix Modeling? Adstock in marketing mix modeling is a way to take into account that the impact of an advertisement on consumer behavior may not be immediate, but rather may build over time. In other words, we need to adjust our model to the fact that changes in how people feel about a product or brand after seeing an ad can last even after the ad is no longer being shown....

May 19, 2023 · 10 min · Mario Filho

Generating Text With Contrastive Search vs GPT-3/ChatGPT

I got very excited about a new technique in natural language processing (NLP) called Contrastive Search. It is a powerful tool that improves the quality of text generated by smaller language models. In this blog post, I will compare the outputs of contrastive search in a few open-source models with GPT-3. Finally, I will talk about the implications of this technique and how it could be used in the future. So, let’s get started!...

December 5, 2022 · 13 min · Mario Filho

How To Do Time Series Cross-Validation In Python

One can’t simply use a random train-test split when building a machine learning model for time series Doing it would not only allow the model to learn from data in the future but show you an overoptimistic (and wrong) performance evaluation. In real-life projects, you always have a time component to deal with. Changes can happen in nanoseconds or centuries, but they happen and you are interested in predicting what will come next....

July 12, 2023 · 5 min · Mario Filho

Implementing Uber's Marketing Mix Model With Orbit

There’s a very interesting marketing mix modeling approach published by Uber’s data science team that uses coefficients that vary over time to estimate a media channel’s effects. The modeling approach is called Bayesian Time-Varying Coefficients (BTVC) and it’s available on Orbit, their forecasting package, as Kernel Time-Varying Regression. Instead of getting a single coefficient to understand each media effect, we can see how the effect varied through time with confidence intervals....

July 5, 2022 · 9 min · Mario Filho

How To Create A Marketing Mix Model With LightweightMMM (Python)

The future of advertising attribution is modeled, predicted, estimated, or whatever other word you want. One of the coolest tools (although still under early development) we have to model the impact of advertising campaigns on revenue is LightweightMMM, an implementation of bayesian marketing mix models developed by Google. I talked about using this tool with my course sales data and really liked the results. I was surprised by the number of people that are trying to solve the same problem in their companies....

May 19, 2023 · 11 min · Mario Filho