Multiple Time Series Forecasting With XGBoost In Python

Forecasting multiple time series can be a daunting task, especially when dealing with large amounts of data. However, XGBoost is a powerful algorithm that has been shown to perform exceptionally well in time series forecasting tasks. In combination with MLForecast, which is a scalable and easy-to-use time series forecasting library, we can make the process of training an XGBoost model for multiple time series forecasting a breeze. Let’s dive into the step-by-step process of preparing our data, defining our XGBoost model, and training it using MLForecast in Python....

February 28, 2023 · 10 min · Mario Filho

Multivariate Time Series Forecasting in Python

In this article, we’ll explore how to use scikit-learn with mlforecast to train multivariate time series models in Python. Instead of wasting time and making mistakes in manual data preparation, let’s use the mlforecast library. It has tools that transform our raw time series data into the correct format for training and prediction with scikit-learn. It computes the main features we want when modeling time series, such as aggregations over sliding windows, lags, differences, etc....

February 25, 2023 · 11 min · Mario Filho

Does XGBoost Need Feature Scaling Or Normalization?

If you are using XGBoost with decision trees as your base model, you don’t need to worry about scaling or normalizing your features. Decision trees are not sensitive to the scale of the features. In practice, I have seen very minor differences in score by scaling features for decision trees, but these are due to numerical computing implementations and not significant in practice. If you are using XGBoost with linear models as base models, it is a good idea to scale or normalize the features....

December 30, 2022 · 7 min · Mario Filho

Can Gradient Boosting Learn Simple Arithmetic?

During a technical meeting a few weeks ago, we had a discussion about feature interactions, and how far we have to go with them so that we can capture possible relationships with our targets. Should we create (and select) arithmetic interactions between our features? A few years ago I remember visiting a website that showed how different models approximated these simple operations. It went from linear models to a complex Random Forest....

January 20, 2020 · 4 min · Mario Filho