Predicting Customer Traffic with Machine Learning

Predicting Customer Traffic with Machine Learning

How to predict customer traffic to better plan teams and inventory. Pragmatic approach with XGBoost, weather and calendar data. Applicable to restaurants, retail, museums, leisure.

The Problem

How many visitors tomorrow? The question comes up every day for shops, restaurants, museums, gyms, leisure spaces. Too much staff planned, that's lost margin. Not enough, that's degraded service.

We worked on this subject in the restaurant industry, where the problem is particularly acute. On a 150-cover service, a 15-cover error goes unnoticed. On a 25-cover service, the same error changes everything. The examples in this article come from this project, but the approach applies to any establishment whose attendance varies.

In this article, we share our approach: understanding the business with the teams, choosing the right prediction method, preparing the data, and deploying a model that works day-to-day.

What Influences Attendance

An establishment's attendance depends on factors we know well:

  • Weather. A sunny Saturday has nothing to do with a rainy Saturday.
  • Calendar. Day of the week, school holidays, public holidays.
  • Events. A concert or match nearby can change everything.
  • History. Patterns repeat from one year to the next.

These factors interact with each other. Weather doesn't have the same effect on a Tuesday as on a Saturday. February holidays don't look like July holidays.

Prediction Approaches

Several families of methods exist for forecasting time series. Each has its strengths and limitations.

Classic Statistical Methods

ARIMA and Holt-Winters are the traditional approaches. They model the series from its past values: trend, seasonality, noise. These methods are well documented and work on regular data.

Their limitation: they don't take external variables into account. Impossible to integrate weather or local events into the model. For attendance, that's a major handicap.

References: Forecasting: Principles and Practice - ARIMA, Holt-Winters

Prophet

Prophet, developed by Meta, popularized time series forecasting. It decomposes the series into trend, seasonality (annual, weekly), and holiday effects. Its simple API and robustness to missing data made it popular.

It's a good starting point, but Prophet remains an additive model: it assumes effects add up. It doesn't capture complex interactions well — for example, that weather affects attendance differently depending on the day of the week.

Reference: Prophet - Official Documentation

Deep Learning

Recurrent neural networks (LSTM) and specialized architectures (N-BEATS, Temporal Fusion Transformer) can capture very complex patterns. These models learn directly from raw data, without manual feature engineering.

The problem: they require a lot of data to generalize well — often several years of high-frequency history. They also require GPU infrastructure and specific skills for training and deployment. For an establishment with 2-3 years of daily history, it's often oversized.

References: N-BEATS (arXiv), Temporal Fusion Transformer (arXiv)

Gradient Boosting

XGBoost and LightGBM are decision tree learning algorithms. They excel on tabular data — exactly what we have here: one row per day with columns for weather, calendar, history.

Their strength: they naturally capture interactions between variables without needing to specify them. They easily integrate heterogeneous variables (numerical, categorical, binary). And they're interpretable: you can see which variables weigh in the predictions.

These algorithms dominate academic benchmarks on tabular data, as confirmed by the study "Tabular Data: Deep Learning is Not All You Need" and recent benchmarks (2024) comparing 20 models on over 100 datasets.

References: XGBoost - Documentation, LightGBM - Documentation

Our Choice: XGBoost

For this use case, XGBoost offers the best tradeoff:

  • It integrates external variables (weather, calendar, events)
  • It captures interactions without coding them manually
  • It works well with a few years of history
  • It runs on any server, without GPU
  • It's stable, documented, maintainable

It's not the most sophisticated tool, but it's the one that fits the field constraints.

With the algorithm chosen, the real work begins. A prediction model is only as good as the data and assumptions you feed it. Before preparing the data, you need to understand the business.

Working with Teams

Before talking technology, there's essential work: understanding the business.

Which error costs more? Predicting 80 covers when there are 100, or predicting 120 when there are 100? The answer isn't obvious and depends on context: cost of idle staff vs cost of degraded service, products that keep or not, ability to call back staff urgently...

These discussions with on-site teams help define how the model should be penalized when it's wrong. If underestimating is more serious than overestimating, we configure the model accordingly. It's a business choice, not a technical one.

Same thing for evaluating prediction quality. A 10-cover error on a 30-cover service isn't the same as a 10-cover error on a 150-cover service. The metrics we track must reflect what really matters for operations.

Finally, training data is defined with the field. Which periods are representative? Should we exclude months of construction or an atypical year? Are there recurring local events to integrate? Each establishment has its context: a city center brasserie doesn't have the same dynamics as a commercial zone restaurant. This business knowledge guides data preparation and selection of relevant variables.

Once these business choices are made, we move on to the concrete preparation of the data that will feed the model.

Data Preparation

The algorithm doesn't do everything. What makes the difference is the quality of the data we provide:

  • Attendance history, cleaned and structured
  • Weather data retrieved automatically (via Open-Meteo)
  • Enriched calendar: holidays by zone, public holidays, local events
  • The right temporal variables: day of week, week of year, distance to holidays...

The model learns the relationships between these factors and past attendance. Then it applies what it learned to future days.

What You Need to Get Started

  • Attendance history. Ideally 1 to 2 years, to capture seasonal variations.
  • Sufficient granularity. At least daily, by time slot if possible.

Weather and calendar data, we retrieve automatically.

Other Sectors Concerned

The described approach applies wherever attendance fluctuates and impacts resources to mobilize:

  • Retail. Anticipate attendance to adjust checkout schedules and restocking.
  • Museums and cultural sites. Manage visitor flows, optimize reception and mediation staff.
  • Gyms. Predict peak attendance to avoid equipment saturation.
  • Events. Estimate participation to size logistics (catering, security, parking).
  • Parks and leisure. Adapt attraction openings and staffing according to expected attendance.

The variables change (weather more or less determinant depending on the sector, events specific to the territory), but the principle remains the same: exploit history and context to better anticipate.

Conclusion

Predicting attendance isn't a complex data science problem. It's a common sense problem, powered by the right methods. The pragmatic choice of XGBoost, working with business teams to understand what really matters, well-prepared data: these are the ingredients that make the difference.

A model that works day-to-day is worth more than a sophisticated model that nobody maintains. The goal isn't perfect prediction, but reliable decision support that teams actually adopt.

Want to discuss it? Contact us.

Going Further