AutoML Attainment Forecast Model
Please note: Specific program names, internal tool references, and other potentially sensitive details mentioned in the original project document have been generalized or masked in this summary for confidentiality.
The Problem That Doesn't Sleep
Every week, our S&OP team faces a deceptively simple question: How much volume can we actually deliver?
This isn't just about demand forecasting. It's about attainment—the percentage of volume our network can actually capture and fulfill when reality collides with capacity constraints, cost pressures, and operational dynamics. Traditional models treated this like a math problem. Reality taught us otherwise.
Picture this: You're managing logistics across hundreds of stations, each with unique characteristics. Some have abundant capacity, others are perpetually constrained. Transportation partners actively shift volume between internal networks and third-party carriers based on cost optimization. A cost offset lever pulled at the network level ripples through to station-level performance in ways that simple averaging models can't predict.
We were flying blind with a flashlight when we needed night vision goggles.
The Cascade of Complexity
Attainment isn't uniform. It's personal. Each station responds differently to the same network-level intervention.
Consider a cost offset designed to shift 5% more volume to our internal network. Station A, located in a dense urban area with multiple carrier options, might see minimal impact. Station B, serving rural routes where alternatives are scarce, could experience dramatic changes. Station C, already running near capacity, might hit constraints that force spillover to competitors.
The traditional approach—taking historical averages and hoping for the best—was like using last year's weather to plan tomorrow's outdoor wedding. Technically informed, practically useless.
What we needed was a model that understood context. Not just "what happened before" but "why it happened and how those drivers are changing."
Building Intelligence Into Uncertainty
Here's what changed everything: instead of treating attainment as a time-series problem, we treated it as a feature-rich learning problem.
We built a custom model using AutoML—not because we love algorithms, but because we love results. The model ingests known covariates: historical cost offset data, planned cost offset changes, geographic context, seasonal patterns, and station-specific characteristics.
Think of it like a GPS that doesn't just know where you are, but understands traffic patterns, construction schedules, and how your driving style affects arrival time. It's predictive intelligence, not pattern matching.
The model architecture follows a clear philosophy:
- Time-series foundation: Recent actuals drive baseline expectations
- Feature richness: Cost offsets, regional data, operational context all influence predictions
- Stability with responsiveness: Consistent enough for planning, adaptive enough for reality
- Explainability: Planners can understand why forecasts change
- Control mechanisms: Human judgment remains in the loop through quantile-based outputs
The Real Work Happens in Data Preparation
Here's what most people don't understand about machine learning: the algorithm is maybe 20% of the work. The other 80% is data preparation, and that's where the magic really happens.
We built a sophisticated preprocessing pipeline that handles three critical challenges:
First, the outlier problem. Network-wide events—holidays, weather disasters, system outages—create data points that look like anomalies but are actually business intelligence. We maintain a curated list of known outlier dates and handle them systematically rather than letting them confuse the model.
Second, the new station problem. Stations launch and close. They don't have consistent time-series histories. Rather than exclude them or fill with zeros, we built logic that identifies insufficient data patterns and handles them gracefully.
Third, the imputation challenge. When we identify outliers at the station level, we don't just remove them. We replace them with statistically sound estimates based on same-day-of-week patterns from recent weeks. This preserves data continuity while eliminating noise.
The result: Clean data that tells true stories, not statistical artifacts.
Two Models, One Truth
We discovered something counterintuitive: sometimes you need multiple perspectives on the same problem.
Our default ensemble model excels under stable conditions. It's like a seasoned analyst who's seen every pattern and weighs evidence carefully. But when cost offset plans shift dramatically week-over-week, ensemble approaches can be slow to react.
So we added a second model—a gradient boosting machine we call 'DirectTabular.' It's more responsive to sudden changes in offset plans, like a trader who reacts quickly to market signals.
Both models train on the same data, but they weight evidence differently. In stable periods, we rely on the ensemble. During volatile times, we lean toward the DirectTabular output. The choice isn't algorithmic—it's strategic, made by people who understand the business context.
This isn't complexity for complexity's sake. It's optionality for when reality doesn't match expectations.
Human Intelligence Amplified
The best forecasting systems don't replace human judgment—they amplify it.
Every week, our subject matter experts review both model outputs alongside recent actuals and contextual factors. They ask questions the algorithm can't: Is demand unusually light this week? Are there operational changes that haven't hit the data yet? What do we know about upcoming capacity constraints?
Based on this analysis, they choose not just which model to use, but which quantile (P40, P50, P60) provides the most appropriate forecast given current uncertainty levels.
We built supporting tools that visualize station-level patterns over time, plotting historical attainment against cost offsets with forecast overlays. This isn't just about transparency—it's about building confidence. When planners can see why the model makes its predictions, they're better equipped to know when to trust it and when to override it.
The Results That Matter
Numbers tell the story: 48 basis points improvement in W-3 MAPE, 20 basis points improvement in W-1 MAPE.
But here's what those numbers really mean: Better predictions lead to better capacity planning. Better capacity planning reduces both excess costs and stockout risks. Reduced manual override effort frees planners to focus on strategic analysis rather than tactical corrections.
The model hasn't just improved accuracy—it's improved the quality of decision-making across the organization.
The Compound Effect of Better Forecasting
This project demonstrates something crucial about operational excellence: small improvements in foundational processes compound into significant business advantages.
When your forecasting improves by 20-50 basis points, you're not just getting better numbers. You're getting better decisions. Better resource allocation. Less waste, fewer stockouts, more predictable operations.
The custom attainment model isn't just a technical achievement—it's a business intelligence multiplier. It transforms weekly planning from reactive adjustment to proactive optimization.
And that's how you build sustainable competitive advantage: one better decision at a time, compounded across thousands of decisions, week after week.