Trainator — Probabilistic Rail Delay Predictions

01 · The problem

Detect missed-connection risk before checkout.

A 12-minute transfer in Frankfurt or Zürich is the difference between a kept appointment and a missed one — and today the passenger only finds out which one they bought after they're already on the platform. Live feeds tell you whether a train is "late" once it already is. They do not tell you, at booking time, how late, how confident, and how to price the risk.

01.A

Frustrated passengers, blamed platform

A missed connection turns into a 1-star review and a support ticket — even when the booking platform did nothing wrong. The brand wears the delay.

01.B

Churn after a single bad trip

One late arrival to a meeting, wedding or flight is enough to push a high-value traveller back to the airline app for the next booking. Trust is hard to rebuild.

01.C

Refunds, rebookings, lost revenue

Missed connections trigger compensation payouts, free rebookings and live-agent support calls — costs that quietly compound across every itinerary the platform sells.

With Trainator

Predictions at booking time turn each of those into an upside.

Because the forecast exists before the itinerary is committed, the platform gets to act on it — quietly re-routing, surfacing safer alternatives, or pricing the risk into a guarantee instead of absorbing it later.

01.A → SOLVED

Confident passengers, trusted brand

Risky transfers get flagged or hidden at booking. The traveller arrives when they expected to — and credits the platform for the smooth trip, not the train operator.

01.B → UNLOCKED

Honest forecasts as a differentiator

Show the realistic arrival window — not just the schedule — on every itinerary. Transparent expectations are something competitors don't surface, and they're a reason to come back to your app for the next booking.

01.C → SAVED

Fewer payouts, fewer support calls

Steering bookings away from high-risk transfers shrinks the long tail of compensation, rebookings and live-agent calls — a margin improvement that compounds across every itinerary sold.

EU Multimodal Booking Reg. compliant

Trainator's model uses only connection-level performance signals (route, time-of-day, rolling stock, network state) — never carrier identity. This means any ranking or filtering downstream is driven purely by empirical reliability, satisfying Article 7's neutrality requirement by construction and letting the same predictions flow to multimodal distributors without legal friction.

02 · How predictions work

One distribution per connection — Introducing the Zero-Inflated q-Exponential.

Rail delay data has a stubborn shape: a sharp spike at zero — most trains really do arrive exactly on time — followed by a heavy positive tail of occasional severe delays. No standard textbook distribution fits both halves at once. The Zero-Inflated q-Exponential (ZIqE) does, by construction: a point mass at zero handles the on-time spike, and a q-exponential tail handles the rare-but-large delays. The result is the closest match to the empirical reality of European rail we have been able to find — and the entire curve is described by just three parameters. Move the sliders to see how the shape changes.

p₀0.62

Probability the train arrives exactly on time — the spike at t = 0.

λ0.17

Decay rate of the tail when a delay occurs. Larger λ → faster decay, shorter tail.

q1.245

Shape of the tail — fitted globally across the fleet and held constant. Higher q means a heavier, slower-decaying tail.

// CDF
F(t) = p₀ + (1 − p₀) · [1 − (1 + λ·(q−1)·t)^{(2−q)/(1−q)}]

Cumulative probability of delay ≤ t

t ∈ [0, 60] min

The jump at t = 0 is p₀ — the probability of being exactly on time. From there the curve climbs along the heavy q-exponential tail toward 1.

P50 · 0 minP80 · 6 minP95 · 22 minP99 · 54 min

03 · API delivery

One URL. Modular by design.

Drop in the parts you have — trip, station, optional minute. Trainator returns exactly the resolution you asked for: a full delay distribution for the whole trip, a station-specific forecast, or a single scalar probability when all your booking flow needs is a yes/no answer. Sub-400 ms latency means responses arrive in real time, and clients can compute multiple probabilities from one payload without refetching.

$ curl 'https://api.trainator.eu/DE/DI?
  trip_id=ICE%20936&
  date=2026-05-20' \
  -H 'Authorization: Bearer $TRAINATOR_TOKEN'

Trip

Date

Station

Minute

5 min

Returns predictions for every station on the trip — one ZIqE distribution per stop.

GET/DE/DI?trip_id=ICE%20936&date=2026-05-20
200 OK
{
  "trip_id": "ICE 936",
  "operator": "DB",
  "date": "2026-05-20",
  "stations": [
    {
      "station_id": "de:02000:8002553",
      "station_name": "Hamburg Hbf",
      "distribution": {
        "family": "ZIqE",
        "p0": 0.796,
        "lambda": 0.216,
        "q": 1.245
      },
      "percentiles_min": {
        "p50": 0,
        "p80": 0,
        "p95": 11,
        "p99": 31
      }
    },
    {
      "station_id": "de:03241:6500",
      "station_name": "Hannover Hbf",
      "distribution": {
        "family": "ZIqE",
        "p0": 0.729,
        "lambda": 0.239,
        "q": 1.245
      },
      "percentiles_min": {
        "p50": 0,
        "p80": 2,
        "p95": 12,
        "p99": 33
      }
    },
    {
      "station_id": "de:03152:5300",
      "station_name": "Göttingen",
      "distribution": {
        "family": "ZIqE",
        "p0": 0.679,
        "lambda": 0.257,
        "q": 1.245
      },
      "percentiles_min": {
        "p50": 0,
        "p80": 3,
        "p95": 13,
        "p99": 33
      }
    },
    {
      "station_id": "de:09663:7100",
      "station_name": "Würzburg Hbf",
      "distribution": {
        "family": "ZIqE",
        "p0": 0.578,
        "lambda": 0.293,
        "q": 1.245
      },
      "percentiles_min": {
        "p50": 0,
        "p80": 4,
        "p95": 14,
        "p99": 33
      }
    },
    {
      "station_id": "de:09161:5100",
      "station_name": "Augsburg Hbf",
      "distribution": {
        "family": "ZIqE",
        "p0": 0.512,
        "lambda": 0.316,
        "q": 1.245
      },
      "percentiles_min": {
        "p50": 0,
        "p80": 4,
        "p95": 14,
        "p99": 33
      }
    },
    {
      "station_id": "de:09762:8200",
      "station_name": "München Hbf",
      "distribution": {
        "family": "ZIqE",
        "p0": 0.461,
        "lambda": 0.334,
        "q": 1.245
      },
      "percentiles_min": {
        "p50": 0,
        "p80": 5,
        "p95": 14,
        "p99": 32
      }
    }
  ]
}

p95 latency 380 msBatch endpoint availableNo client-side quantile solver needed

03·BIntegration exampleHow a downstream system turns the same request into a probability — in JS or Python.▾

predict.jsnode 20 · async
// Fetch the ZIqE delay distribution for ICE 936 at München Hbf
const url = new URL("https://api.trainator.eu/DE/DI");
url.searchParams.set("trip_id",    "ICE 936");
url.searchParams.set("date",       "2026-05-20");
url.searchParams.set("station_id", "de:09762:8200");

const res = await fetch(url, {
  headers: { Authorization: `Bearer ${process.env.TRAINATOR_TOKEN}` }
});
const { distribution } = await res.json();
const { p0, lambda, q } = distribution;
// → p0 = 0.461, lambda = 0.334, q = 1.245

// ZIqE CDF:  F(t) = p₀ + (1 − p₀) · [1 − S(t)]
// where    S(t) = [1 + λ·(q−1)·t]^((2−q)/(1−q))
function cdf(t) {
  if (t <= 0) return p0;
  const s = (1 + lambda * (q - 1) * t) ** ((2 - q) / (1 - q));
  return p0 + (1 - p0) * (1 - s);
}

const threshold = 5; // minutes
const probability = cdf(threshold);

console.log(`P(delay ≤ ${threshold} min) \n= ${(probability * 100).toFixed(1)}%`);
// → P(delay ≤ 5 min) = 81.3%

Probability curve · München Hbf

ZIqE · p₀ 0.46 · λ 0.33

Cumulative probability that the arrival delay is ≤ t minutes, computed from the returned distribution parameters.

P(delay ≤ 5 min) at München Hbf · example threshold

81.3%

P50

0min

P80

5min

P95

14min

P99

32min

Trip numbers, station IDs and reliability figures shown here are illustrative — inspired by real Deutsche Bahn services and IFOPT identifiers but not live. The production API exposes the live fleet.

04 · Accuracy

Calibrated where it counts, sharper than any baseline.

Calibration is what makes probabilistic forecasts usable downstream. Sharpness is what makes them better than a coin-flip. Trainator delivers both, measured on n = 27,000 German long-distance connections from May 2026.

Percentile calibration

n = 27,000 · May 2026

If we say 80% of trains will arrive within X minutes, roughly 80% actually do. Worst deviation: 5.9 pp at the 30th percentile — the model is slightly conservative at the low end and converges to perfect calibration above the 80th.

Trainator observedPerfect calibration

Predicted vs. observed CDF

n = 27,000 · May 2026

The model's predicted cumulative probability of delay ≤ t (averaged across the holdout) plotted against the empirical CDF we actually observed. The two curves stay within ∼1 pp of each other across the entire 0–60 min range — the predicted distribution describes reality on a per-minute level, not just at the headline percentiles.

Trainator predictedEmpirical observed

Reading these together: the left chart confirms the percentiles we report are the percentiles you actually get; the right chart confirms the underlying CDF — at every minute, not just the deciles — matches what really happens. Both diagnostics tell the same story from different angles.

05 · Technical evaluation

How every claim above is measured.

A reference for evaluators who want to reproduce or extend the comparison. Each metric below was computed on the same May 2026 European long-distance holdout (n = 27,000).

M1Brier ScoreMean squared error of probability forecasts. Lower is better.▾

The Brier score averages (p − y)² across all (probability, outcome) pairs, where y ∈ {0, 1} is the realised event "train was on time". It rewards confident-and-correct probabilities and punishes confident-and-wrong ones symmetrically. Scores here are weighted against the empirical event frequency, so a model that just predicts the base rate cannot get a free pass on the easy cases.

Trainator's probabilities aren't just more accurate on average — they're sharper. The naive base-rate predictor sits safely in the middle of the (0,1) interval and hedges; Trainator commits closer to 0 or 1 when the data warrants it, and that confidence — when correctly placed — is what drives the Brier gap. Random is the no-information floor at 0.25.

BS = (1/n) · Σ w_i · (p_i − y_i)²

M2ROC curve & AUCDiscriminative power: can the model separate delayed from on-time trains?▾

For any choice of "delayed = late by more than X minutes", the ROC curve plots true-positive rate against false-positive rate as the model's decision threshold sweeps from 0 to 1. AUC — the area under that curve — is the probability that a randomly chosen delayed connection scores higher than a randomly chosen on-time connection.

The chart shows the curve at the strictest definition (X = 0 min — "any delay at all"), where Trainator hits AUC = 0.854. Across X from 0 to 30 min the AUC stays in the 0.80–0.85 band: the model discriminates well no matter where you draw the line. A random classifier sits at 0.50 on the diagonal.

AUC = P(score(delayed) > score(on-time))

AUC by "delayed ≥ X min"

0 min: 0.8543 min: 0.8267 min: 0.81315 min: 0.82620 min: 0.83130 min: 0.816

M3CDF residual errorMAE & RMSE of the predicted-vs-observed CDF, with empirical-frequency-weighted counterparts.▾

For each minute t we have a predicted cumulative probability and an observed one. The residual is the gap between them: r(t) = F_pred(t) − F_obs(t). Mean absolute error and root mean squared error summarise that residual series into a single number.

Weightedversions multiply each t's residual by the empirical share of trains arriving in that minute. Because 56% of trains arrive on time, the bin at t = 0 dominates the weights — so weighted RMSE / MAE measure how well the model fits the bins customers actually book around, rather than treating a residual at the rare 50-min tail as equally important as the residual at the on-time spike.

MAE = (1/n) · Σ |r(t)|
RMSE = √[ (1/n) · Σ r(t)² ]
wMAE = Σ w(t)·|r(t)| / Σ w(t) — w(t) = empirical density at t

Weighted RMSE

1.14pp

Weighted MAE

1.11pp

RMSE

0.75pp

MAE

0.62pp

06 · Contact

Let's talk in depth.

We're open to conversations that go beyond this page — whether you need more rigorous benchmarks, want to walk through a specific workflow, or are evaluating direct model access for your platform.

In-depth data

Holdout breakdowns by route, operator, season, and delay magnitude. Raw prediction logs available under NDA for technical due diligence.

Performance benchmarks & workflows

Custom evaluation runs on your historical data, plus end-to-end integration walkthroughs tailored to your existing passenger information stack.

Model access

Dedicated inference endpoints, higher rate limits, SLA tiers, and on-premise deployment options for operators with strict data residency requirements.

paulin@trainator.eu

I respond to all serious enquiries within as soon as possible, still please be aware, that I prioritze existing customers and technical conversations over general interest. If you don't get a reply within a week, feel free to send a follow-up email.

Probabilistic delay forecasts for every rail connection.

Trainator inside bahn.de.

Detect missed-connection risk before checkout.

Frustrated passengers, blamed platform

Churn after a single bad trip

Refunds, rebookings, lost revenue

Predictions at booking time turn each of those into an upside.

Confident passengers, trusted brand

Honest forecasts as a differentiator

Fewer payouts, fewer support calls

One distribution per connection — Introducing the Zero-Inflated q-Exponential.

Cumulative probability of delay ≤ t

One URL. Modular by design.

Probability curve · München Hbf

Calibrated where it counts, sharper than any baseline.

Percentile calibration

Predicted vs. observed CDF

How every claim above is measured.

Let's talk in depth.

In-depth data

Performance benchmarks & workflows

Model access