Three models, one decision

Choose a customer

Logistic regression

The simplest possible approach. It multiplies each feature by a fixed weight — learned once from training — adds them up, and converts the total to a probability. The same weights apply to every customer, every time.

Random forest

Grows 500 independent trees, each trained on a random sample of customers and a random subset of features. They never talk to each other. When a new customer arrives, all 500 vote simultaneously — majority wins.

How all three compare

Logistic regression

Random forest

Gradient boosting

How it decides

Weighted sum of features → sigmoid

500 independent trees vote

Trees correct each other sequentially

Explainability

Very high — each weight is a plain number

Medium — feature importance but no single path

High — SHAP shows each tree's contribution

Edge cases

Poor — a straight line misses curved patterns

Good — averaging smooths unusual customers

Best — each tree specifically targets hard cases

Overfitting risk

Low — hard to overfit a simple model

Low — 500-tree average reduces variance

Medium — needs careful depth tuning

Typical accuracy

~74%

~78%

~80%

Best when

You need to explain every decision to a non-technical audience

Data is noisy or robustness matters more than precision

You want maximum accuracy with clean, well-chosen features

Same four features.Three different waysto decide.

Same four features.
Three different ways
to decide.