Isolation Forest
- iForest
Best for: General anomaly detection Aliases: iForest
How it works
$$s(x,n)=2^{-\,E[h(x)]\,/\,c(n)}$$Builds an ensemble of random isolation trees, each splitting on a randomly chosen feature at a random threshold until points are isolated. Anomalies, being sparse and different, require fewer random partitions to separate and thus have shorter path lengths $h(x)$. The anomaly score $s(x,n)=2^{-E[h(x)]/c(n)}$ normalises the mean path length by the average unsuccessful binary-search length $c(n)=2H(n-1)-2(n-1)/n$, so $s\to 1$ flags clear anomalies and $s\approx 0.5$ is normal.
When to use
General-purpose unsupervised anomaly detection across mixed tabular features at scale.
Watch out
Contamination parameter drives the threshold; less reliable on low-dimensional or categorical-heavy data.
Common fields
Fraud · cybersecurity · operations · IoT