Implement the **Softmax Function** for multi-class classification.
Formula:
softmax(z_i) = exp(z_i) / sum(exp(z_j)) for all j
**Numerical Stability:** To avoid overflow with large values, subtract the maximum value before computing exponentials:
z_stable = z - max(z)
softmax(z_i) = exp(z_stable_i) / sum(exp(z_stable_j))
Write a function softmax(z) that takes an array of logits and returns the softmax probabilities.
Example:
z = [2.0, 1.0, 0.1]
exp_z = [exp(2.0), exp(1.0), exp(0.1)] = [7.389, 2.718, 1.105]
sum_exp = 7.389 + 2.718 + 1.105 = 11.212
softmax = [7.389/11.212, 2.718/11.212, 1.105/11.212]
= [0.6590, 0.2424, 0.0986]
# Output sums to 1.0
**Explanation:** Softmax converts raw scores (logits) into a probability distribution. Each output is between 0 and 1, and all outputs sum to exactly 1. It is the standard output activation for multi-class classification.
Constraints:
Test Cases
[2.0, 1.0, 0.1][0.6590011388859679, 0.2424329707047139, 0.09856589040931818][1.0, 1.0, 1.0][0.3333333333333333, 0.3333333333333333, 0.3333333333333333][0, 0][0.5, 0.5]