Question of the Day
by JS
Do shorter hypotheses result in more generalization?
We have Occam’s razor as borrowed from Statistical Learning Theory:
Entities should not be multiplied beyond necessity.
Vapnik provides two reinterpretations, the common:
The simplest explanation is the best.
And the structural risk minimization version:
The explanation by the machine with the smallest capacity (VC dimension) is the best.
