Question of the Day

by JS

Do shorter hypotheses result in more generalization?

We have Occam’s razor as borrowed from Statistical Learning Theory:

Entities should not be multiplied beyond necessity.

Vapnik provides two reinterpretations, the common:

The simplest explanation is the best.

And the structural risk minimization version:

The explanation by the machine with the smallest capacity (VC dimension) is the best.