Learning Decision Trees
by JS
Decision tree learning algorithms have been around a long time but are something I have only come to appreciate lately, since, truth be told, I did not previously spend any serious time trying to understand decision tree learning.
One key benefit of learning decision trees is that the resulting tree function can be interpreted as a sequence of simple questions that can classify data. Say, for example, you want to figure out whether a loan will default or not. Loans have a number of parameters, like loan-to-value, interest rate, and credit score of the borrower. You may try to come up with a set of if-then criteria based on these parameters that lead you to the answer. This is what decision tree learning algorithms generate.
I’ve implemented a simple version, ID3, and added it to my collection of algorithms in the sidebar. You’ll note that I used Sqlite instead of trying to represent that training data using dictionaries. It turns out that a lot of the primitives that go into decision tree algorithms are naturally expressed as operations on table data (see for example the entropy calculation). Sqlite (or any RDMS) is ideally suited for this.
