Debugging and Machine Learning
by JS
As I near completion on my final project for a course on reinforcement learning, I came across the following from Sutton’s page on tile coding:
With the code described so far, there is a small probability that unrelated inputs will hash into some of the same tiles. In a group of tilings, usually there will be no more than one such “collision”, so that it is not a big problem; the learning process will sort it out. There will not be a big effect on performance unless the memory is too small or the hash functions are poorly designed. Nevertheless, the possibility of such a problem is annoying. When one’s program doesn’t work, there is a tendency, deserved or not, to suspect a failure of the hashing function.
I did, in fact, discover that my memory size was too small, resulting in a number of collisions. That was not the only problem with my agent, but one of many.
Of recent related significance, the UTCS/ART autonomous vehicle team did not make the finals in the Urban Challenge. One of the technical problems the team faced was a bad Ethernet cable that delayed critical sensor readings by as much as five seconds. The thread here is that debugging (in the classic sense as a programmer art) does not apply easily to systems that exhibit degrees of homeostasis or non-determinism (e.g. Ethernet protocol, TD-learning).
