An Example of Category Error
by JS
According to Wikipedia:
I encountered an interesting (and I hope uncontroversial) example of a category error in a meeting of the UTCS reinforcement learning group today. Richard Sutton was in town for a thesis defense, and the meeting served as a follow-up to a previous reading group discussion of Sutton’s 14 principles of learning and intelligence, an informal document circulating among reinforcement learning researchers recently.
During the course of the discussion, the problem of characterizing “reward” arose. Is reward something internal to an agent, or something provided by the environment? This is an easy question to answer in a computational setting, since we (the researchers) determine the reward signal. What about the real world? What constitutes external reward?
Preventing the group from reaching consensus on this issue was a rather subtle and pernicious form of category error, ascribing to the real world the property of reward. Rewards are interpretations made by agents in the presence of environmental signals. The environment doesn’t (and can’t) ascribe any meaning on its own state of being or the way it interacts with the agent.
To see this clearly, consider being given a million dollars. You’d feel happy about it, in part because of a spike in dopamine in your brain. But would another agent in the world, say a dog? The million dollars would mean nothing to the dog, which we know to be a reinforcement learning agent in the presence of external signals the dog interprets as rewarding – like a nice juicy cut of meat.
Two agents can interpret an environmental signal as a reward or not depending on the agent. This rules out any notion of external reward, there are only external signals. Of course the external world does provide another method of feedback – selection. The relationship between reward and selection deserves its own post.
UPDATE: A question: Is evolutionary fitness reward?
Comments
[...] better fitness. The presentation made a point about reward functions that I’ve thought of independently, but which psychologists have already enunciated in various forums. The point is this: reward [...]