This review focuses on biological issues of reinforcement learning. Since the influential discovery of W. Schultz of an analogy between the reward prediction error signal of the temporal difference algorithm and the firing pattern of some dopaminergic neurons in the midbrain during classical conditioning, biological models have emerged that use computational reinforcement learning concepts to explain adaptative behavior. In particular, the basal ganglia has been proposed to implement among other things reinforcement learning for action selection, motor control or working memory. We discuss to which extent the analogy between the temporal difference algorithm and the firing of dopamine cells can be considered as valid. Our review then focuses on the basal ganglia, their anatomy and key computational properties as demonstrated by three recent, influential models.