Restorative justice has been subject to a number of attacks, both empirically and philosophically. This paper attempts to address some of these criticisms and suggests that they stem in part from misunderstandings about what restorative justice seeks to achieve and in part from demanding too much from restorative justice at this stage in its development. Attempts to evaluate restorative justice are also relatively recent. Critics, however, tend to either ignore the available research findings or to present them negatively. Critics also fail to contrast what restorative justice has achieved and may still achieve with what conventional criminal justice systems have achieved. Drawing from research, particularly from New Zealand, which has put restorative justice principles into practice to a greater extent than other jurisdictions, this review suggests that there are reasons to be relatively positive about the re-emergence of restorative justice.
To err is human. But to catch that error - does that take a computer? That's a question that psychologists have been wrestling with in recent months, as automated software has been checking their published findings on a huge scale.
In this paper, we suggest a novel reinforcement learning architecture, the Natural Actor-Critic. The actor updates are achieved using stochastic policy gradients employing Amari's natural gradient approach, while the critic obtains both the natural policy gradient and additional parameters of a value function simultaneously by linear regression. We show that actor improvements with natural policy gradients are particularly appealing as these are independent of coordinate frame of the chosen policy representation, and can be estimated more efficiently than regular policy gradients. The critic makes use of a special basis function parameterization motivated by the policy-gradient compatible function approximation. We show that several well-known reinforcement learning methods such as the original Actor-Critic and Bradtke's Linear Quadratic Q-Learning are in fact Natural Actor-Critic algorithms. Empirical evaluations illustrate the effectiveness of our techniques in comparison to previous methods, and also demonstrate their applicability for learning control on an anthropomorphic robot arm.
Hamilton’s theory of kin selection is the best-known framework for understanding the evolution of social behavior but has long been a source of controversy in evolutionary biology. A recent critique of the theory by Nowak, Tarnita, and Wilson sparked a new round of debate, which shows no signs of abating. In this overview, we highlight a number of conceptual issues that lie at the heart of the current debate. We begin by emphasizing that there are various alternative formulations of Hamilton’s rule, including a general version, which is always true; an approximate version, which assumes weak selection; and a special version, which demands other restrictive assumptions. We then examine the relationship between the neighbor-modulated fitness and inclusive fitness approaches to kin selection. Finally, we consider the often-strained relationship between the theories of kin and multilevel selection.
We present four new reinforcement learning algorithms based on actor–critic, natural-gradient and function-approximation ideas, and we provide their convergence proofs. Actor–critic reinforcement learning methods are online approximations to policy iteration in which the value-function parameters are estimated using temporal difference learning and the policy parameters are updated by stochastic gradient descent. Methods based on policy gradients in this way are of special interest because of their compatibility with function-approximation methods, which are needed to handle large or infinite state spaces. The use of temporal difference learning in this way is of special interest because in many applications it dramatically reduces the variance of the gradient estimates. The use of the natural gradient is of interest because it can produce better conditioned parameterizations and has been shown to further reduce variance in some cases. Our results extend prior two-timescale convergence results for actor–critic methods by Konda and Tsitsiklis by using temporal difference learning in the actor and by incorporating natural gradients. Our results extend prior empirical studies of natural actor–critic methods by Peters, Vijayakumar and Schaal by providing the first convergence proofs and the first fully incremental algorithms.