DescriptionBalci et al. 2009 describe a simple timing task in which reward appears after a short latency at one location, with probability p(S), and at another location after a longer latency, with the complementary probability. Human and mouse subjects began each trial waiting at the short location. As time elapsed in a trial, they switched to the long location. In response to changes in p(S), they made approximately optimal changes in the distribution of their switch latencies. We have replicated the Balci et al. finding with mouse subjects, while measuring the latency and abruptness of the adjustments caused by changes in p(S). The latencies are short, implying that mice rapidly detect a change in probability and rapidly estimate the new probability. The changes from the old to the new distributions are also abrupt, making them indistinguishable from step changes. This suggests the explicit detection of the change in p(S), followed by the computation of a new decision criterion (a new target switch time), which requires an enduring representation of the subject's temporal uncertainty together with a new estimate of p(S). The abruptness of the adjustments does not appear to be consistent with the gradual attainment of a new dynamic equilibrium through "hill-climbing," as in simple reinforcement-learning models. To achieve the observed degree of abruptness, the learning rate parameter must be set very high, but then a reinforcement-learning model would track the stochastic noise in the sequence of short and long trials, which the mice do not do.