The forward type of optimality in active

The forward type of optimality in active FG4592 inference is closely related to the

optimality introduced recently for the control of stochastic nonlinear problems with solenoidal or periodic motion, such as in locomotion, in which “the stationary state-distribution of the optimally-controlled process” is approximated (Tassa et al., 2011). In short, optimal motion is determined by prior beliefs, which endow states with a particular value; however, value is a consequence, not a cause, of optimal behavior. The crucial thing here is that cost-to-go and surprise are the same thing. This ensures that maximizing the long-term average of value is the same as minimizing the entropy of sensory states. This is mandated by the free-energy principle and is the same as maximizing Bayesian-model evidence. Both value and surprise are optimized by Bayesian inference, but neither depends on cost functions. PLX3397 We will see an example of cost-free optimality below. In summary, the tenet of optimal

control lies in the reduction of optimal motion to flow on a value function, like the downhill flow of water. Conversely, in active inference, flow is specified directly in terms of equations of motion that constitute prior beliefs, like patterns of wind flow. The essential difference is that prior beliefs can include solenoidal flow (e.g., atmospheric circulation, or the Coriolis Effect) that cannot be specified with (scalar) value functions. Having said this, I do not want to overstate Phosphatidylinositol diacylglycerol-lyase the shortcomings of optimal control in specifying limit cycle or solenoidal motion; for example, there are compelling examples in the recent literature on simulated walking (Wang et al., 2009). These schemes employ simultaneous trajectory optimization, which uses an explicit representation of the trajectory (as opposed to sequential algorithms that only represent the action sequence) (Kameswaran and Biegler, 2006). This generalization replaces cost functions of a particular state with a cost function over trajectories.

Effectively, this converts the problem of optimizing a sequence of movements into optimizing a value function on a high-dimensional state space, whose coordinates are states at different times. A point in this space encodes a sequence or trajectory. However, this begs the question of how one would specify an itinerant sequence of sequences, without invoking even higher-dimensional representations of state space. This is accommodated easily in inference, in which prior beliefs about sequences of sequences are encoded directly by hierarchies of attractors or central pattern generators (Kiebel et al., 2008). Another generalization of optimal control is to consider value functions that change with time (Todorov and Jordan, 2002). Intuitively, this would be like guiding a donkey with a moving carrot (as opposed to placing the carrot at a fixed location and hoping the donkey finds it).

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>