In many domains of artificial intelligence, we operate in non-deterministic domains, such that we must reason under uncertainty. In other words:

  • We might not know exactly what state we start off in, including if we have imperfect knowledge.
    • For instance, in Poker, we can’t see our opponents’ cards.
  • We might not know all of the effects of an action (i.e., don’t know the state transitions).
    • The action might have a random component (like rolling a die).
    • We might not know all long-term effects (in drug design).
    • We might not know the status of something (like a road when we choose to drive down it).

So we want to maximise the expected utility, by gambling rationally. We do this by assigning probability values to states and utilities to the outcomes of actions.