hindcast_pomdp

Compare historical actions to what pomdp recommendation would have been

hindcast_pomdp(transition, observation, reward, discount, obs, action,
  state_prior = rep(1, dim(observation)[[1]])/dim(observation)[[1]],
  alpha = NULL, ...)

Arguments

transition	Transition matrix, dimension n_s x n_s x n_a
observation	Observation matrix, dimension n_s x n_z x n_a
reward	reward matrix, dimension n_s x n_a
discount	the discount factor
obs	a given sequence of observations
action	the corresponding sequence of actions
state_prior	initial belief state, optional, defaults to uniform over states
alpha	the matrix of alpha vectors returned by `sarsop`
...	additional arguments to `appl`.

Value

a list, containing: a data frame with columns for time, obs, action, and optimal action, and an array containing the posterior belief distribution at each time t

Examples

# NOT RUN {
 ## Takes > 5s
## Use example code to generate matrices for pomdp problem:
source(system.file("examples/fisheries-ex.R", package = "sarsop"))
alpha <- sarsop(transition, observation, reward, discount, precision = 10)
sim <- hindcast_pomdp(transition, observation, reward, discount,
                     obs = rnorm(21, 15, .1), action = rep(1, 21),
                     alpha = alpha)

# }

Arguments

Value

Examples

Contents