Initializations
We rely primarily on the tidyverse
and MDPtoolbox
packages for our simulation. We use a a set of additional packages for plotting which can be found in ../R/load-libraries.R
.
rm(list = ls(all.names = TRUE)) # clean environment
# data / MDP libraries
library(tidyverse)
library(MDPtoolbox)
source("../R/load-libraries.R")
Set MDP parameters
We first define a vector of possible actions, \(A\), and states, \(S\)
\[A \in [0,1]\] \[S \in [0,1.5] \]
# define action and state space
<- seq(0, 1, length = 100) # vector of possible action values
actions <- seq(0, 1.5, length = 100) # vector of possible state values states
And define the set parameters for the MDP model:
- \(b\) = benefit parameter
- \(c\) = cost parameter
- \(\sigma\) = ecological stochasticity
- \(r\) = ecological change rate
- \(\gamma\) = discount factor
# set global mdp parameters
<- list(
params benefit = 1.57, # benefit param for MDP
cost = 1, # cost param for MDP
sigma = 0.1, # ecological change stochasticity
r = 0.1, # ecological change rate
discount = 0.97, # myopic discount factor
Tmax = 20 # max simulation time
)
Define state transition probability and utility functions
We assume a simple model of the farmer’s perceived utility \(u(s_t, a_t)\) as a function of the difference between the cost \(c_a\) associated with diversification practice action \(a_t\), versus expected benefits \(b_s\) derived from ecosystem state \(s_t\), at time \(t\), such that
\[u(s_t, a_t) = b_s s_t - c_a a_t\]
Agents’ initial ecosystem states were distributed normally around a mean of \(s =0.5\). The ecosystem state is also dynamic, evolving according to the transition probability function \(p(s_t, a_t)\), such that
\[P(s_{t+1}| s_t, a_t) = s_t + r \left(a_t - s_t \right) + \epsilon\] where \(\epsilon \sim N(0, \sigma)\)
#' Calculate transition probability based on current state and action.
#'
#' @param s The current state.
#' @param a The current action.
#' @param params Must be a list with element r.
#' @return The state transition probability.
<- function(s, a, params) s + params$r * (a - s)
transition_fn #' Calculate utility based on current state and action.
#'
#' @param s The current state.
#' @param a The current action.
#' @param params Must be a list with elements [benefit, cost].
#' @return The state utility.
<- function(s, a, params) params$benefit * s - params$cost * a utility_fn
Calculate transition and utility matrices
We can use the above equation to create both a transition and utility matrices.
#' Calculate transition probability matrix and utility matrix based on MDP parameters.
#'
#' @param states A vector representing the state space.
#' @param actions A vector representing the action space.
#' @param params Must be a list with elements [benefit, cost, sigma].
#' @param transition_fn A function(state, action, params) giving the transition
#' probability to the next state.
#' @param utility_fn A function(state, action, params) giving the value of the next state.
#' @return A list containing the transition probability matrix \code{P} and utility
#' matrix
<- function(states, actions, params, transition_fn, utility_fn){
continuous_model <- function(states, actions, f, params){
transition_matrix <- length(states)
n_s <- length(actions)
n_a <- array(0, dim = c(n_s, n_s, n_a))
transition for(i in 1:n_a){
for (k in 1:n_s) {
<- transition_fn(states[k], actions[i], params)
nextpop if (nextpop <= 0) {
<- c(1, rep(0, n_s - 1))
x else {
} <- truncnorm::dtruncnorm(
x 0, max(states), nextpop, params$sigma) # assumes truncated normal error
states, if(sum(x) <= 0){
<- c(1, rep(0, n_s - 1))
x else {
} <- x / sum(x)
x
}
}<- x
transition[k, , i]
}
}if(any(is.na(transition))) stop("error creating transition matrix")
transition
}<- function(states, actions, utility_fn, params){
utility_matrix <- array(dim=c(length(states), length(actions)))
utility for(i in 1:length(states)){
for(j in 1:length(actions)){
<- utility_fn(states[i], actions[j], params)
utility[i,j]
}
}
utility
}list(P = transition_matrix(states, actions, f, params),
U = utility_matrix(states, actions, utility_fn, params))
}
Define and solve MDP for an infinite time horizon case
We can solve for the optimal action strategy, \(\pi^*\), which maps a give state, \(s\) to an optimal action \(a\), using the a recursive Bellman equation (value iteration; see Marescot et al, 2013 for a thorough introduction to these methods).
# compute P and U matrices for the MDP
<- continuous_model(states, actions, params, transition_fn, utility_fn)
model # solve the MDP to find the optimal decision policy
<- MDPtoolbox::mdp_value_iteration(model$P, model$U, params$discount) base_soln
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
<- tibble(state = states, action = actions[base_soln$policy]) base_soln_df
Land tenure experiment: Calculating finite time horizon MDP solution
While the above solves for the optimal action strategy using a discounted infinite decision horizon, we can also find optimal action strategy for a discounted finite time horizon. In our main text, we set this time horizon to 10 time steps for this analysis (Figure 5).
Code for simulations
Below we provide a full code base for the simulations and corresponding main text figures. We rely on the above methods of infinite and finite MDPs.
Define main simulation function
#' Run DFS-MDP simulation for a single agent based on MDP simulation parameters.
#'
#' @param P Transition probability matrix, output from continuous_model().
#' @param U Utility matrix, output from continuous_model().
#' @param policy Optimal decision policy.
#' @param discount Myopic discount factor.
#' @param x0 A vector of initial states.
#' @param Tmax Max time for the simulation.
#' @return A dataframe with time, state, action, and value.
<- function(P, U, policy, discount, x0, Tmax){
sim_mdp
<- dim(P)[1]
n_states <- action <- value <- numeric(Tmax+1)
state 1] <- x0
state[<- 1:(Tmax+1)
tsteps for(t in tsteps){
# select action, determine value, transition to next state
<- policy[state[t]]
action[t] <- U[state[t], action[t]] * discount^(t-1)
value[t] +1] <- sample(1:n_states, 1, prob = P[state[t], , action[t]])
state[t
}data.frame(time = 0:Tmax, state = state[tsteps],
action = action[tsteps], value = value[tsteps])
}
Set global simulation parameters
<- 500 # number of replicate simulations to run
reps <- truncnorm::rtruncnorm(reps, 0, 1, 0.5, 0.2) %>% # sample init s from norm dist
init map_int(function(x) which.min(abs(x - states))) # map to values in states vector
Run base-case simulation
Define timeseries and density plotting functions
These are just plotting functions used to generate plots of simulations below.
# timeseries plot of multiple repetitions
<- function(sims, title = ggtitle(NULL), tpos = "plot", ytxtoff = FALSE,
sim_plot_ts endcol = pal[1], annotate = FALSE, an_x = NULL, an_lab = NULL,
dnmarmod = 0, upmarmod = 0, lmarmod = 0, rmarmod = 0){
<- sims %>%
df select(-value) %>% # tidy
mutate(state = states[state], action = actions[action]) # rescale
<- max(sims$time)
Tmax <- col2rgb(endcol)
stcol <- stcol/5
stcol <- rgb(t(stcol), maxColorValue=255)
stcol <- "black"
ytitc = "gray30"
ytxtc if (ytxtoff) {
<- NA
ytitc <- NA
ytxtc
}<- df %>%
p ggplot(aes(time, state, group = reps, col = time)) +
geom_path(alpha = 0.1, show.legend = FALSE) +
+
title labs(x="Decision cycle", y="Ecosystem service state") +
scale_x_continuous(limits = c(0,Tmax), breaks = seq(0, Tmax, by=5),
expand = c(.01,.01)) +
scale_y_continuous(limits = c(0,1.5), expand = c(.02,.02)) +
scale_color_gradient(low=stcol, high=endcol) +
theme(axis.text.x=element_text(size=10),
axis.text.y=element_text(size=10, color = ytxtc),
axis.title.x=element_text(size=10),
axis.title.y=element_text(size=10, color = ytitc),
plot.title = element_text(size = 10, face = "bold"),
plot.title.position = tpos,
panel.grid.minor = element_blank(),
plot.margin=grid::unit(c(5+upmarmod,5+rmarmod,5+dnmarmod,5+lmarmod), "mm"))
if(annotate) {
<- p +
p geom_vline(xintercept = an_x, linetype="dashed", color = "red3", size=.5) +
annotate('label', x = an_x + 2, y = 1.3, label = an_lab, hjust = 0, vjust = .5,
family = "Roboto", size = 3.25, label.padding = unit(.15, "lines"),
label.size = 0, alpha = .6) +
annotate("segment", x = an_x + 1.75, xend = an_x + .5, y = 1.3, yend = 1.3,
size=.5, arrow=arrow(length = unit(0.22, "cm")))
}
p
}# density plot showing initial ES distribution and final distribution
<- function(sims, title = ggtitle(NULL), tpos = "plot", endcol = pal[1],
sim_plot_dens lab_lo_peak = FALSE, lab_hi_peak = FALSE,
dnmarmod = 0, upmarmod = 0, lmarmod = 0, rmarmod = 0){
<- sims %>%
df mutate(state = states[state]) %>% # rescale
select(state, time)
<- max(sims$time)
Tmax <- col2rgb(endcol)
stcol <- stcol/5
stcol <- rgb(t(stcol), maxColorValue=255)
stcol <- df %>% filter(time %in% c(0, Tmax)) %>%
p ggplot() + geom_density(aes(state, group = time, fill = time, color = time),
alpha=0.8) +
coord_flip() +
+
title labs(x="", y="Density", fill="Time") +
scale_x_continuous(limits = c(0,1.5), expand = c(.02,.02)) +
scale_y_continuous(breaks = scales::pretty_breaks(n = 2)) +
scale_fill_gradient(low=stcol, high=endcol, guide = guide_colorbar(barwidth = .5),
breaks=c(0, Tmax)) +
scale_color_gradient(low=stcol, high=endcol, guide = NULL) +
theme(axis.text.x=element_text(size=10),
axis.text.y=element_blank(),
axis.title.x=element_text(size=10),
axis.title.y=element_text(size=10),
legend.text = element_text(size=10),
legend.title = element_text(size=10),
legend.box.margin=margin(0,0,0,-5),
plot.title = element_text(size = 10, face = "bold"),
plot.title.position = tpos,
panel.grid.minor = element_blank(),
plot.margin=grid::unit(c(5+upmarmod,5+rmarmod,5+dnmarmod,5+lmarmod), "mm"))
if(lab_lo_peak) {
<- ggplot_build(p)$layout$panel_scales_y[[1]]$range$range[2]
ymax <- ggplot_build(p)$data[[1]] %>%
peak_lo filter(group == 2, x < .4) %>%
arrange(desc(y)) %>%
select(x,y) %>%
filter(row_number()==1)
# dotted line
<- p +
p annotate('segment', x = peak_lo$x, xend = peak_lo$x, y = peak_lo$y - .1 * ymax,
yend = peak_lo$y + .1 * ymax, linetype="dotted")
# text annotation (decide whether to place left or right of line)
if(peak_lo$y > .7 * ymax) {
<- p +
p annotate('label', x = peak_lo$x, y = peak_lo$y - .32 * ymax,
label = sprintf('%.2f', peak_lo$x), hjust = .5, vjust = .5,
family = "Roboto Condensed", size = 3,
label.padding = unit(.15, "lines"), label.size = 0, alpha = .8)
else {
} <- p +
p annotate('label', x = peak_lo$x, y = peak_lo$y + .26 * ymax,
label = sprintf('%.2f', peak_lo$x), hjust = .5, vjust = .5,
family = "Roboto Condensed", size = 3,
label.padding = unit(.15, "lines"), label.size = 0, alpha = .8)
}
}if(lab_hi_peak){
<- ggplot_build(p)$layout$panel_scales_y[[1]]$range$range[2]
ymax <- ggplot_build(p)$data[[1]] %>%
peak_hi filter(group == 2, x > .6) %>%
arrange(desc(y)) %>%
select(x,y) %>%
filter(row_number()==1)
# dotted line
<- p +
p annotate('segment', x = peak_hi$x, xend = peak_hi$x, y = peak_hi$y - .1 * ymax,
yend = peak_hi$y + .1 * ymax, linetype="dotted")
# text annotation (decide whether to place left or right of line)
if(peak_hi$y > .7 * ymax) {
<- p +
p annotate('label', x = peak_hi$x, y = peak_hi$y - .32 * ymax,
label = sprintf('%.2f', peak_hi$x), hjust = .5, vjust = .5,
family = "Roboto Condensed", size = 3,
label.padding = unit(.15, "lines"), label.size = 0, alpha = .8)
else {
} <- p +
p annotate('label', x = peak_hi$x, y = peak_hi$y + .26 * ymax,
label = sprintf('%.2f', peak_hi$x), hjust = .5, vjust = .5,
family = "Roboto Condensed", size = 3,
label.padding = unit(.15, "lines"), label.size = 0, alpha = .8)
}
}
p }
Define base-base optimal decision strategy plotting function
# solution point plot with threshold annotated
<- function(soln_df, tpt) {
soln_plot ggplot(soln_df, aes(state,action)) +
geom_point(size = 1) +
geom_vline(xintercept = tpt, linetype="dashed", color = "red3", size=.7) +
annotate('label', x = tpt + .15, y = .375, label = "Tipping point",
hjust = 0, vjust = .5, family = "Roboto", size = 3.25,
label.padding = unit(.15, "lines"), label.size = 0, alpha = .65) +
annotate("segment", x = tpt + .15, xend = tpt + .025, y = .375, yend = .375,
size=.5, arrow=arrow(length = unit(0.22, "cm"))) +
labs(x="Ecosystem service state", y="Optimal adoption") +
scale_x_continuous(limits = c(0,NA), expand = c(.01,.01)) +
scale_y_continuous(limits = c(0,1), expand = c(.01,.01),
labels = scales::percent_format(accuracy = 1)) +
theme(axis.text.x=element_text(size=10),
axis.text.y=element_text(size=10),
axis.title.x=element_text(size=10),
axis.title.y=element_text(size=10),
panel.grid.minor = element_blank(),
plot.margin=grid::unit(c(5,5,5,5), "mm"))
}
Generate optimal policy plot
<- sims_baseline %>%
state99th mutate(state = states[state]) %>%
pull(state) %>%
quantile(.99) %>%
unname()
<- 0
tpt for(si in 1:length(states)-1) {
if(base_soln_df$action[si] < .1 && base_soln_df$action[si+1] > .9) {
<- base_soln_df$state[si] + ((base_soln_df$state[2] - base_soln_df$state[1]) / 2)
tpt
}
}soln_plot(base_soln_df %>% filter(state <= state99th), tpt = tpt)
Optimal decision strategy \(\pi\) as a function of ES state, showing a tipping point at \(s \approx 0.48\). The upper \(x\) axis limit is the 99th percentile of observed states in our simulation results (\(s \approx 1.3\)).
Generate base-case simulation figure
Initial ecosystem states are distributed normally (mean = 0.5; S.D. = 0.2; truncated at [0,1]). Agents follow decision strategy \(\\pi\) as shown in Fig \ref{fig:res_strategy} until \(t = 20\). (A) Ecosystem state of each agent over time (500 simulations). (B) Initial ES distribution (dark blue) and final bimodal distribution at \(t = 20\) (light blue).
ggarrange(
soln_plot(base_soln_df %>% filter(state <= state99th), tpt = tpt),
sim_plot_ts(sims_baseline, title = ggtitle("B. Timeseries"), rmarmod = -5),
sim_plot_dens(sims_baseline, title = ggtitle("C. State distributions"),
tpos = "panel", lab_lo_peak = TRUE, lab_hi_peak = TRUE, lmarmod = -5),
widths = c(3,2.75,2.2), nrow = 1, ncol = 3
)
Simulate short time horizon case
<- sim(short_tenure_soln) sims_short_tenure
Define density curve comparison plotting function
# plot formatting helper functions
# increase vertical spacing between legend keys
# written by @clauswilke
<- function(data, params, size) {
draw_key_polygon3 <- min(data$size, min(size) / 4)
lwd ::rectGrob(
gridwidth = grid::unit(0.3, "npc"),
height = grid::unit(0.8, "npc"),
gp = grid::gpar(
col = data$colour,
fill = alpha(data$fill, data$alpha),
lty = data$linetype,
lwd = lwd * .pt,
linejoin = "mitre"
))
}# register new key drawing function,
# the effect is global & persistent throughout the R session
$draw_key = draw_key_polygon3
GeomDensity# plot comparing final and initial density curves of two simulations
<- function(sims1, sims2, sims_base = "init",
sim_plot_dens_comp label1 = "Gp. A Final", label2 = "Gp. B Final",
label_base = "Initial", title = ggtitle(NULL),
tpos = "plot", cvec = c(pal[2], pal[1], pal[3]),
dnmarmod = 0, upmarmod = 0, lmarmod = 0, rmarmod = 0){
if (sims_base == "init") {
<- sims1 %>%
df_base mutate(state = states[state]) %>% # rescale
filter(time %in% 0) %>%
select(state) %>%
add_column(id = label_base)
else {
} <- max(sims_base$time)
Tmax_base <- sims_base %>%
df_base mutate(state = states[state]) %>% # rescale
filter(time %in% Tmax_base) %>%
select(state) %>%
add_column(id = label_base)
}<- max(sims1$time)
Tmax1 <- sims1 %>%
df1 mutate(state = states[state]) %>% # rescale
filter(time %in% Tmax1) %>%
select(state) %>%
add_column(id = label1)
<- max(sims2$time)
Tmax2 <- sims2 %>%
df2 mutate(state = states[state]) %>% # rescale
filter(time %in% Tmax1) %>%
select(state) %>%
add_column(id = label2)
rbind(df_base, df1, df2) %>% mutate(id = factor(id, unique(id))) %>%
ggplot() + geom_density(aes(state, group = id, fill = id, color = id), alpha=0.4) +
coord_flip() +
+
title labs(x="", y="Density") +
scale_x_continuous(limits = c(0,1.5), expand = c(.02,.02)) +
scale_y_continuous(breaks = scales::pretty_breaks(n = 2)) +
scale_fill_manual(values = cvec) +
scale_color_manual(values = cvec) +
theme(axis.text.x=element_text(size=10),
axis.text.y=element_blank(),
axis.title.x=element_text(size=10),
axis.title.y=element_text(size=10),
legend.text=element_text(size=9),
legend.title = element_blank(),
legend.key.size = unit(12, "mm"),
legend.spacing.x = unit(-3, 'mm'),
legend.box.margin=margin(0,0,0,-18),
plot.title = element_text(size = 10, face = "bold"),
plot.title.position = tpos,
panel.grid.minor = element_blank(),
plot.margin=grid::unit(c(5+upmarmod,5+rmarmod,5+dnmarmod,5+lmarmod), "mm"))
}
Generate land tenure experiment plot
The simulation is identical to that in Fig \ref{fig:res_bimodal}, but the MDP is solved under a finite, 10-year time horizon. (A) Result of short land tenure on ES state over time. (B) Comparison between final state distribution of short- vs. long-tenure model runs
ggarrange(
sim_plot_ts(sims_baseline, title = ggtitle("A. Long-tenure"),
annotate = TRUE,
rmarmod = -10),
sim_plot_ts(sims_short_tenure,
title = ggtitle("B. Short-tenure"),
endcol = pal[3],
tpos = "panel", ytxtoff = T,rmarmod = -5, lmarmod = -5),
sim_plot_dens_comp(sims_baseline, sims_short_tenure,
label1 = "Long\ntenure\nfinal",
label2 = "Short\ntenure\nfinal",
title = ggtitle("C. State distribution"),
tpos = "panel", lmarmod = -5),
nrow = 1, ncol = 3, widths = c(1,1,1))
Incentive duration experiment
Incentive simulation parameters and setup
# set incentive simulation parameters
<- 2
T_incent_1 <- 10
T_incent_2 # calc modified cost during incentive = 1 - 2 / T_incent
<- 1 - 2 / T_incent_1
C_incent_1 <- 1 - 2 / T_incent_2
C_incent_2 # incentive simulation fxn
<- function(params) {
incent_sim <- continuous_model(states, actions, params,
incent_model
transition_fn, utility_fn)<- MDPtoolbox::mdp_value_iteration(incent_model$P,
incent_soln $U, params$discount)
incent_model# simulate incentive period
<- sim(incent_soln, Tmax = params$T_incent, x0 = init)
start <- start %>% filter(time == params$T_incent) %>% pull(state)
xi # pad to Tmax with baseline decision rule
<- sim(base_soln, Tmax = params$Tmax - params$T_incent, x0 = xi) %>%
rest filter(time!=0) %>% mutate(time = time + params$T_incent)
bind_rows(start, rest)
}
Run incentives simulation
# run incentive simulation
<- incent_sim(list(benefit = params$benefit,
sims_incent_abrupt cost = C_incent_1,
sigma = params$sigma,
r = params$r,
discount = params$discount,
T_incent = T_incent_1, Tmax = params$Tmax))
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
<- incent_sim(list(benefit = params$benefit,
sims_incent_sust cost = C_incent_2,
sigma = params$sigma,
r = params$r,
discount = params$discount,
T_incent = T_incent_2, Tmax = params$Tmax))
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
Plot incentive simulation
Starting from the same initial states as Fig \ref{fig:res_bimodal}, ES state timeseries are shown for (A) a large, abrupt incentive (100\% of adoption expenses are covered for two years) vs. (B) a smaller, more sustained incentive (adoption cost is 80\% of baseline for 10 years). Ignoring discounting, both packages have the same total cost to the funder (the equivalent of 2 years’ worth of full adoption cost offsets). After the incentive period, agents adjust their decision rules to that of the base case (i.e. no incentive) until \(t = 20\). (C) Shows that the sustained incentive ultimately drove more DP adoption.
ggarrange(
sim_plot_ts(sims_incent_abrupt, title = ggtitle("A. Concentrated incentive"),
annotate = TRUE, an_x = T_incent_1,
an_lab = sprintf("%i cycles\n%i%% off", T_incent_1,
round((1-C_incent_1)*100)),
rmarmod = -10, endcol = pal[5]),
sim_plot_ts(sims_incent_sust, title = ggtitle("B. Sustained incentive"),
annotate = TRUE, an_x = T_incent_2,
an_lab = sprintf("%i cycles\n%i%% off", T_incent_2,
round((1-C_incent_2)*100)),
tpos = "panel", ytxtoff = T, endcol = pal[6], rmarmod = -5, lmarmod = -5),
sim_plot_dens_comp(sims_incent_abrupt, sims_incent_sust, sims_base = sims_baseline,
label1 = "Concentrated \nincentive", label2 = "Sustained\nincentive",
label_base = "No\nincentive",
title = ggtitle("C. Final distribution"), tpos = "panel",
cvec = c(pal[2], pal[5], pal[6]), lmarmod = -5, rmarmod = -5),
nrow = 1, ncol = 3, widths = c(1,1,1)
)
Temporal dynamics experiment
Define “benefit sweep” simulation function
<- function(scenarios, inf_fin, t_horiz = 2) {
sim_b_sweep <- function(params) {
sim_b_sweep_scenario <- continuous_model(states, actions, params, transition_fn, utility_fn)
model if(inf_fin == "infinite") {
<- MDPtoolbox::mdp_value_iteration(model$P, model$U, params$discount)
soln else if(inf_fin == "finite") {
} <- MDPtoolbox::mdp_finite_horizon(model$P, model$U, params$discount,
soln N = t_horiz)
else stop('inf_fin must be "infinite" or "finite"')
} sim(soln, Tmax = params$Tmax) %>%
filter(time == params$Tmax)
}
%>%
scenarios ::transpose() %>%
purrr::future_map_dfr(sim_b_sweep_scenario, .id = "id") %>%
furrrleft_join(scenarios, by = "id") %>%
mutate(state = states[state])
}
Parameterize and run “benefit-sweep” simulation scenarios
<-
chans_scenarios ::crossing(cbr = seq(.55, .7, length.out = 40),
tidyrcost = params$cost,
sigma = params$sigma,
r = params$r,
discount = params$discount,
Tmax = params$Tmax) %>%
::rownames_to_column("id") %>%
tibblemutate(benefit = cost / cbr)
<- sim_b_sweep(chans_scenarios, "infinite") chans_sims
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
<-
fast_r_scenarios ::crossing(cbr = seq(.825, .975, length.out = 40),
tidyrcost = params$cost,
sigma = params$sigma,
r = 0.95,
discount = params$discount,
Tmax = params$Tmax) %>%
::rownames_to_column("id") %>%
tibblemutate(benefit = cost / cbr)
<- sim_b_sweep(fast_r_scenarios, "infinite") fast_r_sims
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
## [1] "MDP Toolbox: iterations stopped, epsilon-optimal policy found"
<-
myopic_d_scenarios ::crossing(cbr = seq(.025, .175, length.out = 40),
tidyrcost = params$cost,
sigma = params$sigma,
r = params$r,
discount = params$discount,
Tmax = params$Tmax) %>%
::rownames_to_column("id") %>%
tibblemutate(benefit = cost / cbr)
<- sim_b_sweep(myopic_d_scenarios, "finite", t_horiz = 2) myopic_d_sims
<- function(scenarios, inf_fin) {
sim_b_sweep_t <- function(params) {
sim_b_sweep_scenario <- continuous_model(states, actions, params, transition_fn, utility_fn)
model if(inf_fin == "infinite") {
<- MDPtoolbox::mdp_value_iteration(model$P, model$U, params$discount)
soln else if(inf_fin == "finite") {
} <- MDPtoolbox::mdp_finite_horizon(model$P, model$U, params$discount,
soln N = params$t_horiz)
else stop('inf_fin must be "infinite" or "finite"')
} sim(soln, Tmax = params$Tmax) %>%
filter(time == params$Tmax)
}
%>%
scenarios ::transpose() %>%
purrr::future_map_dfr(sim_b_sweep_scenario, .id = "id") %>%
furrrleft_join(scenarios, by = "id") %>%
mutate(state = states[state])
}
Define “benefit sweep” simulation helper and plotting functions
# find peaks fxn for benefit sweep experiment
<- function(sims, smoothing = 1.5,
b_sweep_peaks ythresh = 0.125, output_bizone = FALSE) {
= 0
bimin = 0
bimax <- tibble(cbr = numeric(), peak = numeric())
peaks for(this_cbr in unique(sims$cbr)) {
<- density((sims %>% filter(cbr == this_cbr))$state, adjust = smoothing)
dens <- tibble(x = dens$x, y = dens$y) %>% filter(y >= ythresh)
dens <- dens$x[findPeaks(dens$y)]
ps for(p in ps) {
<- add_row(peaks, cbr = this_cbr, peak = p)
peaks
}if(output_bizone && length(ps) > 1) {
if(bimin == 0) {
= this_cbr
bimin
}= this_cbr
bimax
}
}if(output_bizone) {
c(bimin, bimax)
else {
}
peaks
} }
# plot benefit sweep experiment results
<- function(sims, title = ggtitle(NULL), ylab = "",
plt_sim_b_sweep tpos = "plot", col = pal[1],
upmarmod = 0, lmarmod = 0, dnmarmod = 0, rmarmod = 0) {
<- col2rgb(col)
dkcol <- dkcol/5
dkcol <- rgb(t(dkcol), maxColorValue=255)
dkcol %>%
sims ggplot(aes(x = state, group = cbr, color = cbr)) +
geom_line(stat='density', size=.5, alpha=.4) +
+
title labs(x = "ES state", y = ylab, color = "c:b") +
scale_x_continuous(limits = c(0, 1.5),
breaks = scales::pretty_breaks(n = 3), expand = c(.01,.01)) +
scale_y_continuous(breaks = scales::pretty_breaks(n = 2), expand = c(.01,.01)) +
scale_color_gradient(low = dkcol, high = col, guide = guide_colorbar(barwidth = .5),
breaks=c(min(sims$cbr), max(sims$cbr))) +
theme(axis.text.x=element_text(size=10),
axis.text.y=element_text(size=10),
axis.title.x=element_text(size=10),
axis.title.y=element_text(size=10),
legend.text = element_text(size=10),
legend.title = element_text(size=10),
legend.box.margin=margin(0,0,0,-5),
plot.title = element_text(size = 10, face = "bold"),
plot.title.position = tpos,
panel.grid.minor = element_blank(),
plot.margin=grid::unit(c(5+upmarmod,5+rmarmod,5+dnmarmod,5+lmarmod), "mm"))
}# plot benefit sweep experiment peaks by cbr
<- function(peaks, bizone = NULL, title = ggtitle(NULL), ylab = "",
plt_b_sweep_peaks tpos = "plot", col = pal[1],
upmarmod = 0, lmarmod = 0, dnmarmod = 0, rmarmod = 0) {
<- col2rgb(col)
dkcol <- dkcol/5
dkcol <- rgb(t(dkcol), maxColorValue=255)
dkcol <- peaks %>%
p ggplot(aes(x = peak, y = cbr, color = cbr)) +
geom_point(size=1) +
+
title labs(x = "state density peak(s)", y = ylab) +
scale_x_continuous(limits = c(0, 1.5),
breaks = scales::pretty_breaks(n = 3), expand = c(.01,.01)) +
scale_y_continuous(limits = c(min(peaks$cbr), max(peaks$cbr)),
breaks = scales::pretty_breaks(n = 3), expand = c(.01,.01)) +
scale_color_gradient(low = dkcol, high = col, guide = NULL) +
theme(axis.text.x=element_text(size=10),
axis.text.y=element_text(size=10),
axis.title.x=element_text(size=10),
axis.title.y=element_text(size=10),
plot.title = element_text(size = 10, face = "bold"),
plot.title.position = tpos,
panel.grid.minor = element_blank(),
plot.margin=grid::unit(c(5+upmarmod,5+rmarmod,5+dnmarmod,5+lmarmod), "mm"))
if(!is.null(bizone)) {
<- p +
p geom_hline(yintercept = bizone[1], linetype="dashed", color = "red3", size=.5) +
geom_hline(yintercept = bizone[2], linetype="dashed", color = "red3", size=.5) +
annotate('label', x = .55, y = mean(bizone), label = "Bimodal\nregion",
hjust = .5, vjust = .5, family = "Roboto", size = 3,
label.padding = unit(.15, "lines"), label.size = 0, alpha = .65)
}
p }
Generate “benefit sweep” simulation plot
For three scenarios (coupled human/natural system, overly-myopic decision maker, and overly-fast ecological change), cost/benefit ratio was varied incrementally over 40 values, indicated by color shade, across a \(c:b\) range of width 0.15, encompassing the transition between a never invest'' to an
always invest’’ policy. For each \(c:b\), 500 replicate simulations were conducted as in Fig \ref{fig:res_bimodal}. Upper plots show distribution of ES state at \(t=20\) for each \(c:b\). Lower plots show density curve peak(s). (A) By coupling a forward-looking decision-maker and a slowly-adapting environment, complex dynamics like alternate stable states can emerge. However, with (B) a short-term decision strategy (solving the MDP over a 2-year time horizon), or (C) a fast ecological change rate (\(r = 0.95\)), no bimodality is observed.
ggarrange(
ggarrange(
plt_sim_b_sweep(chans_sims,
title = ggtitle("A. Forward-looking decision-maker
\n and adaptive ecological system"),
ylab = "Density", col = pal[1], rmarmod = -7, dnmarmod = -2.75) +
theme(legend.position="none"),
plt_b_sweep_peaks(b_sweep_peaks(chans_sims),
bizone = b_sweep_peaks(chans_sims, output_bizone = TRUE),
ylab = "Cost/benefit ratio", col = pal[1],
rmarmod = -7, upmarmod = -2.75),
nrow = 2, ncol = 1, heights = c(1.1,1), align = "v"
),ggarrange(
plt_sim_b_sweep(myopic_d_sims,
title = ggtitle("B. Short-term
\n decision strategy"),
tpos = "panel", col = pal[4],
lmarmod = -3.5, rmarmod = -3.5, dnmarmod = -2.75) +
theme(legend.position="none"),
plt_b_sweep_peaks(b_sweep_peaks(myopic_d_sims),
col = pal[4], lmarmod = -3.5, rmarmod = -3.5, upmarmod = -2.75),
nrow = 2, ncol = 1, heights = c(1.1,1), align = "v"
),ggarrange(
plt_sim_b_sweep(fast_r_sims,
title = ggtitle("C. fast
\n ecological change rate"),
tpos = "panel", col = pal[7], lmarmod = -7, dnmarmod = -2.75) +
theme(legend.position="none"),
plt_b_sweep_peaks(b_sweep_peaks(fast_r_sims),
col = pal[7], lmarmod = -7, upmarmod = -2.75),
nrow = 2, ncol = 1, heights = c(1.1,1), align = "v"
),nrow = 1, ncol = 3, widths = c(1,1,1)
)