kimfilter
is an Rcpp
implementation of the
multivariate Kim filter, which combines the Kalman and Hamilton filters
for state probability inference. The filter is designed for state space
models and can handle missing values and exogenous data in the
observation and state equations.
The Kalman filter is a method to compute the optimal estimator of the unobserved state vector in a time series, while the Hamilton filter is a Markov switching model designed to provide the probability of being in one of a set of potential states. The Kim filter combines these two filters for state space models.
The state space model is written as \[ Y_t = A_{s_t} + H_{s_t} \beta_t + B^o_{s_t} X^o_t + e_t, e_t \sim N(0, R_{s_t}) \\ \beta_t = D_{s_t} + F_{s_t} \beta_{t-1} + B^s_{s_t} X^s_t + u_t, u_t \sim N(0, Q_{s_t}) \] where the first equation is the observation equation and the second equation is the state equation. \(A_{s_t}\) is the observation intercept matrix, \(H_{s_t}\) is the observation matrix, \(X^o_t\) is optional exogenous data in the observation equation, \(e_t \sim N(0, R_{s_t})\) is the observation error, and \(R_{s_t}\) is the observation error-covariance matrix. \(\beta_t\) is the vector of the state components \(D_{s_t}\) is state intercept matrix, \(F_{s_t}\) is the transition matrix, \(X^s_t\) is optional exogenous data in the state equation, \(v_t \sim N(0, Q_{s_t})\) is the state error, and \(Q_{s_t}\) is the state error-covariance matrix.
The subscript \(s_t\) denotes that some of the parameters in in the matrices can be time varying depending on the state \(s\) at time \(t\). The number of states \(S\) are typically limited to a small number like 2 or 3 for tractability, but can include many.
The transition probabilities are given by
\[ p = \begin{bmatrix} p_{11} & p_{21} & \ldots & p_{S1} \\ p_{12} & p_{22} & \ldots & p_{S2} \\ \vdots & \vdots & \ddots & \vdots \\ p_{1S} & p_{2S} & \ldots & p_{SS} \end{bmatrix} \]
where \(p_{ij} = Pr[s_t = j|s_{t-1} = i]\) with \(\sum_{j=1]^S p_{ij} = 1\) for all \(i\) (i.e. the columns sum to 1).
With the basic Kalman filter, the goal is forecast the state vector \(\beta_t\) based on information up to time \(t-1\), which is denoted \(\beta_{t|t-1}\). However, in the Markov switching case, the goal is form a forecast of \(\beta_t\) based on information up to time \(t-1\) and also conditional on the \(s_t\) being value \(j\) and \(s_{t-1}\) being value \(i\) denoted \(\beta_{t|t-1}^{(i,j)}\).
The Kim filter calculates \(S^2\) such forecasts at each time \(t\), so is much more computationally intensive than the basic Kalman filter. The algorithm follows in the same forward two-step procedure of the Kalman filter, where the “forwardness” of the filter means that it uses only data up to time \(t\) to make an inference on the unobserved components at time \(t\) and does peak into the future to make that inference.
The first stage is the prediction stage, which makes predictions of the state components based on information up to time \(t-1\) and all the possible outcomes of \(s_t\) denoted by \(j\) and the outcome of \(s_{t-1}\) denoted by \(i\). This stage is made up of four equations, the first is the prediction of the state components based on its time series properties
\[ \beta_{t|t-1}^{(i,j)} = D_j + F_j \beta_{t-1|t-1}^i + B^o_j X^o_t \] Second, is the prediction of the covariance matrix of the state components
\[ P_{t|t-1}^{(i,j)} = F_j P_{t-1|t-1}^i F_j^{\prime} + Q_j \] Third, is the prediction error of the time series of interest
\[ \eta_{t|t-1}^{(i,j)} = Y_t - (A_j + H_j \beta_{t|t-1}^{(i,j)} + B^o_j X^o_t) \] And finally, we have the variance of the prediction error
\[ f_{t|t-1}^{(i,j)} = H_j P_{t|t-1}^{(i,j)} H_j^{\prime} + R_j \] ## Updating State
The second state is the updating stage, which makes predictions on all information up to time \(t\). It consists of three equations. The first equation is the prediction of the state components based on the full information set
\[ \beta_{t|t}^{(i,j)} = \beta_{t|t-1}^{(i,j)} + K_t^{(i,j)} \eta_{t|t-1}^{(i,j)} \] where \(K_{t}^{(i,j)}\) is the Kalman gain, which determines the optimal weight to give new information in making predictions about \(\beta_t\) and is the second equation
\[ K_t^{(i,j)} = P_{t|t-1}^{(i,j)} H_j^{\prime} \left( f_{t|t-1}^{(i,j)} \right)^{-1} \] The last equation is the updating of the covariance matrix of the state components
\[ P_{t|t}^{(i,j)} = P_{t|t-1}^{(i,j)} - K_t^{(i,j)} H_j^{\prime} P_{t|t-1}^{(i,j)} \] The seven equations above, make up the full Kalman filter routine for a given pair of state outcomes \(s_t = j\) and \(s_{t-1} = i\). If \(Y_t\) is missing for any observation, then
\[ B_{t|t}^{(i,j)} = B_{t|t-1}^{(i,j)} \\ P_{t|t}^{(i,j)} = P_{t|t-1}^{(i,j)} \\ K_t^{(i,j)} = 0 \\ f_{t|t-1}^{(i,j)} = \infty \] However, the key to the Kim filter is to collapse the terms into best estimate of \(s_t\) as so
\[ \beta_{t|t}^j = \frac{\sum_{i=1}^S Pr[s_{t-1} = i, s_t = j|t] \beta_{t|t}^{(i,j)}}{Pr[s_t = j|t]} \]
and
\[ P_{t|t}^j = \frac{Pr[s_{t-1} = i, s_t = j|t]\left( \beta_{t|t}^j - \beta_{t|t}^{(i,j)} \right)\left( \beta_{t|t}^j - \beta_{t|t}^{(i,j)} \right)^{\prime}}{Pr[s_t = j|t]} \] Note, however that these collapsed terms involve approximations. This is because \(\beta_{t}\) conditional on information up to time \(t\), \(s_t\), and \(s_{t-1}\) is a mixture of normals. However, the algorithm is still considered to be making reasonable inferences about \(\beta_t\).
The Kim filter adds an additional stage to the Kalman filter. This stage makes inferences about the probability terms that show up in the above equations.
The first step in this stage is to calculate
\[ Pr[s_t = j, s_{t-1} = i|t-1] = Pr[s_t = j|s_{t-1} = i]Pr[s_{t-1} = i|t-1] \]
for all \(i,j\), where \(Pr[s_t = j|s_{t-1} = i]\) is the transition probability.
The second step defines the conditional density based on the prediction error decomposition. We first need the joint density of \(Y_t\), \(s_t\), and \(s_{t-1}\) given by
\[ f(Y_t, s_t = j, s_{t-1} = i|t-1) = f(Y_t|s_t = j, s_{t-1} = i, t-1)Pr[s_t = j, s_{t-1} = i|t-1] \] for all \(i,j\). The marginal density of \(Y_t\) is then
\[ f(Y_t|t-1) = \sum_{j=1}^S \sum_{i=1}^S f(Y_t, s_t = j, s_{t-1} = i|t-1) Pr[s_t = j, s_{t-1} = i|t-1] \]
where the conditional density of \(Y_t\) is defined as
\[ f(Y_t|s_{t-1} = i, s_t = j, t-1) = (2\pi)^{-\frac{N}{2}} |f_{t|t-1}^{(i,j)}|^{-\frac{1}{2}} exp\left( -\frac{1}{2} \eta_{t|t-1}^{(i,j)\prime} f_{t|t-1}^{(i,j)\phantom{~}-1} \eta_{t|t-1}^{(i,j)} \right) \]
for all \(i,j\), where \(N\) is the number of variables in \(Y_t\).
The third, and final, step is to update the probability terms by using
\[ Pr[s_{t-1} = i, S_t = j|t] = \frac{f(s_{t-1} = i, s_t = j, Y_t|t-1)}{f(Y_t|t-1)} \] for all \(i,j\) to get
\[ Pr[s_t = j|t] = \sum_{i=1}^S Pr[s_{t-1} = i, s_t = j|t] \]
Once the Kalman filter is applied to the data a smoothing procedure can be applied in the backward direction to make a better inference of the state components based on the entire data set. Unlike the filter, the smoother does peak into the future to make an inference of the state components at time \(t\). This procedure for the basic Kalman filter consists of only two equations.
The first equation updates the prediction of the state components based on all the available information
\[ \beta_{t|T}^{(j,k)} = \beta_{t|t}^j + P_{t|t}^j F_k^{\prime} \left( P_{t+1|t}^{(j,k)} \right)^{-1} \left( \beta_{t+1|T}^k - \beta_{t+1|t}^{(j,k)} \right) \]
The second equation updates the covariance matrix of the state components based on all the available information
$$ P_{t|T}^{(j,k)} = P_{t|t}^j + P_{t|t}^j F_k^{} ( P_{t+1|t}^{(j,k)} )^{-1} ( P_{t+1|T}^k - P_{t+1|t}^{(j,k)} ) ( P_{t+1|t}^{(j,k)} )^{-1} F_k P_{t|t}^{j}
$$
However, the Kim filter adds two more equations. The first is the joint probability of \(s_t = j\) and \(s_{t-1} = k\) based on the full information
\[ Pr[s_t = j, s_{t+1} = k|T] = \frac{Pr[s_{t+1} = k|T]Pr[s_t = j|t]Pr[s_{t+1} = k|s_t = j]}{Pr[s_{t+1} = k|t]} \] and the second addition is
\[ Pr[s_t = j|T] = \sum_{k=1}^S pr[s_t = j, s_{t+1} = k|T] \]
Finally, the Kim filter also requires the collapsing of terms in the smoothing algorithm
\[ \beta_{t|T}^j = \frac{\sum_{k=1}^S Pr[s_t = j, s_{t+1} = k|T]\beta_{t|T}^{(j,k)}}{Pr[s_t = j|T]} \]
and
\[ P_{t|T}^j = \frac{\sum_{k=1}^S Pr[s_t = j, s_{t+1} = k|T]\left( P_{t|T}^{(j,k)} + \left( \beta_{t|T}^j - \beta_{t|T}^{(i,j)} \right)\left( \beta_{t|T}^j - \beta_{t|T}^{(i,j)} \right)^{\prime} \right)}{Pr[s_t = j|T]} \]
Finally, the smoothed value of the state components is given by
\[ \beta_{t|T} = \sum_{j = 1}^S Pr[s_t = j|T]\beta_{t|T}^j \]
library(kimfilter)
library(data.table)
library(maxLik)
library(ggplot2)
library(gridExtra)
data(sw_dcf)
= sw_dcf[, colnames(sw_dcf) != "dcoinc", with = F]
data = colnames(data)[colnames(data) != "date"]
vars
#State space model for the Stock and Watson Markov Switching Dynamic Common Factor model
= function(par, yt, n_states = NULL){
msdcf_ssm #Get the number of states
= length(unique(unlist(lapply(strsplit(names(par)[grepl("p_", names(par))], "p_"), function(x){substr(x[2], 1, 1)}))))
n_states
#Get the parameters
= dimnames(yt)[which(unlist(lapply(dimnames(yt), function(x){!is.null(x)})))][[1]]
vars = par[grepl("phi", names(par))]
phi names(phi) = gsub("phi", "", names(phi))
= par[grepl("gamma_", names(par))]
gamma names(gamma) = gsub("gamma_", "", names(gamma))
= par[grepl("psi_", names(par))]
psi names(psi) = gsub("psi_", "", names(psi))
= par[grepl("sigma_", names(par))]
sig names(sig) = gsub("sigma_", "", names(sig))
= par[grepl("mu", names(par))]
mu names(mu) = gsub("mu_", "", names(mu))
= par[grepl("p_", names(par))]
pr names(pr) = gsub("p_", "", names(pr))
= sort(unique(substr(names(pr), 1, 1)))
states
#Steady state probabilities
= matrix(NA, nrow = n_states, ncol = n_states)
Pm rownames(Pm) = colnames(Pm) = unique(unlist(lapply(names(pr), function(x){strsplit(x, "")[[1]][2]})))
for(j in names(pr)){
strsplit(j, "")[[1]][2], strsplit(j, "")[[1]][1]] = pr[j]
Pm[
}for(j in 1:ncol(Pm)){
which(is.na(Pm[, j])), j] = 1 - sum(Pm[, j], na.rm = TRUE)
Pm[
}
#Build the transition matrix
= max(c(length(phi)), length(unique(sapply(strsplit(names(gamma), "\\."), function(x){x[2]}))))
phi_dim = sapply(unique(sapply(strsplit(names(psi), "\\."), function(x){x[1]})), function(x){
psi_dim max(as.numeric(sapply(strsplit(names(psi)[grepl(paste0("^", x), names(psi))], "\\."), function(x){x[2]})))
})= matrix(0, nrow = phi_dim + length(psi), ncol = phi_dim + length(psi),
Fm dimnames = list(
c(paste0("ct.", 0:(phi_dim - 1)),
unlist(lapply(names(psi_dim), function(x){paste0("e_", x, ".", 0:(psi_dim[x] - 1))}))),
c(paste0("ct.", 1:phi_dim),
unlist(lapply(names(psi_dim), function(x){paste0("e_", x, ".", 1:psi_dim[x])})))
))"ct.0", paste0("ct.", names(phi))] = phi
Fm[for(i in 1:length(vars)){
paste0("e_", i, ".0"),
Fm[paste0("e_", names(psi)[grepl(paste0("^", i), names(psi))])] = psi[grepl(paste0("^", i), names(psi))]
}diag(Fm[intersect(rownames(Fm), colnames(Fm)), intersect(rownames(Fm), colnames(Fm))]) = 1
= array(Fm, dim = c(nrow(Fm), ncol(Fm), n_states), dimnames = list(rownames(Fm), colnames(Fm), states))
Fm
#Build the observation matrix
= matrix(0, nrow = nrow(yt), ncol = nrow(Fm), dimnames = list(rownames(yt), rownames(Fm)))
Hm for(i in 1:length(vars)){
paste0("ct.", sapply(strsplit(names(gamma)[grepl(paste0("^", i), names(gamma))], "\\."), function(x){x[2]}))] =
Hm[i, grepl(paste0("^", i), names(gamma))]
gamma[
}diag(Hm[, paste0("e_", 1:length(vars), ".0")]) = 1
= array(Hm, dim = c(nrow(Hm), ncol(Hm), n_states), dimnames = list(rownames(Hm), colnames(Hm), states))
Hm
#Build the state covariance matrix
#Set the dynamic common factor standard deviation to 1
= matrix(0, ncol = ncol(Fm), nrow = nrow(Fm), dimnames = list(rownames(Fm), rownames(Fm)))
Qm "ct.0", "ct.0"] = 1
Qm[for(i in 1:length(vars)){
paste0("e_", i, ".0"), paste0("e_", i, ".0")] = sig[names(sig) == i]^2
Qm[
} = array(Qm, dim = c(nrow(Qm), ncol(Qm), n_states), dimnames = list(rownames(Qm), colnames(Qm), states))
Qm
#Build the observation equation covariance matrix
= matrix(0, ncol = nrow(Hm), nrow = nrow(Hm), dimnames = list(vars, vars))
Rm = array(Rm, dim = c(nrow(Rm), ncol(Rm), n_states), dimnames = list(rownames(Rm), colnames(Rm), states))
Rm
#State intercept matrix: the Markov switching mean matrix
= matrix(0, nrow = nrow(Fm), ncol = 1, dimnames = list(rownames(Fm), NULL))
Dm = array(Dm, dim = c(nrow(Dm), 1, n_states), dimnames = list(rownames(Fm), NULL, states))
Dm for(i in names(mu)){
"ct.0", , i] = mu[i]
Dm[
}
#Observation equation intercept matrix
= matrix(0, nrow = nrow(Hm), ncol = 1)
Am = array(Am, dim = c(nrow(Am), ncol(Am), n_states), dimnames = list(vars, NULL, states))
Am
#Initialize the filter for each state
= matrix(0, nrow(Fm), 1)
B0 = diag(nrow(Fm))
P0 = array(B0, dim = c(nrow(B0), ncol(B0), n_states), dimnames = list(rownames(Fm), NULL, states))
B0 = array(P0, dim = c(nrow(P0), ncol(P0), n_states), dimnames = list(rownames(B0), colnames(B0), states))
P0 for(i in states){
= solve(diag(ncol(Fm)) - Fm[,,i]) %*% Dm[,,i]
B0[,,i] = solve(diag(prod(dim(Fm[,,i]))) - kronecker(Fm[,,i], Fm[,,i])) %*% matrix(as.vector(Qm[,,i]), ncol = 1)
VecP_ll = matrix(VecP_ll[, 1], ncol = ncol(Fm))
P0[,,i]
}
return(list(B0 = B0, P0 = P0, Am = Am, Dm = Dm, Hm = Hm, Fm = Fm, Qm = Qm, Rm = Rm, Pm = Pm))
}
#Log the data
= copy(data)
data.log c(vars) := lapply(.SD, log), .SDcols = c(vars)]
data.log[,
#Difference the data
= copy(data.log)
data.logd c(vars) := lapply(.SD, function(x){
data.logd[, - shift(x, type = "lag", n = 1)
x = c(vars)]
}), .SDcols
#Center the data
= copy(data.logd)
data.logds c(vars) := lapply(.SD, scale, scale = FALSE), .SDcols = c(vars)]
data.logds[,
#Transpose the data
= t(data.logds[, c(vars), with = F])
yt
#Set the initial values
= c(phi1 = 0.8760, phi2 = -0.2171,
init mu_u = 0.2802, mu_d = -1.5700,
p_dd = 0.8406, p_uu = 0.9696,
psi_1.1 = 0.0364, psi_1.2 = -0.0008,
psi_2.1 = -0.2965, psi_2.2 = -0.0657,
psi_3.1 = -0.3959, psi_3.2 = -0.1903,
psi_4.1 = -0.2436, psi_4.2 = 0.1281,
gamma_1.0 = 0.0058, gamma_1.1 = -0.0033,
gamma_2.0 = 0.0011,
gamma_3.0 = 0.0051, gamma_3.1 = -0.0033 ,
gamma_4.0 = 0.0012, gamma_4.1 = -0.0005, gamma_4.2 = 0.0001, gamma_4.3 = 0.0002,
sigma_1 = 0.0048, sigma_2 = 0.0057, sigma_3 = 0.0078, sigma_4 = 0.0013)
#Set the constraints
= matrix(0, nrow = 20, ncol = length(init), dimnames = list(NULL, names(init)))
ineqA #Stationarity constraints
c(1, 2), c("phi1", "phi2")] = rbind(c(1, 1), c(-1, -1))
ineqA[c(3, 4), grepl("psi_1", colnames(ineqA))] = rbind(c(1, 1), c(-1, -1))
ineqA[c(5, 6), grepl("psi_2", colnames(ineqA))] = rbind(c(1, 1), c(-1, -1))
ineqA[c(7, 8), grepl("psi_3", colnames(ineqA))] = rbind(c(1, 1), c(-1, -1))
ineqA[c(9, 10), grepl("psi_4", colnames(ineqA))] = rbind(c(1, 1), c(-1, -1))
ineqA[#Non-negativity constraints
diag(ineqA[c(11, 12, 13, 14), grepl("sigma_", colnames(ineqA))]) = 1
c(15, 16), "p_dd"] = c(1, -1)
ineqA[c(17, 18), "p_uu"] = c(1, -1)
ineqA[#Up/down states must be positive/negative
19, "mu_u"] = 1
ineqA[20, "mu_d"] = -1
ineqA[= matrix(c(rep(1, 10),
ineqB rep(0, 4),
c(0, 1),
c(0, 1),
rep(0, 2)), nrow = nrow(ineqA), ncol = 1)
#Define the objective function
= function(par, yt){
objective = msdcf_ssm(par, yt)
ssm return(kim_filter(ssm, yt, smooth = FALSE)$lnl)
}
#Solve the model
= maxLik(logLik = objective, start = init, method = "BFGS",
solve finalHessian = FALSE, hess = NULL,
control = list(printLevel = 2, iterlim = 10000),
constraints = list(ineqA = ineqA, ineqB = ineqB),
yt = yt)
#Get the estimated state space model
= msdcf_ssm(solve$estimate, yt)
ssm
#Get the column means and standard deviations
= matrix(unlist(data.logd[, lapply(.SD, mean, na.rm = TRUE), .SDcols = c(vars)]),
M ncol = 1, dimnames = list(vars, NULL))
#Get the steady state coefficient matrices
= matrix(ss_prob(ssm[["Pm"]]), ncol = 1, dimnames = list(rownames(ssm[["Pm"]]), NULL))
Pm = Reduce("+", lapply(dimnames(ssm[["Hm"]])[[3]], function(x){
Hm *ssm[["Hm"]][,, x]
Pm[x, ]
}))= Reduce("+", lapply(dimnames(ssm[["Fm"]])[[3]], function(x){
Fm *ssm[["Fm"]][,, x]
Pm[x, ]
}))
#Final K_t is approximation to steady state K
= kim_filter(ssm, yt, smooth = TRUE)
filter = filter$K_t[,, dim(filter$K_t)[3]]
K = solve(diag(nrow(K)) - (diag(nrow(K)) - K %*% Hm) %*% Fm) %*% K
W = (W %*% M)[1, 1]
d
#Get the intercept terms
= Hm[, grepl("ct", colnames(Hm))]
gamma = M - gamma %*% matrix(rep(d, ncol(gamma)))
D
#Initialize first element of the dynamic common factor
= t(data.log[, c(vars), with = F][1, ])
Y1 = function(par){
initC return(sum((Y1 - D - gamma %*% par)^2))
}= optim(par = Y1, fn = initC, method = "BFGS", control = list(trace = FALSE))$par[1]
C10 = rep(C10, ncol(yt))
Ctt
#Build the rest of the level of the dynamic common factor
= filter$B_tt[which(rownames(Fm) == "ct.0"), ]
ctt for(j in 2:length(Ctt)){
= ctt[j] + Ctt[j - 1] + c(d)
Ctt[j]
}= data.table(date = data$date, dcf = Ctt, d.dcf = ctt)
Ctt = data.table(date = data$date, data.table(filter$Pr_tt))
prob colnames(prob) = c("date", paste0("pr_", dimnames(ssm$Dm)[[3]]))
= merge(Ctt, prob, by = "date", all = TRUE)
uc
#Plot the outputs
= ggplot(melt(data.log, id.vars = "date")[, "value" := scale(value), by = "variable"]) +
g1 ggtitle("Data", subtitle = "Log Levels (Rescaled)") +
scale_y_continuous(name = "Value") +
scale_x_date(name = "") +
geom_line(aes(x = date, y = value, group = variable, color = variable)) +
theme_minimal() + theme(legend.position = "bottom") + guides(color = guide_legend(title = NULL))
= ggplot( melt(data.logds, id.vars = "date")) +
g2 ggtitle("Data", subtitle = "Log Differenced & Standardized") +
scale_y_continuous(name = "Value") +
scale_x_date(name = "") +
geom_hline(yintercept = 0, color = "black") +
geom_line(aes(x = date, y = value, group = variable, color = variable)) +
theme_minimal() + theme(legend.position = "bottom") + guides(color = guide_legend(title = NULL))
= melt(uc, id.vars = "date")
toplot3 = range(toplot3[variable == "dcf", ]$value, na.rm = TRUE)
d_range1 = range(toplot3[variable %in% colnames(uc)[grepl("pr_", colnames(uc))], ]$value, na.rm = TRUE)
p_range1 %in% colnames(uc)[grepl("pr_", colnames(uc))], "value" := (value - p_range1[1])/diff(p_range1) * diff(d_range1) + d_range1[1], by = "variable"]
toplot3[variable = ggplot() +
g3 ggtitle("Dynamic Common Factor", subtitle = "Levels") +
scale_x_date(name = "") +
geom_hline(yintercept = 0, color = "grey") +
geom_line(data = toplot3[variable == "dcf", ],
aes(x = date, y = value, group = variable, color = variable)) +
theme_minimal() + theme(legend.position = "bottom") +
guides(color = guide_legend(title = NULL), fill = guide_legend(title = NULL)) +
scale_color_manual(values = "black") +
scale_y_continuous(name = "Value", limits = range(toplot3[variable == "dcf", ]$value, na.rm = TRUE),
sec.axis = sec_axis(name = "Probability", ~((. - d_range1[1])/diff(d_range1) * diff(p_range1) + p_range1[1]) * 100)) +
geom_ribbon(data = toplot3[variable %in% "pr_d", ],
aes(x = date, ymin = d_range1[1], ymax = value, group = variable, fill = variable), alpha = 0.5) +
scale_fill_manual(values = c("red", "green"))
= melt(uc, id.vars = "date")
toplot4 = range(toplot4[variable %in% c("d.dcf"), ]$value, na.rm = TRUE)
d_range2 = range(toplot4[variable %in% colnames(uc)[grepl("pr_", colnames(uc))], ]$value, na.rm = TRUE)
p_range2 %in% colnames(uc)[grepl("pr_", colnames(uc))], "value" := (value - p_range2[1])/diff(p_range2) * diff(d_range2) + d_range2[1], by = "variable"]
toplot4[variable = ggplot() +
g4 ggtitle("Dynamic Common Factor", subtitle = "Differenced") +
scale_x_date(name = "") +
geom_hline(yintercept = 0, color = "grey") +
geom_line(data = toplot4[variable %in% c("d.dcf"), ],
aes(x = date, y = value, group = variable, color = variable)) +
theme_minimal() + theme(legend.position = "bottom") +
guides(color = guide_legend(title = NULL), fill = guide_legend(title = NULL)) +
scale_color_manual(values = "black") +
scale_y_continuous(name = "Value", limits = range(toplot4[variable %in% c("d.dcf"), ]$value, na.rm = TRUE),
sec.axis = sec_axis(name = "Probability", ~((. - d_range2[1])/diff(d_range2) * diff(p_range2) + p_range2[1]) * 100)) +
geom_ribbon(data = toplot4[variable %in% "pr_d", ],
aes(x = date, ymin = d_range2[1], ymax = value, group = variable, fill = variable), alpha = 0.5) +
scale_fill_manual(values = c("red", "green"))
grid.arrange(g1, g2, g3, g4, layout_matrix = matrix(c(1, 3, 2, 4), nrow = 2))