
Preprocess
K+1 frames
- K : train Embedder [B(batch) * K, 2, C, W, H]
- x : frame [B * K, C, W, H]
- y : landmark [B * K, C, W, H]
- 1(t frame) : train Generator

e_hat = E(x, y)
x_hat = G(y_t, e_hat)
r_x_hat = D(x_hat, y_t, i)
r_x = D(x_t, y_t, i)

LossEG = Lcnt(content) + Ladv(adversarial) + Lmch(match)
loss_cnt(x_t, x_hat) : feature의 차이
loss_adv(r_x_hat) = -r_x_hat.mean() + Lfm(feature matching)
loss_mch(e_hat, W_i)
loss_D(r_x, r_x_hat) = (relu(1+r_x_hat) + relu(1-r_x)).mean()






