Are these lines fc7,8 in Table8 in thesis?
https://github.com/lmb-freiburg/Multimodal-Future-Prediction/blob/master/encoder.py#L92-L97
Second question is, in Step2,3 below, should I use nll loss when I'm training?
Step1. training sampling netwrok(fitting network freeze)
Step2. training fitting network(sampling network freeze)
Step3. Train all
Third question is, how many iteration do you recommend for me to train for Step2, Step3 if I trained Step1 30,000*5 iterations?
Fourth question is, when I trained fitting network sometimes loss went into minus. Was it same for you? Do you recommend tanh activation function?(which seems to be not on the paper but on the code)