Git Product home page Git Product logo

dhp's Introduction

DHP

This repository provides database, code and results visualization for reproducing all the reported results in the paper:

Published on IEEE Transactions on Pattern Analysis and Machine Intelligence in 2018. By MC2 Lab @ Beihang University.

Specifically, this repository includes guidelines to:

If you find our paper, database or code useful for your research, please cite:

@article{xu2018predicting,
  title={Predicting head movement in panoramic video: A deep reinforcement learning approach},
  author={Xu, Mai and Song, Yuhang and Wang, Jianyi and Qiao, MingLang and Huo, Liangyu and Wang, Zulin},
  journal={IEEE transactions on pattern analysis and machine intelligence},
  year={2018},
  publisher={IEEE}
}

Download PVS-HMEM database

Our PVS-HMEM (Panoramic Video Sequences with Head Movement & Eye Movement database) database contains both Head Movement and Eye Movement data of 58 subjects on 76 panoramic videos.

  • Blue dots represent the Head Movement.
  • Translucent blue circles represent the FoV.
  • Red dots represent the Eye Movement.

Download our PVS-HM database from DropBox (Click to view). Please feel free to contact us by clicking here so that we can give you access permission to the file. Then extract it with:

tar -xzvf PVS-HM.tar.gz

Note that it contains all MP4 files of our database, along with the HM scanpath data FULLdata_per_video_frame.mat.

Note that the EM data is not provided here due to the scope of this work, however we release the EM data to facilitate the community in this repo.

For more details of the FULLdata_per_video_frame.mat file, refer to here (Note that you do not have to read the details of the mat file if you just want to run our code and reproduce the numbers).

Setup an environment to run our code

If you are not familiar with things in this section, refer to my personal basic setup for some guidelines or simply google it.

Requirements:

  • Install go-vncdriver from OpenAI;
  • Install FFMPEG by sudo apt-get install ffmpeg;
  • Instell OpenCV 2.4, you can use the script here;

Install Anaconda according to the guidelines on their official site, then install other requirements with command lines:

sudo apt-get install tmux

source ~/.bashrc

# create env
conda create -n dhp_env python=2.7

# active env
source activate dhp_env

# install packages
pip install gym tensorflow universe
conda install matplotlib
pip install opencv-contrib-python
conda install -c conda-forge imageio

# clone project
git clone https://github.com/YuhangSong/DHP-TensorFlow.git

# make remap excuatble
cd DHP-TensorFlow
chmod +x ./remap

Now you should run ./remap to make sure the remap is excuatble. It will log information as follows:

./remap [-i input] [-o output] [-f filter] [-m m] [-n n] [-w w] [-h h] [-t tf] [-y] src dst
	-i ... Input  file type: cube, rect, eqar, merc                   [rect]
	-o ... Output file type: cube, rect, eqar, merc, view             [rect]
	-f ... Filter type: nearest, linear, bicubic                   [bicubic]
	-m ... Input  height list                                          [500]
	-b ... Input  width                                                 [2m]
	-n ... Output height                                               [500]
	-v ... Output width                                                 [2n]
	-w ... Viewport width                                              [200]
	-h ... Viewport height                                             [200]
	-x ... Viewport fov x in degree                                     [90]
	-y ... Viewport fov y in degree                                     [90]
	-p ... Viewport center position phi (latitude)(degrees)              [0]
	-l ... Viewport center position tht (longitude)(degrees)             [0]
	-t ... Tracking data file                                         [none]
	-y ... Blend data together (only works with orec, etc ...)         [off]
	-z ... Number of frames                                            [MAX]
	-s ... Number of the start frame                                     [0]

If it is working, check a corresponding issue

  • Note that the remap is provided by mattcyu1, we thank a lot for his contribution to the community.

Run our code

Please make sure you have:

  • More than 64 GB of RAM.
  • More then 600 GB space on the disk you store PVS-HM database.

Offline-DHP.

This section clarifies procedures to train and test offline-DHP.

Prepare

Set the database_path in config.py to your database folder.

Generate YUV files. Set mode = 'data_processor' and data_processor_id = 'mp4_to_yuv' in config.py and run:

source ~/.bashrc
source activate dhp_env
python train.py

The converted YUV files will take about 600 Gb. The reason we have to use YUV files is that, the remap function that get FoV from a 360 image is a binary file that takes YUV and output YUV. We have developed a Python version of remap, but it turns out to be even slower than just reading and writing YUV files into the disk (for more then 5 times). We are trying to see if remap is important to produce our results.

Note that train.py is a script that starts multiple process managed by tmux. Thus, after running train.py, you can use tmux attach=session -t a3c to see how each process goes. More about tmux can be found here.

Warning: After you run train.py, it will start a tmux session, where the program runs. If you are using mode = 'data_processor', please make sure each window exit normally and your task is complete without any error by navigating to each window of the tmux session.

Generate groundtruth heatmaps. Set mode = 'data_processor' and data_processor_id = 'generate_groundtruth_heatmaps' in config.py and run:

source ~/.bashrc
source activate dhp_env
python train.py
Train

Set mode = 'off_line', procedure = 'train' and if_log_results = False in config.py, run following:

source ~/.bashrc
source activate dhp_env
python train.py

During the first few episode, you may find the CPU usage is extremely low, this is due to the sub-process is competing on remap function, which exchange data with disk. Later on, the CPU usage will increase.

Note that we trained for number_trained_steps = 1.113 * (10^6) to produce our results in the paper, we later found that training too much (10 times as many as 1.113 * (10^6)) may make the agent converge to FCB.

Test

Note that the model is stored and restored automatically. Thus, as long as you did not change the log_dir in config.py, previous trained model will be restored. Set mode = 'off_line', procedure = 'test' and if_log_results = True in config.py, then run following:

source ~/.bashrc
source activate dhp_env
python train.py

The code will generate and store predicted_heatmaps, predicted_scanpath and CC value.

If you are seeing

Starting training at step=<your-previous-global-step>

then the model is restored successfully. If you are seeing

Starting training at step=0

then you have not restored it successfully, refer to a corresponding issue

For results under more evaluation protocol. You may want to generate and store groundtruth_scanpaths with mode = 'data_processor' and data_processor_id = 'generate_groundtruth_scanpaths'.

Load our trained model

To load our trained model, download our model from DropBox (click to view), extract it to the path ../results/, and set log_dir = "../results/reproduce_17". As has been said, the model in the log_dir will be automatically loaded.

Visualize training from TensorBoard

The code log multiple curves to help analysis the training process, type:

tensorboard --logdir <PATH>

where <PATH> is the log_dir in config.py.

Some hints on using the code.

  • mode = 'data_processor' is a efficient way to process data under our TMUX manager, the code is in env_li.py.
  • Some features we used in TensorFlow will be depreciated in a future version, we are using tf.__version__=1.6.0 to run our code.
  • Reinforcement Learning based methods are inherently stochastic, and we cannot guarantee producing exact the same numbers as those reported in our DHP paper. But if you do more runs, we are confident to say you can see consistent results.
Summary your results after testing

After you have tested the model (setting if_log_results=True), you can run

python summary.py

to summary the results. It will show results like:

WaitingForLove|cc|0.7042685046323598
SpaceWar|cc|0.5894272979827889
KingKong|cc|0.540897356844711
SpaceWar2|cc|0.6233880089121158
Guitar|cc|0.6216974696171442
BTSRun|cc|0.5312920575202599
CMLauncher2|cc|0.6347282964835593
Symphony|cc|0.669106020987788
RioOlympics|cc|0.7695397332776495
Dancing|cc|0.6590623821187533
StarryPolar|cc|0.6511584747528362
InsideCar|cc|0.7628155513781555
Sunset|cc|0.7349986526376743
Waterfall|cc|0.6883913118465134
BlueWorld|cc|0.7158131732815242
Avg|0.6597722861515888

which should be able to reproduce the numbers reported in the paper. If you meet any problem reproduce the numbers, please do not hesitate to contact us, you feed back on the environment settings and parameter settings will be well appreciated, since we are trying to provide the community a solid proposal.

Kill the session

Runpython kill.py to kill the session.

Meet some issues?

Please don not hesitate to open an issue. We do not encourage you to contact us directly, opening an issue would be the best way to raise up your questions.

Some known issues & fixations are:

Restore model failed.

Navigate to w-0 in tmux to see if this worker is working properly, because this worker is responsible for restoring model from disk, while other worker just async with it. Then check <log_dir>/train/checkpoint, it should look like:

model_checkpoint_path: "model.ckpt-5362890"
all_model_checkpoint_paths: "model.ckpt-5359910"
all_model_checkpoint_paths: "model.ckpt-5360655"
all_model_checkpoint_paths: "model.ckpt-5361444"
all_model_checkpoint_paths: "model.ckpt-5362210"
all_model_checkpoint_paths: "model.ckpt-5362890"

the model_checkpoint_path points to the latest checkpoint, the all_model_checkpoint_paths points to all available checkpoints. Please make what is listed here matches the files lies in <log_dir>/train/.

Then you will see a likely reason for restore failure is that the recent ckpt file is not stored completely when you killed the program, but it has been listed as available and should-be-restored in the checkpoint file. Thus, you can simply remove corresponding ckpt file, along with modifying codes in checkpoint file.

For example, in about case, change <log_dir>/train/checkpoint to:

model_checkpoint_path: "model.ckpt-5362210"
all_model_checkpoint_paths: "model.ckpt-5359910"
all_model_checkpoint_paths: "model.ckpt-5360655"
all_model_checkpoint_paths: "model.ckpt-5361444"
all_model_checkpoint_paths: "model.ckpt-5362210"

and delete <log_dir>/train/model.ckpt-5362890 will remove the most recent ckpt at 5362890 and restore the ckpt at 5362210.

Remap failed.

It is likely that you are running remap on /media/.. instead of /home/..., since the remap is only excutable on home. A quick fix is to copy remap to your home, chmod and test if it is excutable. After you comfirm that it works in home, change ./remap in vr_player.py to [ABSOLUTE_PATH_TO_YOUR_HOME]/remap.

Results Visualization

Reward Function

We propose a reward function that can capture transition of the attention.

Our reward function Baseline reward function

Specifically, in above example, the woman and the man are passing the basketball between each other, and subjects' attention are switching between them while they passing the basketball. Our reward function is able to capture these transitions of the attentions smoothly, while the baseline reward function makes the agent focus on the man all the time, even when the basketball is not in his hands.

Details of the mat data file.

The mat file includes 76 cells, corresponding to the HM data of all 76 videos. Each cell records the longitude and latitude of HM for 58 subjects, with a total of 116 columns. The longitude and latitude are arranged alternately. For example, the first and second column is the latitude and longitude of the first subject, respectively. Note that the sampling rate of the data is twice as the video FPS. The HM data takes the front center as the origin, and the upper & left as the positive direction. Thus, the longitude ranges from -180 to 180, and the latitude ranges from -90 to 90.

Authors

Yuhang Song Mai Xu Jianyi Wang Minglang Qiao Zulin Wang

Special Thanks

We would like to give special thanks to following researchers, for their valuable discussion and contribution to this work.

Ziyu Zhu Haochen Wang Chen Li Lai Jiang

The code is based on the A3C implementation by OpenAI, we thank a lot for their contribution to the community.

dhp's People

Contributors

haochenheheda avatar huoliangyu avatar iceclear avatar minglangqiao avatar yuhangsong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

dhp's Issues

run online-DHP

Hi, I am interested in your online-DHP newwork. Introduced by your Readme, I have known how to use the offline DHP. Therefore, can you write the method for online DHP method?

On 'v' action rewards

Hello dear mr. Yuhang Song,

In the paper, it is mentioned that the rewards for action v are given by
rewv
And the parameters θv are optimized according to the rule:
grad

In the code,

DHP/envs.py

Lines 488 to 502 in 73ddec2

'''get direction reward and ground-truth v from data_base in last state'''
last_prob, distance_per_data = suppor_lib.get_prob(
lon=self.last_lon,
lat=self.last_lat,
theta=action * 45.0,
subjects=self.subjects,
subjects_total=self.subjects_total,
cur_data=self.last_data,
)
'''rescale'''
distance_per_step = distance_per_data * self.data_per_step
'''convert v to degree'''
degree_per_step = distance_per_step / math.pi * 180.0
'''set v_lable'''
v_lable = degree_per_step

there seems to be no reward for v calculated, instead v_lable is estimated as a "weighted" target value (sum of subject_i_v * similarity),

DHP/suppor_lib.py

Lines 154 to 159 in 73ddec2

distance_per_data = 0.0
for i in range(subjects_total):
if config.if_normalize_v_lable:
distance_per_data += prob_dic_normalized[i] * subjects[i].data_frame[cur_data].v
else:
distance_per_data += prob_dic[i] * subjects[i].data_frame[cur_data].v

which then contributes another term (v-v_lable)^2 in the loss function:

DHP/a3c.py

Lines 238 to 239 in 73ddec2

# v loss
v_loss = 0.5 * tf.reduce_sum(tf.square(pi.v - self.v_lable))

Is there any particular reason why the direct sum of rewards is not calculated, and instead the above approach is considered?

Update to md.

@MinglangQiao Please update your name & photo in the md. Besides, update the dataset detail description part, and update the pointer to your EM git repo.

数据集

您好,我点开那个DropBox显示 “/Yuhang Song/Dataset”文件夹不存在。点开那个EM数据库好像也只有一个README文件,请问是没有传上去吗?

online_DHP

Hello,could you give the procedures to train and test online_DHP?Thank YOU!

Data Preprocessing file missing

Hello,

I've run the YUV preprocessing code by modified mode = 'data_processor' and data_processor_id = 'mp4_to_yuv' in config.py.
There were lots of YUV files generated in the location of PVS dataset, but there's no '1.yuv' file in 'results/reproduce_17/temp/get_view/w_//1.yuv.

Did I missed any step in data preprocessing?


哈囉,
我在跑 training offline 的時候會遇到 No such file or directory: '../results/reproduce_17/temp/get_view/w_5/Bicycle/1.yuv' 的問題,
get_view/w_*/影片名稱/ 底下也確實沒有 1.yuv 檔案,
想請問 1.yuv 是 preprocessing 的時候應該被 generate 的嗎?
preprocessing 時在 dataset 內generate 出來的 '影片名稱.yuv' 檔案跟 1.yuv 是同一個嗎?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.