relion ver3.1 src/acc/cuda
优化backproject
的kernel根据
relion31_tutorial.pdf
,下载数据集relion30_tutorial/Movies/*.tiff
,范例在PrecalculatedResults
,工作目录为relion30_tutorial
- 需要快速熟悉relion的流程,例如要每步操作的输出文件类型,为优化做铺垫...
- 优化
- ...
根据
relion31_tutorial.pdf
,文档提供GUI下的操作,猜测描述猜测args
relion_import \
--i "Movies/*.tiff" \
--odir Import/job001/ \
--ofile movies.star \
--do_movies true \
--optics_group_name opticsGroup1 \
--optics_group_mtf mtf_k2_200kV.star \
--angpix 0.885 \
--kV 200 \
--Cs 1.4 \
--Q0 0.1
- 自建
odir
ifile
记得加双引号- 范例可见于
PrecalculatedResults/Import/job001/note.txt
- 输出
less Import/job001/movies.star
,
根据上个步骤的总结3,直接运行
`which relion_run_motioncorr` \
--i Import/job001/movies.star \
--o MotionCorr/job002/ \
--first_frame_sum 1 \
--last_frame_sum 0 \
--use_own \
--j 24 \
--bin_factor 1 \
--bfactor 150 \
--dose_per_frame 1.277 \
--preexposure 0 \
--patch_x 5 \
--patch_y 5 \
--gainref Movies/gain.mrc \
--gain_rot 0 \
--gain_flip 0 \
--dose_weighting \
--grouping_for_ps 3 \
--pipeline_control MotionCorr/job002/
- 有计时,
0.60/4.80 min .......~~(,_,"> [oo]
根据
bp/PrecalculatedResults/CtfFind/job003/note.txt
,
- 要下载
ctffind
,同时更改--ctffind_exe
- 要下载
csh
,推荐conda
安装tcsh
并alias
根据
bp/PrecalculatedResults/ManualPick/job004/note.txt
,略去手动挑选操作,执行echo CtfFind/job003/micrographs_ctf.star > ManualPick/job004/coords_suffix_manualpick.star
- 教程提到基于
LoG
自动挑选策略,是用到scripts/relion_it.py
,然而教程不采纳完全自动的策略,而是采用ver3.1
下不存在的manual picking
,这是ver3.0
以来的教程和实际操作的矛盾。 - 在以上情况下,基于对工作目录已有一个大致的了解,直接进入优化这步...
阅读教程,知backprojection可能包含在relion_refine_mpi
,relion_refine_mpi
与目标息息相关
结合源代码src/acc/cuda
下的backprojector.cu*
文件,溯源头文件,发现AccBackprojector
类的定义和类方法的定义,此外发现或与加速相关的语句
grep -rn "AccBackprojector" *
src/acc/cuda/cuda_ml_optimiser.h:26: std::vector< AccBackprojector > backprojectors;
src/acc/cpu/cpu_ml_optimiser.h:27: std::vector< AccBackprojector > backprojectors;
src/acc/acc_backprojector_impl.h:8:size_t AccBackprojector::setMdlDim(
src/acc/acc_backprojector_impl.h:54:void AccBackprojector::initMdl()
src/acc/acc_backprojector_impl.h:84:void AccBackprojector::getMdlData(XFLOAT *r, XFLOAT *i, XFLOAT * w)
src/acc/acc_backprojector_impl.h:101:void AccBackprojector::getMdlDataPtrs(XFLOAT *& r, XFLOAT *& i, XFLOAT *& w)
src/acc/acc_backprojector_impl.h:110:void AccBackprojector::clear()
src/acc/acc_backprojector_impl.h:140:AccBackprojector::~AccBackprojector()
src/acc/acc_helper_functions.h:97: AccBackprojector &BP,
src/acc/acc_backprojector.h:15:class AccBackprojector
src/acc/acc_backprojector.h:38: AccBackprojector():
src/acc/acc_backprojector.h:86: ~AccBackprojector();
src/acc/acc_helper_functions_impl.h:616: AccBackprojector &BP,
在build/Makefile
找到
#=============================================================================
# Target rules for targets named run_motioncorr
# Build rule for target.
run_motioncorr: cmake_check_build_system
$(MAKE) -f CMakeFiles/Makefile2 run_motioncorr
.PHONY : run_motioncorr
# fast build rule for target.
run_motioncorr/fast:
$(MAKE) -f src/apps/CMakeFiles/run_motioncorr.dir/build.make src/apps/CMakeFiles/ run_motioncorr.dir/build
.PHONY : run_motioncorr/fast
#=============================================================================
以上命令可以生成run_motioncorr
# Target rules for target src/apps/CMakeFiles/run_motioncorr.dir
# All Build rule for target.
src/apps/CMakeFiles/run_motioncorr.dir/all: src/apps/CMakeFiles/relion_lib.dir/all
$(MAKE) -f src/apps/CMakeFiles/run_motioncorr.dir/build.make src/apps/CMakeFiles/run_motioncorr.dir/depend
$(MAKE) -f src/apps/CMakeFiles/run_motioncorr.dir/build.make src/apps/CMakeFiles/run_motioncorr.dir/build
@$(CMAKE_COMMAND) -E cmake_echo_color --switch=$(COLOR) --progress-dir=/data/xieyufeng/relion/build/ CMakeFiles --progress-num=95 "Built target run_motioncorr"
.PHONY : src/apps/CMakeFiles/run_motioncorr.dir/all
# Include target in all.
all: src/apps/CMakeFiles/run_motioncorr.dir/all
.PHONY : all
# Build rule for subdir invocation for target.
src/apps/CMakeFiles/run_motioncorr.dir/rule: cmake_check_build_system
$(CMAKE_COMMAND) -E cmake_progress_start /data/xieyufeng/relion/build/CMakeFiles 59
$(MAKE) -f CMakeFiles/Makefile2 src/apps/CMakeFiles/run_motioncorr.dir/all
$(CMAKE_COMMAND) -E cmake_progress_start /data/xieyufeng/relion/build/CMakeFiles 0
.PHONY : src/apps/CMakeFiles/run_motioncorr.dir/rule
# Convenience name for target.
run_motioncorr: src/apps/CMakeFiles/run_motioncorr.dir/rule
.PHONY : run_motioncorr
# clean rule for target.
src/apps/CMakeFiles/run_motioncorr.dir/clean:
$(MAKE) -f src/apps/CMakeFiles/run_motioncorr.dir/build.make src/apps/CMakeFiles/run_motioncorr.dir/clean
.PHONY : src/apps/CMakeFiles/run_motioncorr.dir/clean
# clean rule for target.
clean: src/apps/CMakeFiles/run_motioncorr.dir/clean
.PHONY : clean
cuda_execute_process(
"Generating dependency file:
-D__CUDACC__
/data/xieyufeng/relion/src/acc/cuda/cuda_projector.cu
/data/xieyufeng/relion/build/src/apps/CMakeFiles/relion_gpu_util.dir/__/ acc/cuda/relion_gpu_util_generated_cuda_projector.cu.o.NVCC-depend
-m64;-DINSTALL_LIBRARY_DIR=/data/xieyufeng/software/bin/lib/;-DSOURCE_DIR=/data/xieyufeng/ relion/src/;-DACC_CUDA=2;-DACC_CPU=1;-DCUDA;-DALLOW_CTF_IN_SGD;-DHAVE_SINCOS;-DHAVE_TIFF;-DHAVE_PNG
"-I/usr/local/cuda-10.1/include;-I/usr/lib/openmpi/include/openmpi/opal/mca/event/ libevent2021/libevent;-I/usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent/include;-I/usr/ lib/openmpi/include;-I/usr/lib/openmpi/include/openmpi;-I/data/xieyufeng/relion;-I/data/xieyufeng/relion/ external/fftw/include;-I/usr/local/cuda-10.1/include"
"Generating
/usr/local/cuda-10.1/bin/nvcc
RCTIC(TIMING_PATCH_FFT);
NewFFT::FourierTransform(Ipatches[tid], Fpatches[igroup]);
RCTOC(TIMING_PATCH_FFT);
RCTIC(TIMING_CCF_IFFT);
NewFFT::inverseFourierTransform(Fccs[tid], Iccs[tid]());
RCTOC(TIMING_CCF_IFFT);
TIMING_PREP_WEIGHT
TIMING_MAKE_REF
TIMING_CCF_CALC
TIMING_CCF_IFFT
TIMING_CCF_FIND_MAX
TIMING_FOURIER_SHIFT
read gain : 1.449 sec (60406 microsec/operation)
read movie : 7.676 sec (319840 microsec/operation)
apply gain : 1.566 sec (65284 microsec/operation)
initial sum : 4.695 sec (195663 microsec/operation)
detect hot pixels : 1.15 sec (47953 microsec/operation)
fix defects : 5.509 sec (229573 microsec/operation)
global FFT : 35.273 sec (1469723 microsec/operation)
power spectrum : 66.63 sec (2776284 microsec/operation)
power - sum : 14.228 sec (592853 microsec/operation)
power - square : 44.964 sec (1873517 microsec/operation)
power - crop : 0.282 sec (11773 microsec/operation)
power - resize : 7.005 sec (291894 microsec/operation)
global alignment : 10.926 sec (455275 microsec/operation)
global iFFT : 37.949 sec (1581241 microsec/operation)
prepare patch : 39.466 sec (65777 microsec/operation)
prep patch - clip (in thread) : 7.905 sec (693 microsec/operation)
prep patch - FFT (in thread) : 621.909 sec (43257 microsec/operation)
patch alignment : 35.366 sec (58943 microsec/operation)
align - prep weight : 2.872 sec (4602 microsec/operation)
align - make reference : 6.722 sec (5264 microsec/operation)
align - calc CCF (in thread) : -48.871 sec (-2431 microsec/operation)
align - iFFT CCF (in thread) : 243.893 sec (8263 microsec/operation)
align - argmax CCF (in thread) : 0.109 sec (3 microsec/operation)
align - shift in Fourier space : 17.424 sec (13645 microsec/operation)
fit polynomial : 0.086 sec (3614 microsec/operation)
dose weighting : 49.023 sec (2042654 microsec/operation)
dw - calc weight : 12.305 sec (512737 microsec/operation)
dw - iFFT : 36.717 sec (1529914 microsec/operation)
real space interpolation : 18.609 sec (775412 microsec/operation)
binning : 0 sec (0 microsec/operation)