xingyizhou / deepmodel Goto Github PK
View Code? Open in Web Editor NEWCode repository for Model-based Deep Hand Pose Estimation
License: GNU General Public License v3.0
Code repository for Model-based Deep Hand Pose Estimation
License: GNU General Public License v3.0
What's confusing me is that you have added a new loss layer to caffe, and registered it. Is there no need to change caffe.proto?
Hi:
thank you for your job!
I run the code with some pic from the real camera,and got bad results.
I followed the steps below:
a. get a depth image from the camera
b. crop the region including the hand
c. run the code.
I looked into the image and found the only diff between mine and NYU is :
my image is more noisy(gussian)than NYU.
Is this lead to the bad results? or did you do some experiments in the real world,and how did it
behaves?
thanks!
I0614 12:48:35.391320 30215 layer_factory.hpp:77] Creating layer DeepHandModel
I0614 12:48:35.391338 30215 net.cpp:91] Creating Layer DeepHandModel
I0614 12:48:35.391347 30215 net.cpp:425] DeepHandModel <- DoF
I0614 12:48:35.391356 30215 net.cpp:399] DeepHandModel -> DeepHandModelxyz
*** Aborted at 1465888715 (unix time) try "date -d @1465888715" if you are using GNU date ***
PC: @ 0x7f9b7cc44754 (unknown)
*** SIGSEGV (@0xc0) received by PID 30215 (TID 0x7f9b7ee55780) from PID 192; stack trace: ***
@ 0x7f9b7cc26d40 (unknown)
@ 0x7f9b7cc44754 (unknown)
@ 0x7f9b7cc4d147 (unknown)
@ 0x7f9b7e670cd9 caffe::DeepHandModelLayer<>::LayerSetUp()
@ 0x7f9b7e6f0515 caffe::Net<>::Init()
@ 0x7f9b7e6f13b5 caffe::Net<>::Net()
@ 0x7f9b7e67b72a caffe::Solver<>::InitTrainNet()
@ 0x7f9b7e67c93c caffe::Solver<>::Init()
@ 0x7f9b7e67cc6a caffe::Solver<>::Solver()
@ 0x7f9b7e6d43e3 caffe::Creator_SGDSolver<>()
@ 0x411666 caffe::SolverRegistry<>::CreateSolver()
@ 0x40ab20 train()
@ 0x40852c main
@ 0x7f9b7cc11ec5 (unknown)
@ 0x408cfd (unknown)
@ 0x0 (unknown)
Segmentation fault
caffe installation tests ran ok. hdf5_classification example also ran ok. Why is this(DeepHandModel) failing? Did you encounter this error? How did you fix it?
How did you set init 47 DOF parameters? Assigned a sample as default hand pose and calculated it's DOF parameters through the joints' locations?
Hi,
I have copied your files from ./libs/include to caffe_root/include and ./libs/src to caffe_root/src and built the caffe library(make && make pycaffe) by enabling the python layer WITH_PYTHON_LAYER := 1 in Makefile.config. Even then it is reporting an error stating no layer named DeepHandModel. Can you please help in resolving this issue.
F0606 19:27:40.747319 20872 layer_factory.hpp:80] Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: DeepHandModel (known types: AbsVal, Accuracy, ArgMax, BNLL, BatchNorm, Concat, ContrastiveLoss, Convolution, Data, Deconvolution, Dropout, DummyData, Eltwise, Embed, EuclideanLoss, Exp, Filter, Flatten, HDF5Data, HDF5Output, HingeLoss, Im2col, ImageData, InfogainLoss, InnerProduct, LRN, Log, MVN, MemoryData, MultinomialLogisticLoss, PReLU, Pooling, Power, Python, ReLU, Reduction, Reshape, SPP, Sigmoid, SigmoidCrossEntropyLoss, Silence, Slice, Softmax, SoftmaxWithLoss, Split, TanH, Threshold, Tile, WindowData)
*** Check failure stack trace: ***
Aborted (core dumped)
您好,请教一个问题 window 下编译的话,不仅仅是将文件copy 到caffe 下吗?是不是需要其他的操作 谢谢您的回复
There is no caffe/proto/caffe.pb.h at https://github.com/BVLC/caffe
which version of caffe do we need to use?
kindly advice
In deepmodel.prototxt which is the layer that outputs "55 dimensional pose parameter theta" . Is it the layer called "DoF"? But it has only 47 num_output.
What is the order of the parameters in the pose parameter theta? Is it the bend, side , twist for each joint of libhand model. What is the joint sequence?
When cropping the images for training, the code uses the ground truth wrist position as the center. Are the testing results of the paper produced by the same way? If not, how to crop the testing image? Thanks.
Hi, xingyi
Than you for your great open source code. I have reproduced the results as you show on the repository.
Furthermore, I want to add accuracy
layer in DeepModel.prototxt
file .
layer {
name: "accuracy"
type: "Accuracy"
bottom: "DeepHandModelxyz"
bottom: "joint"
top: "accuracy"
include {
phase: "TEST"
}
}
However, there is error say that the dimension of jonit
and DeepHandModelxyz
doesn't match.
Number of labels must match number of predictions; e.g., if label axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be N*H*W, with integer values in {0, 1, ..., C-1}
Is it maybe the sake of hd5 format data format? Can you give me some help? I am looking forward to your replay. Thank you!!!
你好,请教一个问题。我想知道 testing\test_images 文件夹下的测试图像是怎么得到的,我根据你论文里的描述,我做了如下预处理:
(1)检测到手的位置;
(2)将感兴趣区域扣出来;
(3)过滤手部前后30公分的背景;
(4)转化为灰度图;
(5)做归一化到[-1,1];
可是测试出的结果并不好,我想知道我的预处理出现了问题还是哪里出现了问题,我看到你的测试图像是3通道的灰度图,想知道你是如何做的预处理?
期待您的回复!谢谢!
Hi thanks for sharing your code
I have a question about your hand detection module which is used in most of the hand pose estimation papers recently.
According to 96 line of training/GetH5DataNYU.py,
depth = CropImage(depth, joint_uvd[id, 34])
you used joint_uvd[id, 34] for com of CropImage function.
So, I`m curious about whether you used groundtruth palm position(joint_uvd[id, 34]) when you crop the hand from original depth image even in test stage or not.
Hi~ Thank you for your kind and great open source code. I have some issues in these following code:
xstart = int(math.floor((u * d / fx - cube_size / 2.) / d * fx))
xend = int(math.floor((u * d / fx + cube_size / 2.) / d * fx))
ystart = int(math.floor((v * d / fy - cube_size / 2.) / d * fy))
yend = int(math.floor((v * d / fy + cube_size / 2.) / d * fy))
in your code,
fx = 588.03
fy = 587.07
And I know fu,fv is decided because the original depth image is 640*480 pixel.
Are fx and fy decided by the camera? Where and How can I get them according to the NYU hand pose dataset? I'm a green hand and I haven't figure out the transforming between xyz and uvd coordination.
Could you pleas give me some help? Thank you!
As far as I understand the preprocessing part, the procedure is
However, hands in the test depth images [0,772,1150,1350,1739].png won't be detected correctly once the image is being flipped or rotated.
Are there some more steps involved in the preprocessing of the depth images (e.g. rotation, left/right hand orientation)?
*) Oberweger et al. (2015) used a center of mass approach to detect the center of hand, whereas in #4 the wrist position is mentioned as being the center, could you clarify on that?
HI~ I come here to ask for your help again.
Why the NYU Hand Dataset's RGB picture quality is so poor? It's not clear. They look so weird.
Could you please give me a hand? The dataset's author seems not to answer issues.
Thank you!
How to convert a NYU Dataset png img pixel value to z dist value?
Hello, I have a question in demo.py
. I saw the following code.
for j in range(J):
x[j] = joint[joints[j] * 3]
y[j] = joint[joints[j] * 3 + 1]
z[j] = joint[joints[j] * 3 + 2]
cv2.circle(img, (int((x[j] + 1) / 2 * 128), int((- y[j] + 1) / 2 * 128)), 2, (255, 0, 0), 2)
After computing the joint in 3D, it seems that we just normalize x,y coordinates to [0,128). I reimplement your code to run pytorch, my output for the ground truth (not my prediction of network) is like this:
The joints looks wrong and should be positioned left slightly. I wonder that it is normal or not (due to cropping, reisize and so on..)
HI~ I come here again. In your code:
https://github.com/xingyizhou/DeepModel/blob/master/training/GetH5DataNYU.py 64 line
data_names = ['train', 'test_1', 'test_2']
cube_sizes = [300, 300, 300 * 0.87]
id_starts = [0, 0, 2440]
What' the purpose of 300*0.87 when creating the test data?
Looking forward for your replying.
Best wishes!
你好,请教一个问题 我想问一下全连接层最后一层参数为 47 是什么参数,怎么算出来的呢?论文里提到是 23个节点,可是实际网络输出是 31 个关节点,这里有些疑惑。谢谢
Hi ,Why I flip or rotate your test image,the result is wrong?
In your paper, you have mentioned:
"Each rotation angle has a range [theta_i; theta_i], which are the lower/upper bounds for the
angle. Such bounds avoid self-collision and physically infeasible poses."
Could you please point me to the values of these bounds for each joint and degree of freedom.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.