menghaoguo / pct Goto Github PK
View Code? Open in Web Editor NEWJittor implementation of PCT:Point Cloud Transformer
Jittor implementation of PCT:Point Cloud Transformer
Thanks for your open source.
I recently use your code to train for a teeth segmentation, I want to seg a whole teeth(contains gum and teeth) to 4 part separate teeth. Because teeth is just one classification, can I abandon your one-hot label conversion part to train? But now, the IOU is only 80%, can you give me any suggestions?
THANKS!
Hi Menghao,
Thank you for sharing the interesting paper!
I have some queries about the definition of the permutation invariance. In my opinion, though the self-attention calculates the global contextual information and aggregate the feature via weighted summation, the resulting features are still related to the point order (but the feature of the same point seems to be invariant). And I also observe the implementation of max-pooling strategy, which is proposed in PointNet to guarantee invariance.
So I wonder how you define the permutation invariance, because it seems the attention itself can hardly gurantee the global invanriance. Thank you very much!
Rui
Thank you very much for your work. I would like to ask how to generate the feature map of the point cloud. If it is convenient, can you provide an example?
thanks!
Thanks for your great job!
Here, I still do not understand the steps you mentioned: (1) Choose query point I.
(2)Convert the value of attention A[i,j] to the depth of color and save both position and color of all point j as .txt file
(3)Open the saved .txt file by meshlab to render point cloud.
Could you please share your visualization.py?
Many Thanks!
I am trying to run simple optimization using this segmentation netowrk, but I cannot fully understand what cls_label parameter in partseg network.
Do I have to run classification before conducting segmentation or something like that?
Thank you.
Dear MenghaoGuo,
感谢你的代码,我在你的文章中的3.4节看到local feature representation,但是似乎这个操作只有cls里面有,seg文件中并没有文章提到的SG layer。
是分割任务中不需要local feature吗,还是global feature就已经能胜任分割任务?
希望您解答我的问题。
Thanks again!
Hi.
As here, the attention matrix should be transposed before the matrix product, if I understand it correctly.
Here is my draft of the calculation about the dimension of the matrix product.
Thank you for sharing model code selflessly! Recently I have reproduced the segmentation based on provided seg code, however the class mean IoU is only 78.8%. Could you please offer some suggestions?
when run pct
self.sa1 = SA_Layer(channels)
File "/home/ssd/zjl/zjl/point/PointCloudLib/networks/cls/pct.py", line 205, in __init__
self.q_conv.conv.weight = self.k_conv.conv.weight
AttributeError: 'Conv1d' object has no attribute 'conv'
Hello,
very interesting paper, and nice to publish parts of the code along with it!
A couple of questions:
self.v_conv
has a bias attached to it. Looking at other 'attention' implementations, it seems that those mostly exclude bias from it (as you also do for the keys and queries). Did you see any improvement adding a bias there?Kind regards,
steven
Hello, thanks for your work!
I am reading the code but get confused about some details.
In the class Point_Transformer_Last
, the self.pos_xyz
is not defined in the section.
What does the operation specifically do and where can I find the definition of it?
Appreciate!
When the convolution weight of q and k are initialized equal,
self.q_conv.conv.weight = self.k_conv.conv.weight
will it cause q and k to be always the same when updated?
Hello, thank you for releasing the code!
I was wondering if you could tell me the model size (MB) and the number of parameters for PCT-2L and PCT-3L?
Thank you!
Hi,
PCT used the Batch normalization, instead of the Layer normalization, used by original Transformer.
I wonder how do you consider about Layer normalization and Batch normalization in PCT?
i found there haven't any describe about how the local feature embedding layer combine with the SA layer, what is the final PCT architecture like?
Hi, @MenghaoGuo,
For Figure 1 in your paper, which self-attention layer is used to visualize the attention map? From your implementation, there are 4 self-attention layers (SA1, SA2, SA3, SA4) in the model.
Thanks~
Hi, how can we obtain images in Figure 1 ?
Can you share the codes to visualize attention map if it will not be a problem?
Best regards.
Hi, thanks for your wonderful work.
And I have some questions about models in /networks/cls/pct.py.
Point_Transformer
on 86 line corresponds to a model without neighbor embedding?Point_Transformer2
on 34 line correspond to the final model of the paper?Maybe I have a wrong understanding and you can give me some advice? Thanks!
Hi, thank you for releasing the code! However, I could not find the details of the CosineAnnealingLR
. Could you tell me how to set T_max
and other attributes for this scheduler?
Hi,
First of all, I want to thank you for the proposed method , which benefited me a lot. So I reproduce your code by pytorch and tried to visualize the attention map in part segmentation task. but when I want to use the right wing as query point, it can't attention the left wing like what you visualize on the paper. So I want to know how you show the visualization result like the paper.
In addition, other issue point out the dimension of softmax is wrong because your multiplication is Value*Attention, so I think the dimension of softmax in Attention would be 1, not -1(or 2), please correct me if there is any mistakes. And also the dimension of softmax and L1 norm is different(softmax is -1 but L1-Norm is 1), why?
Line 211: self.softmax = nn.Softmax(dim=-1)
Line 220: attention = attention / (1e-9 + attention.sum(dim=1, keepdims=True))
Also, I want to know how you do neighbor embedding in part segmentation. the paper said the number of output points is N, which means you didn't sampling the point and also do SG(sampling and grouping) module twice. but when I reproduce the same method, I got cuda out of memory in RTX 2080Ti(VRAM:12G). Is my VRAM not big enough or I have any problem with the understanding of the paper discription?
I'm looking forward to your reply, and thank you for your contribution.
Hi, I wonder if you have prepared to release the complete code, especially in segmentation? Thanks!
Thank you very much for your work. Could you please provide me with the code for drawing the attention map, like this
Hi, I want to ask how is positional embedding implemented in the model?
xyz = self.pos_xyz(xyz) but there isn't self.pos_xyz given
I'm very sorry to bother you.
I trained the PCT in a 2080Ti,but I can‘t obtain the accuracy to 93.2 and the result is greatly waving.
I only obtain the accuracy to 92.8.
Do you have other particular parameter?
Best regard,
Hi,
amazing work and great results, thanks for making it available here! I was wondering whether you plan to release the pre-trained models? I work on robotic grasping and it would be interesting to see how your architecture performs for this task compared to other state-of-the-art models.
Thanks for your sharing! I benefit a lot from the code. However, I also meet some problems.
Based on the released segmentation code, I conducted the experiment on the S3DIS dataset and evaluated the network on Area5. The training pipeline is following the train_semseg.py
I got the result that the training_acc is 94.3% and the test_mIoU only achieve 54.3%. The result is lower than that in the paper about 7%. I have the following question.
Dear authors,
Thank you for sharing your code and contributions. I would like to use your code in my project, while there is no LICENSE present in your repository. May I ask in what ways we could use or modify the code, and it wouild be great if you could add the license to the repository.
Thank you so much!
Hi, @MenghaoGuo ,
Thanks for releasing the package. The current package only provides the code for network construction. Could you provide a real example code for point cloud classification or segmentation?
Thanks~
In Local_op module, it seems like you reshaped the 4-dimension feature from sample_and_group module [batch, npoint, nsample, features] to [batch * npoint, features, nsample]. Then feed it into 1d convolution. After that, you keep the biggst feature as [batch * npoint, features, 1], and reshape it to [batch, features, npoint].
In my opinion, this is the most significant difference between you and other pointnet series articles. Could you explain the effectiveness of this?
HI, @MenghaoGuo ,
From the code in cls and partseg, the attention weights are already normalized by self.softmax()
. Why did you add an extra line attention / (1e-9 + attention.sum(dim=1, keepdims=True))
for weight normalization ?
Any particular reason?
Thanks~
hi.I had some problems reproducing your paper.Can you release the complete code?The partseg and semseg I reproduced with pytorch are bad.
Hello Guo! I found that much more epochs were needed in part_seg task when I replaced the model in Pointnet++ with PCT. (Pointnet++ code from https://github.com/yanx27/Pointnet_Pointnet2_pytorch)
Pointnet++ needs less than 50 epochs while PCT needs more than 500 epochs. Is this caused by model complexity? Have you found same question when training?
Hi,
Thanks for releasing the code. Can you release the PCT code for the PartSeg or simply show the parameters of each layer, please?
Best
Hi, thanks for your great work!
However, I have a problem with the concatenation in segmentation. As shown in Fig. 2, the global feature is obtained by repeating its previous vector. It should be (batch_size, 1024, point_num), point_num may be 1024 in the modelnet. While the point feature should be (batch_sie, 1024, sampled_point_num). For example, the sample_point_num is 256 according to the implementation at https://github.com/Strawberry-Eat-Mango/PCT_Pytorch.
So how can these two features be concatenated? Or do we just segment those sampled points?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.