menghaoguo / pct Goto Github PK

Jittor implementation of PCT:Point Cloud Transformer

Python 100.00%

pct's Issues

About Segmentation Classification Label

Thanks for your open source.
I recently use your code to train for a teeth segmentation, I want to seg a whole teeth(contains gum and teeth) to 4 part separate teeth. Because teeth is just one classification, can I abandon your one-hot label conversion part to train? But now, the IOU is only 80%, can you give me any suggestions?

THANKS!

Queries about permutation invariance of Transformer

Hi Menghao,

Thank you for sharing the interesting paper!

I have some queries about the definition of the permutation invariance. In my opinion, though the self-attention calculates the global contextual information and aggregate the feature via weighted summation, the resulting features are still related to the point order (but the feature of the same point seems to be invariant). And I also observe the implementation of max-pooling strategy, which is proposed in PointNet to guarantee invariance.

So I wonder how you define the permutation invariance, because it seems the attention itself can hardly gurantee the global invanriance. Thank you very much!

Rui

Feature map

Thank you very much for your work. I would like to ask how to generate the feature map of the point cloud. If it is convenient, can you provide an example?
thanks!

Visualization

Thanks for your great job!
Here, I still do not understand the steps you mentioned： (1) Choose query point I.
(2)Convert the value of attention A[i,j] to the depth of color and save both position and color of all point j as .txt file
(3)Open the saved .txt file by meshlab to render point cloud.

Could you please share your visualization.py?

Many Thanks！

What is cls_label in parseg Network?

I am trying to run simple optimization using this segmentation netowrk, but I cannot fully understand what cls_label parameter in partseg network.

Do I have to run classification before conducting segmentation or something like that?

Thank you.

关于seg文件中似乎没有local feature representation的问题

Dear MenghaoGuo，

感谢你的代码，我在你的文章中的3.4节看到local feature representation，但是似乎这个操作只有cls里面有，seg文件中并没有文章提到的SG layer。
是分割任务中不需要local feature吗，还是global feature就已经能胜任分割任务？
希望您解答我的问题。

Thanks again！

Are there potential errors in the implementation of SA?

Hi.

As here, the attention matrix should be transposed before the matrix product, if I understand it correctly.

Here is my draft of the calculation about the dimension of the matrix product.

The performance of segmentation

Thank you for sharing model code selflessly! Recently I have reproduced the segmentation based on provided seg code, however the class mean IoU is only 78.8%. Could you please offer some suggestions?

bugs

when run pct

self.sa1 = SA_Layer(channels)
  File "/home/ssd/zjl/zjl/point/PointCloudLib/networks/cls/pct.py", line 205, in __init__
    self.q_conv.conv.weight = self.k_conv.conv.weight
AttributeError: 'Conv1d' object has no attribute 'conv'

Use of bias in value layer

Hello,

very interesting paper, and nice to publish parts of the code along with it!

A couple of questions:

I was wondering why the layer calculating the values self.v_conv has a bias attached to it. Looking at other 'attention' implementations, it seems that those mostly exclude bias from it (as you also do for the keys and queries). Did you see any improvement adding a bias there?
Is there any reason for setting the initial weights of the key and query layer equally?
In the paper, you mention making use of Farthest Point Sampling (FPS) for the neighbor embedding module, but before you sample you embed the pure 3-dimensional point-coordinates in a high-dimensional space. Do you perform FPS in the full 64-dimensional space, or do you do this in the 3-dimensional one?

Kind regards,
steven

What is `self.pos_xyz(xyz)`

Hello, thanks for your work!
I am reading the code but get confused about some details.
In the class Point_Transformer_Last, the self.pos_xyz is not defined in the section.
What does the operation specifically do and where can I find the definition of it?
Appreciate!

About SA_Layer

When the convolution weight of q and k are initialized equal,
self.q_conv.conv.weight = self.k_conv.conv.weight
will it cause q and k to be always the same when updated?

Is the classification experiment conducted on aligned or not aligned ModelNet40？

Segmentation model detail

Hello, thank you for releasing the code!
I was wondering if you could tell me the model size (MB) and the number of parameters for PCT-2L and PCT-3L?
Thank you!

How do you consider about Layer normalization and Batch normalization?

Hi,

PCT used the Batch normalization, instead of the Layer normalization, used by original Transformer.

I wonder how do you consider about Layer normalization and Batch normalization in PCT?

local feature

i found there haven't any describe about how the local feature embedding layer combine with the SA layer, what is the final PCT architecture like?

question about the attention map visualization

Hi, @MenghaoGuo,

For Figure 1 in your paper, which self-attention layer is used to visualize the attention map? From your implementation, there are 4 self-attention layers (SA1, SA2, SA3, SA4) in the model.

Thanks~

How to visualize attention map in Figure 1

Hi, how can we obtain images in Figure 1 ?

Can you share the codes to visualize attention map if it will not be a problem?

Best regards.

Question about weight value of key and query in SA Layer

Hello,

I really appreciate to your creative work.

However, I hope to know why did you use same weight value when initialize the q_conv and k_conv kernel?

As I know, they don't have to be identical.

Is there any reason?

Thank you for your work again.

Model Question

Hi, thanks for your wonderful work.
And I have some questions about models in /networks/cls/pct.py.

Does Point_Transformer on 86 line corresponds to a model without neighbor embedding?
Does Point_Transformer2 on 34 line correspond to the final model of the paper?
pos_xyz corresponds to positional embedding? and why we ues it?

Maybe I have a wrong understanding and you can give me some advice? Thanks!

The parameter of the CosineAnnealingLR

Hi, thank you for releasing the code! However, I could not find the details of the CosineAnnealingLR. Could you tell me how to set T_max and other attributes for this scheduler?

I want to know how you visualize your attention map

Hi,
First of all, I want to thank you for the proposed method , which benefited me a lot. So I reproduce your code by pytorch and tried to visualize the attention map in part segmentation task. but when I want to use the right wing as query point, it can't attention the left wing like what you visualize on the paper. So I want to know how you show the visualization result like the paper.

In addition, other issue point out the dimension of softmax is wrong because your multiplication is Value*Attention, so I think the dimension of softmax in Attention would be 1, not -1(or 2), please correct me if there is any mistakes. And also the dimension of softmax and L1 norm is different(softmax is -1 but L1-Norm is 1), why?
Line 211: self.softmax = nn.Softmax(dim=-1)
Line 220: attention = attention / (1e-9 + attention.sum(dim=1, keepdims=True))

Also, I want to know how you do neighbor embedding in part segmentation. the paper said the number of output points is N, which means you didn't sampling the point and also do SG(sampling and grouping) module twice. but when I reproduce the same method, I got cuda out of memory in RTX 2080Ti(VRAM:12G). Is my VRAM not big enough or I have any problem with the understanding of the paper discription?

I'm looking forward to your reply, and thank you for your contribution.

Hi, I wonder if you have prepared to release the complete code, especially in segmentation? Thanks!

the code about attention map

Thank you very much for your work. Could you please provide me with the code for drawing the attention map, like this

How is the positional embedding implemented?

Hi, I want to ask how is positional embedding implemented in the model?
xyz = self.pos_xyz(xyz) but there isn't self.pos_xyz given

The question of training

I'm very sorry to bother you.
I trained the PCT in a 2080Ti，but I can‘t obtain the accuracy to 93.2 and the result is greatly waving.
I only obtain the accuracy to 92.8.
Do you have other particular parameter?
Best regard，

Hoe to get and visulise the attention map ?

Pre-trained models

Hi,

amazing work and great results, thanks for making it available here! I was wondering whether you plan to release the pre-trained models? I work on robotic grasping and it would be interesting to see how your architecture performs for this task compared to other state-of-the-art models.

Questions about segmentation

Thanks for your sharing! I benefit a lot from the code. However, I also meet some problems.
Based on the released segmentation code, I conducted the experiment on the S3DIS dataset and evaluated the network on Area5. The training pipeline is following the train_semseg.py
I got the result that the training_acc is 94.3% and the test_mIoU only achieve 54.3%. The result is lower than that in the paper about 7%. I have the following question.

I used the SGD optimizer with LR = 0.01 and used the CosineAnnealing strategy to adjust the LR every epoch. I also used the random scale and rotation around the z-axis data argument. Did I use the correct training trick?
When I experiment with the ShapePartNet dataset for part-segmentation tasks, I found that testing with a multi-scale strategy brings about a 5% improvement. I wonder that did you use the multi-scale test strategy on Area5 for the semantic segmentation?
I also found that the experiment result of DGCNN on Table4 is from 6-fold cross-validation but not Area5.

About the license

Dear authors,

Thank you for sharing your code and contributions. I would like to use your code in my project, while there is no LICENSE present in your repository. May I ask in what ways we could use or modify the code, and it wouild be great if you could add the license to the repository.

Thank you so much!

about a real example code

Hi, @MenghaoGuo ,

Thanks for releasing the package. The current package only provides the code for network construction. Could you provide a real example code for point cloud classification or segmentation?

Thanks~

Reshape operation in Local_op

In Local_op module, it seems like you reshaped the 4-dimension feature from sample_and_group module [batch, npoint, nsample, features] to [batch * npoint, features, nsample]. Then feed it into 1d convolution. After that, you keep the biggst feature as [batch * npoint, features, 1], and reshape it to [batch, features, npoint].
In my opinion, this is the most significant difference between you and other pointnet series articles. Could you explain the effectiveness of this?

question about the normalization on the attention weight

HI, @MenghaoGuo ,

From the code in cls and partseg, the attention weights are already normalized by self.softmax(). Why did you add an extra line attention / (1e-9 + attention.sum(dim=1, keepdims=True)) for weight normalization ?

Any particular reason?

Thanks~

partseg and semseg

hi.I had some problems reproducing your paper.Can you release the complete code?The partseg and semseg I reproduced with pytorch are bad.

Question about training

Hello Guo! I found that much more epochs were needed in part_seg task when I replaced the model in Pointnet++ with PCT. (Pointnet++ code from https://github.com/yanx27/Pointnet_Pointnet2_pytorch)

Pointnet++ needs less than 50 epochs while PCT needs more than 500 epochs. Is this caused by model complexity? Have you found same question when training?

PCT code for the PartSeg

Hi,

Thanks for releasing the code. Can you release the PCT code for the PartSeg or simply show the parameters of each layer, please?

Best

Some questions about the data augmentation step

How is the Global Feature concatenated with Point Feature in segmentation?

Hi, thanks for your great work!
However, I have a problem with the concatenation in segmentation. As shown in Fig. 2, the global feature is obtained by repeating its previous vector. It should be (batch_size, 1024, point_num), point_num may be 1024 in the modelnet. While the point feature should be (batch_sie, 1024, sampled_point_num). For example, the sample_point_num is 256 according to the implementation at https://github.com/Strawberry-Eat-Mango/PCT_Pytorch.
So how can these two features be concatenated? Or do we just segment those sampled points?

menghaoguo / pct Goto Github PK

pct's Issues

Recommend Projects

Recommend Topics

Recommend Org