Good morning, I am recoding your work using pytorch to use it as a baseline in my

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

I get stuck (again) in the loss part, where you have : <div class="snippet-clipboa

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Question about loss function and Hungarian part about 3d-bonet HOT 12 OPEN

yang7879 commented on August 16, 2024

Question about loss function and Hungarian part

from 3d-bonet.

Comments (12)

Yang7879 commented on August 16, 2024

Hi @lelouedec, the released code only deals with the case where the number of pred bbox is equal to the number of gt bbox. It's easy to slightly modify it to deal with your case.

def hungarian(loss_matrix, bb_gt):
    box_mask = np.array([[0, 0, 0], [0, 0, 0]])

    def assign_mappings_valid_only(cost, gt_boxes):
        # return ordering : batch_size x num_instances
        cost_total = 0.
        batch_size, num_instances_gt, num_instances_pred = cost.shape[:3]
        ordering_4_pred = np.zeros(shape=[batch_size, num_instances_pred]).astype(np.int32)
        ordering_4_gt = np.zeros(shape=[batch_size, num_instances_gt]).astype(np.int32)
        for idx in range(batch_size):
            ins_gt_boxes = gt_boxes[idx]
            ins_num_gt_valid = 0
            for box in ins_gt_boxes:
                if np.array_equal(box, box_mask):
                    break
                else:
                    ins_num_gt_valid += 1
            valid_cost = cost[idx][:ins_num_gt_valid]
            row_ind, col_ind = linear_sum_assignment(valid_cost)

            ## ins_gt order
            unmapped_ind_gt = np.array(list(set(range(num_instances_gt)) - set(row_ind)))
            row_ind_gt = np.concatenate([row_ind, unmapped_ind_gt])

            ## ins_pred order
            if num_instances_pred - ins_num_gt_valid > 0:
                unmapped_ind_pred = np.array(list(set(range(num_instances_pred)) - set(col_ind)))
                col_ind_pred = np.concatenate([col_ind, unmapped_ind_pred])
            else:
                col_ind_pred = col_ind

            cost_total += cost[idx][row_ind, col_ind].mean()
            ordering_4_pred[idx] = np.reshape(col_ind_pred, [1, -1])
            ordering_4_gt[idx] = np.reshape(row_ind_gt, [1, -1])

        return ordering_4_pred, (cost_total/float(batch_size)).astype(np.float32)

    ######
    ordering_4_pred, cost_min = tf.py_func(assign_mappings_valid_only, [loss_matrix, bb_gt], [tf.int32, tf.float32])

    return ordering_4_pred, cost_min

from 3d-bonet.

lelouedec commented on August 16, 2024

ohh so it makes sense after all, as I had to modify the code at one or two other places to make it work for a different number of bounding boxes in ground truth and prediction.
Thanks a lot for this answer

from 3d-bonet.

lelouedec commented on August 16, 2024

I get stuck (again) in the loss part, where you have :

 ##### 1. get ce loss of valid/positive bboxes, don't count the ce_loss of invalid/negative bboxes
        Y_bbox_helper_tp1 = tf.tile(Y_bbox_helper[:, :, None], [1, 1, points_num])
        bbox_loss_ce_all = -points_in_gt_bbox_prob * tf.log(points_in_pred_bbox_prob + 1e-8) \
                       -(1.-points_in_gt_bbox_prob)*tf.log(1.-points_in_pred_bbox_prob + 1e-8)
        bbox_loss_ce_pos = tf.reduce_sum(bbox_loss_ce_all*Y_bbox_helper_tp1)/tf.reduce_sum(Y_bbox_helper_tp1)
        bbox_loss_ce = bbox_loss_ce_pos

as points_in_gt_bbox_prob and points_in_pred_bbox_prob are no the same shape, hence cannot be broadcasted together.
Shall I change their shape and tile them to have them B,T,H,nbpoints ? where T is the number of boxes in ground truth and H the predicted ?

from 3d-bonet.

Yang7879 commented on August 16, 2024

@lelouedec in your case where the number of gt bbox(T) is less than pred bbox(H), you need to use "the matched top T bbox" from the pred bbox to caculate loss, instead of using all H pred bbox to compute loss.

from 3d-bonet.

lelouedec commented on August 16, 2024

Good morning,
I followed your instructions and used the matched T boxes. However, even if Pointnet semantic segmentation algorithm is converging an output a segmentation mask close to reality the bounding boxes regression is never converging. I tried using all the different criteria (l2 only, ce etc.. ) but no luck. Any idea?
I haven't made many changes to the code other than what you proposed.

EDIT :
I use the following to take top T matched predictions :

sort_by = np.argsort(-y_bbscore_pred.clone().detach().cpu().numpy()[0,:])
y_bbscore_pred = y_bbscore_pred[:,sort_by[:target_bb.shape[1]]]
pred_bborder = pred_bborder[:,sort_by[:target_bb.shape[1]]]
y_bbvert_pred = y_bbvert_pred[:,sort_by[:target_bb.shape[1]]]

I also tried with the same number of boxes in the Ground Truth and the point cloud and same resutls the bounding boxes never converge.

from 3d-bonet.

Yang7879 commented on August 16, 2024

Hi @lelouedec, according to ur code, u already made "significant" changes. Your approach is to sort all predictions based on the predicted bbox scores, while our network is to sort all predictions based on the optimal assignment.

from 3d-bonet.

lelouedec commented on August 16, 2024

Good evening, I ended up going back to your code and revert my changes, but I still can't achieve the behavior from the paper where the number of ground truth bounding boxes is different from the number of predicted ones as it is not possible to know how many objects there might be in the scene. Do you plan on releasing the code achieving the paper performances and behavior ? Having a version of your code running in pytorch could be beneficial and I would be very thankful if I could use your work as a baseline on our dataset.

…

On Mon, 30 Sep 2019, 20:26 Bo Yang, ***@***.***> wrote: Hi @lelouedec <https://github.com/lelouedec>, according to ur code, u already made "significant" changes. Your approach is to sort all predictions based on the predicted bbox scores, while our network is to sort all predictions based on the optimal assignment. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#11?email_source=notifications&email_token=AD235FMSKBMT2IHJNRUDN53QMJHIBA5CNFSM4I2YL7R2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD76Z3PQ#issuecomment-536714686>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AD235FJFGDME62NPK5GMRA3QMJHIBANCNFSM4I2YL7RQ> .

from 3d-bonet.

Yang7879 commented on August 16, 2024

Hi @lelouedec , essentially, the released code deals with the cases where the real number of gt bbox is not equal to that of pred bbox. To be specific, in data processing, all gt data are zero-padded to be a same number of bbox (T), but in loss cacluation, those zero-padded bbox are not counted. This implementation allows the batch-size to be more than 1 during training.

Unforturnately, I can only explain the logics of the our algorithm, but unable to implement a pytorch version.

from 3d-bonet.

lelouedec commented on August 16, 2024

Ok now I understand how you go up until the loss computation by zero padding the gt bounding boxes. However I am not sure to get where in the code of the loss function you get rid of these zero padding and keep the H-T boxes only (How you wrote it in the paper )

…

On Mon, 30 Sep 2019, 22:01 Bo Yang, ***@***.***> wrote: Hi @lelouedec <https://github.com/lelouedec> , essentially, the released code deals with the cases where the real number of gt bbox is not equal to that of pred bbox. To be specific, in data processing, all gt data are zero-padded to be a same number of bbox (T), but in loss cacluation, those zero-padded bbox are not counted. This implementation allows the batch-size to be more than 1 during training. Unforturnately, I can only explain the logics of the our algorithm, but unable to implement a pytorch version. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#11?email_source=notifications&email_token=AD235FL6QXKOGLWZSDJBSODQMJSKHA5CNFSM4I2YL7R2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD77CVYI#issuecomment-536750817>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AD235FIZQ3VWZVLV5QBMHCLQMJSKHANCNFSM4I2YL7RQ> .

from 3d-bonet.

Yang7879 commented on August 16, 2024

@lelouedec

3D-BoNet/helper_net.py

Line 252 in 07c99e5

Y_bbox_helper = tf.reduce_sum(tf.reshape(Y_bbvert, [-1, bb_num, 6]), axis=-1)

from 3d-bonet.

lelouedec commented on August 16, 2024

I understand a bit better now, still not sure why the bounding boxes regression is not converging at all even with a code nearly 100% similar to yours.
Here a compiled version of the pytorch implementation I did :
https://gist.github.com/lelouedec/5a7ba5547df5cef71b50ab306199623f
In case someone wants to have a look of needs it in the future.
The pointnet backbone works as I am using it on my dataset for semantic segmentation successfully.

from 3d-bonet.

Yang7879 commented on August 16, 2024

hi @lelouedec Thanks for sharing. Will check it out once I finish my task at hand.

from 3d-bonet.

Question about loss function and Hungarian part about 3d-bonet HOT 12 OPEN

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent