Hi, In the paper, it is described that 'Following other works [35],

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Test set of MSR-VTT for downstream evaluation about frozen-in-time HOT 5 CLOSED

m-bain commented on July 16, 2024

Test set of MSR-VTT for downstream evaluation

from frozen-in-time.

Comments (5)

bryant1410 commented on July 16, 2024 1

An external user here that has run into the same.

MSR-VTT 1K-A (from JSFusion work) doesn't have a "val" split, so people kind of use the names "test" and "val" interchangeably for it.

from frozen-in-time.

geyuying commented on July 16, 2024

@bryant1410 Did you evaluate the pretrained model on MSR-VTT 1K-A test set? Both the zero-shot and the finetuned results are higher than that reported in Table 5 of the paper.

from frozen-in-time.

bryant1410 commented on July 16, 2024

I think I haven't run the fine-tuned one with the provided model. For zero-shot one, I get pretty similar results with a different code (I get slightly smaller). Differences in MSR-VTT can be related to the fact that there are repeated labels (so there are ties).

from frozen-in-time.

bryant1410 commented on July 16, 2024

(but not sure how much the repeated-labels thing affects)

from frozen-in-time.

m-bain commented on July 16, 2024

Hi, yes unfortunately MSR-VTT 1k-A does not have a test split (many of the downstream retrieval datasets), so val and test are one and the same as @bryant1410 says. The line in the paper ought to be: "we train on 9k train videos, and val/test on 1k"

Regarding the resulting numbers being slightly higher: I retrained the pre-trained models after submission when rewriting the code, and performance increased a bit -- hence the higher ZS results.

For finetuning, the current code picks the best performing checkpoint from val == test, which preforms better than if you train and evaluate and a pre-decided fixed number of epochs (as described in the paper). Doing the latter will give results closer to those written in the paper.

from frozen-in-time.

Recommend Projects

Test set of MSR-VTT for downstream evaluation about frozen-in-time HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent