This is the offical implementations of "Rethinking the Reference-based Distinctive Image Captioning" accepted by ACM Multimedia 2022.
We are planing to extend this work to a journal, our implementation is based on $M^2 Transformer$.Thanks to the repository.