Revisiting Transformer-based Models for Long Document Classification 2022
Hierarchical Transformers Are More Efficient Language Models 2021
Hierarchical Transformers for Multi-Document Summarization ACL 2019
H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences ACL 2021