Fig. 1: Continual Retroactive Dot-Product Attention.
The query (Q), key (K), and value (V) matrices are aggregated over time by caching the step vectors q_n, k_n, and v_n in a FIFO queue. During each step, only the entries of A associated with q_n, k_n, and the oldest K step, k_o are computed.
The diagonal entries of the row-normalisation matrix D as well as the AV can be updated retroactively by subtracting features corresponding to k_o and adding features related to k_n to the cached outputs of the previous step, D_{mem} and AV_{mem}, respectively.
Fig. 2: Continual Single-Output Dot-Product Attention.
The key (K) and value (V) matrices are aggregated over time by caching the step vectors k_n and v_n in a FIFO queue. During each step, only the attention output associated with q is computed.
Setup
Continual Transformers and its modules can be installed in in your project using:
hello! I have installed "continual-inference“ but I got this error, I think it may be because the version of "continual-inference“ is incorrect. Could you please tell me the correct installed version of "continual-inference“?
from continual.module import CallMode
ImportError: cannot import name 'CallMode'