StateX: Enhancing RNN Recall via Post-training State Expansion
📰 ArXiv cs.AI
arXiv:2509.22630v2 Announce Type: replace-cross Abstract: Recurrent neural networks (RNNs), such as linear attention and state-space models, have gained popularity due to their constant per-token complexity when processing long contexts. However, these recurrent models struggle with tasks that require accurate recall of contextual information from long contexts, because all contextual information is compressed into a fixed-size recurrent state. Previous studies have shown that recall ability is
DeepCamp AI