EMA Is Not All You Need: Mapping the Boundary Between Structure and Content in Recurrent Context
📰 ArXiv cs.AI
arXiv:2604.08556v1 Announce Type: cross Abstract: What exactly do efficient sequence models gain over simple temporal averaging? We use exponential moving average (EMA) traces, the simplest recurrent context (no gating, no content-based retrieval), as a controlled probe to map the boundary between what fixed-coefficient accumulation can and cannot represent. EMA traces encode temporal structure: a Hebbian architecture with multi-timescale traces achieves 96% of a supervised BiGRU on grammatical
DeepCamp AI