Delayed Homomorphic Reinforcement Learning for Environments with Delayed Feedback
📰 ArXiv cs.AI
arXiv:2604.03641v1 Announce Type: cross Abstract: Reinforcement learning in real-world systems is often accompanied by delayed feedback, which breaks the Markov assumption and impedes both learning and control. Canonical state augmentation approaches cause the state-space explosion, which introduces a severe sample-complexity burden. Despite recent progress, the state-of-the-art augmentation-based baselines remain incomplete: they either predominantly reduce the burden on the critic or adopt non
DeepCamp AI