Projected Autoregression: Autoregressive Language Generation in Continuous State Space

📰 ArXiv cs.AI

Projected Autoregression generates text by predicting next-token vectors in embedding space and projecting them to discrete tokens at commitment time

advanced Published 7 Apr 2026

Action Steps

Replace traditional token selection with continuous prediction in embedding space
Predict next-token vectors via autoregressive models
Project predicted vectors to discrete tokens at commitment time
Evaluate the generated text using standard language evaluation metrics

Who Needs to Know This

NLP researchers and AI engineers on a team can benefit from this approach as it offers a new perspective on autoregressive language generation, allowing for more flexible and continuous modeling of language

Key Insight

💡 Autoregressive language models can be designed to predict continuous vectors in embedding space instead of discrete tokens, allowing for more flexible modeling of language