↻ ERGO: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models

Algoverse AI Research
Second Workshop on Uncertainty-aware NLP @ EMNLP 25'

Also presented at MTI-LLM @ NeurIPS 25'

*Indicates Lead Author
Introduction Figure


ERGO is a model-agnostic inference-time framework that preserves LLM performance in multi-turn dialogue by monitoring token-level entropy and, upon detecting spikes, rewriting degraded context into a distilled, noise-reduced representation.


Abstract

Large Language Models (LLMs) suffer significant performance degradation in multi-turn conversations when information is presented incrementally. Given that multi-turn conversations characterize everyday interactions with LLMs, this degradation poses a severe challenge to real world usability. We hypothesize that abrupt increases in model uncertainty signal misalignment in multi-turn LLM interactions, and we exploit this insight to dynamically realign conversational context. We introduce ERGO (Entropy-guided Resetting for Generation Optimization), which continuously quantifies internal uncertainty via Shannon entropy over next token distributions and triggers adaptive prompt consolidation when a sharp spike in entropy is detected. By treating uncertainty as a first class signal rather than a nuisance to eliminate, ERGO embraces variability in language and modeling, representing and responding to uncertainty. In multi‐turn tasks with incrementally revealed instructions, ERGO yields a 56.6% average performance gain over standard baselines, increases aptitude (peak performance capability) by 24.7%, and decreases unreliability (variability in performance) by 35.3%, demonstrating that uncertainty aware interventions can improve both accuracy and reliability in conversational AI.


Average Performance Results

Model FULL SHARDED ERGO Relative Improvement
GPT-4o 79.2 51.4 74.1 +44.2%
GPT-4.1 83.6 56.6 77.0 +36.0%
GPT-4o-mini 73.8 44.3 71.8 +62.1%
Phi-4 64.6 36.4 59.2 +62.6%
LLaMA-3.1-8B 46.0 28.7 50.9 +77.4%

More Key Results

Average Performance Gain Peak Capability Increase Decrease in Unreliability

56.6%

24.7%

35.3%


BibTeX

@inproceedings{mohammad-khalid-etal-2025-ergo,
    title = "{ERGO}: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models",
    author = "Mohammad Khalid, Haziq  and
      Jeyaganthan, Athikash  and
      Do, Timothy  and
      Fu, Yicheng  and
      Sharma, Vasu  and
      O{'}Brien, Sean  and
      Zhu, Kevin",
    editor = "Noidea, Noidea",
    booktitle = "Proceedings of the 2nd Workshop on Uncertainty-Aware NLP (UncertaiNLP 2025)",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.uncertainlp-main.23/",
    pages = "273--286",
    ISBN = "979-8-89176-349-4"
}