ERGO is a model-agnostic inference-time framework that preserves LLM performance in multi-turn dialogue by monitoring token-level entropy and, upon detecting spikes, rewriting degraded context into a distilled, noise-reduced representation.
Abstract
Large Language Models (LLMs) suffer significant performance degradation in multi-turn conversations when information is presented incrementally. Given that multi-turn conversations characterize everyday interactions with LLMs, this degradation poses a severe challenge to real world usability. We hypothesize that abrupt increases in model uncertainty signal misalignment in multi-turn LLM interactions, and we exploit this insight to dynamically realign conversational context. We introduce ERGO (Entropy-guided Resetting for Generation Optimization), which continuously quantifies internal uncertainty via Shannon entropy over next token distributions and triggers adaptive prompt consolidation when a sharp spike in entropy is detected. By treating uncertainty as a first class signal rather than a nuisance to eliminate, ERGO embraces variability in language and modeling, representing and responding to uncertainty. In multi‐turn tasks with incrementally revealed instructions, ERGO yields a 56.6% average performance gain over standard baselines, increases aptitude (peak performance capability) by 24.7%, and decreases unreliability (variability in performance) by 35.3%, demonstrating that uncertainty aware interventions can improve both accuracy and reliability in conversational AI.
Average Performance Results
| Model | FULL | SHARDED | ERGO | Relative Improvement |
|---|---|---|---|---|
| GPT-4o | 79.2 | 51.4 | 74.1 | +44.2% |
| GPT-4.1 | 83.6 | 56.6 | 77.0 | +36.0% |
| GPT-4o-mini | 73.8 | 44.3 | 71.8 | +62.1% |
| Phi-4 | 64.6 | 36.4 | 59.2 | +62.6% |
| LLaMA-3.1-8B | 46.0 | 28.7 | 50.9 | +77.4% |
More Key Results
| Average Performance Gain | Peak Capability Increase | Decrease in Unreliability |
|---|---|---|
56.6% |
24.7% |
35.3% |
BibTeX
@inproceedings{mohammad-khalid-etal-2025-ergo,
title = "{ERGO}: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models",
author = "Mohammad Khalid, Haziq and
Jeyaganthan, Athikash and
Do, Timothy and
Fu, Yicheng and
Sharma, Vasu and
O{'}Brien, Sean and
Zhu, Kevin",
editor = "Noidea, Noidea",
booktitle = "Proceedings of the 2nd Workshop on Uncertainty-Aware NLP (UncertaiNLP 2025)",
month = nov,
year = "2025",
address = "Suzhou, China",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.uncertainlp-main.23/",
pages = "273--286",
ISBN = "979-8-89176-349-4"
}