
ERGO is a model-agnostic inference-time framework that preserves LLM performance in multi-turn dialogue by monitoring token-level entropy and, upon detecting spikes, rewriting degraded context into a distilled, noise-reduced representation.
Abstract
Large Language Models (LLMs) suffer significant performance degradation in multi-turn conversations when information is presented incrementally. Given that multi-turn conversations characterize everyday interactions with LLMs, this degradation poses a severe challenge to real world usability. We hypothesize that abrupt increases in model uncertainty signal misalignment in multi-turn LLM interactions, and we exploit this insight to dynamically realign conversational context. We introduce ERGO (Entropy-guided Resetting for Generation Optimization), which continuously quantifies internal uncertainty via Shannon entropy over next token distributions and triggers adaptive prompt consolidation when a sharp spike in entropy is detected. By treating uncertainty as a first class signal rather than a nuisance to eliminate, ERGO embraces variability in language and modeling, representing and responding to uncertainty. In multi‐turn tasks with incrementally revealed instructions, ERGO yields a 56.6% average performance gain over standard baselines, increases aptitude (peak performance capability) by 24.7%, and decreases unreliability (variability in performance) by 35.3%, demonstrating that uncertainty aware interventions can improve both accuracy and reliability in conversational AI.
Average Performance Results
Model | FULL | SHARDED | ERGO | Relative Improvement |
---|---|---|---|---|
GPT-4o | 79.2 | 51.4 | 74.1 | +44.2% |
GPT-4.1 | 83.6 | 56.6 | 77.0 | +36.0% |
GPT-4o-mini | 73.8 | 44.3 | 71.8 | +62.1% |
Phi-4 | 64.6 | 36.4 | 59.2 | +62.6% |
LLaMA-3.1-8B | 46.0 | 28.7 | 50.9 | +77.4% |
More Key Results
Average Performance Gain | Peak Capability Increase | Decrease in Unreliability |
---|---|---|
56.6% |
24.7% |
35.3% |
BibTeX
@inproceedings{khalid2025ergo,
title={ERGO: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models},
author={Khalid, Haziq Mohammad and Jeyaganthan, Athikash and Do, Timothy and
Fu, Yicheng and O'Brien, Sean and Sharma, Vasu and Zhu, Kevin},
booktitle={Proceedings of the 2nd Workshop on Uncertainty-Aware NLP @ EMNLP 2025}
year={2025}
}