Dynamic Prompt Compression for Efficient Inference of Large Language Models

Published in TKDE, 2026