This study investigates Large Language Models (LLMs) as dynamic Bayesian filters through question-asking experiments inspired by cognitive science. We analyse LLMs’ inference errors and the evolution of uncertainty across models using repeated sampling. Building on Bertolazzi et al. (2023), we trace LLM belief states during repeated queries, finding that entropy decreases with each interaction, signaling reduced uncertainty. However, issues like “resurrection” (reassigning probabilities to invalidated outcomes) and “Bayesian apocalypse” (probabilities approaching zero) reveal significant flaws. GPT-4o consistently outperforms GPT-3 in probabilistic reasoning. These results underscore the need for improved architectures for reliability in high-stakes contexts and suggest a link between token-level and task-level uncertainty dynamics that can be leveraged to enhance LLM performance.

Patania, S., Masiero, E., Brini, L., Piskovskyi, V., Ognibene, D., Donabauer, G., et al. (2024). Large Language Models as an active Bayesian filter: information acquisition and integration. In Proceedings of the 28th Workshop on the Semantics and Pragmatics of Dialogue, September, 11-12, 2024, University of Trento.

Large Language Models as an active Bayesian filter: information acquisition and integration

Patania, S;Masiero, E;Ognibene, D
Ultimo
;
Donabauer, G;
2024

Abstract

This study investigates Large Language Models (LLMs) as dynamic Bayesian filters through question-asking experiments inspired by cognitive science. We analyse LLMs’ inference errors and the evolution of uncertainty across models using repeated sampling. Building on Bertolazzi et al. (2023), we trace LLM belief states during repeated queries, finding that entropy decreases with each interaction, signaling reduced uncertainty. However, issues like “resurrection” (reassigning probabilities to invalidated outcomes) and “Bayesian apocalypse” (probabilities approaching zero) reveal significant flaws. GPT-4o consistently outperforms GPT-3 in probabilistic reasoning. These results underscore the need for improved architectures for reliability in high-stakes contexts and suggest a link between token-level and task-level uncertainty dynamics that can be leveraged to enhance LLM performance.
paper
chatbot, information gain, question making, epistemic value, LLM
English
28th Workshop On the Semantics and Pragmatics of Dialogue
2024
Bernardi, R; Breitholtz, E; Riccardi, G
Proceedings of the 28th Workshop on the Semantics and Pragmatics of Dialogue, September, 11-12, 2024, University of Trento
2024
https://www.semdial.org/anthology/papers/Z/Z24/Z24-3006/
none
Patania, S., Masiero, E., Brini, L., Piskovskyi, V., Ognibene, D., Donabauer, G., et al. (2024). Large Language Models as an active Bayesian filter: information acquisition and integration. In Proceedings of the 28th Workshop on the Semantics and Pragmatics of Dialogue, September, 11-12, 2024, University of Trento.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/533843
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact