In this paper, we study human–AI collaboration protocols, a design-oriented construct aimed at establishing and evaluating how humans and AI can collaborate in cognitive tasks. We applied this construct in two user studies involving 12 specialist radiologists (the knee MRI study) and 44 ECG readers of varying expertise (the ECG study), who evaluated 240 and 20 cases, respectively, in different collaboration configurations. We confirm the utility of AI support but find that XAI can be associated with a “white-box paradox”, producing a null or detrimental effect. We also find that the order of presentation matters: AI-first protocols are associated with higher diagnostic accuracy than human-first protocols, and with higher accuracy than both humans and AI alone. Our findings identify the best conditions for AI to augment human diagnostic skills, rather than trigger dysfunctional responses and cognitive biases that can undermine decision effectiveness.
Cabitza, F., Campagner, A., Ronzio, L., Cameli, M., Mandoli, G., Pastore, M., et al. (2023). Rams, hounds and white boxes: Investigating human–AI collaboration protocols in medical diagnosis. ARTIFICIAL INTELLIGENCE IN MEDICINE, 138(April 2023) [10.1016/j.artmed.2023.102506].
Rams, hounds and white boxes: Investigating human–AI collaboration protocols in medical diagnosis
Cabitza F.
;Campagner A.;
2023
Abstract
In this paper, we study human–AI collaboration protocols, a design-oriented construct aimed at establishing and evaluating how humans and AI can collaborate in cognitive tasks. We applied this construct in two user studies involving 12 specialist radiologists (the knee MRI study) and 44 ECG readers of varying expertise (the ECG study), who evaluated 240 and 20 cases, respectively, in different collaboration configurations. We confirm the utility of AI support but find that XAI can be associated with a “white-box paradox”, producing a null or detrimental effect. We also find that the order of presentation matters: AI-first protocols are associated with higher diagnostic accuracy than human-first protocols, and with higher accuracy than both humans and AI alone. Our findings identify the best conditions for AI to augment human diagnostic skills, rather than trigger dysfunctional responses and cognitive biases that can undermine decision effectiveness.File | Dimensione | Formato | |
---|---|---|---|
Cabitza-2023-Art Intell Med-preprint.pdf
accesso aperto
Descrizione: Research Article
Tipologia di allegato:
Submitted Version (Pre-print)
Licenza:
Altro
Dimensione
2.83 MB
Formato
Adobe PDF
|
2.83 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.