AI-based code assistants are promising tools that can facilitate and speed up code development. They exploit machine learning algorithms and natural language processing to interact with developers, suggesting code snippets (e.g., method implementations) that can be incorporated into projects. Recent studies empirically investigated the effectiveness of code assistants using simple exemplary problems (e.g., the re-implementation of well-known algorithms), which fail to capture the spectrum and nature of the tasks actually faced by developers. In this paper, we expand the knowledge in the area by comparatively assessing four popular AI-based code assistants, namely GitHub Copilot, Tabnine, ChatGPT, and Google Bard, with a dataset of 100 methods that we constructed from real-life open-source Java projects, considering a variety of cases for complexity and dependency from contextual elements. Results show that Copilot is often more accurate than other techniques, yet none of the assistants is completely subsumed by the rest of the approaches. Interestingly, the effectiveness of these solutions dramatically decreases when dealing with dependencies outside the boundaries of single classes.

Corso, V., Mariani, L., Micucci, D., Riganelli, O. (2024). Generating Java Methods: An Empirical Assessment of Four AI-Based Code Assistants. In Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension (pp.13-23). IEEE [10.1145/3643916.3644402].

Generating Java Methods: An Empirical Assessment of Four AI-Based Code Assistants

Corso, Vincenzo;Mariani, Leonardo;Micucci, Daniela;Riganelli, Oliviero
2024

Abstract

AI-based code assistants are promising tools that can facilitate and speed up code development. They exploit machine learning algorithms and natural language processing to interact with developers, suggesting code snippets (e.g., method implementations) that can be incorporated into projects. Recent studies empirically investigated the effectiveness of code assistants using simple exemplary problems (e.g., the re-implementation of well-known algorithms), which fail to capture the spectrum and nature of the tasks actually faced by developers. In this paper, we expand the knowledge in the area by comparatively assessing four popular AI-based code assistants, namely GitHub Copilot, Tabnine, ChatGPT, and Google Bard, with a dataset of 100 methods that we constructed from real-life open-source Java projects, considering a variety of cases for complexity and dependency from contextual elements. Results show that Copilot is often more accurate than other techniques, yet none of the assistants is completely subsumed by the rest of the approaches. Interestingly, the effectiveness of these solutions dramatically decreases when dealing with dependencies outside the boundaries of single classes.
paper
AI-based code assistants; Bard; ChatGPT; code completion; Copilot; empirical study; Tabnine;
English
32nd IEEE/ACM International Conference on Program Comprehension, ICPC 2024 - 15 April 2024 through 16 April 2024
2024
Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension
9798400705861
2024
13
23
open
Corso, V., Mariani, L., Micucci, D., Riganelli, O. (2024). Generating Java Methods: An Empirical Assessment of Four AI-Based Code Assistants. In Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension (pp.13-23). IEEE [10.1145/3643916.3644402].
File in questo prodotto:
File Dimensione Formato  
Corso-2024-ICPC-VoR.pdf

accesso aperto

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 4.98 MB
Formato Adobe PDF
4.98 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/490719
Citazioni
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 1
Social impact