AI-based code assistants are promising tools that can facilitate and speed up code development. They exploit machine learning algorithms and natural language processing to interact with developers, suggesting code snippets (e.g., method implementations) that can be incorporated into projects. Recent studies empirically investigated the effectiveness of code assistants using simple exemplary problems (e.g., the re-implementation of well-known algorithms), which fail to capture the spectrum and nature of the tasks actually faced by developers. In this paper, we expand the knowledge in the area by comparatively assessing four popular AI-based code assistants, namely GitHub Copilot, Tabnine, ChatGPT, and Google Bard, with a dataset of 100 methods that we constructed from real-life open-source Java projects, considering a variety of cases for complexity and dependency from contextual elements. Results show that Copilot is often more accurate than other techniques, yet none of the assistants is completely subsumed by the rest of the approaches. Interestingly, the effectiveness of these solutions dramatically decreases when dealing with dependencies outside the boundaries of single classes.
Corso, V., Mariani, L., Micucci, D., Riganelli, O. (2024). Generating Java Methods: An Empirical Assessment of Four AI-Based Code Assistants. In Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension (pp.13-23). IEEE [10.1145/3643916.3644402].
Generating Java Methods: An Empirical Assessment of Four AI-Based Code Assistants
Corso, Vincenzo;Mariani, Leonardo;Micucci, Daniela;Riganelli, Oliviero
2024
Abstract
AI-based code assistants are promising tools that can facilitate and speed up code development. They exploit machine learning algorithms and natural language processing to interact with developers, suggesting code snippets (e.g., method implementations) that can be incorporated into projects. Recent studies empirically investigated the effectiveness of code assistants using simple exemplary problems (e.g., the re-implementation of well-known algorithms), which fail to capture the spectrum and nature of the tasks actually faced by developers. In this paper, we expand the knowledge in the area by comparatively assessing four popular AI-based code assistants, namely GitHub Copilot, Tabnine, ChatGPT, and Google Bard, with a dataset of 100 methods that we constructed from real-life open-source Java projects, considering a variety of cases for complexity and dependency from contextual elements. Results show that Copilot is often more accurate than other techniques, yet none of the assistants is completely subsumed by the rest of the approaches. Interestingly, the effectiveness of these solutions dramatically decreases when dealing with dependencies outside the boundaries of single classes.File | Dimensione | Formato | |
---|---|---|---|
Corso-2024-ICPC-VoR.pdf
accesso aperto
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
4.98 MB
Formato
Adobe PDF
|
4.98 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.