We present an exploration of machine learning architectures for predicting brain responses to realistic images on occasion of the Algonauts Challenge 2023. Our research involved extensive experimentation with various pretrained models. Initially, we employed simpler models to predict brain activity but gradually introduced more complex architectures utilizing available data and embeddings generated by large-scale pre-trained models. We encountered typical difficulties related to machine learning problems, e.g. regularization and overfitting, as well as issues specific to the challenge, such as difficulty in combining multiple input encodings, as well as the high dimensionality, unclear structure, and noisy nature of the output. To overcome these issues we tested single edge 3D position-based, multi-region of interest (ROI) and hemisphere predictor models, but we found that employing multiple simple models, each dedicated to a ROI in each hemisphere of the brain of each subject, yielded the best results - a single fully connected linear layer with image embeddings generated by CLIP as input. While we surpassed the challenge baseline, our results fell short of establishing a robust association with the data.
Chimisso, R., Buršić, S., Marocco, P., Vizzari, G., Ognibene, D. (2023). Exploration and Comparison of Deep Learning Architectures to Predict Brain Response to Realistic Pictures. Intervento presentato a: The Algonauts Project 2023 - Exploration and Comparison of Deep Learning Architectures to Predict Brain Response to Realistic Pictures Special session at 2023 Conference on Cognitive Computational Neuroscience, Oxford, UK.
Exploration and Comparison of Deep Learning Architectures to Predict Brain Response to Realistic Pictures
Vizzari, G;Ognibene, D
2023
Abstract
We present an exploration of machine learning architectures for predicting brain responses to realistic images on occasion of the Algonauts Challenge 2023. Our research involved extensive experimentation with various pretrained models. Initially, we employed simpler models to predict brain activity but gradually introduced more complex architectures utilizing available data and embeddings generated by large-scale pre-trained models. We encountered typical difficulties related to machine learning problems, e.g. regularization and overfitting, as well as issues specific to the challenge, such as difficulty in combining multiple input encodings, as well as the high dimensionality, unclear structure, and noisy nature of the output. To overcome these issues we tested single edge 3D position-based, multi-region of interest (ROI) and hemisphere predictor models, but we found that employing multiple simple models, each dedicated to a ROI in each hemisphere of the brain of each subject, yielded the best results - a single fully connected linear layer with image embeddings generated by CLIP as input. While we surpassed the challenge baseline, our results fell short of establishing a robust association with the data.File | Dimensione | Formato | |
---|---|---|---|
Chimisso-2023-The Algonauts Project-AAM.pdf
accesso aperto
Tipologia di allegato:
Author’s Accepted Manuscript, AAM (Post-print)
Licenza:
Creative Commons
Dimensione
535.05 kB
Formato
Adobe PDF
|
535.05 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.