Grzegorz Chrupała
I am an Associate Professor at the department of Cognitive Science and Artificial Intelligence at Tilburg University, where I lead the Multimodal Language Learning (ℳℒ²) Lab.
I received my PhD from
the School of Computing at
Dublin City University. After that I worked as a researcher at the
Spoken Language Systems group
at Saarland
University.
See full bio.
Research at the ℳℒ² Lab is inspired by the ease which young children show for picking up any language they are exposed to, sometimes several languages at the same time, and seemingly with little effort and practically no explicit instruction. The information they rely on is messy and unstructured, but it is rich and multimodal, including speech and gestures, visual and auditory perception and interaction with other people. In contrast the typical way computers learn language is by reading billions of words of written text, at best complemented by captioned static photos. In our lab we work on enabling machines to access rich data in multiple modalities, and find systematic connections between them as a way to learn to understand language in a more natural and data-efficient manner. Our approach will help us explore the limits of human-like learning, and if successful will enable computers to deal not only with the world's largest languages, but also with those with little written material, or no writing system at all.
In my free time I read and take photos.
Note to prospective PhD students: Please check the News section below as well as my Twitter account for announcements of available positions.
News
Recent
- I'm talking about putting natural back into Natural Language Processing at the 2nd Dutch Speech Tech Day.
- My course on Neural models of spoken language at the LOT Winter School 2024.
- Paper accepted to ICLR 2024: Quantifying the Plausibility of Context Reliance in Neural Machine Translation.
- Outstanding paper award for EMNLP 2023 paper Homophone Disambiguation Reveals Patterns of Context Mixing in Speech Transformers.
People
- Gaofei Shen PhD candidate, Interpretability techniques for spoken language models.
- Gabriele Sarti. PhD candidate, User-centric interpretability for neural machine translation.
- Hosein Mohebbi. PhD candidate, Analyzing and interpreting deep neural models of language.
- Lisa Lepp. PhD candidate, Machine Translation for sign and spoken languages.
Alumni
- Chris Emmery. PhD thesis: User-Centered Security in Natural Language Processing.
- Bertrand Higy. Postdoc, Understanding visually grounded spoken language via multi-tasking.
- Patrick Bos. E-science engineer, Understanding visually grounded spoken language via multi-tasking.
- Christiaan Meijer. E-science engineer, Understanding visually grounded spoken language via multi-tasking.
- Ákos Kádár. PhD thesis: Learning Visually Grounded and Multilingual Representations.
Publications
For the complete list of publications check: Google Scholar | Semantic Scholar | DBLP | ACL Anthology | ORCID
Selected papers
- Nikolaus, M., Alishahi, A. & Chrupała, G. (2022). Learning English with Peppa Pig. Transactions of the Association for Computational Linguistics, 10, 922–936.
Paper | Code - Chrupała, G. (2022). Visually grounded models of spoken
language: A survey of datasets, architectures and evaluation
techniques. Journal of Artificial Intelligence Research, 73,
673-707.
Paper - Chrupała, G., & Alishahi, A. (2019). Correlating Neural
and Symbolic Representations of Language. In Proceedings of the 57th
Annual Meeting of the Association for Computational Linguistics
(pp. 2952-2962).
Paper | Code - Chrupała, G., Gelderloos, L., & Alishahi, A. (2017).
Representations of language in a model of visually grounded
speech signal. In Proceedings of the 55th Annual Meeting of the
Association for Computational Linguistics (Volume 1: Long Papers)
(pp. 613-622).
Paper | Code - Kádár, A., Chrupała, G., & Alishahi, A. (2017). Representation
of linguistic form and function in recurrent neural
networks. Computational Linguistics, 43(4):761-780.
Paper | Code
Recent Talks
- Putting Natural in NLP. University of Groningen. Slides
- Learning language from Peppa Pig. ILLC seminar, University of Amsterdam.
- Visually Grounded Models of Spoken Language and their Analysis. Lecture at ALPS Winter School. Video
- Investigating neural representations of speech and language. Keynote at the Nordic Conference on Computational Linguistics (NoDaLiDa), October 2019. Slides
Supervision
Selected MSc theses
- Jean Constantin. 2022. Identification of causal discourse relations in French text using machine-translated training resources. Tilburg University.
- Kristel van Rooij. 2021. Gender classification of first names using Long Short-Term Memory recurrent neural networks and support vector machine in various countries. Tilburg University.
- Aayushi Pandey. 2020. Emotion recognition in a model of visually grounded speech. Tilburg University.
- Dmitrijs Surenans. 2020. Machine Learning Explainability In Finance: An Application to Default Risk Analysis. Tilburg University.
- Dennis de Groot. 2019. Finding Structure in Neural Network Activation Patterns via Representational Similarity and Convolutional Kernels. Tilburg University.
- Mark van der Laan. 2018. Encoding of speaker identity in a Neural Network model of Visually Grounded Speech perception. Tilburg University.
- Lieke Gelderloos. 2016. Tilburg University. Levels of representation in a recurrent neural model of visually grounded language learning. Tilburg University. See also Coling 2016 paper: From phonemes to images: levels of representation in a recurrent neural model of visually grounded language learning.
- Ákos Kádár. 2014. Grounded learning for source code component retrieval. Tilburg University
- Antoaneta Baltadzhieva. 2014. Predicting question quality in question answering forums. Tilburg University
- Huijing Deng. 2013. Probabilistic Models of API Retrieval. Saarland University. (See also Deng and Chrupała. 2014. Semantic approaches to software component retrieval with English queries. LREC.)
Bio
Grzegorz Chrupała is an Associate Professor at the Department of Cognitive Science and Artificial Intelligence at Tilburg University, where he leads the Multimodal Language Learning (or ℳℒ²) Lab. Previously he did postdoctoral research at the Spoken Language Systems group at Saarland University. He received his doctoral degree from the School of Computing at Dublin City University. His research focuses on computational models of learning (spoken) language in naturalistic multimodal settings, as well as analysis and interpretation of representations emerging in deep learning architectures. He regularly serves as Senior Area Chair for major NLP and AI conferences such as ACL and EMNLP. He was one of the creators of the popular BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP. His research has been funded by the Dutch Research Council (NWO), via ASDI and NWA-ORC grants.Contact
Department of Cognitive Science and Artificial Intelligence
Tilburg University
PO Box 90153
5000 LE Tilburg
The Netherlands
Twitter: @gchrupala
Mastodon: @gchrupala@sigmoid.social
Web: grzegorz.chrupala.me
Email: grzegorz@chrupala.me