Grzegorz Chrupała
I am an assistant professor at the Tilburg Center for Cognition and Communication at the Tilburg University.
My research is focused on computational language learning. I am especially interested in cognitively-inspired models of language learning which illuminate language acquisition in humans while at the same time being useful for real-world text understanding applications.
I received my PhD from the School of Computing at Dublin City University. After that I worked as a researcher at the Spoken Language Systems group at Saarland University.
Publications
My Google scholar citation statistics
-
Benjamin Roth, Grzegorz Chrupała, Michael Wiegand and Mittul Singh. 2012.
Generalizing from Freebase and Patterns using Distant Supervision for Slot Filling. TAC 2012.
Paper - Afra Alishahi
and Grzegorz Chrupała. 2012. Concurrent Acquisition
of Word Meaning and Lexical
Categories. EMNLP-CoNLL
2012.
Paper | Poster - Grzegorz Chrupała. 2012. Hierarchical clustering of word
class distributions. NAACL-HLT 2012 Workshop on the Induction of Linguistic
Structure.
Paper -
Grzegorz Chrupała. 2012. Learning from evolving data
streams: online triage of bug
reports. EACL 2012.
Paper | Slides | Data | Code -
Fang Xu, Stefan Kazalski, Grzegorz Chrupała, Benjamin
Roth, Xujian Zhao, Michael Wiegand and Dietrich Klakow. 2011. Saarland
University Spoken Language Systems Group at TAC KBP 2011. TAC 2011.
Paper -
Grzegorz Chrupała. 2011. Efficient induction of
probabilistic word classes with LDA. IJCNLP 2011.
Paper | Slides | Code -
Grzegorz
Chrupała, Saeedeh
Momtazi, Michael Wiegand, Stefan Kazalski, Fang Xu, Benjamin Roth,
Alexandra Balahur and Dietrich Klakow. 2010. Saarland University
Spoken Language Systems at the Slot Filling Task of TAC KBP
2010. TAC 2010.
Paper - Grzegorz Chrupała, Georgiana Dinu and Benjamin
Roth. 2010. Enriched syntax-based meaning representation for answer
extraction. SIGIR
2010 Workshop: Query Representation and Understanding
Paper | Poster - Grzegorz Chrupała and Afra
Alishahi. 2010. Online Entropy-based Model of Lexical Category
Acquisition. CoNLL
2010
Paper | Slides | Code - Georgiana Dinu and Grzegorz Chrupała. 2010. Relatedness
curves for acquiring paraphrases. ACL workshop GEMS 2010
Paper - Djamé Seddah,
Grzegorz Chrupała, Özlem Çetinoğlu,
Josef van Genabith
and Marie
Candito. 2010. Lemmatization and Lexicalized Statistical Parsing
of Morphologically Rich Languages: the Case of French. NAACL SPMRL 2010
workshop
Paper. -
Grzegorz Chrupała and Dietrich Klakow. 2010. A Named
Entity Labeler for German: exploiting Wikipedia and
distributional
clusters. LREC
2010
Paper | Code - Afra Alishahi
and Grzegorz
Chrupała. 2009. Lexical
Category Acquisition as an Incremental
Process. PsychoCompLA-2009,
Cogsci
2009
Paper - Michael Wiegand, Saeedeh
Momtazi, Stefan Kazalski, Fang Xu, Grzegorz
Chrupała and Dietrich Klakow. 2008. The Alyssa System at TAC
QA 2008. TAC 2008
Paper - Grzegorz Chrupała, Georgiana Dinu and Josef van
Genabith. 2008. Learning Morphology with Morfette. LREC 2008
Paper | Code - Grzegorz Chrupała, Josef van
Genabith. Using very large corpora to detect raising and control
verbs. 2007. LFG07
Paper - Grzegorz Chrupała, Nicolas Stroppa, Josef van Genabith and
Georgiana Dinu. 2007. Better Training for Function
Labeling. RANLP 2007
Paper | Code - Grzegorz Chrupała. 2006. Simple Data-Driven
Context-Sensitive Lemmatization. SEPLN 2006
Paper - Grzegorz Chrupała and Josef van
Genabith. 2006. Using Machine-Learning to Assign Function Labels
to Parser Output for Spanish. COLING/ACL 2006
Paper - Grzegorz Chrupała and Josef van
Genabith. 2006. Improving Treebank-Based Automatic LFG Induction
for Spanish. LFG06
Paper - Xavier Carreras,
Lluís Màrquez and
Grzegorz Chrupała. 2004. Hierarchical Recognition of
Propositional Arguments with Perceptrons. CoNLL-2004
Paper - Anthony Pym and Grzegorz Chrupała. 2005. The quantitative analysis of translation flows in the age of an international language. In Less Translated Languages, Albert Branchadell and Lovell Margaret West (eds.), 27-38. John Benjamins.
-
Grzegorz Chrupała. 2003. Perl Scripting in
Translation Project Management. In Across Languages and
Cultures, Vol. 4, No. 1. (5 May 2003), pp. 109-132
Paper - Grzegorz Chrupała and Lidia Cámara. 2003. STAR Transit XV. In Entornos Informáticos de la Traducción Profesional, Gloria Corpas Pastor and María-José Varela Salinas, (eds.). Atrio, Granada.
Theses
- Grzegorz Chrupała. 2008. Towards a
Machine-Learning Architecture for Lexical Functional Grammar
Parsing. PhD dissertation, Dublin City University
PDF | Single-spaced PDF - Grzegorz Chrupała. 2003. Acquiring Verb Subcategorization
from Spanish Corpora. DEA Thesis, University of Barcelona.
PDF - Grzegorz Chrupała. 1998. Bibliotheca in Fabula. The library motive in La biblioteca de Babel, The British Museum is Falling Down and Il nome della rosa. MA Thesis, University of Silesia.
HTML
Software
Morfette is used by many people for morphological analysis. SemiNER is also popular for named entity tagging in German. Many of the other packages are quite specialized and are of interest mostly if you are a researcher working on similar problems as myself.- Colada: implements online and minibatch word class class induction using Latent Dirichlet Allocation (LDA) with an Online Gibbs sampler.
- Ladybug: Online (incremental) triage of bug reports.
- LDA-wordclass: Soft word-class induction with Latent Dirichlet Allocation
- Lingo: Haskell NLP utilities
- Delta-H: Online entropy-based model of lexical category acquisition
- Sequor: a perceptron-based sequence labeler with a flexible feature template language.
It is meant mainly for NLP applications such as Part of Speech tagging, syntactic chunking or Named Entity labeling. Includes:
- SemiNER: a semi-supervised Named Entity labeler (with pre-trained models for German)
- Morfette: a tool for supervised learning of inflectional morphology. Comes with pre-trained models for Spanish and French.
- Funtag: Add grammatical function labels to constituency parse trees
Teaching
Invited Lectures and Tutorials
Machine learning for NLP and MT
(with Nicolas Stroppa, Google, Zurich)
EU META Network of Excellence workshop, Barcelona, Spain, October 2010. Two-day intensive course for graduate students and META researchers covering a selection of machine learning techniques useful for NLP.Introduction to classification and sequence labeling
International Research Training Group - Annual Meeting, Irsee, Germany, June 2009. Half-day intensive tutorial for graduate students covering basic machine learning techniques useful for NLP. Slides
- Machine Learning for NLP
Centre for Next-Generation Localisation, Dublin, Ireland, March 2009. Two-day intensive tutorial for graduate students covering a selection of machine learning techniques useful for NLP. Slides
Contact
Dr. Grzegorz Chrupała
Communication and Information Sciences
Tilburg University
PO Box 90153
5000 LE Tilburg
The Netherlands
Web: grzegorz.chrupala.me
Phone: +31 13 466 3106
Email: g.chrupala@uvt.nl