Improving the Sample Efficiency of Pre-training Language Models
by Gábor Berend (University of Szeged)
The use of transformer-based pretrained language models (PLMs) arguably dominates the natural language processing (NLP) landscape. The pre-training of such models, however, is known to be notoriously data and resource hungry, which hinders their creation in low-resource settings, making it a privilege to those few (mostly corporate) actors, who have access to sufficient computational resource and/or pre-training data. The main goal of our research is to develop such a novel sample-efficient pre-training paradigm of PLMs, which makes their use available in the low data and/or computational budget regime, helping the democratisation of this disruptive technology beyond the current status quo.