Polyglot Machines [2024-]:
This NWO-funded VIDI project aims at improving low-resource language modeling, taking inspiration from child language acquisition insights. For example, we study the role of different types of language inputs on the morpho-syntactic abilities acquired by the models. Key research questions include: Does training on child-directed language speed up the language learning process compared to training on adult-directed language (such as Wikipedia articles)? Which properties of child-directed language enable efficient learning in humans vs. machines? To scale up the research to a broader set of languages, we also develop new grammatical evaluation benchmarks in a cross-lingual or language-specific way, as well as collecting developmentally plausible LM training datasets in non-English languages.
Key people: Francesca, Jaap, Arianna, Yevgen
LESSEN Project [2023-]:
LESSEN is an NWO-funded Netherlands-based consortium bringing together academic and industrial partners working on safe and efficient chat-based AI assistants, with a focus on low-resource (retail) domains. Example partners are Albert Heijn, bol.com or KPN (Dutch telecommunication company), all of which handle large amounts of user requests daily through chatbots. Within this consortium, we specifically focus on studying and improving LLMs’ abilities to answer user requests accurately and consistently across different languages, for instance by using Retrieval-Augmented Generation techniques and inspecting model internals to attribute model answers to a specific textual source.
Key people: Jirui, Arianna, Raquel Fernández
InDeep Project [2021-]:
InDeep is a NWO-funded research consortium working on the interpretability of deep learning models of text, language, speech and music. Within this consortium, we focus on the application of model-internal analysis techniques to LLM generation tasks, such as machine translation and RAG-based question answering. We aim at bridging the gap betweem scientific advances and user needs by developing toolkits that facilitate access to advanced interpretability techniques (e.g. InSeq, MIRAGE) and by conducting user studies with professional translators.
Key people: Gabriele, Arianna, Malvina Nissim, Grzegorz Chrupała
NeLLCom [2019-]:
This project is the fruit of a long-running collaboration with language evolution expert Tessa Verhoef. Our goal is to use neural network-based agents to simulate and study the emergence of universal language properties, such as the trade-off between word order and case marking as alternative strategies to convey argument roles. For this purpose, we have developed a Neural-agent Language Learning and Communication framework (NeLLCom) that combines supervised learning with reinforcement learning in a meaning reconstruction game. In our experiments, we teach agents small artificial languages that were designed by cognitive scientists for use in experiments with human participants. We then let agents communicate with each other and study how their language changes in comparison to what has been observed in humans.
To pursue this line of work, we’ve been fortunate to supervise two PhD students funded by the China Scholarship Council: Yuchen Lian and Yuqing Zhang.
Key people: Yuqing, Yuchen, Arianna, Tessa Verhoef
Our journey continues — new projects are on the way!