On the Low-Rank Parametrization of Reward Models for Controlled Language Generation

Sergey Troshin; Vlad Niculae; Antske Fokkens

On the Low-Rank Parametrization of Reward Models for Controlled Language Generation

Authors	Sergey Troshin Vlad Niculae Antske Fokkens
Publication date	08-2025
Journal	Transactions on Machine Learning Research
Article number	4690
Volume \| Issue number	2025
Number of pages	34
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Language models trained on large amounts of data are known to produce inappropriate content in some cases and require careful tuning to be used in the real world. We revisit an effective and modular approach for controllability of the language models, when an external expert model guides the decoding. Particularly, we zoom in into the parametrization choice of an external expert, highlighting the difference between low-rank and higherrank parametrizations. Higher-rank experts are designed to support high flexibility when representing the rewards, leading to higher computational costs during decoding. However, we demonstrate that they might not use their full flexibility. By analyzing the recently proposed reward-augmented decoding approach (RAD), which uses a higher-rank expert model, we introduce a simpler but more efficient low-rank parametrization of the expert model enabling fast and effective guided decoding. We empirically show that the low-rank RAD performs on par with the more flexible RAD on a detoxification and a sentiment control task, while requiring only a single reward model call per generated token.
Document type	Article
Language	English
Published at	https://openreview.net/forum?id=cjRsEGLT8B
Other links	https://github.com/serjtroshin/rad-q/ https://jmlr.org/tmlr/papers/index.html https://www.scopus.com/pages/publications/105015067766
Downloads	On the Low-Rank Parametrization of Reward Models for Controlled Language Generation (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

On the Low-Rank Parametrization of Reward Models for Controlled Language Generation