SSH: Sparse Spectrum Adaptation via Discrete Hartley Transformation

Y. Shen; Qi Bi; Jia-Hong Huang; H. Zhu; A.D. Pimentel; A. Pathania

doi:https://doi.org/10.18653/v1/2025.naacl-long.522

SSH: Sparse Spectrum Adaptation via Discrete Hartley Transformation

Authors	Y. Shen Qi Bi Jia-Hong Huang H. Zhu A.D. Pimentel A. Pathania
Publication date	2025
Host editors	Luis Chiruzzo Alan Ritter Lu Wang
Book title	Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: proceedings of the conference
Book subtitle	NAACL 2025 : April 29-May 4, 2025
ISBN (electronic)	9798891761896
Event	2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics
Volume \| Issue number	1
Pages (from-to)	10400–10415
Publisher	Kerrville, TX: Association for Computational Linguistics
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI) Faculty of Economics and Business (FEB) - Amsterdam Business School Research Institute (ABS-RI)
Abstract	Low-rank adaptation (LoRA) has been demonstrated effective in reducing the trainable parameter number when fine-tuning a large foundation model (LLM). However, it still encounters computational and memory challenges when scaling to larger models or addressing more complex task adaptation.In this work, we introduce Sparse Spectrum Adaptation via Discrete Hartley Transformation (SSH), a novel approach that significantly reduces the number of trainable parameters while enhancing model performance. It selects the most informative spectral components across all layers, under the guidance of the initial weights after a discrete Hartley transformation (DHT). The lightweight inverse DHT then projects the spectrum back into the spatial domain for updates.Extensive experiments across both single-modality tasks—such as language understanding and generation—and multi-modality tasks—such as video-text understanding—demonstrate that SSH outperforms existing parameter-efficient fine-tuning (PEFT) methods while achieving substantial reductions in computational cost and memory requirements. For instance, during instruction tuning on the LLaMA3.1 8B model, SSH achieves higher accuracy with only 0.048M trainable parameters compared to LoRA’s 33.5M, while reducing computational intensity up to 55% compared to FourierFT.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.18653/v1/2025.naacl-long.522
Downloads	2025.naacl-long.522 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

SSH: Sparse Spectrum Adaptation via Discrete Hartley Transformation