How does alignment affect classification? On LLM guardrail sensitivity

Contributors
Publication date 09-03-2025
Publisher Zenodo
Organisations
  • Faculty of Humanities (FGw) - Amsterdam Institute for Humanities Research (AIHR) - Amsterdam School for Cultural Analysis (ASCA)
Document type Dataset
DOI https://doi.org/10.5281/zenodo.14994330
Permalink to this page
Back