Toggle light / dark theme

Training large language models on narrow tasks can lead to broad misalignment

Finetuning a large language model on a narrow task of writing insecure code causes a broad range of concerning behaviours unrelated to coding.

Leave a Comment

Lifeboat Foundation respects your privacy! Your email address will not be published.

/* */