Multiagent Finetuning. Our self improvement approach constructs a multiagent set of language models over multiple rounds of finetuning. At each round of finetuning, models specialize to become generation and critic agents, and agents in each further specializing based off their generations in the previous round of finetuning.
Leave a reply