The summarize-from-feedback repository implements the methods from the paper “Learning to Summarize from Human Feedback”. Its purpose is to train a summarization model that better aligns with human preferences by first collecting human feedback (comparisons between summaries) to train a reward model, and then fine-tuning a policy (summarizer) to maximize that learned reward. The code includes different stages: a supervised baseline (i.e. standard summarization training), the reward modeling...