teknium/OpenHermes-2.5-Mistral-7B
Text Generation
•
Updated
•
66.5k
•
822
A collection of chat models to explore the differences between three alignment techniques: DPO, IPO, and KTO.
Note The chat model we optimized with DPO, IPO, and KTO.
Note The AI feedback dataset we used to fine-tune OpenHermes-2.5 with DPO, IPO, and KTO.