Ukeate笔记

RLHF

RLHF, Reinforcement Learning by Human Feedback。找人提问题，并对模型反馈奖励、惩罚

关系图谱

反向链接

ChatGPT

Created with Quartz v4.5.2 © 2026