首页
教程
IT编程
国外技术
登录
标签
Theoretic
KTO: Model Alignment as Prospect Theoretic Optimization
一、引言 本报告介绍了一种基于前景理论(Prospect Theory)的大型语言模型对齐方法 ——KTO(Kahneman-Tversky Optimization)。该方法通过设计人类感知损失函数(HALO),直接最大化模型生成的效用
ALIGNMENT
Model
KTO
optimization
Theoretic
admin
4月前
33
0
[NIPS2017] A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning 笔记
文章目录前言Background and Related WorkNeural Fictitious Self-PlayPolicy-Space Response OraclesMeta-Strategy SolversDeep Cogni
笔记
GAME
Theoretic
Unified
Reinforcement
admin
4月前
27
0