Paper Notes

标签: RLHF

此标签下有3条笔记。

  • 2026年5月

    Auto-Rubric as Reward: From Implicit Preferences to Explicit Multimodal Generative Criteria

  • 2026年5月

    Qwen-Image-2.0 Technical Report

  • 2026年3月

    From Sparse to Dense: Multi-View GRPO for Flow Models via Augmented Condition Space


Created with Quartz v4.5.2 © 2026

  • Source