• Natanael@slrpnk.net
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    9 days ago

    Humans learn a lot through repetition, no reason to believe that LLMs wouldn’t benefit from reinforcement of higher quality information. Especially because seeing the same information in different contexts helps mapping the links between the different contexts and helps dispel incorrect assumptions. But like I said, the only viable method they have for this kind of emphasis at scale is incidental replication of more popular works in its samples. And when something is duplicated too much it overfits instead.

    They need to fundamentally change big parts of how learning happens and how the algorithm learns to fix this conflict. In particular it will need a lot more “introspective” training stages to refine what it has learned, and pretty much nobody does anything even slightly similar on large models because they don’t know how, and it would be insanely expensive anyway.