azorius.net

Alignment faking in large language models (www.anthropic.com)
in technology@lemmy.world from Joker@sh.itjust.works on 22 Dec 08:31
comments (12)

Mapping the Mind of a Large Language Model (www.anthropic.com)
in technology@lemmy.world from kromem@lemmy.world on 21 May 2024 22:52
comments (21)

Mapping the Mind of a Large Language Model (www.anthropic.com)
in hackernews@lemmy.smeargle.fans from bot@lemmy.smeargle.fans on 21 May 2024 16:25
comments (0)

Anthropic: Reflections on Our Responsible Scaling Policy (www.anthropic.com)
in hackernews@lemmy.smeargle.fans from bot@lemmy.smeargle.fans on 20 May 2024 04:58
comments (0)