llm.c: multi-GPU, bfloat16, flash attention, ~7% faster than PyTorch (twitter.com)
from bot@lemmy.smeargle.fans to hackernews@lemmy.smeargle.fans on 04 May 2024 22:35
https://lemmy.smeargle.fans/post/157663

HN Discussion

#hackernews

threaded - newest