Nvidia breakthrough gives 4-bit pretraining technique the accuracy of FP8 (arxiv.org)
from yogthos@lemmy.ml to technology@lemmy.ml on 14 Oct 16:36
https://lemmy.ml/post/37532938

NVIDIA just trained a 12B-parameter language model on 10 trillion tokens entirely in 4-bit precision.

Here’s why this matters:

This is the first successful demonstration of large-scale 4-bit pretraining without losing accuracy.

The next generation of frontier models will be faster, cheaper, without compromise.

#technology

threaded - newest

technocrit@lemmy.dbzer0.com on 15 Oct 14:24 collapse

next generation of frontier models

lol. Too much grifter speak for me. Slow down on that kool aid.

yogthos@lemmy.ml on 15 Oct 16:06 collapse

People building their whole identity around hating LLM tech will never stop being hilarious.