bitfucker@programming.dev
on 01 Jun 08:27
collapse
In the context of machine learning, usually a list of numbers that is arranged in a certain way, and is used for mathematical operation. You can think of it as a transfer/transform “function” that takes data as input, and spits out the representation of said data in some other way (that we usually don’t know until the training is finished and we analyze the result)
onlinepersona@programming.dev
on 01 Jun 22:25
collapse
The weights for the neural network or the embeddings?
bitfucker@programming.dev
on 01 Jun 22:40
collapse
No. Normally the kernel doesn’t get updated in the network during training. They are called hyper-parameters. They do affect training, but they are not updated by the training algorithm
SpicyToaster420@sopuli.xyz
on 01 Jun 07:56
collapse
Awesome use of LLMs. I wonder they didn’t use FP8 quantization though, especially since their target hardware was an L40s.
threaded - newest
What is a “kernel” in this context? It doesn’t seem to be related to the OS kernel but some kind of graphics kernel? Whatever that is…
Anti Commercial-AI license
In the context of machine learning, usually a list of numbers that is arranged in a certain way, and is used for mathematical operation. You can think of it as a transfer/transform “function” that takes data as input, and spits out the representation of said data in some other way (that we usually don’t know until the training is finished and we analyze the result)
The weights for the neural network or the embeddings?
Anti Commercial-AI license
No. Normally the kernel doesn’t get updated in the network during training. They are called hyper-parameters. They do affect training, but they are not updated by the training algorithm
Awesome use of LLMs. I wonder they didn’t use FP8 quantization though, especially since their target hardware was an L40s.