Large Language Models up to 4x Faster on RTX With TensorRT-LLM for Windows (blogs.nvidia.com)
from ijeff@lemdro.id to technology@lemmy.world on 18 Oct 2023 12:04
https://lemdro.id/post/2378141

cross-posted from: lemdro.id/post/2377716 (!aistuff@lemdro.id)

#technology

threaded - newest

korewa@reddthat.com on 18 Oct 2023 12:34 collapse

Dang I need to try these for now only the stable diffusion extension for automatic 1111 is available.

I wonder if it will accelerate 30b models that doesn’t fit all in the gpu vram.

If it only accelerates 13b then it was already fast enough