Jan-nano-128k: A 4B Model with a Super-Long Context Window

Jan-nano-128k: A 4B Model with a Super-Long Context Window (huggingface.co)
from yogthos@lemmy.ml to technology@lemmy.ml on 25 Jun 2025 14:23
https://lemmy.ml/post/32242468

Jan-nano-128k is model fine-tuned to improve performance when enable YaRN scaling (instead of having degraded performance). This model will require YaRN Scaling supported from inference engine.

It can uses tools continuously, repeatedly.
It can perform deep research
Extremely persistent

gguf can be found at: huggingface.co/Menlo/Jan-nano-128k-gguf

#technology

threaded - newest