Jan-nano-128k: A 4B Model with a Super-Long Context Window
(huggingface.co)
from yogthos@lemmy.ml to technology@lemmy.ml on 25 Jun 14:23
https://lemmy.ml/post/32242468
from yogthos@lemmy.ml to technology@lemmy.ml on 25 Jun 14:23
https://lemmy.ml/post/32242468
Jan-nano-128k is model fine-tuned to improve performance when enable YaRN scaling (instead of having degraded performance). This model will require YaRN Scaling supported from inference engine.
- It can uses tools continuously, repeatedly.
- It can perform deep research
- Extremely persistent
gguf can be found at: huggingface.co/Menlo/Jan-nano-128k-gguf
threaded - newest