Jan-nano, a 4B model that can outperform DeepSeek 671B on certain metrics using MCP
from yogthos@lemmy.ml to technology@lemmy.ml on 15 Jun 20:33
https://lemmy.ml/post/31744080
from yogthos@lemmy.ml to technology@lemmy.ml on 15 Jun 20:33
https://lemmy.ml/post/31744080
Jan-nano is a model fine-tuned with DAPO on Qwen3-4B. Jan-nano comes with some unique capabilities:
- It can perform deep research (with the right prompting)
- It picks up relevant information effectively from search results
- It uses tools efficiently
The model was evaluated using SimpleQA - a relatively straightforward benchmark to test whether the model can find and extract the right answers.
Jan-nano outperforms Deepseek-671B on this metric, using an agentic and tool-usage-based approach. A 4B model obviously has its limitations, but it’s interesting to see how far these things can be pushed. Jan-nano can serve as your self-hosted Perplexity alternative on a budget.
You can find the model at: huggingface.co/Menlo/Jan-nano
And a gguf is available at: huggingface.co/Menlo/Jan-nano-gguf
threaded - newest
Very cool. This kind of stuff is interesting for local use on phones. Not that I will be implementing that myself. But someone will probably do so in the future and I will be using it