Jan-nano, a 4B model that can outperform DeepSeek 671B on certain metrics using MCP

Jan-nano, a 4B model that can outperform DeepSeek 671B on certain metrics using MCP
from yogthos@lemmy.ml to technology@lemmy.ml on 15 Jun 20:33
https://lemmy.ml/post/31744080

Jan-nano is a model fine-tuned with DAPO on Qwen3-4B. Jan-nano comes with some unique capabilities:

It can perform deep research (with the right prompting)
It picks up relevant information effectively from search results
It uses tools efficiently

The model was evaluated using SimpleQA - a relatively straightforward benchmark to test whether the model can find and extract the right answers.

Jan-nano outperforms Deepseek-671B on this metric, using an agentic and tool-usage-based approach. A 4B model obviously has its limitations, but it’s interesting to see how far these things can be pushed. Jan-nano can serve as your self-hosted Perplexity alternative on a budget.

You can find the model at: huggingface.co/Menlo/Jan-nano

And a gguf is available at: huggingface.co/Menlo/Jan-nano-gguf

#technology

threaded - newest

geneva_convenience@lemmy.ml on 18 Jun 08:27 collapse

Very cool. This kind of stuff is interesting for local use on phones. Not that I will be implementing that myself. But someone will probably do so in the future and I will be using it