← All models View on HuggingFace →
N
Nemotron Nano 12b V2 Vl
nvidia/nemotron-nano-12b-v2-vl
VisionTool useReasoningStreaming
NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence. It introduces a hybrid Transformer-Mamba architecture, combining transformer-level accuracy with Mamba’s...
Specs
Context
128,000 tokens
Input
image, text, video
Output
text
Released: 2025-10
Supported parameters
include_reasoningmax_tokensreasoningseedtemperaturetool_choicetoolstop_p
Open weights · HuggingFace
150,225 downloads/mo
83 likes
other image-text-to-text