← All models View on HuggingFace →
Z
Z.ai: GLM 4.6V
z-ai/glm-4.6v
VisionTool useJSONReasoningStreaming
GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across images, documents, and mixed media. It supports up to 128K tokens, processes complex page layouts...
Pricing
Input
$0.3 / 1M
Output
$0.9 / 1M
Specs
Context
131,072 tokens
Input
image, text, video
Output
text
Released: 2025-12
Supported parameters
frequency_penaltyinclude_reasoningmax_tokenspresence_penaltyreasoningrepetition_penaltyresponse_formatseedstoptemperaturetool_choicetoolstop_ktop_p
Open weights · HuggingFace
3,848 downloads/mo
392 likes
mit image-text-to-text