← All models View on HuggingFace →
Z
Z.ai: GLM 4.5V
z-ai/glm-4.5v
VisionTool useJSONReasoningStreaming
GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built on a Mixture-of-Experts (MoE) architecture with 106B parameters and 12B activated parameters, it achieves state-of-the-art results in video understanding,...
Pricing
Input
$0.6 / 1M
Output
$1.80 / 1M
Specs
Context
65,536 tokens
Input
text, image
Output
text
Knowledge cutoff: 2024-12-31
Released: 2025-08
Supported parameters
frequency_penaltyinclude_reasoningmax_tokenspresence_penaltyreasoningrepetition_penaltyresponse_formatseedstoptemperaturetool_choicetoolstop_ktop_p
Open weights · HuggingFace
178,678 downloads/mo
718 likes
mit image-text-to-text