DeepSeek: R1 Distill Llama 70B

deepseek/deepseek-r1-distill-llama-70b

JSON推理流式

DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The model combines advanced distillation techniques to achieve high performance across...

价格

输入

$0.7 / 1M

输出

$0.8 / 1M

参数

上下文

131,072 tokens

输入模态

text

输出模态

text

知识截止：2024-07-31

发布：2025-01

支持参数

frequency_penaltyinclude_reasoninglogit_biasmax_tokensmin_ppresence_penaltyreasoningrepetition_penaltyresponse_formatseedstoptemperaturetop_ktop_p

开放权重 · HuggingFace

110,512 月下载

776 收藏

mit text-generation

arXiv:2501.12948

在 HuggingFace 查看 →

使用该模型 →