DeepSeek: R1 Distill Llama 70B

deepseek/deepseek-r1-distill-llama-70b

JSONReasoningStreaming

DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The model combines advanced distillation techniques to achieve high performance across...

Pricing

Input

$0.7 / 1M

Output

$0.8 / 1M

Specs

Context

131,072 tokens

Input

text

Output

text

Knowledge cutoff: 2024-07-31

Released: 2025-01

Supported parameters

frequency_penaltyinclude_reasoninglogit_biasmax_tokensmin_ppresence_penaltyreasoningrepetition_penaltyresponse_formatseedstoptemperaturetop_ktop_p

Open weights · HuggingFace

110,512 downloads/mo

776 likes

mit text-generation

arXiv:2501.12948

View on HuggingFace →

Use this model →