DeepSeek

Advertisement

GRPO (Group Relative Policy Optimization) Study Notes

GRPO (Group Relative Policy Optimization) Study Notes

We introduce Group Relative Policy Optimization (GRPO), a variant of Proximal Policy Optimization (PPO)

DeepSeek #OpenSourceWeek - Five Consecutive Releases

DeepSeek #OpenSourceWeek - Five Consecutive Releases

We're a tiny team @deepseek_ai exploring AGI.

Andrej Karpathy in-depth explanation of large language model (LLM) technology (Part 1) - [Pretraining and Inference]

Andrej Karpathy in-depth explanation of large language model (LLM) technology (Part 1) - [Pretraining and Inference]

- introduction - pretraining data (internet) - tokenization - neural network I/O - neural network internals - inference

DeepSeek Janus Series: Unified Multimodal Understanding and Generation Models

DeepSeek Janus Series: Unified Multimodal Understanding and Generation Models

Janus-Series: Unified Multimodal Understanding and Generation Models

Comparison of the reasoning processes between ChatGPT o1 pro and DeepSeek R1

Comparison of the reasoning processes between ChatGPT o1 pro and DeepSeek R1

DeepSeek R1 Vs ChatGPT 01 (My Experience)

DeepSeek R1: X.com User Reviews

DeepSeek R1: X.com User Reviews

Deepseek-r1 is open source and on par with o1 preview - @bindureddy

Paper of DeepSeek-R1: Exploration and Breakthrough of the New Generation Inference Model

Paper of DeepSeek-R1: Exploration and Breakthrough of the New Generation Inference Model

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning